Hacker News new | past | comments | ask | show | jobs | submit login
An amateur linguist loses control of the language he invented (2012) (newyorker.com)
272 points by godarderik on Aug 15, 2014 | hide | past | favorite | 106 comments



Reading this story brings to mind the history of algorithms in the field of machine translation. Early attempts at the problem attempted to explicitly define the rules of converting between tongues using meticulously laid out systems of vocabulary and syntax. This approach proved untenable, in part due to the complex and ever changing nature of language. Modern systems such as Google Translation make use of machine learning algorithms that are fed large amounts of source material and computationally discern relationships between them.

I wonder if a similar approach could be taken with language construction. Instead of spending 25+ years fleshing out the details of a language in painstaking detail, computer programs could be devised that, using large amounts input, determine the most "efficient" means of expressing information. The approach would not only be far less labor intensive, it could also accommodate the rapidly evolving nature of language, for example adding to its "dictionary" in response to new phenomena in need of naming.


Interlingua was constructed this way, at least its vocabulary. They made the mistake IMHO to make the grammar naturalistic, which made it very easy to read for people who already spoke a Romance language; writing, on the other hand, was made difficult by this.

You could perhaps use a typological database with grammatical features of the world's languages and somehow select an "optimal" combination from it, but that's a far cry from letting a computer determine the most efficient means of expressing information; we have no idea how to define information/meaning, so that it's still an impossible dream. I don't think the problem is that designing languages is hard per se, it's that people can't be bothered to agree on one and learn it.


Maybe it could use Minimum Message Length http://en.wikipedia.org/wiki/Minimum_message_length


It sounds like an experiment worth testing out and could lead to some interesting results. On the other hand, I am imagine most conlangers enjoy devising the details of their language.


In fact it doesn't just sound like an experiment worth doing. It sounds like something that somebody somewhere might have already done.


To be glib, it has been done. We call it language.

Seriously, though, take a look at the link I posted in https://news.ycombinator.com/item?id=8180924

One of the techniques used is to computationally create a space of possible ways to partition semantic domains on a plane whose dimensions are simplicity and informativeness, in order to look at where in the possible space it is that real languages lie. While it's not been done (to my knowledge) for a whole language, it's potential direction to go.


For anyone who is interested in what an ideal language would look like, particularly in respect to brevity vs. informativeness I'd highly suggest looking into Terry Regier's work: http://lclab.berkeley.edu/

I worked in his lab on one of many projects showing that most human languages use a near optimal trade-off in various semantic domains (so far - color, kinship, containers, and spatial relations). His work also includes some of the best evidence for some language dependent forces in cognition interacting with some universal ones.


Ithkuil seems like what a language should be: as the article said, it is both precise and concise. It looks the way Esperanto ought to have looked. I find Quijada's effort deeply impressive.

I don't know much about designing human languages, but I know how hard it is to design a decent programming language (see http://colinm.org/language_checklist.html), and building a serious human language seems orders of magnitude more difficult. I've never seen an attempt that really intrigued me until I found Ithkuil.


Language should not be concise. Redundancy is built into the language for a reason - language communication is extremely noisy and if there's two-bit-error distance between "I love you" and "I killed and ate your dog" then the usage of this language by humans would not be comfortable.

Moreover, people communicating are imperfect. So if you have a language which is very precise and concise, you would have to spend a lot of effort to find a word or set of words which exactly expresses your meaning (in programming, we call it design when we do it upfront, and debugging when we do it post factum) and communication would be a very complex exercise. However, if you have a lot of words which mean roughly the same, you can be sure the meaning is passed through even if the words are not chosen super-carefully.


In some languages the meaning of a word is highly dependent on the pitch accent, like ancient Greek. If you're 1 bit off you have trouble :) Surprising enough I read an example of this yesterday evening:

"Hegelochus, the actor in Euripides' Orestes, which was presented in 408 BC, in line 279 of the play, instead of "after the storm I see again a calm sea" (galeén' horoo), Hegelochus recited "after the storm I see again a weasel" (galeên horoo)."


One of the most famous passages from the Bible seems to have have been affected by (or benefited from) a similar ambiguity.

Mark 10:25 (and parallel versions in Matthew and Luke) has prompted much speculation over the centuries with regards to the origin of it's evocative metaphor: "It is easier for a camel to go through the eye of a needle than for someone who is rich to enter the kingdom of God."

But when you consider that the word for camel (kamêlos) and for rope (kamilos) differ by only one vowel, quite a mundane explanation springs to mind: someone in the early church misheard, misspelled, or mistranslated Jesus' original admonition.

The more satisfying explanation, the one that I prefer, is that this is a pun that happens to have gotten lost in translation.

The few comments in this blog article offer some interesting explanations:

http://rambambashi.wordpress.com/2010/06/03/common-errors-36...


Case in point, the Turkish I problem has had tragic consequences: http://gizmodo.com/382026/a-cellphones-missing-dot-kills-two...


Having to deal in the past with the code broken due to Turkish having two i's and weird case conversion rules (for both i and I, case change goes to the other letter which is not present in ASCII) I can only be happy it didn't come to these proportions.


It's hard to overstate the importance of redundancy in natural languages.

This is the whole reason we are able to make ourselves understood in a noisy, imprecise world. Even if you miss a few syllables - or even half a sentence, you can usually piece together what the other person was trying to say.

Imagine someone technology-illiterate trying to describe a problem they're having with their computer in this language. Impossible.


It's hard to overstate the importance of redundancy in natural languages.

There are other factors at play, such that speech perception is multi-modal. E.g. see:

https://en.wikipedia.org/wiki/McGurk_effect


> Ithkuil seems like what a language should be: as the article said, it is both precise and concise.

Language should be useful and expressive for its users. Human languages designed by the kind of people who prize simplicity and regularity above all other qualities tend to fail for much the same reason programming language designers are hurt and disappointed to discover C is still popular.


Ithkuil is not simple. It's so ridiculously complex that it is just not possible for anyone to learn to speak fluently. It never stood any chance of being adopted in the way Esperanto has. I'm surprised that it found any use at all, but people without linguistics training seem to find it useful for discovering nuances in their own languages that they hadn't considered.


>people without linguistics training seem to find it useful for discovering nuances in their own languages that they hadn't considered.

Learning any foreign language will tend to do that, though. It's a function of being forced to consider your native tongue as a language, rather than just speaking it. Plus if you learn a foreign language, you will have the advantage of being able to talk to an established base of people who speak it.


Yes but Ithkuil is a bit unique in that it decided to include as many grammatical distinctions as possible.


>Ithkuil seems like what a language should be: as the article said, it is both precise and concise.

Which is good for scientists and logicians, but would probably be an issue for novelists and poets. Exploiting ambiguity is a common feature of the arts.


Esperanto was designed first and foremost to be learnable, it seems to be quite successful in this regard whereas it's not a strong point of Ikthuil (as it wasn't a priority). With "ought to have looked", do you mean in your own opinion, or are you referring to some of Esperanto's original goals? I always got the impression that learnability and ease of use were bigger priorities.

Personally, I don't have much faith in the idea that you can make people's communication more precise and concise by designing the language in a certain way, just as I don't think that politically correct language leads to meaningful change. This is simply because I don't believe than language can drastically change the way people think (linguistic determinism, strong Whorfianism). Esperanto is very similar to natural languages in its (im)precision, which might just be the level where humans naturally converge to if left to their own devices.


I remember a criticism of Esperanto was that it was too similar to Europeans languages. I read about a language that was similar to Esperanto but incorporated Chinese and Arabic qualities. Does anybody know this?


It's really only the vocabulary which takes from European languages, because the grammar is schematic, very regular and simple to learn. Broadly speaking, you have two options: 1) you select a specific group (e.g., European languages) and sample from their vocabulary. 2) you sample from ALL the world's languages, or generate completely random words (effectively just as hard to learn).

In the first case you have privileged one group, but in the second everyone loses ... In the second case you create an additional barrier because there is a large amount of new vocabulary that everyone has to learn, while in the other anyone who doesn't know European languages will have to learn some of its vocabulary, but that might be useful to them anyway, so it clearly seems like the better option to me.


58 different phonemes means the language is just substituting one form of complexity for another.

And apq’uxasiu for 'gawk' doesn't strike me as a particularly huge win for conciseness.


> And apq’uxasiu for 'gawk' doesn't strike me as a particularly huge win for conciseness.

It seems like it's missing the linguistic equivalent of huffman coding: make frequently used things shorter.

Also, it's hardly concise with respect to time if you have to spend 30 minutes consulting a dictionary to utter a single sentence.


The huffman coding strategy relies too much on culture to remain properly useful over different groups and time.


Given that you're communicating to an agent who can use context for disambiguation, all efficient languages will be ambiguous.

http://www.sciencedirect.com/science/article/pii/S0010027711... (Sorry for the paywall...)

So maybe Ithkuil isn't what a language should be after all. And maybe that's why no natural language looks like it...


So basically language decoding is multimodal like one of the other comments here said? Like in digital communication, perfect decoding needs appropriate SNR (context) in addition to optimal spectral efficiency (conciseness/preciseness of the language). Interesting research.



Oh gods! It sounds like a tongue-twister played backwards. Clearly not designed for ease of pronunciation or for singing songs.


> Ithkuil does not use the concept of zero

Interesting. How is one supposed to talk about math without a concept of zero?


It's likely not missing the concept of zero but rather the symbol in the numbering system. The 0 is not necessary for representing integers (if there's a new symbol for 10). I'm pretty sure it is necessary to write the decimal part of a real number.


It's a fascinating problem. It makes me wonder, without zero, what paths mathematics would have taken. There have been civilisations that used math extensively without zero and I hope Ithkuil-fluent mathematicians some day would continue exploring this.


"Zero: The Bibliography of a dangerous idea" was quite a memorable book for me, and goes into some good detail on those very points - even trying to guesstimate just how damage to progress of civilization by the numerous rejections of zero at several points in our early history. http://books.google.com.au/books/about/Zero.html?id=obJ70nxV...


The same thing happened to Blissymbols[1], as documented by radiolab[2].

1. https://en.wikipedia.org/wiki/Blissymbols

2. http://www.radiolab.org/story/257194-man-became-bliss/


That was a great episode.

This has also happened with the language Lojban[1], which was 'forked' from Loglan[2] when the creator starting making copyright complaints so the community could maintain control.

Such an odd concept that someone could 'own' a language, but I guess if you created it I can see why you would want to.

[1] http://en.wikipedia.org/wiki/Lojban [2] http://en.wikipedia.org/wiki/Loglan


I think it's a great lesson for any creator. Just because you have invented something does not necessarily mean you have the right or are capable of dictating how people use it.

I seem to remember Umberto Eco mentioning that he does not offer his own interpretations of his novels for a similar reason, but I can't find the quote.


I can recommend his book.

The Search for the Perfect Language - Umberto Eco - 2010


It's a less odd concept if you compare it to Elvish rather than English.


This is attracting some reader interest here, so I should probably mention, for other Hacker News participants deeply interested in human languages, a definitive analysis of Esperanto[1] explaining why Esperanto has not caught on with more speakers.

[1] http://www.xibalba.demon.co.uk/jbr/ranto/


I don't think that explains anything; it looks like a list of aesthetic "faults" that the author finds, unlikely to be recognised by anyone without a degree in linguistics. Surely those faults exist, but it's a stretch to pretend that they alone explain anything.

I speak some Esperanto but I'm no zealot. The "explanation" that your neighbours don't speak Esperanto, if such a simplified thing can exist, is probably as simple as network effects. I can learn German and speak to the man next-door, or I can learn Esperanto and speak to some theoretical people that may exist somewhere but I don't know them and they are mostly a bunch of nerds that meet at the local co-op to speak in Esperanto mostly about how great Esperanto is.

Go ahead. Ask your neighbour why he doesn't speak Esperanto. Does he say " The 'basic' number‐terms tri, trio, tria ('three, threesome, third') are a crowded jumble, making a mockery of the regular root/noun/adjective pattern they imitate" (K5 in the article)?

Or does he say "what's that?"?


Agreed: studying Esperanto now makes as much sense as striving to provide CP/M compatibility in your product.


Please don't put words in my mouth, I don't think you agree with me at all.

I think Esperanto is a fun hobby and I have fun toying with it every few years, even if I'm under no delusion that it's a practical way to, say, do international business. I do think that if it or any universal second language (French, English, Mandarin, whatever) were to suddenly become more widespread, the world would be better off.

My claim was only that the reason that it's not more widespread has nothing to do with subjective flaws in the language itself.


Sorry, I wasn't trying to put words in your mouth. I was just trying to underline the fact that if you put "CP/M compatible" in the ad for your product you'd get exactly the same response from most people, i.e. "uh, what?"

And probably the same response after you patiently explained them what CP/M (or Esperanto) was: "Why the hell should I care about that? Give me something that works with Windows (or Mandarin) so I can work with a sizeable portion of the rest of the world !!!"...


After reading it, I would say it is really written badly, but the author's point seems to be that most of Esperanto's structure was basically designed at random, with no particular goals in mind, other than looking superficially similar to other European languages. His substantive objections are as follows:

* the language has an excessive number of phonemes

me: On this point I disagree strongly. A language with more phonemes can form shorter words and convey information faster. that's just basic math (43^9 >> 16^9)

* the vocabulary is too large to be practical and has dubious links to other languages. specifically, Basic English requires far fewer (>10x fewer!) words for competence and is more recognizeable due to worldwide borrowing.

me: Here I agree. The amount of memorization required is quite large and borrowing words doesn't make up for it. Interlingua took a better path, but it targeted the only continent that doesn't need it.

* synthesis mechanisms are irregular and insufficiently general. in particular, esperanto's semantic structure and word synthesis fail to allow the speaker to compensate for missing vocabulary by using compound words or overly general words with attached descriptors

[I don't understand this sort of thing, but it sounds serious]

* the alphabet is needlessly complex. it uses symbols people do not recognize and cannot type in favor of ASCII digraphs with wide recognition.

me: Also, it does not accord significant importance to the way that an orange bikeshed would clearly clash with the trees in the background.

Okay, fine, I agree. This is just putting the nail in the coffin at this point.

* the noun-to-verb-to-adjective declension is not compatible with the structure of meaning, and so cannot be extended in a consistent, predictable way. that is to say, the variety of possible verbs is such that it is basically undecidable what noun should correspond to a given verb in the general case, and vice versa.

me: I don't know this but as far as I can tell he's right. For example, what is the "verb" associated with "electron"?

* the language is inherently sexist.

me: This is not a trivial criticism.

So there are some substantial criticisms in the article, methinks. And you can't discount the impact of difficulty on network effects: if one person has too much trouble learning Esperanto, they're not going to pass it on to their friends, who will not pass it on to their friends.

I wonder if a language designed to be usable with as small a core as possible -- something the author suggests -- could have a substantially better chance of catching on? Perhaps if it were also good at borrowing words from other languages ("extensible"), and grew in the right community for a little while...


>>> A language with more phonemes can form shorter words and convey information faster.

Shorter words, yes. Convey information faster - only if you have perfect speakers and perfect listeners. Otherwise, in the best case you'd just have people ask - did you just say "hélló" or "hëllò"? In the worse case, they'd just misunderstand what you're saying. And since you strive to eliminate redundancy, instead of nonsense which would trigger request for repetition, you'd get meaning - but meaning something else that you didn't intend to say.


>Shorter words, yes. Convey information faster - only if you have perfect speakers and perfect listeners. Otherwise, in the best case you'd just have people ask - did you just say "hélló" or "hëllò"? In the worse case, they'd just misunderstand what you're saying. And since you strive to eliminate redundancy, instead of nonsense which would trigger request for repetition, you'd get meaning - but meaning something else that you didn't intend to say.

Perhaps I didn't include enough context from the document. He cited Esperanto as containing 34 phonemes, compared to English at 43, and then a list including Andean Spanish with 17, Japanese with 14, Hawai'ian with 8, or Rotokas with 6. That is, he cited (what seem to be) outliers with the smallest number of phonemes and implied that these were desirable. Later he notes that Eastern Polish has 49 phonemes.

I agree that a language can certainly have phonemes which are difficult to distinguish (Mandarin Chinese is famous for this) or hard to pronouce (the voiced "th" in English "the" is famous for this) or inscrutable for some listeners (most English speakers cannot distinguish dz from d), but I thought his minimalism was too extreme.


I have a lot of respect for your opinions on language (given your background as a professional translator and the solid advice you regularly give here on HN and your website).

Whenever a post about constructed languages comes up, you post this link, which I find disappointing: Rye's rant is, well, just a rant and his reasons for not learning Esperanto are just bad.

There was a time when I spent a few months learning Esperanto. I eventually gave up because 1) whilst it is relatively easy to get going with Esperanto, speaking it comfortably would require about as much work as any other language and 2) the Esperanto community is made up out of folks that are either much older than me or a little strange.

I don't think my first criticism is a fault of Esperanto. Human languages require a lot of convention and shared cultural ideas in order for communication to be compact and, at the same time, clear. No language, auxiliary or otherwise, can magically remove these requirements.

The second criticism is perhaps a little unfair but still important. Cultural cachet is important for a language and a language that appears to cater only to certain groups is going to have a tough time.

But if someone is keen to learn a new language, especially if that person is a monoglot, they should be encouraged to give it a try, even if their language of choice is Esperanto. They can only improve their linguistic skills.


There is also other criticism of Esperanto that's worth reading (imho). Here's a linguist and former Esperanto-proponent's opinion on the community surrounding the language.

http://www.christopherculver.com/writings/esperanto.html


One of the final paragraphs really resonates with my own experiences of international (mostly EU) meetings:

""" During recent travels to Spain, I had the opportunity to observe participants in a pan-European seminar on youth and globalisation. While English was the default language of this group, in conversations between any two people the participants would often switch to the native language of one or the other. For example, a young man from France would greet another in English, but upon discovering that his conversation partner is from Italy, would switch to Italian. This would not find approval among Esperantists. Ironically, English proves the neutral choice here. It is often seen as a sure bet for international communication among young people in many countries, but it is well understood that other languages may serve just as well. In the Esperanto movement, on the other hand, there is an ideological attachment to Esperanto which mandates its use even if there are other, more culturally rich possibilities. """

My experience was at large international demo-parties. I mainly noticed this for English and German, but that's because I speak those two languages relatively fluently (Dutch is my mother tongue). The same must have been happening around me for French or Spanish (or who knows what else) as well, but I don't speak those languages well enough to tell for sure whether they were using (say) Spanish as a common bridging language or were native speakers.

And yes, English is usually the sure bet. Although some of the comments elsewhere ITT have piqued my interest to perhaps learn some Spanish in the future, especially if it's really that easy to learn. Funny, I used to hate learning language (French and German) in high school. It's only later in my life that I found out I actually have quite a knack for it :) (I should probably blame the way it was taught, but I can't really get that much worked up about it, I'm pretty satisfied with the quality of my education overall)


Hello. You cite the case where "a young man from France would greet another in English, but upon discovering that his conversation partner is from Italy, would switch to Italian." I'm not sure how frequent such a case would be. The dominance of English has pushed other languages to the sidelines in France.

As an Esperanto speaker, I have no objection to anyone using any language on any occasion, and I'm happ. I have just returned from an Esperanto conference in Dinan, Brittany. I was able to use some basic Breton with a few individuals, but the sky did not fall on my head.

I wish you well with your language learning. You may wish to add Esperanto one day.


That rant is certainly not definitive, he clearly has a chip on his shoulder. He lists a lot of reasons why Esperanto is not a perfect language, but (obviously) no language can be; when a language is constructed people suddenly set way higher standards, but this is not reasonable and leads to endless bikeshedding.

Still, I'd say Esperanto is an interesting case because it is undeniably the most successful constructed language, with the longest continuous (still continuing) history. It is interesting to learn it for that reason alone, and as far as learning second languages go, it is also one of the easiest to learn. That could make it a useful first step towards learning other second languages.

Why hasn't Esperanto caught on with more speakers? Maybe it has something to do with the reasons in his rant; personally I find it much more likely that it's due to network effects. Even a "perfectly" constructed language would be very unlikely to win over the vested interests and inertia of people.


What I wrote about Esperanto and its failure to go viral:

"There's already an existing language that fulfills so many of those criteria that it's going to be very hard to organize a new one from scratch. And the existing language already has a deep legacy of literature and culture.

That language is the world's second most widespread: Spanish.

Spanish has been destroying the dreams of Esperantists and others over the years who hope to build a more regular, orderly, and easy to learn common language based on common Indo-European roots.

Turns out that it's dang hard to design anything easier or more accessible to speakers of any European language than Spanish already is. The spelling and pronunciation are already completely regular and predictable. The grammar is straightforward and common to almost all European tongues. The vocabulary is mostly based on Latin with some Arabic variety thrown in, but it's been standardized over the centuries so that a lot of it has a simpler and more natural morphology. The sounds are a simple subset of what most languages already use.

It's a great second language: it's fairly easy, the world's second most widespread tongue, and spoken in warm countries with very friendly natives. It's not likely to provide you with many lucrative business opportunities, though. None of the world's financial capitals use it."


Spanish is a beautiful language, I speak some myself, it's probably better than Esperanto or the more European-focused Interlingua, but... consider the number of forms of a regular verb:

(lavar (present lavo lavas lava lavamos lavan) (past lavé lavaste lavó lavámos lavarón) (imperfect lavaba lavabas lavabamos lavaban) (future lavaré lavarás lavará lavaramos lavarán) (conditional lavaría lavarías lavaríamos lavarían) (present-subjunctive lave laves lavemos laven) (gerund lavando) (participle lavado))

And that's ignoring the twenty or so irregular verbs and roughly ten "irregular" patterns (e.g. querer). It's a lot simpler than French or Italian, to be sure, or Latin, for that matter, but it could be a lot easier.


That's not to mention that Spanish is deletive so those sometimes subtly different verb conjugations are often the only way of conveying the subject of the sentence. It seems much more optimised for speaking rápidamente than listening, especially for non-native speakers.

If you were designing a language from scratch you also wouldn't choose features like the unnecessary grammatical gender (even though it's relatively consistent and easy to get right) and the b/v distinction in the orthography that's unpronounced in most dialects.

On paper it's still a far, far better auxiliary language than English though (I wouldn't be surprised to live to see a day when most people worldwide speak a regularised English with much more basic grammar and sensible orthography one day though; it's easier to build momentum taking the second language people are most exposed to as the starting point)


Every time I encounter irregular verb forms, I remember I have a personal hunch for Japanese (basically a handful of exceptions and only two tenses). The leverage of context is powerful and concise in a way strangely similar to Perl. Also I can't decide if their way of typed counting is brilliant or cumbersome. As for writing systems, I have a particular hunch for Hangul.

I wish it would be as easy to prototype human languages as it is for software...


Ithkuil is definitely one of the most amazing pieces of work I have ever come across. I having been using the name as my email address for many years and another variant of it he had called 'ilaksh' as my screen name (note I didn't have anything to do with the creation of ithkuil/ilaksh, just a fan). I think not only other conlangers but also anyone interested in fields like linguistics, computer programming, knowledge representation, etc. can be inspired by what Quijada did.

I did get a few somewhat weird emails that I think were in Russian some years ago, but I think they figured out pretty quick that it wasn't the right email address to reach Quijada.


Losing control of a language seems to be standard procedure.

If this invented language were to catch on, it likely wouldn't be a generation or two and kids who grew up speaking it would start saying the Ithkuil equivalent of things like "yo dog, that's the rad shizaz!". Then, several generations thereafter grandmothers would be regularly using the word "shizaz" and they would have to put it in the dictionary. That's just the way it goes and is probably the reason we don't all speak the same language in the first place.

That being said, I've always been fascinated by the idea of a systematically created universal language and think the world would be much better place with one....if that were possible.

This was a neat article.


I think there's some research out there that suggests all natural languages have about the same information density, when you factor how two people in conversation will add error-correction or extra context to frame an idea.

IMO this suggests the bottleneck is something about our brains on a biological rather than linguistic level.


According to a study published several years ago, mainstream languages seem to operate on an information density/speed tradeoff [1].The authors found that languages that are spoken faster seem to encode less information per syllable than those uttered at a slower pace.

This does seem to suggest that biology may be the limiting role in controlling the rate at which humans convey information. Indeed, the language mentioned in the article seems almost laughably cryptic and dense. However, I feel that the limitation of the mentioned study results from the fact that it treats information on a relatively limiting per syllable basis. Quijada seems to suggest that an artificially constructed language has the ability to incorporate all the implicit meanings of a phrase that are left unsaid in normal conversation.

Ultimately, while Quijada's project seems quite unlikely to catch on among those who are not fringe pseudoscientists, it poses interesting philosophical questions about the nature of speech and communication and perhaps earns its title as a "conceptual-art project."

[1] http://rosettaproject.org/blog/02012/mar/1/language-speed-vs...


The article seems to support the information density / speed tradeoff, in hinting several times that the language's inventor puts at least as much cognitive effort into agglutinating syllables to form a word in his language as he would into joining words to make a sentence in a second language.


>The authors found that languages that are spoken faster seem to encode more information per syllable than those uttered at a slower pace.

I think you mean the inverse.


Fixed it, thanks.


i think it is more auditory and has to do with out ability to error correct. Actually there is a really fascinating section in james gleick's "the information" on African drum communication which is essentially a much less dense version of spoken language since you say everything using a long sentence which essentially reduces the possibility of it being misinterpreted despite most of the sounds of normal spoken language are missing. Not sure if more scholarly work supports this idea but if this is the case then a written or thought language could certainly be denser...like symbolic algebra or python.


I found this article fascinating and satisfying.

I'm curious about the desire to reduce ambiguity, which seemed to be emphasized as a motivation for the creation of Ithkuil and some of the other languages mentioned.

Is it desirable to completely eliminate ambiguity? I can see why it would be desirable in a scientific paper or a public political debate. But in everyday interactions, (intentional) ambiguity plays many important roles.

In my experience, politeness is bolstered by some level of ambiguity. Rather than explicitly state your needs, desires or opinions, you imply them at some level of abstraction, allowing other participants in the conversation to accept or decline more easily. Imagine Jessica who has brought two friends who don't know each other to see a play. They chit-chat a little afterwards, then Jessica goes home early leaving two virtual strangers to have a drink together. It's not hard to imagine the conversation going like this:

A: "Did you enjoy the play?"

B: "It was very interesting. I thought the stage dressing was a little unconventional."

A: "Yes, I noticed that too. Very creative. I was intrigued by the style of the narration. It really let the audience write the story for themselves."

B: "It certainly didn't constrain the imagination did it? I couldn't help noticing that many of the actors took a somewhat avant-garde interpretation of the source material."

A: "Yes, as if they didn't want it to seem like they were 'acting', so to speak?"

B: It was awful wasn't it!?

A: Thank god! Yes, worst thing I've ever seen!

Ambiguity allows subtle social cues (not so subtle in my example!) that avoid direct confrontation when it might be uncomfortable. If one person loved the play and the other hated it, they each might want to avoid offending the other.

Intentional ambiguity plays an important role in other social interactions like dating or friendship-making. Correct use of ambiguity protects feelings, demonstrates subtlety and good judgement, and avoids non-productive conflict.

In artistic expression too, ambiguity is often intentional or even necessary to the effectiveness of the work. Consider a poem like "My Papa's Waltz" [1]. Does it describe happy memories of the narrator's father, or dark memories of childhood abuse [2]? Can it describe both? Is there something in between? The ambiguity isn't a byproduct of imprecise language. The ambiguity is the meaning. To resolve it is to remove the point of the work. The poem cannot be effectively communicated in any medium that does not allow for the existence of ambiguity.

[1] http://www.poetryfoundation.org/poem/172103

[2] 'Yet, this poem has an intriguing ambiguity that elicits startlingly different interpretations. Kennedy calls it a scene of "comedy" and "persistent love", and Balakian, in part, labels it a "comic romp" (62). In contrast, Ciardi sees it as a "poem of terror"' - from http://www.mrbauld.com/exrthkwtz.html


>, politeness is bolstered by some level of ambiguity. Rather than explicitly state your needs, desires or opinions, you imply them at some level of abstraction, allowing other participants in the conversation to accept or decline more easily.

Steven Pinker explores this point in an entertaining presentation[1] for RSA. He also covers other language topics such as spacetime encoding, and profanity but the last part analyzes the need for ambiguity in a language. I deep-linked into the relevant portion of the presentation although the the entire talk is very enlightening. It's worth rewinding to the beginning to watch the entire talk. The 2nd youtube video[2] is the mostly the same material but it's the older one he presented at Google TechTalks.

[1]http://www.youtube.com/watch?v=5S1d3cNge24#t=32m55s

[2]http://www.youtube.com/watch?v=hBpetDxIEMU#t=40m38s


To quote Levinson (2000): "inference is cheap, articulation expensive, and thus the design requirements are for a system that maximizes inference". All natural languages [citation needed] rely heavily on inference — simply because it allows one to minimize what's said. That's not to say that everything maximally relies on inference — politeness phenomena clearly demonstrate otherwise, often being "needlessly" verbose.


I agree, and additionally I would say we shouldn't remove ambiguity for a more basic reason: humans are very good at resolving it. There may be lots of other areas where we have systematic biases such as estimating probabilities, but dealing with ambiguity is actually one of our big strengths so it doesn't make sense to spend so much energy on avoiding it.

This is a bit speculative, but I have the feeling there is a certain kind of personality that is typically fascinated with this idea of being able to rule out ambiguity. It seems to me a tendency to overextend the idea of mathematical rigor to other areas of life.


The point has also been made that ambiguity is a functional property of language, because it is more efficient from an information-theoretic point of view: context also provides information, but a purely unambiguous language would have to express that information anyway. Ambiguity also allows reusing words and sounds. (http://www.sciencedirect.com/science/article/pii/S0010027711...).


Couldn't agree more. Seven Types of Ambiguity published as far back as 1930 by William Empson launched a school of criticism called New Criticism. A definition of ambiguity is then "alternative views might be taken without sheer misreading." For Empson poetry is heavily reliant on ambiguity. And, arguably, poetry is language at its most wrought with ideas most distilled.


One of the threads in David Brin's Jijo trilogy is that development in the galaxy is being held back by (amongst other things) designed, unambiguous languages.


Ambiguity is necessary for art. If there is no room for interpretation, then its just a photo-of-words.


I also wonder if a concise language like this wouldn't make lying easier since you can manipulate the meaning of your words by so slightly altering their spelling and pronounciation.


I think an unambiguous human-speakable language would be a great candidate to teach to a computer.


"Among the Wakashan Indians of the Pacific Northwest, a grammatically correct sentence can’t be formed without providing what linguists refer to as “evidentiality,” inflecting the verb to indicate whether you are speaking from direct experience, inference, conjecture, or hearsay"

This is amazing. But I can't grasp the difference between inference and conjecture - they are both 'figuring out' what happened rather than knowing or hearing?


I wonder how well Ithkuil can be represented in Ian Banks' Marain script. http://trevor-hopkins.com/banks/a-few-notes-on-marain.html


Ithkuil has well over double the number of phonemes as Marain, so the answer would probably be "not so well".


Rotation and reflection of the basic set extend the phonemes and can link together similar sounding ones in Marain, so I would have thought it would be achievable.


>"Languages are something of a mess. They evolve over centuries through an unplanned, democratic process..."

I'm in awe of the creator of Any language. Because to create a (Good) language isn't easy. This is true or both programming languages and otherwise. However, it comes without saying that adoption is a vital component of any language, and with mass adoption comes evolution.

People will often make changes in languages, make their own dialects (based on things perhaps the can relate to on a deeper level, etc..). This isn't a bad thing. To me it only signifies growth and expansion of the language.

+1


I really enjoyed this article when it was new. Not long ago, when I was learning Octopress, my first post was Hello World in Rust and Ithkuil. (I just wanted to make sure code formatting was working.) I have no idea how correct the translation is. I just googled around until I found someone else's.

http://screaming.org/blog/2014/07/12/ettawil-cutx/


Can someone list a few popular constructed languages (maybe comparing them to programming languages)? I'd only heard of Lojban and Esperanto before reading this.


Try http://www.reddit.com/r/conlangs for discussion and lists of languages. There are many subreddits for specific languages. For example, Toki Pona [1] and Lojban [2].

[1] http://www.reddit.com/r/tokipona [2] http://www.reddit.com/r/lojban


Toki Pona is a minimalist language with a bit of a following. It has 120 root words and tries to build all concepts based on those.

Loglan is a predecessor and inspiration to Lojban.

Slovio is the Slavic version of Esperanto.

Dothraki, Elvish (Quenya, Sindarin), Klingon, Na'vi are constructed languages from popular novels/movies.


One novel thing about Tolkien's languages: he constructed a root language (like proto-indo-european) and etymologies for them. (See http://en.wikipedia.org/wiki/The_Etymologies_(Tolkien) )


> Slovio is the Slavic version of Esperanto.

Someone doesn't understand the point of Esperanto.


Regardless of the point, the truth is Esperanto's lexicon is mostly inspired by Romance languages whereas Slovio's lexicon is mostly inspired by Slavic languages. It's not that shocking of a comparison.


TFA:

> A sentence like “On the contrary, I think it may turn out that this rugged mountain range trails off at some point” becomes simply “Tram-mļöi hhâsmařpţuktôx.”

Wikipedia:

> Romanization: Oumpeá äx’ääļuktëx.

> Translation: "On the contrary, I think it may turn out that this rugged mountain range trails off at some point."


That was an interesting read, but the reporter's breathless assertions frequently got in the way of appreciating Quijada and his idea.

I mean, things like:

> A sentence like “On the contrary, I think it may turn out that this rugged mountain range trails off at some point” becomes simply “Tram-mļöi hhâsmařpţuktôx.”

Simply?

We could have used LZW algorithm and the sentence could probably become even shorter, just a "simple" sequence of random-ish bytes. If you increase the number of allowed symbols, of course you need less symbols to convey the same information. If you allow for a limitless set of words that are dynamically generated from combining many roots, of course the number of words decreases... sometimes down to 1, as in polysynthetic languages. This is Information Theory 101.


The English text is 97 ASCII encoded bytes.

Compressed with zlib: 86 bytes.

Compressed with lzma: 98 bytes.

The Ithkuil representation is just 30 UTF-8 encoded bytes.

Compressed with zlib: 39 bytes.

Compressed with lzma: 47 bytes.

(Measured using python's zlib/pylmza modules to avoid e.g. file header overhead)

It's hard to achieve this kind of compression without an external dictionary. What Quijada has created with Ithkuil is, in part, a dictionary for the space of human thought and concepts, something I wouldn't have expected to work in the way the article describes it.


Actually, using zlib format gets you an unnecessary 2 byte header and 4 byte footer, so the proper sizes are 80 and 33.

I'm having trouble figuring out what's going on with lzma because the spec is lying about the header, so I won't attempt to guess the correct number there.



While it looks like it's an impossible language to use in every day, I'm wondering if it could be used for science and technology. Just imagine having all scientific papers in it :)


I'm amazed that the article doesn't mention Lojban at all.


Off topic: Are the drop caps supposed to be lower than the line of text to which they belong? It looks kind of silly IMO.


Any font files for this? would be interesting to use.


If there was a site that summarised New Yorker articles in 2 pages I would be there in a flash.


That would be atrocious! You need to flavour and enjoy the reading, just the same you enjoy a nice drink, or a good cup of coffee, or you take the time to make coitus a never-ending engagement.

Just enjoy it.


Sometimes I need a quick snack, for the calories.


To be fair there are lots of "snack food" articles around these days but few three course meals. I try to appreciate the long form articles that I find interesting for what they are.


You can't say that on Hacker News. The downvote brigade will nail you every time for disagreeing with them. You can not comment on the verbosity of articles or unrelated extra paragraphs and asides that don't serve the overall narrative in The New Yorker, The Atlantic, etc.


Seems this could contribute to accelerating artificial intelligence towards the possibility of the singularity.


another hacker news TL;DR article


Too bad. It's a pretty good read.


It's a very interesting article. But it's done in old journalism/academic paper style where it takes 5 pages to get to the point and has huge multiparagraph asides that the reader is often uninterested in. I already know the history of esperanto... most people won't even care about it. I don't care at all that George Soros learned it as his first language, it's unrelated nonsense. Tell me about the topic of the article. If you want interested people to be able to learn more about Esperanto, link to a side article. We can do this today.


Infodump


Two things struck me about this article in hindsight when I read it.

-- Whose pot did the Croats, Bosnians and Slovenes piss in to not make it into this super Slavic union?

-- China Mieville wrote a book[0] along very similar thought lines which won the Locus Award.

Also, Garkavenko appears not to have taken the obvious side [1] in Ukraine's present conflict given how he is described in Foer's article

[0] https://en.wikipedia.org/wiki/Embassytown

[1] http://maidantranslations.com/2014/06/24/russian-volunteers-...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: