Hacker News new | past | comments | ask | show | jobs | submit login
The Death of the Urdu Script (medium.com/eteraz)
316 points by aarghh on June 25, 2014 | hide | past | favorite | 137 comments



So I skimmed the article once, then read it again.

I see the writer is here. I just want to say I understand your sentiment, but your title and content are filled with some annoying misinformation and discrepancies.

First of all, Nastaliq is not "the Urdu script". If you even skimmed Wikipedia, you would notice that the script does come oringinally from Arabic (where it is used, but more to your point, sadly not as much in print media, at least from my experience in the GCC) and was later well loved by the Persians, and later different portions of the Indian subcontinent (India, Pakistan, and beyond). To call it Urdu script, just because it is popular, and to insinuate that it Naksh is the default script of Arabic (it is not the preferred one, by the way) is a huge stretch.

Another point: Naksh was invented by a Persian, but who grew up his whole life in Arab Baghdad and was an Abbasid court vizier in modern-day Iraq. He did not spend time in Iran (then Fars, as it is known today in Arabic), until he was a tax collector there after growing up in Iraq, and only for a period. He might have been ethnically Persian, but to call Naksh script Persian because of the guys family background is far-fetched, especially if you know how loose the cultural boundaries between Abbasid Iran-Iraq were (they were part of the empire and people, goods, and ideas freely flowed between them). It is a derivative of Kufic script, and has strong ties to the religious intellectual history of Iraq (Basra and Kufa were the seat of a massive amount of religious scholarship and still prominent beliefs in Islam; scripts and art were an awesome by-product).

Sorry for the rant, but I know this is probably not your field. As I guy who spent a lot of time studying Arabic and Arab and/or Islamichistory, I get very annoyed at the misinformation peddled by people about Arabic history and culture. I do not mean to be so blunt, but I hear people rattle off misinformation like this often, and it irritates me.


You are correct about the history. Unfortunately the Abbasid empire came to an end in 1258 AD and Urdu wasn't even around then.

I take your point that naskh has been wide-spread. And it is true that it is even in the Indian sub-continent. Sindhi, for example, is a naskh based language. Punjabi, which my parents spoke, on the other hand, prefers nastaliq, and regionally the two places are adjacet. That doesn't mean that those people didn't historically play around with scripts. They did.

However, my piece is entirely about what is going on today. Sure you can find Ottoman era signs in Egypt and the Levant that display Arabic in nastasliq (my readers sent me plenty such pictures), but by and large Arabic today is written in naskh and almost never in nastaliq. Take a look at some of the fonts that the Omani government is playing around with. They are not nastaliq. This is the political aspect I briefly touched upon in the article. I hope you will get a chance to look into the Arabization aspects of the political debates raging in that part of the world. I make brief mention of it by bringing up the fight over "Khuda Hafiz" or "Allah Hafiz."

Meanwhile, the past two or three generations of Urdu readers and writers grew up associating nastaliq and Urdu almost exclusively with each other (thus the jarring effect associated with having to read Urdu in something else). I have plenty of emails testifying to this from people from my parents generation and some people of mine.

In short, no one is really talking about ancient Arab history here, as fascinating as that would be.


I understand you might find my comments on history annoying, but I was pointing out a few things because the average reader is not familiar with such things, namely these scripts are not specific to Urdu despite Urdu reading-writing communities appreciating it and there is some undertone that Arabic scripts of lesser beauty are being forced upon Urdu readers and writers. All of them come from Arabic (not just the letters, the scripts) as well, and as an Arabic speaker I find that attitude strange. I could be reading too much into it.

And thanks for your notes re politicization of Arabic and script. I happen to live in that part of the world, hence I know.

Off-topic: saw your bio blurb and saw what you write about. Sounds cool. I will definitely try to read your stuff when I have some down time.


You write: "All of them come from Arabic (not just the letters, the scripts) as well, and as an Arabic speaker I find that attitude strange. I could be reading too much into it."

I think that's where our disconnect lies and I do think you're reading too much into it. Urdu is not a child of Arabic. The word itself is Turkish, and the old story about Urdu's genesis is that it amalgamated during the military conquests of Babar, a Central Asian Turk who was funded by the Ottomans but operated independently. A great percentage of Urdu was Turkish (I once heard upto 40% of the words are Turkish), and it contains many Sanskrit words as well. This notion that Urdu is a "Muslim" language, and because of that, it has some special relationship or descent from Arabic, is more of a 20th century phenomenon having to do with Pakistan's affiliation with the Arab world as a result of becoming an Islamic Republic. That's what I think. You are welcome to try and persuade me that Urdu is some kind of Arabic-lite. But in my experience except for a very small minority of very recently Arabized Pakistanis, I have never heard of that view. What I have seen more frequently, however, is resentment against the idea that Urdu is Arabic's baby. Its kind of like how the Anglicans would resent being declared Children of Catholicism. Sure, without revolting against the Church there would not Anglicanism, but that doesn't mean that an Anglican would like to be told, "you came from Catholicism."

I will give you an example. I once had an Arab yell at me for pronouncing Eteraz with a zay ending. He tried to show me that since in Arabic Eteraz ends with a dawd, and since a dawd is a hard sound, I should pronounce it Eterawd. I gently reminded him that for me it comes from Urdu and Persian and I will pronounce it with a Z. That may irk an Arabic purist. But only if he thinks that all languages written in Arabic-script should follow Arabic's rules, which makes no sense. By that logic, Indians should demand that all people who count with a zero in their system should pronounce their numbers in Sanskrit.

I did not intend to write this much in response to that one sentence, but your point of view is something I've experienced frequently enough that I felt inclined to respond longer. It is not just Arabs who say things like this, too. As I mentioned it now includes Pakistanis themselves, who consider themselves "of Arab culture." An example of these Pakistanis are kids in Lahore, Pakistan, who have a license plate tag that reads, "al-Bakistan." They want to Arabize. In you they would have an ally, I imagine :).

I appreciate you looking into my other work. If you get a chance to check out the short story collection, you will find a pre-Islamic story set in Mecca there. There is also a story set in an imaginary state called Islamistan, and there I adhere to Arabic rules of transliteration instead of Urdu or Persian i.e. Dhulfaqar instead of Zulfikar, etc.


> You write: "All of them come from Arabic (not just the letters, the scripts) as well, and as an Arabic speaker I find that attitude strange. I could be reading too much into it."

>

> I think that's where our disconnect lies and I do think you're reading too much into it. Urdu is not a child of Arabic.

Ok. I think I should have been more clear. Urdu is not Arabic. I specifically mean the script, Nastaliq, and Arabic aphabet as used in Urdu is from Arabic (even though, as we are both aware, letters were added to compensate for the lack of certain letters in the original Arabic alphabet, re your Pakistan->Bakistan comment).

As for your insights in pronunciation, I LOVE the irony here. I was also studying Farsi for a while, and I routinely had trouble with this dispartiy, as I had to relearn sounds. It is hard to shake as the eight plus years of Arabic will not go away overnight. I also had friends who were advanced Farsi students with the opposite problem. And I have a father-in-law who teaches Arabic, and unfortunately I hear students and him confirm the religious Pakistani students are poor performers and/or argumentative in class: they have trouble realizing the difference between their knowledge of Arabic vocab and pronunciation through Quranic Arabic, Urdu, and complete inexperience for formal Standard Arabic (which is different from Quranic). I have also been party to these arguments, and heard of them. I have also been silly with my stuff.

In any event, I am amazed I am having this conversation on HN. I thought there would be such little interest in this topic. I am glad to have talked to you and others about this. I thought no one cares.


    Its kind of like how the Anglicans would resent being
    declared Children of Catholicism. Sure, without revolting
    against the Church there would not Anglicanism, but that
    doesn't mean that an Anglican would like to be told, "you
    came from Catholicism."
This doesn't touch your main point, but I don't think most Anglicans have trouble with the idea that it comes from Catholicism.


Very true, hence the term 'Anglo-Catholic':

http://en.wikipedia.org/wiki/Anglo-Catholicism


I'm at a loss to find where 616c in any way talked about Urdu as a language as descending from or owing anything to the Arabic language. All of his comments were specifically about the script. Unless 616c has radically edited the post since you responded to it.


If your goal here is really to "reject[] the cultural Arabization of South Asia", have you considered writing your Urdu in... devanagari? Using nastaliq is already a pro-arabization move.


Apparently Nastaʿlīq was (a) developed in Iran (which is not an Arab country), and although it was developed as a combination of Nasḫ (Naskh) and Taʿlīq scripts, (b) it was never really used in the Arab lands, but in the Persian, Turkic, and South Asian spheres of influence.

So if the people in Pakistan feel that Nastaʿlīq is "their own" writing system and Naskh symbolizes Arab imperialism, in my opinion they have all the rights to feel that way.

Really, even if the script evolved from Arabic script, that was 700 years ago, and what matters in this matter is how the people feel about their letters today.

http://en.wikipedia.org/wiki/Nasta%CA%BFl%C4%ABq_script


> what matters in this matter is how the people feel about their letters today

Sure. It's a political decision. alieteraz even states that the reason he feels strongly about font choice is politics, and observes that current Pakistanis (overall) want to Arabize more than they already have. But for some reason, the article's arguments amount to "naskh is ugly", which is completely irrelevant to the dispute. (There's also a weird idea that nastaliq presents such insurmountable technical challenges that even if Pakistanis strongly preferred it, for example if naskh was viewed as an affront to Pakistani culture, there would still be no reliable way to type in it. And also that various organizations, from Pakistani governmental departments to Microsoft, have already implemented nastaliq fonts.)

This reminds me quite strongly of Megan McArdle's observation that when an election is actually about gentrification, the attack ads focus on ineffable concepts like "respect", because it's not felt to be workable to say "we want the whites to stop moving in, and in fact we also want the ones who are already here to get out". If you want to fight for Pakistani separation from the Arab cultural sphere, say so, and point out the problems with Arab cultural affiliation. If alieteraz is a true weirdo and Pakistan's choice of font matters more to him than its cultural orientation, he still needs to make the argument for separating from the Arab world, because, as he points out himself, what's actually happening is that the people of Pakistan want to Arabize further.


The "people of Pakistan" are not a monolith. So some of them want Arabization and some do not and some want Westernization and so forth. As for the ugliness of naskh, its not that naskh is ugly, its that the naskh based mutant on digital devices is ugly. Also angular Urdu, generally speaking, is dissonant and hard to read for Urdu speakers. Here is an email I got since last night:

Hey Ali,

I loved your essay on The Death of Urdu Script. Nastaliq was the default script when reading Urdu books in schools. Therefore, I am so used to reading Urdu in Nastaliq that I cannot read properly in Naskh. And because I am a tech entreprenuer who works online, I am forced to read Urdu in Naskh. Therefore, over years I have lost interest in the Urdu language.

If Nastaliq is revived and becomes a common script online, I will be sure to read Urdu more.


Psychologists in the Dark Time did experiments that showed people couldn't read ALL CAPITAL LETTERS as easily as they could read lowercase. From that, in a wild leap of imagination, they concluded that lowercase letters were easier to read than capitals.

Unsurprisingly, that was garbage. Later psychologists easily documented that the reason test subjects read lowercase text faster was because they were used to it -- the advantage of lowercase quickly erodes as the subjects gain practice reading capitalized text.

What I'm saying is, this problem:

> I am so used to reading Urdu in Nastaliq that I cannot read properly in Naskh

is imaginary, not real.

> The "people of Pakistan" are not a monolith. So some of them want Arabization and some do not and some want Westernization and so forth.

Note how I summarized you as saying "current Pakistanis (overall) want to Arabize". Who's winning? If you're angry about being on the losing side, whining about ugly fonts is misplaced. Whine about the thing you're actually upset about.


>>Who's winning? If you're angry about being on the losing side, whining about ugly fonts is misplaced. Whine about the thing you're actually upset about.

Now you're just being mean. He's advocating (not whining) for a positive step forward. A step that has some political overtones, but also some cultural and artistic merit.


As I read it, he's not advocating for a step forward, he's advocating against taking a step backward. He wants things to stay as they are instead of shifting toward greater Arabic idenitification.

However, he's chosen to make his battle over an issue of no significance whatsoever. Because it's so insignificant, it's a proxy that reflects the issue he actually cares about, the cultural orientation of Pakistan. But the causality here is all in the other direction. Preventing people from taking umbrellas with them when they leave the house won't stop it from raining, even though people taking umbrellas with them when they leave the house reflects that it's likely to rain later. This is both dishonest and fundamentally misguided. I don't appreciate the dishonesty, so I'm taking a snappy tone. But I'm also offering what I see as good advice -- that if he actually cares about the culture issue, he shouldn't go taking dramatic stands on fatuous proxies. He should make arguments that address the culture issue. Political battles happen over innocent proxies all the time, but the proxy battle is never won on its own terms; it's won or lost based on how the issue it was a proxy for ended up.

So:

1. If alieteraz cares about the culture issue, fighting over a font won't win the battle or lose the battle. It's wasted energy, and it will anger people looking for sincerity in their position pieces. (Personally, I have no intimate connection to Pakistan, broadly but weakly believe that it would be a bad idea for it to shift towards the Arab world, and place great value on sincerity and honesty over BS.)

2. If alieteraz really, truly cares about the font, identifying font choice with Arabization is the worst thing he could possibly do, as it will likely tie the outcome to whether Pakistan decides to make the cultural shift.

Therefore, 3. In no case is this essay a good idea.

Furthermore, I don't get at all what he's trying to say about the technical difficulties of implementing a nastaliq font. He seems to contradict himself in multiple ways within the body of the essay. I don't like that either.


How do you know which percentage of Pakistanis want to Arabize? Do you have some sources?


Note the frowny face. He's a Pakistani.

https://twitter.com/kamilwaheed/status/481792094135009280


Plenty of people in India do this.


Yes, I know, but Urdu written in devanagari isn't called "Urdu", it's called "Hindi". The only thing reifying "Urdu" at all is the choice to make a cultural affiliation with the Arab world.


Especially around Lucknow, there are certainly vocabulary differences, and some very minor grammatical ones (matching numbers in verbal phrases, some nouns changing gender, but overall yes, I see your point). I would be interested in your opinion of what Kamleshwar's novel Kitne Pakistan was written in or Munshi Premchand's early work published in Devanagari?


> I would be interested in your opinion of what Kamleshwar's novel Kitne Pakistan was written in or Munshi Premchand's early work published in Devanagari?

You've exceeded the bounds of my knowledge, but I can provide some generalities. Writing systems can have no effect on the, um, "identity" of a language, since it's quite possible for people to be illiterate. So to me, saying something is written in Urdu, or saying something is written in Hindi, are exactly the same claim about the language involved. I have been taught that Urdu is written in Arabic letters and Hindi is written in Devanagari, and since that is the only difference between the two, iconic cultural works of Pakistan, if written in Devanagari, would fairly be described as being written in Hindi -- they have all of the defining characteristic[] of Hindi and none of the defining characteristic[] of Urdu.

The existence of geographically-based variation in vocabulary and grammar, especially minor variation, doesn't make for a claim that the languages are different. In Georgia (and, so I'm led to believe, other parts of the American South, but I can speak to Georgia from personal experience) it's possible to combine modal verbs, which is grossly ungrammatical in standard American english. Using just examples from country music, it's also possible in southern dialects to use verbs that don't exist in SAE ("I ain't [...]") or to conjugate verbs in impossible ways ("He don't [...]"; "We was [...]"). There is no serious case to be made that they're not speaking english in the South, though (Jamaica would be a more interesting case).

The only other place I'm familiar with where the alphabet (term used loosely here) is held to, under its own power, define the language, is China. I actually had a dispute some weeks ago on HN with someone advancing the argument that the "Chinese language" is the writing system, and any spoken language is a dialect of "Chinese". Despite its total incoherence, this is essentially the "party line" in China, and a very common view among people educated in the Chinese school system. So here are some examples I used there:

1. If the english sentence "How old are you?" is written as 多老是你? (pronounced "how old are you?"), english has not become a dialect of Chinese.

2. If the mandarin sentence 你多大? is written "Ni duo da?", mandarin has not ceased to be a chinese language, nor has it become a dialect of english (or turkish, or any other language written with roman characters).

3. Considering two illiterate Chinese people, one of whom speaks mandarin and the other hakka, is it correct to say, since they're both illiterate, that neither knows any language at all?

It should be easy to see that choice of writing system is orthogonal to what the language is that's being written (or, in example 3, not written). Chinese politics dictate that different languages with 0% mutual intelligibility be made to appear "the same". What we see in Pakistan is the opposite impulse, that the same language be made to appear different for political purposes. But this has no basis in reality; Urdu and Hindi, with their 100% (!) mutual intelligibility, are the same thing. Arabic letters don't make Urdu different from Hindi any more than they make it similar to Arabic.


Ok then, why do I regularly encounter Hindi words like "saphalta" and "viruddh" that I have never encountered written or spoken anywhere in the Pakistani mediasphere and have to resort to a dictionary for (when there are perfectly good 100% mutually intelligible equivalents "kamyabi" and "ke khilaf/mukhalif hona/rokna")


You know, for my whole life people have been complaining to me that I use words they don't understand. Words like "quixotic", "propriety", and "fee schedule" (those aren't invented examples; they all happened recently enough for me to remember). They've never suggested that my vocabulary actually means I speak a different language, though.


As a person who grew up reading Urdu in Pakistan- I can attest to the fact that I associate nastaliq with Urdu exclusively. Regardless of the history - his post is about what's currently happening with the Urdu script. If given the opportunity to read Urdu in nastaliq, I would probably read Urdu more often. I have tried reading BBC Urdu various times, and have given up due to the naskh script.


I can see the value in said script, but I have to live a more pratical lifestyle: as an IT guy, if I could stop reading English, Arabic, or Persian every time it is in an ugly font, I would.

And if you want to see ugly fonts, try Arabic news sites. BBC, Nastaliq or Naksh, have nothing on Al-Jazeera or Al-Arabiyya.


According to Wikipedia, the difference between Nastaʿlīq and Naskh is much more than just a different font.

And also, according to this discussion, I have also learned that Nastaʿlīq is deeply connected to feelings of origin and pride in (at least some) Urdu speakers, and these are delicate matters.

So why would you even make such an obnoxious, and totally incorrect, reduction of this matter to differences in mere font choices?


> Sindhi, for example, is a naskh based language. Punjabi, which my parents spoke, on the other hand, prefers nastaliq, and regionally the two places are adjacet.

Saying that Sindhi is Naskh based language while Punjabi is Nastaleeq based language is like saying English is Arial based language while French is Times New Roman based.


The difference is considerably larger than Arial vs. Times. I think it's closer to discussing, in earlier centuries, which European languages used blackletter scripts versus Latin-style scripts. Which is something people actually do discuss, and at times it was a major political issue.


Respectfully, you are ignoring European history where certain languages were pretty much exclusively rendered in certain "scripts." This was also the case -- and still kind of is -- in the subcontinent. Anshuman Pandey has an interesting anecdote about how the British tried to "standardize" a script in the early 20th century, only to discover that everyone wrote in different scripts and didn't want to change. In the end, the British tried to force everyone to write in Roman letters, but that failed too.


This is rather confusing. IIRC both Arial and Times New Roman let you express all 26 alphabets. Now if Naskh has Nastaleeq have differing number of characters, is this a mere font issue or is there something more fundamental differing between the two?


You got it wrong. European Latin-based scripts are not identical. They have, besides the basic Latin set of letters, different diacritics. The original Latin alphabet has/had less than 26 letters, and the European languages had to adapt (either by supplement it, or by expressing sounds through combinations of existing letters). Now, a font may or may not have all the European letters, but that makes it merely incomplete from a language's perspective, not that it belongs or not to a given language.


There are two issues. One is that some operating systems (ios) don't offer the full alphabet. As for those that do (Android/Windows), they don't offer the right script. Hope that helps.


restalis, FYI you have been hellbanned, so no one can read your posts unless they are logged in and have "showdead" turned on.


I think it's because I've touched political matters in a post a while ago: https://news.ycombinator.com/item?id=5166069


Another reason why some prefer Allah Hafiz over Khuda Hafiz is 'Khuda' is a word Zoroastrians used to refer to their 'God' and not Allah, the one true God revered by the Muslims.


I find it interesting that many variations of Kufic script, which as far as I know is the 'original' Arabic script, seem to be almost ideally suited for print/digital reproduction.

With the square normalized forms it almost looks like QR codes: https://www.google.com/search?q=kufic+script&tbm=isch

This is in contrast to modern Arabic writing which seems to be geared to handwriting, with its flowing and connected shapes, kind of like old Latin cursive scripts.


Hardly optimal, to be honest. The QR code nature is because all diacritics are not used at all, period. Those dots actually signify, you guessed it, different letters (ironically, they are grouped into Arabic by articulation: b, t, and th are letters in Arabic, same shape and different dots; there order in the alphabet are grouped because you make them with the same part of your mouth).

Problem with that is, grokking text is difficult without a lot of practice. I am not a native Arabic speaker, and reading that stuff, even if I can read novels, the Quran, and weird pre-Islamic poetry (vocab is very different) would take me days to decipher if written in that script.

What is more funny is that there are not even short vowels in Kufic script. So after I learn to read and know contextually which letters are which, I would only know the consonants. And the vowels in Arabic are amazingly important (same consonant, different vowels = same word root, but meaning can change from book to dest to library to writer, for example), so Kufic is an easy way to bring an Arabic student to suicide.


just a minor correction - the script is naskh from the triliteral root n-s-kh (نسج - copy) not naksh from the root n-k-sh(نقش - engrave)


This is Ali Eteraz. Thank you for recognizing this piece. It was a lot of fun to write. I've actually been to a Ycombinator event. Twice I think. A couple of your alumnus managed to talk me into downloading stuff or getting on email lists that, well, I didn't always use (though they did seem promising!). I am a writer based in the Bay Area and I really enjoy getting to learn about interesting technological developments and human stories within that. So definitely hit me up whenever you're congregating or plotting something interesting. My website is alieteraz.com or I am at @eteraz or FB.

Ali Eteraz

ps - I later got to meet Michael K. at the Unicode conference in Santa Clara, along with the Microsoft Windows team. The Windows Phone people, who I also really wanted to talk to because this is a mobile problem too, did not want to talk to me because they had a new phone coming out.


There is another problem with fonts. Popular fonts used on the internet only have information about the joining of the 28 Arabic characters. When the extra characters in Urdu are typed they don't join with the other characters in the word. eg. I will type the first line of the couplet here. Note how the third,fourth and seventh word are broken up. اور باذار سے لے آے اگر ٹوٹ گیا

I have written about some other issues with the Urdu language over here. http://upgoerurdu.nfshost.com/technical.html


Unsurprising to see a link to Michael Kaplan in there. His blog was taken down due to what sounds like a dispute over patents with his boss. See this twitter conversation between Spolsky and him: https://twitter.com/spolsky/status/447024470256283648.

--

"It has been made officially clear to me that [my] Blog ... is for all intents and purposes dead."

Well, I blogged about a patent and included art, so my former manager took the Blog down with extreme prejudice.

--

Blog is now archived at http://www.siao2.com/ and has much more information on localisation.


Wat? I thought the whole point of patents is to make information public, because competitors are simply not allowed to use the information without payment anyway.


> Utility had defeated tradition.

Is that supposed to be self-evidently bad?


Sadly popular opinion seems to be that languages and other traditions must be preserved because, well, we can't lose them. It's certainly good to preserve them in museums so people can use them for research, etc. But there's little or no actual need for more than one language in the world, let along more than one writing system for the same language!

People don't mourn the loss of codepage 437. For many people that's a big part of their growing up. But now it's not in common use and the world is better for it. We don't see people trying to slow the adoption of UTF8 because it's making the world one big dull homogenous borg that's unable to express box drawings.

Nowadays, there's quiet revolution going on in the form of many small dialects disappearing. That's wonderful. It's giving millions the opportunities that come with being able to communicate with more other people by sharing common language. We should be celebrating the massive advantages that keep growing each time another obscure script or dialect becomes nobody's native language.


I think the concern is more the loss of literature written over the ages in a language. Equally importantly, language - as an extension of culture - shapes the way people think. Different people thinking differently is how problems can get solved, instead of monstrous uniformity you seem to be envisioning.


This doesn't make any sense. Ancient literature will be lost no matter how many languages the world holds at any time -- language change means that it isn't possible to understand ancient works. Sure, it's easier for an English speaker to pick up Old English than Ancient Egyptian, but both take years of effort.

> Different people thinking differently is how problems can get solved, instead of monstrous uniformity you seem to be envisioning

People think plenty differently while speaking the same language. They don't think any more differently if they speak different languages.


Are you bilingual? If so, that perhaps you'll know what I'm talking about.


I detest this view. Typical navel gazing, utility-uber-alles (at least for things that "I" am not interested in) programmer who is more concerned with some kind of compatibility (of course using IT examples) than anything else.

Even if we only strictly consider languages themselves, what if some people find pleasure in them? You can't say that what they enjoy doing is wrong with some argument for its utility - because the use and consumption of that language is a utility in itself. Neither can you confine them to these "museums" since they might want to use it outside of the halls of dusty books. Then, so much for "one and only one language".

Now, considering not just the language but culture. Languages are so incredibly context dependent and vague that I think they are deeply intertwined with culture. Outside of whatever you can laboriously define and enshrine in some book to stuff in a museum, there are incredible nuances that can be hard to appreciate if one is not involved with it, through things like social interaction and the written word. If you lose or discard the language of some culture, then you might also lose an appreciation for a lot of subtleties and ways of thinking. Words and expressions that might seemingly translate easily may have nuances that only an active speaker of the language can appreciate.

Thirdly. Consider if we with one magical button could make everyone in the world speak one language, say English. Now we are all on the same page! Then fast forward a few units of time - are we still speaking the same language? Probably, but we might have developed distinct dialects in different corners of the world. Why? Because every society and culture practising the language "evolved", namely added new expressions, words, phrases. Even pronunciation (accents). Why didn't the language simply evolve in the same way, worldwide? Of course because of the fact that the world is made up of a lot of different places and cultures. All of these cultures affect each other, but it's like a web of intermingling sharing. There is, largely, not a single, dominating culture, and so there is no central "authority" on how the language is going to evolve naturally. But you need a single "reference" point for this language to evolve in a uniform way, and avoid a lot of dialects! So if the goal is to have a single language, and a single dialect, your only bet is cultural imperialism. But culture is just a silly thing that just gets in the way, right? Better just stuff them in a museum and adopt a single, standardized culture. It's so efficient!

Given enough time, these dialects might turn into their own languages (let's say that languages are now distinguished by not being mutually intelligible with each other). Then you're back to the same problem, basically. Well, it's a lot better since all of these languages are closely related. And maybe these languages would never really become their own languages, since there is sufficient inter-cultural, global communication to avoid any significant divergence. But to eventually have no dialecs, no linguistic misunderstandings? Something's gotta give, namely everyone has to adopt the same culture and pallet of beliefs and values. But I guess that kind of robotic efficiency would be a Utopia for you.


Why preserve culture? You answer still comes down to "because we can't lose it". The other side of the coin is it's being done at the expense of forcing all minority language speakers to be economically disadvantaged. Great news if you're an American, not so great if you're a Nepalese. I have to disagree with you there. You haven't shown any reason why a member of a minority culture should be excluded from most of society and prevented from even learning modern knowledge, reading literature, or even understanding the ways of thinking of most of the world.

This is why I favor storing languages in museums, rather than burdening millions of innocent people with the job they didn't even sign up for.


> Why preserve culture? You answer still comes down to "because we can't lose it".

Why have cultures? Because it increases the diversity of thought and ideas. This diversity in turn can inspire each other and create new impulses, ideas, ways of thinking. This diversity is severely limited - and in turn how much it can influence and inspire other cultures - if they simply disappear. It's like taking all the work and time it took, for perhaps thousands of years, to create that culture, and simply destroying it.

Another reason is that I think that it's a shame for majority, dominant cultures to simply swallow up minority cultures. If the people of that minority culture want to preserve it, I think they should stand a chance.

Lastly, I think it's a worthy end onto itself. This is the end-road of socratic reasoning. Would you say that it is frivolous to just say "because it is worthy in itself"? Well, this is how all arguments end up, anyway. Just like your argument might end up with the assumption that "convenience and less struggle to communicate is worthy in itself". But then I can say, "a little struggle and inconvenience in communication is part of the fun! We should preserve our differences for that reason".

> The other side of the coin is it's being done at the expense of forcing all minority language speakers to be economically disadvantaged. Great news if you're an American, not so great if you're a Nepalese. I have to disagree with you there. You haven't shown any reason why a member of a minority culture should be excluded from most of society and prevented from even learning modern knowledge, reading literature, or even understanding the ways of thinking of most of the world.

Nonsense. Do you know how many people speak two, three and so on languages? How many people who speak a minority language in some country, also speak some majority language? The same goes for speaking a native, minority, or whatever language, and speaking English (or perhaps French or Portuguese etc., depending on where they are in the world).

I have to emphasize that I would never condone forcing people to practice and learn a language. It should be by their own volition. If they don't want to practice it, then that is of course entirely up to them. Just as I wouldn't force them or encourage them to give up their language (and perhaps in turn also their culture) in favour of some international standard language.

> This is why I favor storing languages in museums, rather than burdening millions of innocent people with the job they didn't even sign up for.

Is that so.


Indeed. I feel like languages/scripts dying out has some advantages too.


Languages do evolve and die. English itself is changing all the time. Same goes for Urdu. The question here is whether the changes are organic or is there simple oversight going on. My view is that the problem is of oversight, and of disregarding the Urdu speaking and writing user. If the world's most popular operating systems were designed in Pakistan or Iran and they were not offering enough English language letters as a default on the devices, I think we would be plenty upset.


"Organic" change is little more than the result of repeated oversight. It is often triggered by changes in hardware availability.

Scripts that were developed in the age of stone tablets will change when the tablet and chisel are replaced with paper and brush, and again when the brush is replaced with a fountain pen, and again when the whole setup is replaced with a computer. People will naturally adopt the style that allows them to convey meaning the most conveniently in a given medium.

Chinese, Japanese, and Korean were all written vertically from right to left (|||) until only a few decades ago. Now they are very often written horizontally from left to right (≡). A lot of factors have contributed to the change, such as the ease of embedding Western, mathematical and scientific symbols in a CJK sentence, the need to keep one's hands clean when writing with a modern pen, and of course the introduction of computerized typesetting.

Nobody forced anyone to write horizontally. People just gravitated towards what they found more convenient given the availability of new writing tools. And of course, many artists and calligraphers continue to write vertically when it suits their intentions better. Even better, new calligraphical techniques have been developed to take advantage of unique opportunities that horizontal writing offers. As long as everyone gets what they need (art for artists and utility for utilitarians), I don't see any problem with this arrangement.


The question is whether languages are really worth preserving. I'm not a native English speaker yet I don't really have much emotional attachment to my first language. In fact, my life would have been easier if I grew up speaking English.

And it's not that I think that English is somehow inherently a better language, it's more that there is a lot more people speaking it. It's kind of like if you are deciding what open source project to use. Having a big community around it is a major deciding factor, if not THE deciding factor.


Although English is seen as the lingua franca of the 21st century, it wasn't always like that.

In the several millions we are able to speak and write, there have been lots of languages taking that role.

And even English is not as common as many people think, just traveling around the world will teach people that there are many places where knowing English will be of no help.

Besides languages represent culture, not all concepts are expressible across all languages and they are also a door to our past as mankind, sure they are worth preserving.


> Although English is seen as the lingua franca of the 21st century, it wasn't always like that.

I'm aware of that.

> And even English is not as common as many people think, just traveling around the world will teach people that there are many places where knowing English will be of no help.

I guess it's not just about sheer numbers but also about what sort of stuff is produced in the language. E.g. the expert books are mostly written in English. You don't even have to go that far, e.g. there aren't all that many say compiler books that were not translated from English, that are up to date and not terrible. This is for me more important than being able to have a discussion with Bhutani farmer.

> Besides languages represent culture, not all concepts are expressible across all languages and they are also a door to our past as mankind, sure they are worth preserving.

Well it's a tradeoff. And I think that at some point the cost of preserving them outweighs the benefits.


> the expert books are mostly written in English.

This depends very much on the domain.


Just out of curiosity, what concept is not expressible in English?


There are lots of examples. Citing just two out of my mind.

Feierabend (German) - expresses the concept you are finished with the work duties for the day and can enjoy your private time with family and friends

Saudade (Portuguese) - a mixture of loneliness, melancholy, sense of loosing part of you feelings, even mixed with a kid of sad joy, while remembering something that is no longer there.


You did a pretty good job expressing those concepts in English in your post. Sure, there's no single word for each of those expressions, but that only means they're not that popular in everyday usage.

If they were popular enough, we'd say: English does have words for those ideas! They're "feierabend" and "saudade".

Exactly the same as "gesundheit".


Thanks for the compliment, however I feel from the linguistic point of view, I explained them.

Expressing means there is a similar word.


The Turkish "huzun" is a lot like saudade.

You should compare Orhan Pamuk's treatment with Fernando Pessoa.


Thanks for the hint. Although I imagine reading a translated version won't be quite the same thing.


You just expressed these in English.

Although you might spend far more words on it, and it might be hard to properly capture abstract concepts in another language, these are not examples of concepts that cannot be expressed in English.


> You just expressed these in English.

From linguistic point of view, I explained them.

Expressing means there is an English word with the same meaning.


What definition of "to express" are you using? Because expressing does not come with a single word restriction that I know of. I think you mean whether or not there is a translation of a certain word?

We were talking about concepts being "expressable" in any language. Of course many words are not directly translatable (half of Chinese isn't, that isn't news to anybody), so that's not particularly interesting. But you'd be hard pressed to find a concept that is not at least explainable in English.


Tamil and probably many other Indian languages have various examples: Love as in parental (anbu), as between couples (kaadhal). "How many -eth" (ethanaiyaavathu) child are you to your parents? Many forms of family relationships: younger brother (thambi), elder brother (annan), etc.


Those concepts seem to be expressed just fine. The only difference is that you used multiple words instead of a single word. (I'm a native Tamil speaker.)


> Just out of curiosity, what concept is not expressible in English?

My native language doesn't force feed gendered personal pronouns on me, so when other languages (English, Swedish, German etc.) do , I feel they are forcing me to be sexist.

(The Swedes are actually trying to take some steps fixing this aspect of their language, by introducing a gender-neutral pronoun.)

Even worse (i.d. feels even more unnatural) are the languages with gendered nouns (German, Spanish etc.).


Interesting — my native language is Portuguese and it suffers from having gendered nouns. Writing in English to me is an improvement in that sense. I didn't know there are other languages which are even better in that respect. What is your native language?


Finnish. But there are others, like Armenian, Basque, Bengali, Chinese, Estonian, Hungarian, Japanese, Korean, Persian.

http://en.wikipedia.org/wiki/Gender_neutrality_in_genderless...


I feel much the same way (though, considering I learned English at the age of 4 and never really learned much more of my parents' language after that, calling English my second language is grossly misleading). I feel a vague attachment just because my grandfather was a professor of Bengali literature and I'd like to be able to read his work, but I don't sweat over the fact that my grasp of the language is very basic. I don't need to use it too often, and people back in the "motherland" can worry about preserving the language or not.

The analogy to open source projects is spot on; while you might have some "emotional" attachment to a library/technology that you learned early on, you probably wouldn't make significant decisions based on that. I learned to program in VB5, yet I dropped it as soon as I learned Java because it was just an awful language (though I guess human languages don't really have notions of superiority or inferiority like some programming languages might).


The analogy to open source projects is spot on

Except that when an open source project dies, its code remains in full -- to study, take ideas from, or even potentially be revived at a later time.

When a language dies though, it's pretty much gone. Even if it's received a lot of attention from field linguists, it's near impossible to fully codify the grammar and document the nuances of any language.


I doubt that a language dieing in 2014 would be completely gone from all record.

Almost all the languages that we "lost" did not have a rich literature or sometimes even an alphabet.


There are plenty of extant languages that don't have an alphabet either. They will be "lost" in every sense of the word.

The # of languages currently spoken is typically pegged somewhere between 3000-6000 and it's thought that we lose around one a week. Most have almost no documentation whatsoever, and even so a couple papers written in the 1970s aren't going to capture any potential unique aspects of a language let alone allow for reconstruction at a later point.

The analogy to code is way off.


Just a guess: (1) It's maybe difficult to separate one missing their country, and missing the language of their country. (2) Also it's maybe easier to miss the homeland, when the homeland is a rich western country, and moving back home would not mean any decrease in standard of living or safety (might even be an improvement over US or UK).


"English itself is changing all the time."

As a writing style, cursive is basically dead. This would have made an awesome analogy for the original article essay. Very few Americans in gen X and younger actually write cursive and there are efforts to remove it from the curriculum.

I can sign my name in cursive when I write a legacy paper check (a couple times a year) or sign forms. Thats about it. I now write legacy paper checks in block capitals and only use cursive for my .sig. I would have to check wikipedia to see how to make a cursive capital "F" and some of the weirder letters (what is a cursive Q?)

It presents some problems to genealogy, I have trouble reading centuries old census forms because I haven't read cursive since grade school, some decades ago. In grade school we were told we will have to write all our papers in cursive in middle school and up; not so; all typed, mostly on computers.


I don't think it's a bad thing either.

Those scripts will not be forgotten soon because of the way information is preserved in our current age. Nothing of their historicity is lost to us at all. Yet this script (and many others) do not present compelling reasons to stay in use. The same goes for Chinese characters; the knowledge of this script among Chinese youth is waning simply because these characters are not as useful as pinyin romanization is. It's "natural selection".

From my limited knowledge, only some "effective" scripts in the world that translate efficiently to a digital age. From the top of my head, only Latin, Cyrillic, (other descendants of Greek) and Hangul script feature a small enough 'standardized' amount of characters to express a wide variety of information. That is what is needed to survive in a digital age methinks.


I haven't noticed that about Chinese. Pinyin and Bopomofo are useful input methods but I hardly ever see material actually written in pinyin other than children's books. My kids have never really complained about having to learn characters and find a lot of pleasure in being able to read and write characters.

Most people I hear calling for the eradication of Chinese characters are just a few non-native speakers. Of course, that is just anecdotal, I am sure there are also a few native speakers that hold that view but it seems to be a very small minority.

Post-education most people hand write characters a lot less, but they are still reading characters and producing texts with pinyin/bopomofo/cangjie, etc. There are huge amounts of information being written Chinese with the increase of internet usage in China and increase in education. Unless there is another massive education reform in China, I don't see Chinese characters going anywhere.


These kinds of people decrying losses of languages have existed throughout the ages. It's just that now they have a large slice of the world as their audience instead of their town square/academic halls/whatever.

I say as long as we ("we" being...fuck I don't know) make a concerted effort to translate any ancient texts not yet translated into English, then let it die gracefully. The people crying for these losses will eventually pass and...well, such is life.


Translation is an admirable goal, but so is preseervation. Not necessarily the "preservation" that implies that the languages will remain in day-to-day use forever; rather the "preservation" that implies that a complete (or nearly complete) grammatical description of the language, along with a relatively comprehensive vocabulary, will be recorded. That has more than historical interest, since the varieties of language and linguistic structures can tell us an awful lot about how language itself works in humans. It's sort of like doing astronomy/cosmology now (or at least recording the observations we can make now) will tell us very different things than we could possibly find out after the cosmic microwave background has become undetectable and galaxies outside of our gravitationally-bound local cluster have receded beyond the visibility threshold (where space between us and them is expanding at a rate faster than the speed of light). Before Hixkaryana was described, it was thought that OVS was "impossible". Amharic "made no sense" until Greenberg noticed something going on with adpositions (mostly prepositions, but a few incipient postpositions) that indicated that it was only partway through the process of settling into SOV from the ancestral VSO. That, in turn, indicates that there is something to the idea of markedness hierarchies, which, in turn, might be giving us an insight into something more fundamental about the way we're wired up.


I think that it is bad. Technology should help us preserve, not dictate what words and letters we're allowed to use.


For people who value tradition, yes. For people who value utility, no.


its an interesting read and I acknowledge the intent to keep the traditional and authentic writing system for Urdu.

I have a few thoughts on this though. One would be the legibility. Of course I cannot judge legibility of writing systems I do not like, but it seems that nastaliq would be hardly readable on a lot of mobile devices and I wonder how difficult learning the ornate script is. I am talking about alphabetization here.

Next thing is: I am learning turkish and turkish is written in the roman alphabet. As far as I know, it was written in a arabic/persian script before which was then reformed to use the latin alphabet. As far as I can tell this is today really uncontroversial and using the latin alphabet is actually the more suitable alphabet for turkish and its rich vowel system that is really important for grammar and meaning. Again I cannot really say anything for Urdu, but knowing it is not arabic but afaik a language of the indo-european family I wonder if there are more reasons to use the lating alphabet than just availability of nataliq fonts and rendering engines.

As a side note I would add a few observations relating to the cultural/heritage aspects. In Germany, the "Fraktur" was used widely even at the beginning of the 20th century (see http://en.wikipedia.org/wiki/Fraktur ). Some authors, like Hermann Hesse refused that Antiqua fonts would be used for their writings until publishers convinced them that their works could just not be read by young folks. In a way, a lot of people argued against using non-gothic fonts, but in the end antiqua became quite standard. Nowadays we use the lating alphabet (and most people are not concious about that there ever was a switch).


A note about Fraktur: Even though the wiki article states that the main reason for its downfall was the Nazis forbidding it themselves, as a native German speaker I quite strongly feel it to be associated with Nazi culture - which immediately renders it impossible to use in any context, since you want to be as far from that as you can. Allergic reaction to anything resembling Nazi symbolism is quite strong in German speaking culture, even in Switzerland where it never took hold. So it seems to me that postmodernist culture is most responsible for the sudden downfall of that scripture.


I associate Fraktur with Karl May :-) (He died 1912, so he predated the Nazis.)


As a native German I rather feel it associated with the German Empire - probably also not a desirable association (read "Der Untertan" by Heinrich Mann (in English this book seems to be known under different names: "Man of Straw", "The Patrioteer", "The Loyal Subject")), but it has nothing to do with nazis.


I would say it just looks really old fashioned and outdated. But of course this is a matter of perception.


Great point about German language. I have come across that again and again. And you're right, there ARE Urdu speaking youth now who are basically just as comfortable in naskh as they are in nastaliq, if not more so. This was a revelation to me and it added a wrinkle to my thinking. But I go back to my point, that we can't fully adjudge if nastaliq is meant to die off until we make at least a good faith effort to get it onto the devices.


What makes me wonder is, I would have thought that the prospects of nastaliq are increasing with electronic devices (as opposed to old-school printing presses).

What I also observe is, that arab speakers sometimes write with the latin alphabet using characters like "3" do represent some missing phonemes. Urdu surely is not alone, and having a proper nastaliq rendering on all devices might not solve the bigger issue here.

Could you elaborate a bit on nastaliq and how/whether it does represent Urdu's phonemes properly? Are people considering Devangari as an alternative?


> But I go back to my point, that we can't fully adjudge if nastaliq is meant to die off until we make at least a good faith effort to get it onto the devices.

Is that really true, though? The reason writing styles, fonts, heck even languages are for myriad reasons, but the intersection of technology and change (I don't just mean modern technology, but even the invention of the printing press itself, shipping and cross-continent contact, modernisation over the past few centuries) seems to me to be the dominant reason why they disappear. Perhaps this is just another example of that.

Of course, you can make the argument that because of technology now, there is no reason for a writing style to die off, and I wouldn't disagree! The question remains, though, is that whether it's worth it outside of the digitization of languages for archive purposes, if it's not used widely as-is. It's certainly interesting to think about, thanks for posting!


Although Antiqua is used almost ubiquitously in Germany, Fraktur still has its proponents in Germany (mainly among older people). Just to give one reason, which I consider as pretty convincing: Fraktur has two different small s (long s and round s)

> http://de.wikipedia.org/wiki/Fraktur_%28Schrift%29#mediaview...

while Antiqua only has one. Since in German combinations of st, sp are pronounced differently and in German words are concatenated by juxtaposition, this can lead to situations where writing a German word in Antiqua can make it non-unique: Consider the word "Wachstube". Written in Antiqua it can either mean "Wach[-]stube" (guardroom) or "Wachs[-]tube" (tube of wax). If you write this word in Fraktur, it is clear, what word is meant:

> http://commons.wikimedia.org/wiki/File:Wachstube.svg

I only explained the situation for the small s. There are lots of other rules, too. For details see

> http://de.wikipedia.org/wiki/Fraktursatz


You could still write Wachſtube and Wachstube. Long s (ſ) could be made available in any script, and the fact that it isn't maybe tells that it is not such a big problem.


There are additional rules for non-optional ligatures hinting the pronunciation which sometimes must be used and sometimes must not; see my post at

> https://news.ycombinator.com/item?id=7944134

Indeed your argument shows that long s and pronunciation-hinting ligatures are not strictly necessary. But neither are high-level languages for programming (you could code anything, say in C or assembler). Nevertheless most programmers prefer using higher-level languages over C or assembler, because they implement desirable features. So why not use a writing system which already implements such nice features by construction?


The cló Gaelach https://en.wikipedia.org/wiki/Gaelic_type had a similar history.


Minor nitpick: fraktur is not an alphabet, it is a style that some fonts exhibit. So "nowadays we use the latin alphabet" is not a change.


This is not true: Fraktur has two small s (long s and round s)

> https://news.ycombinator.com/item?id=7944021

and some ligatures that in German word have to be used or not depending on whether the letters of the signature seperate syllables or not.

Consider the German words "Tatzeit" (time of the crime) and "Tatze" (paw). For Tatzeit you must not use an tz ligature since the z is the beginning of a new syllable. On the other hand for "Tatze" you have to use the tz ligature.

So the Fraktur ligatures carry a meaning on the pronunciation of the German words that is lost when writing them in Antiqua.


Nitpick on the nitpick: You could not take a German text and change the font type to get proper Fraktur typesetting. It involves using some different rules (final s and inner "s"), ligatures that are context sensitive as well as different punctuation and font "markup" (no bold font but Sperrsatz, etc.).


Please correct me if I'm wrong, but if Apple and Android make a nastaliq font part of the directory of fonts that go into their devices from the get go, there wouldn't be a problem.

Such fonts have already been designed. The issue now is to find Android and iOS developers who can make them default in the operating systems.

When I talked to Microsoft they said that on the smaller devices every last bit of memory was precious so they didn't want to stick extra megabytes or however big a font is. I think that Apple's cheapening out is worse than Android's, because at least Android offers the entire Urdu alphabet (though in naskh). Apple doesn't even offer the Urdu alphabet, requiring Urdu users to have to make do with 12 less letters.


I don't think it is just a matter of adding a font, with 12 additional letter for urdu. The actual layout algorithm has to change (how characters are laid out, their size, how beginning of one connects to ending of another, etc.).


Meanwhile, in the free software world...

https://docs.google.com/presentation/d/1ySTZaXP5XKFg0OpmHZM0...

https://github.com/behdad/harfbuzz/blob/master/test/shaping/...

(Harfbuzz is the default text layout engine for Linux GUI frameworks: Qt, GTK, etc)


This talk at Google by Roozbeh Pournader is very good at laying out the issues with bidirectional language issues in software. This is a non-trivial problem and is rarely ever solved correctly.

https://www.youtube.com/watch?v=wOEzYefrqo4


And not so ironically, Behdad Esfahbod, one of the original developers of HarfBuzz, works for Google now, according to his blurb on Github.

https://github.com/behdad

I actually emailed him a few times before I discovered mlterm to read Arabic and other RTL fonts in a terminal (you might ask why: file names and I use mutt for email). Very nice guy, and I am very appreciative of this library.


That was an amazing article, and a great insight into middle eastern scripts in the digital world.

It doesn't look like it'll get out of the new queue, but I wanted to thank OP for sharing it anyway.


Pro tip: Mughul India is not the Middle East, and neither is Pakistan or Iran.

There are arabic derived scripts at least as far east as China and the Philippines (jawi), but describing them as Middle Eastern is plain wrong - just as you would not describe Balinese or Cambodian as Indus Valley.


The article talks about two arabic scripts. Saying "Middle eastern" is more inclusive than saying "arabic" and while it does not include pakistan, it does almost reach it. Kind of like when we talk about Europe and increase the reach a bit more. Or when you say "the United States" when you really mean north america (a much worse offender since you talk about a country). This was my logic when writing it.

I'm not a history expert. I would love a history lesson but you seem like a very poor teacher, "picking" on people who just try to be nice. I did not know this article was going to be on the frontpage as I said in my original comment - I caught it at 2 points. Had I known my comment was going to go through a seven-way review by a wide variety of experts, I'd have just said "non-latin scripts" to be on the safe side. (/s)

Sorry for the rant - People correcting each others where correction does not at all matter bothers me to no end. It feels like a sneaky and thoroughly annoying way of flashing your "I know THINGS!" credentials.


You seem to be admitting mistake, complaining about being corrected, and dismissing the process all at once. But I'm sure you learned something :)


According to Wikipedia, Iran is in the Middle East.

http://en.wikipedia.org/wiki/Middle_East#Traditional_definit...


In modern times things are very vague. For example, according to Wikipedia, it's also part of Central Asia https://en.wikipedia.org/wiki/Central_Asia and even South Asia https://en.wikipedia.org/wiki/South_asia


> There are arabic derived scripts at least as far east as China and the Philippines (jawi), but describing them as Middle Eastern is plain wrong

Is it, though?

I'd describe the Roman script as European (since it originated in Europe) without expecting to be corrected, even though the vast majority of people who use it live outside Europe.


Would you describe a page of written Vietnamese or Hanyu Pinyin as European? That's a fairer comparison to the author's turn of phrase.


I would describe the scripts they're written in as European. The languages themselves aren't, obviously.


You seem to be trying very hard to look very smart, yet you (probably purposefully) seem to be missing the point of everyone you reply to. That is not really making you look "smart".


Sad. I remember as a kid, growing up in Kashmir, I took special pride in my Urdu calligraphy skills. Writing each sentence was like an art project.


BTW. I published this piece in fall of last year. Long before it came on Hacker News, it had been passed around the Pakistani and Indian Twittersphere, and no one there was saying that we should let tech companies off the hook for not offering a) the full alphabet and b) the right script. It was, actually, the Pakistanis and Indians that made this article what it was. When I wrote it, I honestly thought I was just some eccentric who was annoyed for very personal reasons. Even as we speak the piece is being retweeted by professors and students at LUMS, in Lahore, and universities in Islamabad.

What's even more interesting is that as a result of writing this piece my own favoritism towards nastaliq actually LESSENED. But by the time my thinking evolved, the nastaliq purists had made the piece their own.


The article is great, but I don't think it adequately captures one of the difficulties with this problem, and that is the very high minimum number of characters that must be designed for an Urdu font. I think it is an order of magnitude higher than many would guess it at first.


The Wikipedia article on Nastaliq puts the number at around 20,000.

https://en.wikipedia.org/wiki/Nasta%CA%BFl%C4%ABq_script


This article was making rounds in my social feed about a year ago. Disturbed by the extremely weak and some absolutely flawed arguments, I wrote a note: https://www.facebook.com/sharjeelqureshi/posts/1015180699836...


And for the Arch Linux peopkle, have a Nastaliq font if you need it (or want it).

https://aur.archlinux.org/packages/ttf-nastaliq/


In the 1960s, the Government of Pakistan tried to make naskh common in order to facilitate mechanization of printing. In order to do so, naskh was introduced in schools so that children get used to it from an early age. Some books started appearing in naskh. The daily newspaper Nawa-i-waqt started printing its second page in naskh. It was very unfortunate for Urdu that these efforts failed. The reason was simple. Instead of requiring that the textbooks be typeset in naskh, they continued to be handwritten, albeit in naskh-like script. Gradually after a few years that naskh started resembling nasta’aliq more and more and eventually all traces of naskh disappeared. This, to mind, was the greatest setback suffered by the Urdu language printing.

Let me explain why I call it the greatest setback. Urdu books and newspapers were written by hand and then lithographed. The process was slow, mistakes were plentiful, dots above and below letters were misplaced, letters such as daal and waaw were indistinct from each other, and many times the printing was illegible. Please try to read a book produced by that method, and then compare it to the same book typeset in naskh; you will note the difference. I have a copy of Divan-e-Hafiz published in nasta’aliq in the subcontinent, and another copy in naskh published in Iran. The difference in the clarity of text in the two is phenomenal.

This abortive attempt in 1960s to switch to naskh had nothing to do with religion or Arabization. Ayub Khan, in whose time the effort started, can hardly be accused of religion-inspired initiatives. It was simply to promote mechanized printing. If anything, later the religion-enthusiast Zia-ul-Haq did nothing to popularize naskh. Ironically, computer has saved nasta’aliq, since software is now commonly available which is now universally used to compose material for printing. But naskh is still clearer to read, and in my opinion it is not too late to give it another try, and use naskh, while reserving nasta’aliq for calligraphy. Undoubtedly nasta’aliq is elegant and no script can match its beauty.


For Hindi speakers who don't read Nastaliq, the sign on the wall is hilarious. Transliterated to Devanagari, it reads:

हाय मैं मर गई ऐन्ना (इतना) टेस्टी बर्गर


@alieteraz - your comments below have been extremely informative and though urdu occupies the collective conscious of all Indians through bollywood (90% of melancholic melodies are invariably in urdu), we have never given the script much thought.

I think you fought admirably with Apple, Twitter and Microsoft for getting Urdu fonts included ... but that is the wrong battleground.

The arena for you is simply the browser and mobile . All you need to do is get Nastaliq scripts adopted into Google Webfonts (free for anybody to use in their webapps) and into Android (installing a font through a Launcher theme is incredibly easy [1]. Do NOT try getting it included in the Android base)

do you think you can get together a Kickstarter to fund a nastaliq font, good enough to be usable ? and then try getting them into webfonts and android themes ?

[1] https://play.google.com/store/apps/details?id=pete.app.apext... and http://appcrawlr.com/android-apps/best-apps-launcher-fonts


For web development, perhaps you could create a simple system that detects if the user's device supports nastaliq. If not convert, it to naskh or Romanized Urdu (not sure which is preferred). As for dealing with all the data already written in naskh and Romanized Urdu, this is a bit more complicated because I am guessing there is not a one to one match between naskh and nastaliq (naskh has less letters), and Romanized Urdu and nastaliq. You would probably need to collect some data and use some machine learning techniques. If it is not too big you could use a client-side Javascript program, otherwise something like a Chrome extension or an API. There are similar things done for Arabic to handle various forms of Romanized Arabic (user does not have access to an Arabic keyboard) and Non-Modern Standard Arabic Dialects: http://www.yamli.com/


Data is stored as Unicode (except some really old desktop app formats that were written before Unicode). Naskh and Nastaliq are font styles, used when the system displays the data. The solution is simply that good Nastaliq fonts be created and then used on websites where Urdu is being used. Since Urdu fonts can be much larger in size than Roman fonts (the fonts have to specify joining behavior between all possible combination of characters), delivering over the web is not really viable. Fonts have to be native to all the OSes in existence, so that websites can use them.

(I don't think Roman Urdu is ever used for anything serious.)


The author of the article mentions that Naskh does not contain all the letters that Nastaliq has, so if someone types an Urdu text in Naskh it would not have the same underlying Unicode if they had typed it in Nastaliq. Correct me if this is wrong.

My assumption is that not all OSes will adopt Nastaliq simultaneously. So assume that the website is storing data in complete coding for Nastaliq and they want to send it to a device that cannot render Nastaliq so the Unicode should be converted so it is properly renderable in Naskh.

The other half is just to make one's Urdu language web experience completely in Nastaliq. Even unimportant stuff written in Roman Urdu and then converted to Nastaliq might help promote the use of Nastaliq.


Yes, that's wrong. The author was talking about that fact that Urdu has more letters than standard Arabic, mostly additional diacritics etc, which makes entering it with an Arabic keypad/keyboard painful.

Both Naskh and Nastaliq can be encoded using the Arabic block in Unicode, it's just that Nastaliq's vertically stacking nature makes it difficult to deal with in computer systems that expect clean rows of text.


Ok, I went back over and I did misread that part of the article. The problem is not as complicated as I thought, but that vertical stacking is definitely a challenging problem to handle.


A big reason nastaliq is not widely available on our screens is because rendering it correctly requires 'context sensitive shape substitution'[1]. Simple substitution of one character for another is not enough.

I've spent a couple of hours looking at how to render my own text, perhaps using something like http://typeface.neocracy.org/, but my day to day work is so far from text layout, rendering, client side javascript that it would be (too) large an undertaking.

[1] http://ww.cle.org.pk/Publication/papers/2006/context_sensiti... [PDF]


This feels like unusual timing. I am presently reading 'A Suitable Boy' by Vikram Seth. One of the main characters is sent into virtual exile where he learns how to read and write Urdu. Until now I don't think I'd heard of it.

My own opinion is that we should preserve history, remember it, but not morn it. We are programmers! How many writing styles and languages have we seen die to make way for something better (or worse?). This is a natural survival of the fittest cycle that effects writing styles, languages, civilisations and everything else. I have Japanese friends who read even less Kanji than I do. I see that writing style dying out over the next 50 years also.


I don't like how the article opens with the premise of a people struggling their whole lives to master writing in a particular style only to have their expression oppressed, then proceeds to talk about font rendering..

Generally though, I'm not really sure how I feel about trying to get people to care about something so precious in general. One one hand it's a shame to lose culture. On the other the internet/technology has its own culture and is it really lost in the age of 5c per GB storage?


It seems like something analogous has happened with European scripts. There were huge changes in how characters were presented between different styles of handwriting, engraving, printing presses, and finally everything the computer world has brought us.

It just seems like the author has an extreme emotional attachment to a typeface.


I don't know about the author's state of mind, but generally speaking, this is not just a matter of preference. For those of us who grew up reading Nastaliq, other type faces such as naksh are fairly difficult to read 'fluently.' Imagine if all books, newspapers websites were hand written by someone with poor handwriting. It would actually slow down your reading speed. Reading is such a natural thing we do on a daily basis that we notice even small things which cause us to slow down.

btw, I use 'bad handwriting' simply as a way to explain the effect of different writing style on reading, naksh is not good or bad -- it is just different from what Urdu readers are used to.

Also, nastaliq isn't just different because its letters are formed differently (there being a one-to-one correspondence between different typefaces). Nastaliq letters are laid out in a context sensitive manner where a letter can take on a fairly large number of shapes, depending on where in the word it is (and which letter it is next to). My earlier post links to a paper which describes how Nataliq rendering requires context sensitive shapes. Basically randomly capitalize letters in everything you read online. Eventually you will get used to it, but that doesn't mean it won't be annoying for a long time.


That's not the only way in which Urdu is "dying" - due to colonization, basically all modern words are in English.


Nastaliq is lovely to look at.


Anyone ready to work with me on creating a Nastaliq webfont?


>Utility had defeated tradition.

There is hope for humanity yet!


you need to come up with more details and authentic details because some stuff you mentioned it utter crap.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: