Hacker News new | past | comments | ask | show | jobs | submit login
The miracle of reading and writing Chinese characters (upenn.edu)
100 points by lermontov on March 28, 2017 | hide | past | favorite | 49 comments



Brian Spooner's comment on the article is interesting, too:

"...In Iran I was always fascinated and puzzled by seeing Persians read Persian much faster than I could read English. [...] Our answer was that literate Persians (and Urdu-writers) think, see and act much less in terms of individual letters (than we do, and than we are taught to do as we learn to read and write). They see and write pen-strokes, each of which may contain 1, 2, three or even four letters, possibly (though rarely) more. The letters are written differently according to where they come in the pen-stroke and the pen-strokes are written differently according to what comes before or after them. Readers read pen-strokes (the number of which is very large), not letters. They read faster because they are scanning by series of pen-stokes. We also tend to read English by scanning words rather than reading letters, but the Persian reader is much more efficient and faster at then. The general approach and relationship to writing is different from ours. Its practice is different. And its result is different. There is much more to be said about this. But enough for this comment."

I don't know anything about Persian, but I now feel the need to go learn more.


That's because it cheats!

Arabic script has three layers:

1. The base forms (rasm) denote groups of consonants.

2. Then dots distinguish individual consonants (e.g. B, T, TH, Y and N have the same basic form but have different arrangements of dots).

3. Finally, diacritics - loops and dashes - are used to denote vowels.

When Arabic is fully laden with vowel markings, the result is visually noisy. So usually vowel markings are left out.

But without vowels, it is impossible to tell how a word is pronounced if you don't already know. It's a really heavy price to pay for any advantage in reading speed.


This doesn't make sense to me. When I'm reading, it feels like comprehension of ideas is the limiting factor, not optical recognition.

A cursory internet search backs up this intuition:

https://persquaremile.com/2011/12/21/which-reads-faster-chin...

https://www.reddit.com/r/askscience/comments/2elbtp/do_avera...


One of my friend was a fast reader. I was astonished by how well he remembered a book he has read so fast. It seems that optical recognition is a distraction that reduce the focus on the ideas. When you read faster, this distraction is reduced.


It would be interesting to see if the same system could be implemented for English. Say, make a text renderer that groups common sequences of 2-to-4 letters into one distinct (but still immediately legible) glyph; the set of available glyphs would remain constant. With practice, readers' brains might learn the glyphs as a "shortcut" to take in these sets of letters- almost like a very basic compression scheme programmed into your reading-comprehension neurons.


What you're describing sounds a lot like shorthand.

https://en.wikipedia.org/wiki/Shorthand


I'm pretty sure I read, write, and type words as glyphs.

When I read the word 'word', I don't read 'w'-'o'-'r'-'d'.... 'word'!

Alternatively, when skiing in Japan, having only learned katakana on the plane-ride, it was less than a week before I reflexively processed 'スキー' as 'ski' instead of sounding out su-ki in my head.


Yep having taken Japanese in college, after a while you don't read letters, you read words. Basically we recognize what a word is based on the shape of the word.


Chinese is a lot of fun as a language, I think you'll be more pleasantly surprised by it, and there's a hell of a lot more Chinese content. Tonnes of cool papers and forums.


Arabic-based scripts like Persian and Urdu are a lot like cursive handwriting.


Writing is a technology invented to help practically disseminate information over time and space. It's a good invention -- arguably one of our best -- regardless of the minutiae of alphabets vs. syllabaries vs. ideographs, etc. However, one shouldn't confound it with the thought patterns of speech and the mechanisms required to produce and process language.

Chinese is no different to English (or any other language) in that respect. Do you visualise the shape of the word "bee" when asked to think how to spell it? I doubt it. The act of writing is completely removed from the linguistic process.


Chinese is different, regarding homographs. Written English forces a lot of disambiguation of a purely abstract kind (English homograph collisions don't usually reflect associations, just random collisions) else you can't cope with the homographs. Ideogram collisions or overlaps are far more meaningful, generally non-co-incidental. (Terrific for poetry.)

Examining all this, some while ago, a Chinese professional philosopher and I concluded that this had indeed had some real influence on the two cultures and thinking. This point isn't about visualization per se, but instead the fact that ideograms combine concepts, not sounds. Thanks to its extreme borrowing and therefore an abundance of synonyms, English is no doubt the most extreme example of a language with random homonym and homograph collisions, forcing both listener and reader to make constant choices between abstract (unassociated) options. That's not generally how the human mind works, brains are association engines. So it's a good bet that the fact that English is the language of science and technology (for now) is not entirely coincidental.

This is no bar to progress for China today, however, to the extent that the hànzì alphabet is adopted for text.


> So it's a good bet that the fact that English is the language of science and technology (for now) is not entirely coincidental.

I have to disagree on this. For me it seems that the world politics was the major, if not only, reason why English took over French and German in these fields (too).


I can't interpret this. Do you mean the supremacy of the english navy due to the improvements in iron forging that gave the english iron cannons first around the time of the Spanish Armada? That world politics?

It's true that Germany had better rates of literacy 500 years ago.


No, it is because the British Empire spanned a quarter of the globe and the Americans won World War II. No other language has had the advantage of being the language of two world superpowers with such vast geopolitical influence spanning over 200 years and huge expanses of the globe. French was the language of international diplomacy, but the French language never had the unparalleled reach that English had (and still has to this day).

In terms of technology, the Industrial Revolution and American scientific achievements were important drivers too. German was the language of science, but when the vast number of new findings are published in English (due to the sheer volume of American/British scientific output), its decline was inevitable.


Those iron cannons, together with a wealth of Elizabethan technical innovations, are a VERY large reason why Britain spanned the globe. The advantage of world-dominance didn't precede innovation, it resulted from it. Better weapons, better tactics, better hull designs for ships, better mining techniques and on and on and on.

The U.S. won WWII with no small amount of accumulated Western technology and technological advance. What WWII did do was allow the US (and Russia, but not Britain) to simply seize every German patent in existence without compensation, afterward. If you research Germany technological history you might have a chance to overturn my hypothesis. But the industrial revolution didn't begin there. In any case you'd have to do a lot more homework than you have so far. Japan was a latecomer, and can't count as a counterexample.

English as the primary language of technology and science long predates WWII. The exception might be chemistry (German.)


> So it's a good bet that the fact that English is the language of science and technology (for now) is not entirely coincidental.

Unless it was the intricacy of English spelling that finally drove Hitler over the edge, resulting in the US rising as the sole superpower while France and Germany crashed and burned, I'm not sure how they could be possible related.


Repairing a clock, or creating a weaving machine requires thought with an underlying simplicity (and definitiveness) to its logic - you have to take simple facts and relationships (without rich associations) that can piled on top of each other and yield precise calculations. Poetry is a great preparation for many things, but probably not clock design or repair.

Math, logic, mechanics - what's so different about these tasks is that they're simple-to-complex. But because our minds are association-machines, they're all hard for us. Chess gets you closer to that sort of thinking than Go does (it's a more intuitive game.) However Chinese chess has a long history and isn't much different than Western chess, so chess hasn't been a Western advantage. And a language with an alphabet and tons of homonyms strips away many of the rich associations present, say, in the written Chinese language. I'm going to guess you haven't experienced and can't imagine the difference, or perhaps the notion that the Chinese way might be richer or superior in any respect.

Compare many of the mind-bending "reasonings" of Medieval thinkers from a time when Latin was the medium of scholarship. So many of these, which were taken very seriously at the time, are just stupifying to moderns - because the arguments are so easily taken over by coincidental associations which to a modern person have no logical importance at all, of any kind. Latin borrowed, but never as freely as English did, and by Medieval times was a pared-down, rationalized version of Ancient Latin. No forest of homographs.

Another pretty well-known hypothesis holds that english common law, which in contrast with so many other top-down legal traditions, necessitated the logical reconciliation of decision across many circumstances, etc, over hundreds of years, reinforced the mechanical bent of mind. I guess you'd find it plumb crazy too.


How did you conclude this? Through experimentation or reasoning?


I don't think "good bets" are conclusions. The professor and I were comparing different thinking patterns. He had the most exemplars having moved to North American within a year or two of the conversation.


Echoing abecedarious, I do visualize the letters. In fact, I've read in some books about different types of language learners that there's a category/subset of visualizers that can only visualize the letters. (Not everyone is a visualizer, either.) It noted that these people are often very good at spelling and spelling bees.

Interestingly enough this seems to play out in various ways for me -- I can't remember people's names unless I remember how they're spelled, and often cannot recall the correct pronunciation if it's somewhat uncommon. And because my knowledge of Chinese characters is middling at best, I'm unable to remember Chinese names since I can't visualize and know the characters used!

EDIT: I should note I'm practically aphantasic, and that my visualizations are closer to being kinesthetic spatializations.


I do visualize "bee", don't you? Maybe relevant: many people notice puns that I don't, perhaps because I think of words more visually? And back in school, spelling tests were an easy 100% every time with no study; I was more likely to not know the pronunciation.


> Do you visualize the shape of the word "bee"

Not sure about Chinese, it is supposedly much more phonetic, but the Japanese unquestionably do visualize words to the point where they on occasion are absolutely dumbfounded by words that are just slightly out of context and unless you tell them what kanji you meant they won't be able to easily associate pronounced words to meanings. In my opinion this is so deeply ingrained that society and language subtly changed in a way to alleviate the embarassment from misunderstandings paradoxically by making the already highly context sensitive language vague while also placing the responsibility of interpretation (but at the same time also allowing for some generous amount of flexibility) on the listener. The result - in my opinion - is a spoken language that is in a way tangibly insufficient for _practical_ dissemination of information exactly because of this constant need for visualization.

You can find the same thing in written language where writing words just in the phonetic alphabet renders it completely unreadable without the symbology. In return having the words rendered in the logography frees the mind from having to visualize and is performing much better as a coduit for information dissemination (with the small caveat that the spoken language already changed to include the various ambiguities)

In the end, while you don't necessarily visualize the image of a bee or the shape of the _phonetic_ spelling of the be, you do visualize the placeholder character for bee. (BTW interestingly, I don't think there is a separate word for bee and wasp in japanese, so naturally there is no difference perceived)


> The act of writing is completely removed from the linguistic process.

https://en.wikipedia.org/wiki/Phonocentrism


I was under the impression that Chinese characters were created as part of religious rituals and originally had only very vague meanings.


I'm sorry, but that sounds like something that Athanasius Kircher[1] (who created a fascinating but completely off-the-wall translation system for Egyptian hieroglyphics[2]) would say.

As far as I know, Chinese writing likely followed the same path as Sumerian and Egyptian hieroglyphs (both of which, IIRC, have well-attested archaeological evidence): originating as markings for inventory-like purposes (a drawing of two penguins => this shipment contains two penguins) and then expanding into recording spoken language.

There is a complicating factor, AFAIK, in that the earliest known form of Chinese writing, the oracle bone script used to write on bones for divination purposes, already has clear meaning as words and is possibly traceable through to modern characters.[3][4]

On the other hand, as well, "had only very vague meanings" seems to be the initial take on every untranslated writing system: Chinese, hieroglyphics, Mayan, Minoan/Mycenaean, etc.

If you're interested in this sort of thing, I recommed Umberto Eco's book, The Search for the Perfect Language.

[1] https://en.wikipedia.org/wiki/Athanasius_Kircher#Egyptology

[2] https://publicdomainreview.org/2013/05/16/athanasius-kircher...

[3] https://en.wikipedia.org/wiki/Chinese_characters#History

[4] http://www.ancientscripts.com/chinese.html

Edit: I just got to digging around and remembered that the cover of John DeFrancis' The Chinese Language: Fact and Fantasy has the same phrase in several forms of Chinese:

https://upload.wikimedia.org/wikipedia/en/7/7f/Defrancis.jpg

The top row, IIRC, is oracle bone script, the third is modern, simplified Hanzi, and the bottom is pinyin. The phrase is "Chinese Language", roughly.


This is strange for me - and now I need to ask my Japanese acquaintances the same questions!

I'm able to visualize the Chinese characters I know just fine, and I'm by no means fluent. I speak Japanese instead of Chinese, but I don't think that would make a difference when visualizing the characters since most of them are one in the same.

I can visualize「雑」(from: 複雑) because of my familiarity. I struggle with 「爆」, a character I just recently learned, because I am not familiar with it. I would have thought that natives of Chinese or Japanese would have no difficulties with visualizing the characters! So it is interesting for me to hear anecdata that they struggle with it.

E:

Reading Xophmeister's reply it actually makes sense to me. For myself, Chinese characters are much more artistic and it is like visualizing a familiar painting or drawing. For a native, they are closer to a tool used for writing, like how English is for myself. For English, I can only visualize words with eight or nine characters or so. It's a weird practice to try and visualize the word "adenohypophysis" - it is difficult. So I can imagine any non-trivial Chinese character would be difficult to visualize for native speakers.

E2:

Replaced "Kanji" with "Chinese characters" in a few places it slipped through.


Kanji -> Hanzi would work.


I've been told this more times than I can remember and I always forget about it. Here's hoping this time is different. Thank you for the reminder.

It really gets tiresome to type "Chinese characters" all the time, and I try to avoid referring to them as "kanji" when used in a Chinese context.

Cheers!


To me "kanji" is an English word but "hanzi" isn't. I never saw any problem using "kanji" in a Chinese context, since "kanji" seems to be a Japanese sounding of the word "hanzi". But unfortunately some pedants (or one pedant with many usernames) complained when I used "kanji" so nowadays I stick to "Unihan" when writing in a technical context.


Chinese people generally don't like the usage of Japanese terms when referencing their language, mostly due to WWII and historical reasons.

Usage of Chinese characters in Japanese language - kanji

Usage of Chinese characters in Chinese language - hanzi


AFAIK the historical reasons are well-placed. Japanese Kanji is essentially a spin-off system, since altered, wherein many of the characters that exist in both Chinese (eg. in the Tang Dynasty) and Japanese now have completely different modern meanings. In addition, grammars in both places have since changed (Japanese possibly never really had one, deferring to Middle Chinese in early use, and only slowly adopting characters elsewhere, interspersing other writing systems to provide grammatical structure). Therefore it makes little sense to conflate the two when discussing language, though the evolution of Japanese character use can provide some insight in to Middle Chinese and earlier periods, particularly in subjects such as Nara Period and Silk Road Buddhism. (I am less knowledgeable about Japanese than Chinese, and claim no particular expertise in either, though I do translate old Chinese texts to English for fun and have visited Japan)


"Han Characters" seems to be the preferred way in English to describe traditional chinese, simplified chinese and japanese versions of han characters.


...and you opened the can of worm labeled "Han unification".


Writing is primarily a motor skill, not a visual one. You don't need to visualize a keyboard to be able to touch-type, either. Muscle memory is a form of procedural memory very much distinct from any visualization ability.


Honest question: do you have any experience with Sino-Japanese writing? (I practice ShoDo, while not really have any Japanese language proficiency - I suspect that there is a big difference between writing in a phonetic alphabet and in anything like Chinese or Japanese).


The main difference between using an alphabet and a non-phonetic writing system is that with the latter, knowing the pronunciation does not really help you remember how to write a word. Since hanzi also tend to have more strokes than alphabetic handwriting, the sequences of hand movements you have to produce in one go are quite long. The way to cope is essentially lots of handwriting practice to really burn in each character.


I agree. I think the germinal point here is that visualization is a form of 'recall' and that shouldn't be necessary if muscle memory suffices.


Well, often they can't write these characters at all [1] (this video is for Japanese, but the situation must be similar with Chinese)

[1] https://www.youtube.com/watch?v=sJNxPRBvRQg

I can recognize more than a thousand of characters (thanks to Anki) but visualising/writing them is another matter entirely, and I would probably fail to write a large fraction of them. Just like with alphabetic characters, I memorize the "shape" of the character rather than its sequence of strokes. Only when the character is indistinct, or similar to other characters do I bother to remember which radicals it consists of.

For example the character for "dream" (夢) has a very distinctive shape that is easy to recognize, so I wouldn't be able to write it because I never learned how.

Now take the character 緑 (green). It looks similar to 線 (line), but lacks the radical 白 (white). Because green is not white, you see (not actually, it's just a memorization technique). But it also looks similar to 録 (recording). That's because the character for "green" has radical 糸 (thread), just like "line" character. Threads can be green, and they are also lines. But "recording" uses radical 金 (metal) in its place, because recorders are made from metal. So the only things I remember about 緑 is that it has 糸 and doesn't have 白. And thus, I still won't be able to write it.


I expect that the problem would be less pronounced in China, given that the Japanese can revert to kana if they forget the kanji, whereas the Chinese have no other choice but to remember the characters.

That said, I've heard of, and even seen once or twice, Chinese people forgetting how to write common words like "sneeze" (喷嚏). So it does happen.


I found that memorising the shape in general actually took longer. Remembering each radical allowed me to form a mnemonic story involving them. I learned to write 30 characters a day with this method with ~2 hours study each day.


As a student of Japanese and having previously thought I might have Aphantasia this was interesting to me.

But I'm still not convinced that we understand Aphantasia completely. I think there's a whole spectrum of different conditions in visualisation and everyone has different skills and abilities. Here's mine:

- I can currently recall just under 1000 Kanji

- The learning system I am using (wanikani) doesn't require me to, but from previous study I can write a few hundred

- I can easily recognize people's faces from memory ("never forget a face")

- But if you asked me to draw a face from memory I'd have a very hard time

- I can visualize geometric figures in my mind's eye and move them around

- As a long time software developer I can effortlessly deconstruct a complex system into base abstractions, hold these in memory, and manipulate/recall the constructs in my head

- I get a very abstracted visualisation when reading fiction. Lengthy overwrought descriptions bore me and I throw away most of it anyway, so I tend to favour Charles Bukowski over Dan Brown

- My imagination is similiar, I can set up basic scenes but I can't "see" the detail in any way like looking at a photograph

- My art skills are decent but I'm not working off any kind of internal image. I think of the main points I want to express but I don't know exactly how they'll look until I commit them to paper/pixel

- Sometimes when I'm really tired I have very short (under one second) open eye hallucinations where I do visualize random things in extreme detail, indistinguishable from reality.

- I have visual dreams that feel as accurate as reality

- I have sporadic occasions when I (start to) lucid dream. Before I'm lucid, everything in the dream is vivid, but once I become aware then all the detail of the dream unravels and I wake up shortly after


Writing characters is a bit like thinking of a song. Harder to jump into the middle, easier to start from a beginning point and work your way through it.


It's called "muscle" memory. This is no different than how pianists play piano from memory. They don't actually recall every individual keys but it's the connections between keys that matter.

Ask people to write Chinese words or play pieces of music from arbitrary starting points and you will see their struggle. This is because you're breaking up the sequence into individual parts, which is not people learn things.

To put it in an Anglo-Saxon perspective, when someone asks you whether the letter n is before r you may likely recite the alphabet from start to finish. So you know the answer but to get it you have to rely on some more"muscle" memory.

I am also sure this happens when people try to spell certain words. They can't visualize exactly the word but once they start writing or typing it they get into the flow.


The question seems ill-formed to me. I don't know anything about Chinese characters, but I am one of those aphantasic people described in the post. There is no impact on recall or access to information AFAIK, it's just that you can't generate "fake" input to your visual system. Or whatever system is used by people who can visualize things.

So - I have no trouble remembering faces. I could very easily describe accurate visual details of any number of places I've visited or seen. I just don't know how / am not able to turn that into something I can "see" that is in any way analogous to actually vision.


Why are people in general so confused by proprioceptive memory? I'd be completely nonfunctional as a programmer if I had to "visualize" things in fewer dimensions than my brain stores them.


Huh? I found this article kind of tone deaf.

Writing uses motor memory, there is no need for visualization or rehearsal once mastery has been achieved.

It's no harder to write Chinese than english or any other alphabetic language.


I like pulling apart the different Chinese characters into the radicals.


I have very poor visualization but have been able to learn to read and write over 2,000 characters as an adult. It's all motor skill and thinking in metaphors (for _me_).




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: