…both use accents on vowels, but only Scots uses grave (left-pointing) accents, like on à in Gàidhlig.
Just a quick note of caution here: "Scots" and "Scots Gaelic" are two completely different languages, the former being a Germanic language closely related to English and spoken in the Lowlands, and the latter being a Celtic tongue largely confined to the Highlands, the Western Isles, and Nova Scotia. If you can read English you can probably make some vague sense of written Scots, but unless you have training there's no way you'd understand a word of Scottish Gaelic. This article is referring to Gaelic, not Scots.
Some of my Chinese friends use の instead of Chinese equivalent 的 just for fun. Personally I distinguish them by just going and learning the languages. It's easy to distinguish them by noticing Japanese has curvy characters mixed with blocky complicated Chinese ones, where is Chinese is 100% complicated blocky characters.
Also I just want to put it out there randomly that if you want to learn a language but believe you can't, you are almost certainly wrong. If you are able to read this text, you have demonstrated possession of a wet sack of neurons capable of learning a second language. I've witnessed or read about all sorts of people learning a new language; old people, shy, autistic, even while dealing with brain cancer. It is a myth to think children can learn faster than adults. The only time this happens is when the adult is hindered by there own reluctances.
Go get some beginner materials with audio, not just text, and dive in. Don't waste time torrenting 12312^23 TiB of learning materials. Glossika, Teach Yourself X, Xpod, Xclass101, whatever. Learn how to ask some trivial questions relevant to your life, write them down because you will forget, then go chat with a native speaker somehow. Read out your questions because you're nervous and forgot, and then fail to understand anything they say, but just pay attention and listen to the sounds of the language. Then go home and learn a bit more, but don't worry too much about memorising anything, just listen, and comprehend a little bit. Then meet up with a native speaker again with some more questions prepared. This time you might understand 0, 1, or a few words, still be nervous, but you'll be a little bit better than last time. Basically you just keep this up without giving up and you will pick up pace. For inspiration, look at blogs like Benny Lewis' and others. It may take 10000 hours to master, but it only takes hundreds of hours to find yourself understand and contributing to group conversations comfortably. If you can enjoy the process, you'll be able to study for N hours for any N as the clock keeps ticking regardless. Just set N=1000
For some languages a couple hundred hours might be enough to participate meaningfully in a normal conversation. But if you're for example an English native speaker and want to learn Chinese, you'll have a really hard time understanding anything but very simple sentences.
On the other hand if you know English and German well, it is very easy to learn for example Dutch, and a couple hundred hours will get you really far.
You're discussing about the language overlap being relevant for the amount of time and effort necessary to get results. That is true, of course, and it if often thanks to the (shared and) already possessed mental models necessary to master the new language. The most important bit of this mental model of a given language is the way speakers phrase their thoughts. This is exactly what you get right by following the brendyn's advice of taking it slow. In time you'll sense and reproduce the natural expressions. (This advice is especially valuable for learning English, BTW! Trying to make sense out of the compound verbs is a poor strategy, therefore just let them sink in slowly in your mind, each within an appropriate context.)
> It's easy to distinguish them by noticing Japanese has curvy characters mixed with blocky complicated Chinese ones, where is Chinese is 100% complicated blocky characters.
Additional bonus: Korean uses lots of basic angles, squares, and lines and many Hangul have "three parts" eg. 한국어의 It's a beautifully simple writing system.
You can go quite a long comment chain in Japanese without seeing の. I always tell people "look for lots of simple characters that can be written with only 2 or 3 lines mixed in between a bunch of really complex characters".
来週は学校に行きません
For those who don't know Japanese, only following my rule, can you identify the Japanese characters in the sentence above?
Except with names, Japanese writing will have a bunch of Chinese characters with many "simple" Japanese characters sprinkled in between. To someone who doesn't read Japanese, both are "unintelligible" but I find people can identify the more complex Kanji from the more simple Kana quite easily and can point them out with pretty good accuracy, even if they can't read any of the kana.
The above example without Japanese characters if you'd like to see if you guessed correctly:
Another quirk about the の thing is that some Chinese-speaking people, in Taiwan at least (not sure about the mainland) use の as a colloquial replacement for 的, presumably due to Japanese cultural influence. So it's not completely impossible to see の in a Chinese-language text.
(They still pronounce it "de", only the writing is different.)
Being in Japan is always a bizarre experience as a student of Chinese. I can perfectly understand a number of signs and get the gist of a surprising amount of other written language despite not understanding a word of it spoken aloud.
Twist: When you see a pair of complex-looking Chinese character strings, but if one of them looks somewhat "simpler", then chances are that it's Chinese and the other is Japanese. Because Mainland Chinese people use simplified characters.
来 and 学, from the original example, are both simplified characters (the traditionals being 來 and 學). I was a little bemused to see them in an example of Japanese text, but it turns out they are the common Japanese characters as well. Japanese has its own set of simplifications ( https://en.wikipedia.org/wiki/Shinjitai ), often overlapping with the Maoist simplifications.
In a different pattern, mainland China simplified 龍 to 龙 while Japan simplified it to 竜. Despite being a "simplified" form, 竜 is actually the oldest of those three characters.
>> For those who don't know Japanese, only following my rule, can you identify the Japanese characters in the sentence above?
I would technically meet that requirement, but knowledge of Chinese makes the question pretty easy regardless of knowing Japanese. ;)
The reason for this is contextual (internet posts), not grammatical. You'll get more results if you remove the polite ending and/or the topic particle (try "来週学校に行かない").
I hope you don't find it nonsensical, though, I understood it just fine.
And checking ghits for plain forms shows:
"学校に行く" - 3,680,000
"学校へ行く" - 444,000
so I think the other way is actually the variant!
BTW, I feel like "学校に行く" - where に means "for/into" - has a sense of "going to school to go to class", but "学校へ行く" - where へ means "towards" - has a sense of "going to the school building as a physical place".
Hi, I'm a gainfully employed translator of Japanese and linguist.
Short answer: the sentence is fine as it stands, although your alternative is equally grammatical (if less idiomatic - "学校へ行かない" gets about 1/4 of the ghits of "学校に行かない" by my count, and personally I'd never use へ).
Long answer:
Japanese draws a distinction between verbs of A→B movement (e.g. 行く to go; 引っ越す to move house; 移動する to move/change location) and verbs that describe the manner of motion (e.g. 歩く to walk; 走る to run). The A→B verbs can happily take an indirect object complement (a に phrase) as the destination, because there's really no other possible meaning. You can't "go for school", as you put it.
Verbs of motion, on the other hand, are less flexible. These can't take an indirect object complement, and need a more expressly destinatory case (think へ、へと、まで). The reason for this is that a change of location is not implied in the verb, however counterintuitive that might seem to an English speaker.
Interestingly, motion verbs can take a direct object (を), such as in 街を歩く 'to walk around town'. Another indication that they're not strictly destinatory. Also they can take adverbs that further qualify motion, such as ぽつぽつ歩く 'to dawdle/mosey/toddle'(? tough to translate), whereas 行く can't. A→B verbs can however take adverbs that qualify the speed, like ゆっくり行く 'to go unhurriedly' - but this is different from ゆっくり歩く 'to walk in a slow manner'.
This is my first ever comment on HN so I've tried to be as informative as possible... If you have any counterexamples I'd be happy to try to explain them.
Only tangentially related, but why can't spelling check software automatically figure out which language I'm typing in? I write mostly in English but sometimes I write in French or in a mix of both languages and I usually struggle with the spelling corrector which keeps bothering me.
As hirsin said, SwiftKey does this really well on Android. It's probably my main irritation on iOS: having to switch keyboard all the time when I could just type English on the Swedish keyboard.
SwiftKey (Android keyboard) does an amazing job of this. I routinely switch between French and English, and usually by the end of the first word in the other language my autocorrect and suggestions are both in the right language.
AFAIK it uses Markov chains, so I think it just lumps all dictionaries together, and as soon as you write a couple of words, the probabilities for the following words will be in the proper language. It doesn't even need specific rules per language, it's all automatic.
Ð/đ may also be Croatian, where it sounds like a "dj". Technically it could also be Serbian (which is pretty much the same spoken language, called Serbo-Croatian), but Serbian is usually written using the Cyryllic alphabet while Croats chose Roman letters.
To be fair, the original question was how to recognise English text, not how to recognise the archaic microcosm of English text that resides between the covers of The New Yorker.
And anal English speakers (such as the copy editors of the New Yorker) also use the diaresis (the double dot over a vowel). In English, it is used over the second of two vowels in a row which are voiced separately, such as naive (should have the double dot over the i) or cooperate (double dot over the second o). The distinction is between a word like coop (meaning a house for chickens) in which the two vowels make one sound and a word like cooperation, in which the two o's are separate sounds.
Nitpick: In Turkish, "ğ" is silent by itself but it makes the pronunciation of the vowel before itself longer and sometimes makes the pronunciation end at the back of the mouth, especially after "e". "Erdoğan" is indeed just "Erdooan" though.
Correction to the article: There is no Ů in Czech (except for CAPS LOCKED words) - The longer "u" is written as "Ú/ú" as the first letter in a word, and "ů" in other positions (strange for sure, but because of historical reasons)
It's not just historical. "Ů" is phonologically a longer "u", but it is actually an alternation [1] of "o": for example see how nominative "dům" becomes "domu" in the genitive (likewise "stůl", "bůh", etc.). When a "u" becomes longer, instead, it becomes a "ú" as the first letter of the word, but otherwise it becomes "ou"; for example see how the feminine nominative of (some) nouns and adjectives is "a" and "á" respectively, while the accusative is "u" and "ou", or how the perfective companion of "kupovat" is "koupit".
(Also, see how I sneaked in a "Ů" in the second sentence :)).
> Welsh is actually quite different from the other two. It uses lots of ll and ff and it uses w as a vowel (e.g., cwm).
Welsh also uses a circumflex accent to extend any of the vowels, and since both 'w' and 'y' are vowels in Welsh (leading to many jokes by English speakers about words with no vowels) they can have the circumflex accent too. I've had problems in the past finding the alt-codes to generate w or y with a circumflex accent - so those may be unique to Welsh.
From http://symbolcodes.tlt.psu.edu/bylanguage/welsh.html :
> Because of the writing system, Welsh places accents on the letters w (phonetic /u/) and y (phonetic /ɨ/ or /i/), which is very unique in languages of the world. These symbols require Unicode support apart from that of other Western European languages.
Persian will have three dots in a triangle above a single upward stroke or below the line. Arabic only has the three dot combo above the script on a multiple upward stroke grouping (sometimes a flat line between upstrokes).
The same is true for Urdu as well, so if you want to distinguish Urdu from Persian: look for a backward moving (i.e. towards the right) horizontal stroke at the end of a word. This stroke will always run under the preceding letters of the word, except that some dots of the preceding letters may be moved beneath the stroke in order to avoid collision.
Nothing on Filipino/Indonesian languages? Those always confuse me, since the users also heavily mix them with English, so you might see a comment mostly in English but also have a bunch of native words or phrases mixed in.
I like Hacker News for that. The topic of this article is interesting. Thanks for bringing that up. However, when you read the comments here, you realise the article is quite wrong :)
Yes, if you spot ß, that's a dead giveaway for German. You can't really rely on it alone for identification however because it's not that frequent (or rather, it's very inconsistent – German can run for paragraphs without a single ß only to make up for it with five of them in a single sentence). It's also not used at all in Swiss German.
Another near-certain giveaway is that all nouns in German are capitalised. The only other language that does that and uses the Latin alphabet is Luxembourgish, and you're probably not looking at that.
There is a character only used in Taiwanese, not Chinese: 互
By that, I mean the Taiwanese language, which is not the same as Mandarin Chinese. Both languages are used in Taiwan, although Mandarin is the official language of the (outgoing) KMT government. Taiwan number 1 ;-)
On a meta level I find that just a little troubling. It sounds to me like "Crap, I agreed with this until I noticed it was an opinion from a tribe I don't identify with - so I can't agree with it". Maybe theweek.com is some uniquely evil thing I haven't heard about?
Yes, but "quality" is both vague and subjective, not only will different people evaluate aspects of quality differently, different people will legitimately have different views on what components "quality" of a link has. I don't think it's unreasonable to consider the source as a one factor in overall quality (if nothing else as a proxy for things the rater is unable to evaluate about the article in isolation.)
But why should things other than the article directly linked to matter? Why should it be acceptable to downvote an otherwise interesting and correct article just because of the source?
That smacks of voting for ideological correctness over truth or interestingness, a problem that otherwise intelligent people should be able to look past. What makes this site meaningfully different from the front page of Reddit if people will crap on an article because it comes from a source that doesn't align with their politics?
> Why should it be acceptable to downvote an otherwise interesting and correct article just because of the source?
"Correct" is often a probabilistic assessment, not something a potential up/downvoter can determine absolutely.
The source is often an important input to that probabilistic assessment.
> That smacks of voting for ideological correctness over truth or interestingness
Different outlets of the same ideological bent (whether relatively neutral or not) can have wildly different editorial standards which produce wildly different reliability.
I wasn't implying that upvote means agree - only that upvote and agree are positive rather than negative sentiments (because I was proposing a broad pattern match not an exact semantic match). But your explanation does make sense to me, that is a plausible stance, thanks.
Just a quick note of caution here: "Scots" and "Scots Gaelic" are two completely different languages, the former being a Germanic language closely related to English and spoken in the Lowlands, and the latter being a Celtic tongue largely confined to the Highlands, the Western Isles, and Nova Scotia. If you can read English you can probably make some vague sense of written Scots, but unless you have training there's no way you'd understand a word of Scottish Gaelic. This article is referring to Gaelic, not Scots.