Hacker News new | past | comments | ask | show | jobs | submit login
Great Vowel Shift (wikipedia.org)
259 points by docdeek on April 14, 2021 | hide | past | favorite | 268 comments



The vowel shift is one of the major reasons why Shakespeare's poetry doesn't rhyme when spoken with modern English. There was a rather fascinating video that went around awhile back of a father and son demonstrating the changes.[1]

[1] https://www.youtube.com/watch?v=gPlpphT7n9s


Shakespeare's plays post-date the bulk of the Great Vowel Shift. The reason they don’t rhyme is only partly because of the very last phases of the Great Vowel Shift, and partly due to the changing English dialectal landscape.

The classic English author for whom the Great Vowel Shift is most relevant, in terms of audiences today not pronouncing the text anywhere near as society then would have, is Chaucer.


It's funny, for at least the first example, original pronunciation just sounds like Hagrid from the Harry Potter movies, which is especially interesting considering that's intended to be a less formal/"lower class" accent. Reminds me of something I heard once that the modern american accent is actually closer to the british accent at the time of the revolutionary war, and it's the british accent that's changed more since then. No idea if that's actually true, but it was really interesting when I heard it.


What Americans think of as “the” British accent is known as “Received Pronunciation”. An accent that arose in London in the latter 19th century.


There's an accents guy on Wired's youtube channel that has a really interesting slew of videos on accents. His latest ones are a bird's eye view of American accents: https://www.youtube.com/results?search_query=wired+accent+ex...


The British (well, everywhere really) have a lot of accents and dialects! Most of them never come to your awareness.


Hagrid has a West Country accent. The character is supposed to come from the Forest of Dean, in Gloucestershire, and the accent is approximately from that area. The most obvious difference from RP English is that it has the rhotic "R", which is how it's closer to standard American and to historical English accents. The non-rhotic R is also found in the Boston accect ("hahvahd yahd").


I seem to remember that the closest thing to an old school English is a costal Maryland/Virginia accent. Which sorta makes sense historically speaking.


Another slightly related video by Tom Scott https://www.youtube.com/watch?v=dUnGvH8fUUc


It's fascinating how much of this holds out in various accents across the UK, most particularly outside of London and going up north. My mother and grandparents on her side have similar-ish pronunciation of some words with their thick Lancs accents. Not quite farmers accents but still quite archaic in sound sometimes.

(Imagine other examples, like pronouncing 'couch' a bit more like 'cooch')


Poetry does not have to rhyme.


They really missed an opportunity by not calling it the Great Vowel Movement


Vowel shift is a term of art in linguistics, so I'm not sure the pun would be worth it.


> Vowel shift is a term of art in linguistics

Aside: I see "term of art" everywhere on HN lately. Why?! What does it add to saying "Vowel shift is a linguistics term" or "a term in linguistics"? Why do people want to sound like patent lawyers? Am I missing something?


"A term in linguistics" is imprecise — it's also a term outside of linguistics. The phrase "term of art" means basically "I know that you think you know what this means based on the words, but it is a special defined term that means something particular in this field." It is generally used to correct someone who seems to be viewing a term of art as a common English phrase, so the precision is useful.


Seems like "idiomatic" already fits this perfectly. Is "term of art" in some way different? maybe has a more of a "you can't really understand it as an outsider," sense?:

Idiomatic:

A speech form or an expression of a given language that is peculiar to itself grammatically or cannot be understood from the individual meanings of its elements, as in keep tabs on.

A specialized vocabulary used by a group of people; jargon.


"Term of art" means that it has a technical meaning in a specific field, while "idiom" just means that a phrase has a meaning. For example, "a dime a dozen" is an idiom but not a term of art. If you wanted to avoid the phrase "term of art" for some reason, "idiom" would be a reasonable alternative.


"a technical meaning in a specific field" is one of the definitions of idiom (as given above)—they literally list 'jargon' as a synonym ("The specialized language of a trade, profession, or similar group, especially when viewed as difficult to understand by outsiders."). That entry was from: The American Heritage® Dictionary of the English Language, 5th Edition.

Since idiom has both the structural denotation of a phrase being semantically 'atomic' (you lose the meaning if you break it into pieces), plus its synonymity with 'jargon', there appears to be nothing added in "term of art".

—but maybe this is a lesser known meaning of 'idiom'.


To start, “idiomatic” is an adjective while “term of art” is a noun. The word “idiom” in the sense you are quoting refers to the whole language/dialect/way of speaking, not one word. The sense of “idiom” meaning one phrase (maybe shortened from “idiomatic expression” or something?) does not mean the same thing as “term of art”, but is more like a common expression in a particular language / dialect. It does not have the sense of a specific technical meaning for a word, distinct from the ordinary definition.

These words are not synonyms, and should not be substituted.


And there’s what I was looking for :) I did overlook that ‘idiom’ applies to a full language/dialect.

I think you’re overstating your case though when you say “It does not have the sense of a specific technical meaning for a word, distinct from the ordinary definition”—since jargon is a synonym for one sense of idiom and idiomatic is less specific in the group/individual distinction (and yes it’s an adjective but you can trivially employ it to construct an equivalent noun phrase so this matters little). That said my own case is clearly a stretch here lol.

Maybe more relevant is the fact one could just say “technical term” and they’d be understood perfectly—I’m pretty sure that phrase is the reason I’ve also never come across “term of art”.


A lot of things are jargon that are not terms of art. When I was practicing law, a lot of people in my legal circles would jokingly refer to their spouses as their "domestic associates". That's not a term of art, but it is lawyer's jargon or an idiomatic expression.


That’s an interesting example, though I read it as a kind of parody of jargon rather than actual jargon. It’s a tricky case though since jargon also has multiple senses and your classification fits one of them.

I’m curious whether you would consider “domestic partnership” to be both jargon and a term of art (I would).

It’s interesting reviewing definitions of ‘jargon’: while you do find things like e.g. “ specialized technical terminology characteristic of a particular subject” —mostly you find references to incoherent/nonsensical speech.


"term of art" is a term of art in pedantry circles


I've found that using 'term of art' more effectively communicates 'this word has a domain-specific meaning that might differ from what you would naively think, pay attention' than simply saying 'x is an x term' or the like, which people tend to gloss over.

[edit] I am pleased to see that HN readers as a group love to define things, and also that I need to learn to refresh tabs I've had open for a while before responding


Then why not just say 'domain-specific meaning?'

Term of art doesn't really have a clear meaning in general (native English speaker, and this is the first time I've heard this), and it's unclear/ambiguous as to what it actually indicates.


Because "term of art" is the term of art...

"Domain-specific" is nonsense gibberish to people outside ~computing, more or less.


It’s pretty much exactly as intuitive as “state of the art”—a term of art I didn’t understand as anything beyond a superlative until I had heard it for 20 years and decided to give it a think. Which is to say, not very intuitive at all, until you just pick it apart and think about how it might relate in context, and then it’s just reflexive.


First time I read "term of art" I knew exactly what was meant. It doesn't seem that obscure.

However, I was already familiar with "start of the art" so it wasn't that great a leap.


"Jargon" is a good word for this case.


"Jargon" sounds pejorative to me, and implies that the phrasing is difficult for outsiders to understand.

"Term of art" only implies that it is the specific phrasing used by practitioners to describe the matter at hand.


Does that imply that the meaning is well-defined like "term of art" does?


Yes, it requires that the idea is defined in a concise way that is professionally/contextually restricted.

Use "term of art" if you'd like. I think "jargon" is a commonplace alternative and I like it.


Terms of art are pre-scientific jargon that survive mainly due to utility. They are common in crafts such as pottery or the Law. They are references to heuristically derived conveniences. They're linguistically cool.

In Math, lemma is a term of the art.


They are references to heuristically derived conveniences

That's every word in existence


Yes, and this is not trite. Every word we hang on to has special meaning. Think about the amount of words we've forgotten.

Could a computational linguist chime in please? How many words in how many human languages will never be spoken again in 2021?


Not a computational linguist (or a linguist at all), but this is an area I know a bit about. The question you ask is at the centre of the study of ‘glottochronology’ [0]:

> The original method presumed that the core vocabulary of a language is replaced at a constant (or constant average) rate across all languages and cultures …

Unfortunately, the only major discovery glottochronology has revealed is that rates of change vary too much to be of any use:

> in Bergsland & Vogt (1962), the authors make an impressive demonstration, on the basis of actual language data verifiable by extralinguistic sources, that the "rate of change" for Icelandic constituted around 4% per millennium, but for closely connected Riksmal (Literary Norwegian), it would amount to as much as 20%.

And there are other factors affecting replacement rate as well. For instance, a curious trait about the non-Austronesian languages of New Guinea is that the word ‘louse’ is practically never replaced: it evolves according to normal sound change, but never gets thrown out entirely. By contrast, many languages have a taboo against mentioning the names of the deceased; this can speed up lexical replacement if those people have names homophonous with common words.

For these reasons, I suspect that a general answer to your question will be extremely difficult — if not impossible — to find.

[0] https://en.wikipedia.org/wiki/Glottochronology


In linguistics, lemma is also a term of art.


Maybe you just noticed it lately.

Term doesn't imply a specialized meaning. Term of art does. And it doesn't have a negative secondary meaning like jargon.


It’s shorthand for “in context, that might not mean what you think it means.” I would use term of art to distinguish something that has a particular meaning for the context. For example “clobber” has a useful meaning, a meaning that many people use, but when programmers use it, it’s a term of art that means something specific and non-obvious.


Yes..it's a term in programming, a programming term.

None of these replies justifying "term of art" seem at all convincing to me. "Term doesn't imply a specialized meaning" – but I think it does here: if "clobber" didn't mean something different in programming, you wouldn't (need to) say "it's a programming term". "It's a programming term" precisely means "the term has a meaning in programming different to what it commonly means". The listener already knows it's a word in everyday English, which is apparently all "term of art" adds.


To my ears, "term of art" has a certain connotation of precision that the alternatives lack. Not just "this means something you might not guess" but "it has a very particular, well-defined meaning you might not guess."


> I see "term of art" everywhere on HN lately. Why?!

Probably a combination of the Baader-Meinhof phenomenon and virality (meme).


"Term of art" is a vector of meaning.


Seems like linguistics is the perfect arena for exploiting puns. Term of art I think not when they miss a perfect chance for artistry.


Maybe because "V" and "B" used to be pronounced the same ;-)


It certainly reinforces what I know about techies that I just had to scroll all the way down to here to get past a group of them not only missing the point of a joke, but then arguing with each other pendantically over the meaning of words that add nothing to the discussion.

Is bikeshedding a habit or a temperament?

Glad to see someone managed to get the joke, anyway.


But everyone got the joke without the need for the follow up comment? That’s why it’s the top comment.


In Spanish they still are.


Canadian English is going through a vowel shift right now. The shift is much smaller than the great vowel shift, but still notable.

https://www.macleans.ca/society/life/in-the-midst-of-the-can...


In the US there is a Northern Cities Vowel shift going on right now, that this linguist claims is the biggest vowel shift since “The Great Vowel Shift.” See this video starting at 7:10 https://youtu.be/IsE_8j5RL3k

Maybe related to what’s happening in Canadian English?


Huh, glad that is real, and not just my ears.

I’m an American, recently back from 2 years in Toronto, and the Toronto accent is just not the stereotypical Canadian accent that I imagined I’d hear. Far less Minnesota, a bit more New Jersey.


Go to Calgary for that stereotypical Minnesota - with a bit of Texas for fun.


Tron funkin' Blow (https://youtu.be/XARhrIRF85A)


Thanks for the link, it was fascinating!


For anyone wondering about the field that studies this sort of phenomenon, welcome to the wonderful field of historical linguistics!

Historical linguistics is a really cool intersection between anthropology and linguistics - in short the idea is we can look at languages around today and use certain well-founded assumptions about how languages change to understand the languages of the past; essentially the words we speak and sign today are the fossils of our linguistic history!

Suggested reading is Historical Linguistics by Lyle Campbell, and Language Files for a more general Linguistics textbook.


Moreover, there wasn't just one or a few shifts: they happen all the time and there were plenty of them. Apparently a big discovery in etymology (in 19th century iirc, and I forget the guy's name) was that words mostly don't change pronunciation individually, but the same parts in different words change in the same way. Which allows to track those changes over time and thus backtrack how modern words should've sounded in the past.

This is why folk etymology usually misses the mark—since it focuses on single similar words that in fact often turn out to be unrelated. Also afaik spelling is irrelevant for etymology, at least until the very recent times.


Is this Grimm's Law, by Jacob Grimm of "Grimm's Fairy Tales" fame?


Sure sounds like it. Though I'm fairly certain that I wouldn't forget Grimm's name like that (the tales are childhood's staple where I am)—so maybe Rasmus Rask or Karl Verner was instead mentioned when I heard of the general concept being attributed at all. Which, rather weirdly, was only one time in a lecture by Andrey Zaliznyak on Youtube, even though I heard of how shifts work before and after that—you'd think such an important discovery would at least more often bear the name of its author.


The very same! He wrote a whole series of books about German — a history of German, a grammar of German, a dictionary of German, and, yes, a collection of old German stories.



related, but has anyone noticed younger Americans pronouncing a few specific types of words differently as of late?

I'm 29 and grew up in the Midwest, so in my accent, when I say "button," the "tt" is nearly silent. however, at my last job doing remote web development, one guy who was a bit younger than me (and also American) would pronounce it "BUH 'in," with noticeable "stop" between the two syllables. I have since noticed this in other younger Americans as well, and for other words I cannot recall right now but with the same general pattern.


Glotalization of T https://linguistics.byu.edu/faculty/deddingt/t-glottalizatio...

My anecdata says we covered this dialect in intro linguistics back in the 80s. I have been hearing it from New Jersey natives for a long time.


That sounds like the glottal stop you hear quite a bit in England. Bu'hu for "butter", wa'hu for "water" and so on. I've been in heated discussions with Englishmen arguing my pronouncing the Ts in the middle of words is incorrect. I personally hold to the notion that at the very least if you go to the length of putting the letter T *twice* in the same spot, it really wants to be pronounced.


That is the rule for RP. Lots of English are afraid of sounding posh and avoid it. Estuary English is what the elite kids speak these days so they can blend in with working class.


I tend to associate this with Brits. I have definitely noticed a rise in young Americans doing this in the last year or two though.

The typical American "nearly silent" one you are describing tends to be more of a flapped /ɾ/, by the way. <d> is often the same.


American young person here: I do the glottal stop thing with "button" but a flapped r with "butter." I think it has to do with the "n" - I would also do the glottal stop for "bitten" but not for "bitter." For me the second syllable is a pure syllabic /n/ - no preceding vowel or consonant.


> The typical American "nearly silent" one you are describing tends to be more of a flapped /ɾ/, by the way. <d> is often the same.

yes, exactly! also out here in South Dakota we specifically pronounce the "t" in "Dakota" as a "d," I've noticed.


Voiceless phonemes(?) tend to become voiced when they're between two vowels. Boise natives hypercorrect to say Boy-see, when most folks from outside the area say Boy-zee.

Historic shifts like that are loafes -> loaves.


Midwesterner here but if I try to say Dakota with a t sound, I feel like I'm doing a British accent or something.


Also originally a midwestern. I've noticed that I and others with the accent also put a compensatory breathy h on the end of words like Dakota, and hence pronounce it "Dakodah". Often this results in a devoicing of that second "a". I've never seen any studies on that though.


Is that the same thing as "D-Troit" verses "Duh-Troit" for Detroit? That's one of my favorites.


The pronunciation "dee-troit" is just local slang meant to sound folksy. It's used by sports announcers and singers, and for comedic value. Most of the time people in the area pronounce it the same way as the rest of the country pronounces it.


I think you're right there. But also going back, there was a more 'neutral' pronunciation of "deh-troit". I believe from there, we started getting "duh-troit" as a type of Schwa[1]. Again, I agree that there's a newer "DEE-troit" which is kind of like if you drew a line from "duh-troit" through "deh-troit" and kept going, then you'd get "DEE-troit".

[1]https://en.wikipedia.org/wiki/Schwa


Funny enough, this has almost become the common British pronunciation of most "hard t" words. It's quite common for people to say "a bottle of water" like "a boh-ell of wah-er", or "butter" like "buh-ah. The sound comes from the back of the throat instead of the tip of the tongue.


I had the impression such is considered "lower class" talk, for lack of a PC way to describe it. Please hear me out before giving me negative points.

If you are "educated", then you are "supposed to" pronounce the middle constantans, as skipping them is considered "lazy". I'm not making a value judgement, but reporting that this is what parents often tell their kids in private. All things being equal, parents want their children to sound wealthy and well educated. Similar for certain rural or "redneck" talk in the US. Example: "Murica" instead of "America". My mother used to lecture me against certain verbal shortcuts so that I didn't "sound ignorant". Her words, not mine.


It's no big secret that different social groups exhibit language variation, and that while there's nothing inherently lesser about certain dialects, they are nonetheless coded as e.g low or high status.

You might feel a lot more comfortable talking about this subject and find useful alternatives to your scare-quoted words if you skim https://en.wikipedia.org/wiki/Sociolinguistics

Also due to https://en.wikipedia.org/wiki/Hypercorrection ,I think https://en.wikipedia.org/wiki/Rhoticity_in_English is a fascinating and accessible example. A terrible summary: dropping the R is lazy (dropping anything is "lazy"), but at the same time sounds British and therefore fancy to some American ears. But other people, to avoid laziness, add Rs. But then their speech can sound low-status in some cases too.


It is (or was) considered "lower-class" to drop your Ts in the UK. But these days, it's not fashionable to talk like Boris Johnson, even if you went to Eton, so many upper-middle class people emulate aspects of "lower-class" speech, including the glottalization of "t" when it's followed by an unstressed syllable (like water). Even someone as indisputably posh as Prince Harry will T-glottalize on occasion.


Not being posh has been fashionable before. A fashion for dropping terminal g existed the upper classes in the 1920's UK. Dorothy L. Sayer's Lord Peter Wimsey did this, and it was fascinating to hear this affectation in the 1970's BBC adaptations with Ian Carmichael. https://www.irishtimes.com/opinion/no-g-men-frank-mcnally-on... Also, there is a backlash against Received Pronunciation as inauthentic. Now I enjoy BBC announcers with regional accents, unlike the voices I heard on BBC in the seventies and eighties.


Famously in "huntin', shootin', and fishin'." But I'm not sure it's the upper classes emulating the lower. It seems to be parallel evolution from different forms of Old English [0]. Which makes sense; it's unlikely an upper-class English person of the 1920s would be motivated to copy lower-class speech patterns. Bertie Wooster wouldn't want to sound like his scullery maid.

[0]: https://books.google.co.uk/books?id=YMS3AwAAQBAJ&pg=PT26&lpg...


> Now I enjoy BBC announcers with regional accents, unlike the voices I heard on BBC in the seventies and eighties.

I agree. The downside of the loosening of the old standard is heard in the number of actors who mumble through their lines. I wouldn't want to go back to a time when RADA enforced a kind of RP but I would like them to focus just as hard on diction.


>but reporting that this is what parents often tell their kids in private //

I correct my childrens' pronunciations. I've honestly never thought of it as a class thing. In part it's "inherited", so it could have consciously been a class thing for one of my ancestors. But really to me it just seems necessary for preliterate speakers to understand the proper way to say a word based on its spelling, which they don't know. I find myself correcting USA-ian word use ("garbage") and pronunciation far more with my youngest than I had to at that age with my oldest ... we probably let him watch too much TV (several shows, like Paw Patrol have shifted to USA versions from British English versions).

It's hard to discern every phoneme from a new word, and hard to say some of them.

I do shift my accent, and vocabulary, to mark myself as a local I guess (when I'm back in that area of the country). If I get to use certain words in their localised (to about a 20mi long area) meaning it makes me happy for some reason. But I've never consciously worked on my accent to sound more/less posh; but have modified it to be more understood.

My wife is what I call a 'sympathetic speaker', she very quickly adopts the accent of those she's speaking with. An interesting phenomenon.


Re: My wife is what I call a 'sympathetic speaker', she very quickly adopts the accent of those she's speaking with.

Politicians are often known to do the same. It's usually painted as "pandering", but I'm not making a value judgement here. There's arguments both for and against.


it's called a glottal stop, it's also pretty common in Irish accents to substitute this for a T


I've noticed this recently among some of my American colleagues, and it's so obvious to me that at first I thought they were imitating an English accent. To my English ears, standard American pronounces the "t"s in butter and water as "d", and hadn't realised that some accents do have the glottal "t". I think it is mostly in people from the East coast.


spot-on in your analysis & as is mentioned elsewhere in the replies, I think it might be a definite east-coast influence on the rest of the US population as a result of video-based social media. at least, that's my running theory!


I don’t think it is a new phenomenon in the language. I associate this with Northeast US accents and many British accents. It may be that regional accents which use a glottal stop for this pattern are leaking into younger people’s accents around you simply because YouTube and TikTok are making less common accents heard more widely.


I'm 46, also grew up in midwest, and I say "BUH 'in".


Lexicon Valley covers these sorts of things regularly. And I think he takes ideas from listeners too!


I think I've always said with a stop. I'm from the East Coast, though.


I notice s being pronounced sh. It shtrikes me as more frequent now.


while we're on the topic, my younger-than-me girlfriend from Idaho says "warsh" for "wash," which until now I fully associated with only people the age of my late grandmother. truly, the wide gamut of American accents is something to behold!


Interesting how in "micro" sense we know people knew it was happening ("young people today.." nothing changes) Yet, its a 300 year process.

One of those "I was there" -no, you weren't things. Like sea level rise. You see a bit of it. The totality is a story which spans generations. Nobody owns all of it.


And yet, people experience changes in accents during their own life; they hear recordings from themselves from the 70's and think "did I really talk like that?".

One thing though, take older movies with a grain of salt; in old movies, actors were taught a specific accent (called the Mid-Atlantic accent, https://en.wikipedia.org/wiki/Mid-Atlantic_accent).


It's a sequence of shifts. I guess, significant numbers of people get to experience maybe one shift in a chain.

Modern example. Router. Two forms, the British one rooter is being deprecated strongly for rowter.

Etc is etcetera. Now, its etsy.


> Etc is etcetera. Now, its etsy.

I can't say I've ever heard that, anywhere.

Definitely have heard a lot of "esetra" / "esedra", though.


/etc etsy has become more current in my (oz) community. there's still a divergence between /sbin (s'bin) and s-bin ess-bin

/tmp is pretty much temp to everyone. /var vahr not vair. and luckly, we all think soodoo is pronounced the same. Oh wait, soodough. Damn.


If anyone finds this subject interesting, I recommend The History of English Podcast: https://historyofenglishpodcast.com.


Can't upvote this enough; it's a great podcast. If this seems just a little too esoteric, might I appeal to everybody's prurient side and recommend Chaucer's Vulgar Tongue [0] as a gateway episode?

[0] https://historyofenglishpodcast.com/2019/09/25/episode-129-c...


Oh wow, I've never seen anyone rep that podcast elsewhere, but I highly recommend it. It's absolutely fascinating. I'm not much of a podcast listener, but this one pretty much immediately captured my attention and has held it (I'm now a ~70 episodes in IIRC).


Also the ITV series "The Adventure of English" with Melvyn Bragg. They're all on YT.


A a great podcast that has started a personal hobby/interest in linguistics and etymology.

I love that history is able break your notion of what is or isn't possible and this podcast is great at that; it repeatedly shows how no language is set in stone, it is a human construct, and how languages are interrelated.


That is one of my favorite podcasts! I often go back and listen to older episodes.


And ‘Mother Tongue’ by Bill Bryson.


Please don’t recommend that book. Bill Bryson has no training in linguistics and that book is so bad, that there is a factual error on virtually every page.


Mein Held!


I'm amazed! Late Middle English ‘boat’ sounds just like modern Norwegian ‘båt.’ :p I heard the stories that Scandinavians could go to Britain and it would only take a short while to get used to the language. Kind of like Norwegians going to Denmark today. In writing Danish and Norwegian is almost the same, but it takes some getting used to the pronounciation.

There's a fantastic dialogue illustrating the similarity of Old Danish / Norse vs Old English by Jackson Crawford and Simon Roper: https://www.youtube.com/watch?v=DKzJEIUSWtc


My mother’s family is from the Durham area in the north east so I’m very familiar with the broadly Geordie accent. To me Norwegian sounds a bit like German spoken by a Geordie.

It’s not really surprising considering that region of England used to be colonised by “Danes” and was called The Danelaw before the Norman conquest, as against the south Germanic tribes that settled southern England. I suppose the Norman french influence peters out the further north you go.


Wow, so English sounded just like a Scandinavian language! I knew old English words are often similar to the equivalent modern Scandinavian languages in writing (Swedish, Norwegian, Danish) but I was not sure if they sounded similar.


Well, to be fair, Danish doesn't really sound so similar to Norwegian, but since most of the words are written the exact same way, it's usually not hard to decipher the pronunciation, especially after a day or two among them (I'm Norwegian). We've got some pretty special old mountain dialects in Norway too. Technically they are Norwegian, but it's nigh impossible to understand a word they say lol.


Any non-native English speakers feeling like: So is that why we need to suffer a lingua franca that does not map 1:1 spoken and written?

This must be the world's largest tech debt. :)))


John McWhorter has argued a few times on the Lexicon Valley podcast that although English has many problems (especially spelling), it's not a terrible lingua franca. Many complexities from other languages do not exist in it.

Obviously, any artificial language is going to be much simpler. But these have never caught on for a variety of reasons.


English is a great lingua franca IMO because of its inherent decentralization and willingness to accept new words. There is no academy regulating its use, as is the case with French and Spanish.


In the case of Spanish, I can't speak for French, having an academy (the RAE in this case) regulating the language sounds like a problem on paper, but really isn't a problem in practice. The RAE accepts Americanisms and treats American Spanish just as it would European Spanish (referring to the continent(s), not the country) and encourages lingusitic diversity. Anglicisms are only discouraged when an acceptable Spanish alternative is in WIDE use (but never prohibited, as can be evidenced by the sheer quantity of anglicisms in the Spanish language.)

Spanish is very willing to accept new words, and as diverse as English in terms of decentralization. Grammar doesn't make or break a lingua franca, number of speakers does, which is where English really shines.

So what does the RAE do you ask? They write grammars and compile dictionaries, describe phonology and answer people's questions on Twitter. Just like what Merriam Webster or Oxford would do, but the RAE has official backing and creates consensus among the hispanophone countries. English is a regulated language, just not officially regulated.


I'd say it's more accurate to describe Webster and Oxford as documenting established usage, rather than being informal regulators. They generally view their role as being purely observers and not as active influencers of the language.

Some other language authorities do not take a usage-evidence-based approach to defining their dictionaries, and take into account cultural or historical concerns.


Oh how I envy you. Here in Korea we are stuck with the tax-funded imbeciles of National Institute of Korean Language, whose hobby is saying "ALL you Koreans are using the Korean language wrong!" with a straight face at every occasion.


Similarly in Norway the authority in charge often proposes Norwegian alternatives to Anglicisms, but they'll yield if an import becomes dominant, and will even reverse official reforms if they fail to gain traction.


I which french academy was this tolerant.


The problem of the French Academy isn't that it's intolerant, there are linguistic bodies in Europe that are far more conservative, it's that it's too slow and simply cannot keep up.


All languages are decentralized regardless of regulation by official government bodies.

The role of regulation is mostly a legal role employed by countries that use a civil legal system. Most English speaking countries use a common law legal system and so interpretation of words is fairly fluid and subject to interpretation by the courts. Civil law does not have this kind of flexibility and room for interpretation, so courts will make use of regulatory bodies to uniformly interpret language.

This is why almost all countries with a civil law system have a regulatory body, while countries with a common law system do not.


To be honest, I still wish it were Spanish, which is easier to learn/read and pronunciation of which is somehow easier for most people (except for native English speakers, as I've noticed).


Surely it will be a Romance/Germanic split? So Spanish is easier for speakers of French, Portuguese, Italian and Romanian, and English for speakers of Dutch, German, Swedish, Norwegian and Danish?


What I've noticed while working on most continents is that when people learn just a little English, they can communicate better than I can if I try to learn their language. The reason is, I think, that English grammar is, in a way, so redundant that even if the speaker use wrong grammar all the time the meaning often doesn't change. You can say "I is", which is wrong, but there's no problem with understanding (at least not for non-native English speakers who are so used to hear all kinds of wrong grammar). But make a single mistake in a Romance language like Italian, and suddenly your sentence "I'm stupid" now means "You're stupid". The grammar matters. In English the grammar, for the most part (not for everything, of course) is about correctness more than meaning - which is why the language is so robust, which is one reason it works as a lingua franca: A little English knowledge goes a long way.


Not sure about those others but the Germans that I know struggle with English just like everybody else. (Despite common roots German and English are very different, and, in fact, I find the written Spanish and even French to be quite close to English, and rather easy to understand if you know English and learn a few words that are different or spelled differently; not so with German.)


I don't know John McWhorter, but is this the "second languages are hard" argument? I tend to get annoyed when people compare the relative difficulty of learning a language they spoke 1636518 hours vs ... 18 hours.


McWhorter is a linguist (as well as a political commentator). His argument isn't that English is easier because people already speak it natively. Instead, English lost a lot of its complexity (especially around suffixes) during the Viking invasions when it was taken up by adult speakers.

This doesn't mean that learning the language is easy--no language really is. And English has some things that make it harder to master, especially its very large vocabulary.

Some of it is covered in this recent episode: https://slate.com/podcasts/lexicon-valley/2021/03/english-la...


English's large vocabulary actually makes it easier as a second language.

Adults are really good at learning new words, but struggle with morphological and gender systems. Complex morphology seems to have some benefits that push languages towards including them, but are only sustainable when those languages are predominantly learned by children, who can pick up morphological complexity easily, not adults, who can't.


> Adults are really good at learning new words

You have no idea :)


From a Slavic perspective, the really complex thing about English grammar is the sequence of tenses.

Having spoken English for almost 30 years by now, I am still not sure if "X has died" or "X died" is correct, or in which context.

But the other direction would be worse. Languages like Czech, with its 20+ classes of declension, must be a true nightmare for any English native speaker to learn.


> Having spoken English for almost 30 years by now, I am still not sure if "X has died" or "X died" is correct, or in which context.

Okay, strictly speaking, this is a distinction in aspect, not tense. But colloquially, tense, aspect, and mood are all referred to as "tense", especially since the conflation is present in most Indo-European conjugation patterns.

"X has died" is the present perfect. The perfect aspect is kind of like the past tense in that it is referring to something that has happened. Indeed, the past perfect ("X had died") is usually described as "the past of the past". But in keeping the tense in the present, the present perfect means that the past occurrence has relevance to the present. This can carry a few connotations. It can be a recent past, especially if you use "just" as an infix (c.f., "X has just died"). Or it can highlight the consequences of the event having occurred (e.g., "Our lord has died. What will become of us now?"). In any case, the speaker is drawing the listener's attention to the connection between past and present when they use the perfect aspect.

So which is correct? "X died" you would expect to find more in a biographical context or maybe a novel. "X has died" would be common in a news report, or someone informing you that a loved one died not too long ago. Which is more correct in a given scenario can usually be informed by the dominant tense in surrounding text; after all "X has died" is the present tense, despite conveying an action that happened in the past. If there's not enough text to dictate a tense, then it's often the case that either form will end up being acceptable--it just sets the tense that will be used.


A good guideline I've heard is that you say "X has died/eaten/gone" but "X died/ate/went this morning." So simple past if there is a specific time attached and present perfect if it's a general statement. More examples:

"Have you [ever] been to New York?" "Did you go to New York last week?"

"Have you seen Star Wars?" "Did you see Star Wars this afternoon?"

Might not work in every circumstance but a good rule of thumb.


Even we don't know sometimes.


> especially its very large vocabulary

Vocabulary isn't a problem IMHO.

The fact that I have to learn each word twice (how to write it and how to say it) - is.

I was learning German for 3 years at school. After the first month I had no problems with pronunciation. Now after almost 2 decades of not using it I can still pronounce any German word I see.

I've been learning English since I was 10 or so. I'm 36 now. I still have many English words I know (and use correctly in writing) that I'm not sure how to pronounce.


> I still have many English words I know (and use correctly in writing) that I'm not sure how to pronounce.

Do you mind sharing some examples? As a native English speaker, I'm so curious! Do you think that if you heard them without seeing the word you'd realize what the written form was? Or might there be words where you know the written form and the spoken form and don't realize it's the same word?


For me it was "awry" and "lichen". I didn't even recognize "lichen" when I first heard it spoken (on an episode of QI).

There are quite a few others that tripped me up over the years like "cleanliness".

In general, "Chaos" aka "Dearest creature in creation" shows this problem (I would still struggle to read it even if I know every word there): https://pages.hep.wisc.edu/~jnb/charivarius.html


Yeah, I'm a native speaker and I still pronounce "awry" as "AH-ree" in my head sometimes. I think I saw it in print a lot before I heard it.


"Vocabulary" is one. I just checked it and I almost guessed right. I thought the u was more of an oo and the second a was ah not eh.

Parallel - for some reason the second a is eh not ah. Can't remember that, have to check it every time.

I play a lot of D&D over the internet in English and even as common word as "sword" is for some reason hard to remember. Every time I have to guess if the "w" is pronounced or not.

been == bin ? - the rules for that are just evil

> Do you think that if you heard them without seeing the word you'd realize what the written form was?

Sure, from the context if not instantly. I listen to a lot of English media with different accents (I watched the whole Big Bang and IT Crowd and I listen to Critical Role when I'm commuting).

> might there be words where you know the written form and the spoken form and don't realize it's the same word?

Leicester and queue. But these are famous enough that I remember them now. I obviously won't be able to give you examples that I still haven't realized ;)


> the second a was ah not eh

> for some reason the second a is eh not ah

> been == bin ? - the rules for that are just evil

Actually, the rules are rather simple. They all have to do with unstressed syllables in English: unstressed vowels are reduced to /ə/ or /ɪ/ (the latter is what comes to play in your been -> bin use). Stress rules in English are not simple compared to other languages, and I can definitely see where non-native speakers might get confused.

One downside of the schwa reduction rules is that it can trip you up when you realize that you need to spell a word with a reduced vowel and you're not sure how it's actually written, because every vowel can be reduced to /ə/.


I once lost a bet because I couldn't believe "finite" and "infinite" sounded so different.

But the best examples are in the poem _The Chaos_ (http://ncf.idallen.com/english.html).

Through, though, throw, tough... I know a trough exists but I have no idea how it's pronounced.


Dr. Seuss has a lesser-known book called The Tough Coughs as He Ploughs the Dough (based on the observation that none of these -ough words rhyme -- the /tʌf/ /kɔfs/ as he /plaʊz/ the /doʊ/). (And yes, that's not even all of the sounds that are spelled by -ough, like /-u/ in "through".)

Trough is /tɹɔf/ and rhymes with "cough".


Or, “Steve was a fungi”.


Some examples from I Love Lucy : https://youtu.be/uZV40f0cXF4?t=9


epitome. I knew what the word meant written and spoken, but didn't realize it was the same word until well into adulthood.


This reminded me of "catastrophe". I'm still unsure :)


It poses problem for native speakers as well. After all, there are no Spelling Bee competitions in France, or Germany ... or anywhere.


There are in Poland. I suspect that they are also elsewhere, any language will have at least some tricky words.

See https://pl.wikipedia.org/wiki/Og%C3%B3lnopolskie_Dyktando


In Polish the mapping letters->sounds is pretty simple, and nobody asks kids to pronounce a word (because anybody can do it after learning the general rules for a few weeks).

But the mapping sounds->letters isn't as obvious, because there's some tech debt there (ch == h, ó == u, rz == ż or sz depending on the preceding letter).

So if you know how the word sounds you're not always sure how it's spelled, but if you know how it's spelled you always know how it sounds.

In English the mapping is non-obvious both ways.


Here’s one: Chrząszcz.


There are in French


I don't see the point ... One can easily spell any french word, there are hard rules.


Especially when things like ‘row’ [rou] and ‘row’ [rau] are both correct (but the meaning is different).


So, the good news is that lots of native English speakers don't know how to pronounce things either. Of course I know how to say all the words I speak frequently and hear others using - but if I use a word I haven't heard there's a risk I'll say it wrong. You just learn not to worry about it (the biggest problem really is when you unconsciously try to correct somebody else and realise you've got not basis for your assumed pronunciation).

For example I'd seen adenovirus written down, but never heard it said out loud, I was describing the vaccine I'd had to friends, one of whom works in medicine (a doctor, but not of medicine) and she corrected my pronunciation because she's used that word plenty of times so she (presumably) knows how to say it correctly.

Even for a completely native immersed speaker, there's just no clue in English how to correctly say a completely new word you've only seen written down, so you're at no disadvantage there. For "real" words there may be an etymological clue, but those aren't reliable. In fiction it's anything goes. Hearing fictional words I've read pronounced out loud in movies is as weird for me as seeing the (inevitable) transformation of a woman described as plain in the books into a beautiful Holywood actress...

It's obviously a bigger problem with some common English words - either where they are actually two separate words with different pronunciations but the same spelling, or worse, one word but with different stress patterns. But once you've got a fair-sized vocab the new words you're learning won't have that sort of weirdness.

It's definitely true that if you're not confident pronunciation can really be an obstacle, fortunately the huge vocab helps again - a (non-English native but UK citizen) friend of mine will carefully choose to talk about liking the "seaside" never the "beach" because she's concerned she'll manage to make people think she said "bitch". She has a few other words like that, in each case English provides convenient alternatives.


> never the "beach" because she's concerned she'll manage to make people think she said "bitch".

Is her native language Spanish?

It's really cool how you can have "blind spots" depending on your native language. To me, the difference between beach and bitch is huge, because my native language uses short and long vowels extensively, and there are tons of words that only differ in a single vowel length.

But at the same time, I have other blind spots in English. For example, I have to make an effort to remember to use sounding "s" and "j" where appropriate, and the lack of those is a dead give-away for identifying Swedish English speakers.


She could also be French, or a speaker of any other romance language, or really any language that doesn't have the vowel [ɪ]. For speakers of such languages, "beach" and "bitch," as well as "sheet" and "shit," can be very hard to distinguish from one another.


Yeah it's amazing how bad people are at this without practice (including myself). Reading complicated / unfamiliar words aloud is a key skill in reading quizbowl questions (a type of trivia). Often these questions will have pronunciation guides to help but even then it can be a slog for some people. I've actually found TTS better than all but the most experienced readers for this task.


> You just learn not to worry about it

I don't know why but a lot of people my age do worry about it and get very uncomfortable guessing at a pronunciation of a new word. Doubly so for names. I guess it's an insecurity thing? I have no problem just going for it, with a little question tagged on or just an upward tone if I'm really unsure.


> talk about liking the "seaside" never the "beach" because she's concerned she'll manage to make people think she said "bitch".

Thanks for that anecdote, I somehow thought that I am alone living with fear of that happening :)


Also, try to avoid saying “it’s actually...”


Any lingua franca will end up deviating in pronunciation from how it's written. For example, Finnish is said to have very shallow orthographic depth (the technical term for this phenomenon), but you can imagine if people from across the world began speaking and writing Finnish that over 100 years you'd end up with different accents and different variations of the original Finnish. The language would branch and merge and branch in unpredictable ways as it gets incorporated into various cultures and you'd end up with a situation similar to English today.


I don't think Finnish (or any language in a state of diglossia) is the best example here. That is, Finnish is already essentially two separate languages. Written Finnish (kirjakieli) is a somewhat artificial language that was a compromise between different dialects, because at the birth of Finnish literacy there was no longer an "original Finnish" on which to base writing.

Written Finnish is mainly a written thing, and while there is a close correspondence between the orthography and how it would be read aloud, when Finns actually speak they use spoken Finnish (puhekieli), which isn't standardized and varies from region to region.


I am referring strictly to the correspondence between a written language and its pronunciation. The fact that people speak differently from how they read/write is an altogether separate matter.

When reading Finnish, one pronounces the words very closely to how the word is spelled, regardless of whether when they speak they do so in an altogether different manner.


Changes to written representation of languages tends to be much more conservative compared to the changes in the spoken languages they're based upon. This is often due to the desire to preserve continuity in the ability for people to gain literacy without having to relearn a new system and retranslate all works that were written in the previous system. This generally lasts until the difference between speech and writing becomes bad enough to hinder literacy, at which point script reforms may happen. Many of the more "phonetic" writing systems encountered in continental European languages were due to the fact that they had relatively more recent script reforms that made spellings closer to how their words were pronounced in their modern languages. Written English, on the other hand, still tends to conserve spelling that reflects older pronunciations.

An example of a language that has it worse than English is Tibetan, which hasn't had a script reform since 800: https://www.youtube.com/watch?v=btn0-Vce5ug


The other side of that coin being that the more the script is reformed, the less accessible to non-specialists become a culture’s great works of literature. Shakespeare being the obvious example.


> So is that why we need to suffer a lingua franca that does not map 1:1 spoken and written?

I think the fact that it is a lingua franca is one of the main reasons keeping any spelling reform from occurring actually. It's in far too wide-spread use and there isn't any centralize authority that would do the spelling reform. Maybe take solace in the likely fact that it written English and spoken English will probably only be _more_ different as time goes forward. In other words, you have it easier than all future generations.

Also as much as spelling matching pronunciation is a convenience, it isn't really necessary. The variety of spoken Chinese languages using the same characters is greater than the spoken romance languages. Maybe English really slowly becoming more character-like over time. There are languages that can undergo spelling reforms, and there are languages people actually use.


To be considered a competent reader of a standard newspaper you need a 6 years of education in English, while only 5 in Spanish. I understand Japanese requires 9 years of school for the same feat (I'm don't know how to verify this claim, but it seems reasonable).

There there is a very real cost to not mapping spoken and written languages: kids need to spend more time in school learning basic reading skill - time that could be spent either learning something else, or playing.


In the region, countries like Korea and Vietnam did move away from Chinese characters - though I'm not sure what impact that's had on educational outcomes and time spent studying different stuff.


An old book "The Fifth Generation Fallacy" (Unger) addresses this somewhat.

As I recall it speaks about the desire by the Japanese leadership of the time to build AI translation and related it to the challenge of full literacy in Japanese due to the requirement of learning 4 alphabets. Hiragana, Katakana, Kanji and Romanji.


It really is a shame Spanish didn’t become the lingua franca since it does have almost a 1:1 spoken and written.

Of course, it could have been worse. We could have ended up with French as the lingua franca (yes, I know what franca means) where there is almost no correlation between written and spoken language.


This might be somewhat subjective, as I don't know how you'd measure correlation between spoken and written, but French seems to have a much higher match between written and spoken language than English.

Going from spelling to pronunciation in French follows (admittedly complex) rules that are rarely broken except for common words (or endings such as -ent). Vowel pronunciations for a given spelling are far more variable in English, and often depend on the etymology of the word. Plus, English has word-level stress that is not marked in writing (French has none, and it's marked in Spanish), and moving the stress will usually make a word unintelligible! That alone makes writing => pronunciation very difficult.


French spelling is unidirectional. I can comfortably say any French word (albeit with a couple key exceptions, such as "et" or "clef", that break rules), but I can't reliably go from someone talking to how to spell it. "Eaux", "eau", "au", and "aux", or alternatively e.g. "ou", "oux", etc., all have identical pronunciations, but different spellings.

Unsurprisingly, we can vaguely quantify this by looking at dyslexia amongst languages. English and various Southeast Asian languages that rely on Chinese ideographs are by far the worst, followed by things like Arabic, French, Hebrew, and German that have fewer exceptions but less guidance, and then followed last by things like Spanish, Cherokee, and so on that are truly one-to-one.


> how you'd measure correlation between spoken and written

There are a number of ways currently used, but I have a new one to propose: compare the size of two G2P models (1 for each language), which have similar RMS errors. Assuming they are generated using similar techniques, the one which requires the bigger model probably has a less clean phoneme-to-grapheme correspondence.


It's not subjective; French is better than English in this regard, and Spanish is better than French. English has more complex pronunciation rules and many, many more exceptions than those languages.


I confess being super happy that English is not so hung up on the gender of words. So, it has that advantage over Spanish.


The phoneme-grapheme correspondence in Spanish is better than English, but let's not pretend it is 1:1. Does it account for assimilation in rapid speech? Does it account for coarticulation of adjacent consonants? Does it account for regional/dialectal variation? Does it account for secondary articulation?

Even ignoring all of these, its clearly not bijective. For example:

C --> /k/, /θ/

Z --> /θ/ [0]

K --> /k/

Q --> /k/

G --> /ɡ/, /x/

J --> /x/

N --> /n/ (with several distinct secondary articulations), /m/ (rarely)

M --> /m/

R --> Can be tapped or trilled.

Etc. You can go here and see many bijection-failures here: [1]

I am being intentionally unfair to Spanish (which truly does have a much, much better phoneme-grapheme correspondence than English[2]), mostly to illustrate the point that there aren't really any languages which have a 1:1 mapping between spellings and pronunciations. Even if you decide to use the IPA to write your language, non-standard dialects end up needing to read words that don't match their pronunciations. What happens when inevitably the language undergoes change - do we update all of the books to use the 'new' spellings of words?

The ideal orthography shouldn't be completely 1:1, but it should be relatively shallow. From that perspective, Spanish orthography is a fairly attractive option.

[0] The non-1:1 situation with /θ/ gets much worse in most dialects of Spanish, where it is not distinguished from /s/. See: https://en.wikipedia.org/wiki/Phonological_history_of_Spanis...

[1] https://en.wikipedia.org/wiki/Spanish_orthography#Alphabet_i...

[2] Look at how effective Spanish-speakers are at reading without "decoding" compared with Portuguese, which also has a good p-g correspondence. In particular, look how much faster the Spanish students are at pseudowords, on page 141: https://www.academia.edu/17872463/Differences_in_reading_acq...


Most of these examples are unambiguous in context. For example, C is always pronounced /k/ when preceding A, O, U, and always pronounced /s/ or /θ/ depending on the dialect when preceding E and I.

The function from written Spanish to spoken Spanish (provided we are talking about a single dialect) is surjective, but darn close to bijective, especially if we exclude words of recent foreign origin.


From Mark Rosenfelder (https://www.zompist.com/spell.html):

> Many people expect … to predict the spelling from the pronunciations-- not realizing that few orthographies meet this goal. It's far from true of Spanish, for instance, which is often held up as an example of a good orthography. I stopped fervently admiring Spanish orthography when I saw a sign in a Mexican bakery with about one spelling mistake every third word.

So, no, hardly bijective!


Well, rules can be simple and there can be few exceptions and people will still screw up, so I don't think that anecdote proves anything. In any case, my claim is that there is exactly one possible pronunciation per correctly spelled Spanish word. The opposite direction is not quite 1:1, but again, it's very close, and anyway it's far closer than English.


Sure, that I can agree with.


Laughs in chilean or andalucian rap god dialect


It was my second language, do I started out assuming that this was a problem with any language you'd learn.

I do, however, wish natives English speakers were more aware of this for their own sake. Most seem to default pronouncing vowels in foreign words as if it was English, whereas they'd be much closer to the correct pronunciation of they defaulted to pronouncing it like words for any language other than English they might know even a little of. To me this should be one of the first things you learn when learning your second language as a natives English speaker. It even holds true for romanizations of Asian languages like pinyin for Chinese or Hepburn for Japanese


It really is a shame Spanish didn’t become the lingua franca since it does have almost a 1:1 spoken and written.


Well, that, and no one wanting to update the orthography. Though speaking of lingua franca, that language also has a very outdated orthography.


This what frustrates me about grammar fanatics. Clearly language changes and evolves, and grammar is defined as standard usage, not some rule formed 50 years ago that no one follows in common speech. In short, grammar is fundamentally descriptive, not proscriptive.


It's both. Language changes, but at any particular time, there's some consensus about what's correct and incorrect. You can ignore it if you want, but people will think you're ignorant.


There may be an elite consensus, but more often than not that can differ from what is actually being spoken.

For example, in linguistics studies AAVE (or “Ebonics”) is considered like other dialects to basically have a grammar of its own that is internally consistent, and it is the first English of a fairly large population.

That’s before we get into which English is the “correct” one; there’s British English, American English, etc. for the US and CANZUK, but there’s also Indian English, Singaporean English, Euro English, etc.


Yes, dialects exist. But within any given dialect, there's some consensus about what's correct and incorrect. You can call that elitist if you want.


Use of "whom" is a great example of this. If you use it correctly (proscriptively) then people thing you sound strange.

There's not just descriptive and proscriptive approaches: utilitarian and apathetically approaches seem distinct (perhaps they're a different axis).


I learned about another vowel shift yesterday: the (US) "Northern Cities Vowel Shift" - https://en.wikipedia.org/wiki/Inland_Northern_American_Engli...


Very good demonstration of how that sounds: https://youtu.be/IsE_8j5RL3k?t=433


I wonder if our new age of global culture and free information exchange will slow down language mutations?

Surely tons of existing movies and songs and other media should give a clear idea of a speaking norm so spoken language doesn't deviate much from it. Or it doesn't?


The pre-shift vowels are quite similar to how modern Swedish is pronounced.


For the front ones, maybe. But for the back ones Old Swedish went through "stora vokaldansen" (the great vowel dance) leading to "ut" (out) being pronounced /ʉːt/ rather than /uːt/.


One interesting side effect of the Great Vowel Shift occurred when the word Uisce Beatha (Whisk(e)y) was anglicized twice. It was first brought in during the Elizabethan period from Ireland as "Whiskey" and later re-brought in as "Whisky" in Scotland. I've often wondered if that is the same reason behind surname anglicization of Ceallaigh to "Kelley" and later "Kelly".


Wikipedia tells me that

> Uisce beatha, literally "water of life", is the name for whiskey in Irish. It is derived from the Old Irish uisce ("water") and bethu ("life").

In France, many liquors used to be simply called "eau de vie". They now tend to each have their own name (origin and/or brand), but most people would still understand that you're not looking for water if you ask for some eau de vie.


I thought this fascinating video lecture[0], History of English - The Great Vowel Shift by Jürgen Handke, explains things very well, using the diagram and by his demonstrating all the vowel sounds in question. It even explains the origin of some other modern English accents.

[0] https://www.youtube.com/watch?v=zyhZ8NQOZeo


Interesting. Vowels in English has always been hard to find a rule that dictates their pronunciation in words. This is a funny video about that:

https://www.tiktok.com/@therealivancohen/video/6921416585412...


Wow. I was (again) reading about this whole thing last week. Lot-cloth, Father-brother, cot-caught, Psalm by itself...

If you are interested to know more, take a look at the way native speakers of other languages utter "a" and "o" and their variations. Fascinating, to say the least.


The article doesn't explain what evidence we have of this. How do we know this happened?


It's a good question. It turns out that linguists can discover quite a lot about how some languages were pronounced before audio recording existed. I got some insight into this by reading the first chapter of an academic book on the pronounciation of Latin where there was a summary of the different kinds of evidence that can be used. Unfortunately I don't have the list to hand so the following is just my random brainstorming:

* the pronunciation of descendent languages, especially when there are lots of them, or lots of different dialects

* comparison with languages that have a common ancestor

* how words were transcribed from one language into another

* how words were changed when adopted from one language into another

* what spelling mistakes were made, particularly in cases where the writer was less educated or being less careful, such as graffiti

* rhyme and metre in poetry

* in literature, cases where someone is mocked for their pronunciation

* puns and wordplay in literature

* and of course cases where ancient authors have written more or less explicitly about pronunciation, either descriptively or prescriptively


> * the pronunciation of descendent languages, especially when there are lots of them, or lots of different dialects

It's worth calling this out as being one of the key elements of historical linguistics. Establishing genetic relations between languages requires proposing systematic sound shift laws that can explain why cognates sound the way they do, and this means that cognates in modern languages may not bear much resemblance to each other (English five and Sanscrit pankan are cognate yet share 0 sounds!). For example, there's a rule in the Germanic languages that shifts /k/ to /h/, so words like Latin "centum" instead become English "hundred" or "canis" to "hound" [1].

Now these pronunciation shifts often have caveats in them, such as shifting only before certain kinds of vowels or consonants. These restrictions can give you some clues as to why certain words seem to undergo a change while other words with seemingly similar pronunciations didn't. In Proto-Indo-European, this leads to the notion that there are several consonants (specifically, laryngeals) which are no longer present in any modern Indo-European language but whose existence in the original is responsible for sometimes shifting vowels that otherwise appear somewhat anomalous in descendant languages.

[1] To be clear, Latin is not the initial word and English is not the final word. I'm just using Latin to illustrate a word closer to the original Proto-Indo-European pronunciation and English to illustrate what the Proto-Germanic pronunciation shifts towards.


English orthography is a dead giveaway. For example, why does the letter "i" in most continental European languages represent the vowel /i/, while in English it represents a diphthong? The way French loanwords or Latinisms are pronounced in modern English is similar evidence.

Also, vowel chain shifts are exceedingly common across the languages of the world: compare Tatar and Kazakh to the other languages of their subgroup (Kipchak) within the Turkic family, for instance. Sometimes a vowel chain shift can even be observed in progress, as in the case of the Northern Cities Shift in the USA. [0]

[0] https://en.wikipedia.org/wiki/Inland_Northern_American_Engli...


I used to have an English lit professor who could read and conversationally speak Old English. To my ears, it sounded like a Lord of the Rings style movie, or the Silmarillion come to life. Very musical in a sort of imaginary elvish kind of way.


One of the things I like about Old English is kennings.

For instance, "battle-sweat" to mean blood.


Middle English, before the Great Vowel Shift, sounded much more Germanic: https://www.youtube.com/watch?v=GihrWuysnrc


Sadly the article doesn't explain how we know vowels were pronounced differently back then. There was no voice recording obviously. Did they have some phonetic system that we can use to reproduce their pronunciation.


There are a lot of pieces of indirect evidence. They include:

* poetry that has a rhyme scheme, so you know that two words rhymed for the author

* puns and wordplay (Shakespeare is overflowing with this kind of thing) that only work if words were pronounced a certain way

* misspellings or variant spellings, which before the typewriter era would be almost entirely sound based (i.e. unlike typoes in general)

* borrowings into other languages, which generally are "frozen" at the time of their borrowing and from then on start mutating by the rules of the other language

* related languages with cognate words, if you can devise a coherent set of sound-shift rules to reconstruct the shared ancestor-word and the intermediate forms along the way

* complaints from the older generation about how kids these days are mispronouncing words (these are always fun, and usually come with specific, if informal, descriptions of both the old and the new pronunciation)


We know that German, Swedish, etc, descended fron the same old language that English descended from, and the pronouciation of those languages is very similar to the pre-shift pronounciation of English.


Some good previous discussion: https://news.ycombinator.com/item?id=16774428


I recommend to read the following article before reading that wiki post. So we need to understand what does a Vokal (Vowel) mean first.

https://github.com/yogurt-cultures/kefir/blob/master/kefir/p...

Also I shifted one Const sound in the wikipedia article.


Of course, when I first heard of this, it was referred to as the “Great Vowel Movement.”


The recordings sound very Dutch!


The Frisian languages are the closest living relatives to old English, so perhaps there's a connection there?


Here's what it'd look like (taken from Reddit):

"The European Commission has just announced an agreement whereby English will be the official language of the European Union rather than German, which was the other possibility.

As part of the negotiations, the British Government conceded that English spelling had some room for improvement and has accepted a 5- year phase-in plan that would become known as "Euro-English".

In the first year, "s" will replace the soft "c". Sertainly, this will make the sivil servants jump with joy. The hard "c" will be dropped in favour of "k". This should klear up konfusion, and keyboards kan have one less letter.

There will be growing publik enthusiasm in the sekond year when the troublesome "ph" will be replaced with "f". This will make words like fotograf 20% shorter.

In the 3rd year, publik akseptanse of the new spelling kan be expekted to reach the stage where more komplikated changes are possible.

Governments will enkourage the removal of double letters which have always ben a deterent to akurate speling.

Also, al wil agre that the horibl mes of the silent "e" in the languag is disgrasful and it should go away.

By the 4th yer peopl wil be reseptiv to steps such as replasing "th" with "z" and "w" with "v".

During ze fifz yer, ze unesesary "o" kan be dropd from vords kontaining "ou" and after ziz fifz yer, ve vil hav a reil sensi bl riten styl.

Zer vil be no mor trubl or difikultis and evrivun vil find it ezi TU understand ech oza. Ze drem of a united urop vil finali kum tru.

Und efter ze fifz yer, ve vil al be speking German like zey vunted in ze forst plas."


Believe it or not, that joke originated in a 1946 issue of Astounding Science Fiction. See https://news.ycombinator.com/item?id=26468884 and the links back from there.

(We detached this subthread from https://news.ycombinator.com/item?id=26809658.)


This is cute, up until the point where it starts reaching and makes changes which are important for the sound of the language.

The e in many words isn't silent, it's a modifier. Kit and kite are different words. "th", "z", "w", and "v" are all really different sounds. While certain accents (or children) do occasionally conflate them, in basically every case, it's technically incorrect. Zebra, Webra, Vebra, and Thebra are all totally different words to a native English speaker.

I'm absolutely behind the idea of simplified English where spelling and pronunciation match. But that's a lofty goal, as first one would have to canonize English, which is basically impossible at this point. Then they'd have to tackle homonyms, like cot and caught (assuming canonized English has these pronounced the same).


So basically English spelling is trapped in a local minimum. An small change will make spelling closer to some dialect, and further from most other dialects. The current situation is suboptimal for everyone, but at least it's a Nash equilibrium.


That's certainly true for things like vowels, but there are other situations where spelling could be changed with no ambiguity problems in any dialect. For example, basically every word that contains "gh" can be simplified, as <gh> represents a /x/ sound like German "Bach" that dropped out of the language in the 1200s. Nobody is confused by "tho" and "thru".

Another easy example is just fixing "island". The <s> was never pronounced. Medieval scribes put it there because they incorrectly guessed that the word was related to Latin "insula".


I personally think there's a lot of room for improvements that would work in nearly all English dialects. The "ough" mess, for example, could stand some clean up. But in general, yes, there is too much variation in English around the world to "fix" it now.


You're hitting the phonological analogue of Moravec's law: the "ough" mess is precisely the kind of thing that can't easily be simplified in a way that works across English dialects. It hits a wide swath of basic vocabulary that originally had a small range of slightly awkward pronunciations that diverged in descendant English. Different dialects don't split those pronunciations the same way.

The real low-hanging fruit is getting the British to give up on those spellings that follow a dead branch of French ;-)


Because 'Kite' should be spelt 'Kait'. The English I/E/AI vowel sounds are a massive mess. I realised this only after starting lessons in Japanese, where the vowel sounds are almost entirely regular. Ke always sounds like Kelp, and Ki always sounds like Kit and Ka will always sound like Kart.


I'd pronounce kite and kait differently I think. More rising on the latter. Being acceptably good at Dutch pronunciation, though non-native, I'd suggest keit, and I'm assuming that it's the same in German too.

But really, it should be kite because that's what it is :)


Well kite could be written as keit to match the pronunciation :)

For most non-native speakers, the backtracking of silent `e` is more confusing. It does not help in case of `sake` vs `saké` where most people do not add the acute mark and use context for disambiguation.


The Economist had a recent article on why spelling reform never really gains traction. Basically, those in power to change it have no incentive to do so: https://www.economist.com/books-and-arts/2021/04/10/why-its-...


Yes. In British English, one problem with any sort of spelling reform is that the pronunciations of many words have significant regional variations. For example, in a Southern English accent, words like Bath and Castle are pronounced with a long vowel (almost as if they were spelt 'Barth and 'Carstle') compared with the much shorter Northern English versions ('baf', 'cassl').

To pick a particular pronunciation-based spelling of words would therefore be to prefer one region over another. This would at least trigger a monumental North vs. South argument, assuming that the plans survived the inevitable knee-jerk reactions / incredulity of the usual media suspects.

Probably best if we stick to arguing about daylight saving time, or changes to the format of cricket matches.


There are a few changes that I believe are universal. I'm thinking hard/soft "c" becoming "k"/"s" and soft "g" becoming "j". I don't think I've heard any dialects where there's a difference between whether you pronounce the "c" in a word as hard or soft.

But then again, those rules are pretty standard (c/g before e/i [the "skinny vowels"]) are almost always soft and they are hard otherwise. If you need a soft "c" sound before an "a" then you just use the letter "s". So maybe it's not even worth the effort.


> I don't think I've heard any dialects where there's a difference between whether you pronounce the "c" in a word as hard or soft.

Um, yes, I can't think of regular words where a "c" changes.

<thinks a bit more>

Place names. Place names - at least in England - can have significant differences between spelling and pronunciation, and the locals will often use or be aware of a local pronunciation that isn't obvious to outsiders. Examples include Bicester ('bister'), Leicester ('lester'), Salisbury ('solsbry'), Tottenham ('totnam') and many others. It's not quite the same effect as with dialects but it certainly complicates spelling reform.


People also fucking haaaate changes in language. Especially older people.

I had a 85 year old (yes) technical writing professor in college who insisted that we do our papers using the "Queen's English" that he learned growing up in Catholic school. He even went so far as to write a damn book outlining his rules for English writing because none of the existing style guides out there matched his view of the language.

You'll never get people like that to adopt any sort of change to the language.


That's not universal at all. Many languages have spelling reform every couple of decades. French spelling was updated in 1991 and German in 1996. Dutch was reformed in 1996 and 2006.


But other languages (like Polish or German for example) have spelling reforms from time to time.


I think it's easier when there's a single polity that represents the "entirety" of that languages speaking community. Nobody controls English in the same way - I'd say the US and UK have equal claim to being able to formalize language changes, but good luck getting 2 billion people to follow them.


I think for those languages (I know French and German have them, unsure about Polish) there is a central body which "controls" the language, meanwhile in English we don't have one.


There is one for Polish but I think English speakers overestimate the power such bodies have over people :)

The only power they have is influencing the way kids are taught at school. Everything else changes by social pressure and exposure - most media choose to follow the new convention and people get used to it over time.

The changes are very gradual - the only big one I remember was in 90s - changing how "not" was written with adjectives and adverbs. The rules got much simpler so few people complained.


So the only power such bodies have is being able to influence the entirety of the next generation of the speakers of the language? That's surely some amount of power that nobody holds whatsoever on the English language. Israel was able to use the educational system to revive a millenia-dead old language, that's quite a lot of power.


I meant the direct influence would only be noticeable after decades, but because of media and social pressure after a few years most adult people switched.


Also Norwegian and Swedish.

Romania also had some spelling reform, albeit it was more motivated by a desire to distance itself from a communist past and not cleanup of tech debt.


You still need c for “ch” so perhaps, like Indonesian, you can just use “c” in place of “ch.”

Oh, and Indonesian is my favorite lingua franca. Super easy to learn for anyone and also simple, phonetic spelling.

I had only ever learned Indo-European languages (ie English, Spanish, French) and a bit of Japanese (also unrelated to Indonesian), but I was able to pick up a useful amount of conversational Indonesian in about 3-4 weeks. Indonesian is an Austronesian language (actually a standardized variant of Malay) and totally unrelated to my mother tongue (American English), yet it was the easiest thing to pick up. Sounding out new words in Indonesian is actually easier than English to me.


I dont vant tu teik saids, but German haz diklention und konjugation, vich maiks it a suboptimal lingua franka. Not tu mention meni iregular verbs.

Mai vot gos tu som nordik languag like Svedish.


If you want to go that route, it's fun to imagine a "unified Germanic" achieved by systematic reform to bring the main Germanic language closer together.

Start by purging English of French influence, starting with words where words of Germanic origin exists with a similar meaning. Simplify German grammar. Undo some consonant shifts. E.g (the German and Dutch I had to check/adjust w/Google translate; no guarantees for accuracy):

Swedish: En dag kan vi alla tala samma språk

Norwegian: En dag kan vi alle snakke samme språk (or "det samme språket")

Danish: En dag kan vi alle tale det samme sprog

German: Eines Tages können wir alle dieselbe Sprache sprechen

Dutch: Op een dag kunnen we allemaal dezelfde taal spreken

English: One day we can all speak the same language

Now consider "speech" as an alternative to "language" in English (alternatively: "tale" is valid but archaic in Norwegian in this context and we have the English cognate "talk"), and undo that D->T consonant shift in German (e.g. compare Tag to Low German "Dag"), and replace "the" (compare det/de/die/das/der etc.).

There are a whole lot of simple spelling and sound changes that'd bring the above languages a lot closer together very easily.

Of course it's easy in theory - in practice I've lived through multiple Norwegian language reforms and know how excruciatingly slow it can be to get people to adapt (e.g. Norway changed the spoken form of numbers above 20 in 1952 from the equivalent of "four and fifty" to "fifty-four"; my parents learned the new forms in primary school, yet I still picked up the old forms from them in the late 70's and still switch back and forth between the old and new forms now)


Unpopular opinion: Declension and conjugation are nice! It means different forms for the act of watching X (watching birds), X that is watching (watching eyes), or X that is used for the act of watching (watching post). So you immediately know which one is which, instead of trying to figure that out from context (you have 200 milliseconds before the next sentence starts, good luck).


In English it's not based on context, it's based on word order. Declension is a cool concept but it's far more confusing to foreigners because word order basically doesn't matter.

For example in Czech:

Jan zabil Petra

Jan Petra zabil

Petra Jan zabil

Petra zabil Jan

Zabil Petra Jan

and

Zabil Jan Petra

are all equivalent to the English "John killed Peter." Change Jan to Jana and Petra to Petr and all 6 of those become "Peter killed John." Even more confusing to a foreigner learning it is that Petra and Jana are the feminine forms of those names in the nominative case.


You're not going to get a lot of support here, unless you change your vote to Rust.


Esperanto > Rust


I don't think declensions and conjugations matter all that much, seeing how Latin was the Lingua Franca for 2000+ years, and _the_ Lingua Franca: French, has a load of irregulars too.

But we all know that it should be Esperanto.


Esperanto is more like Go. Rust is more like lojban


Does German have (significantly) more irregular verbs than English?


No.

German has significantly fewer irregular verbs than English.

It's ~200 to ~300. (French is double that?)

There's enough to be moderately annoying, but not that bad. Also (in my personal opinion), German irregular verbs tend to be not-as-irregular as English.


The fuzzy thing with counting French irregular verbs is that there are so many that follow similar patterns that they really can't be treated as fully irregular. More, like...oddly specific variants of the -re/-ir/-er verb classes. (You can get into this with English, too, in things like "to come", "came"/"to become", "became", or "to hold", "held"/"to behold", "beheld", but we actually IIRC have fewer groupings like this.) So the raw French number is higher in a strict sense, but potentially between English and German overall.


I've half-jokingly, proposed a similar change to Spanish, basically:

z, c (as in "ce", "ci"): use "s" (non european spanish speakers do not distinguish these sounds anyway)

v: always use "b"

c (as in "ca", "co", "cu"), q(u) (as in "que", "quiso"): replaced with "k"

w: why do we have this letter?! use "u"

y (as vowel): use "i" (basically only used as "and" in Spanish)

y (as consonant): stays like it is now (important in some variants where it sounds pretty much as "sh" in English)

ll as in "lluvia": replaced with "y"

h (mute as in "hueso", "humano"): Just remove it (ueso, umano)

ch (as in "chorizo"): replaced with "c"

r, rr: Couldn't yet find a good replacement that's not ambiguous for the soft and vibrant sounds in all the use-cases...

ñ: this stays. it gives the language personality!

I've got not much traction with my friend, though!!!!!


I would prefer B and V to keep being disctint letters.

As far as I know, when properly pronounced, the V in Villa doesn't sound the same as the B in Billete.

Sure, sometimes they blend into each other, but not always.


Yes, in "proper" spanish they sound different. That said, except when exaggerating, I know noone who makes the distinction in day to day conversations!


If you are pronouncing B and V differently, then it's not proper Spanish.


At least not where I live, that's it!


> ñ: this stays. it gives the language personality!

We can remove it and call the entire transition the Convergencia año-ano.


LOL! Nooooo

I stand by the Ñ!


Adding to my post, with regards to "y": In school we are teached there are 5 vowels "aeiou" yet the "y" sound when used alone is a vowel!


Jaja, I see that you like Andrés Bello, dont you?


Haven't read him, actually! I will now!


That one predates Reddit's existence, it was already being passed around the internet in the 90s. There are a couple of slightly different variations floating around.

https://web.archive.org/web/19991006200917/http://users.ox.a...


Very adept adaptation of the Mark Twain original!


That is lovely, thanks for ‘ze laf’. One minor note, it reads much more like Dutch to me :)


As someone who speaks both English and German natively, I can say Dutch makes my brain hurt whenever I hear it. It’s like my brain can’t pick which neural pathways to use and the dissonance is awful.


Even as a native English speaker who speaks no German, I find being in The Netherlands to feel a lot like the part of my brain that is hearing a conversation and trying to tune in is working at native speed, but the language parsing part can't make any sense of it.

Sometimes when I'm in parts of the world where I'm surrounded by other languages I don't speak, I find I'm almost automatically tuning out people speaking them around me. That doesn't happen at all with Dutch.


c can then be used as "ch", as in Italian.

x can then be used as "sh", as in Portuguese.


Speaking of Portuguese, I've always found the usage of "ç" to be interesting. I've never studied the language but just from looking at words it's clearly the way to represent the soft "c" sound before a "fat" vowel (a/o/u). How did those words ended up with a "ç" instead of an "s"?

This thread finally made me remember to go look it up and it seems like the "ç" used to be a different sound (/dz/). I guess it evolved to the "s" sound we hear today sometime by the 1700s.

I wonder if that means only words older than the 1700s have the cedilha and newer words would just be spelled with an "s"?


Yes, I always figured it was related to the French usage. Not sure who used it first. Hmm, guess that leaves Þ | Θ - thorn or theta for "th", but I'm less enthusiastic.


In German e is a, and i is e. It really is striking.


It isn't striking for speakers of German nor for speakers of Romance languages. These are the vowels that these letters originally stood for in the Latin alphabet, and they still do in many languages. But English obviously isn't one of them.


I’m a native speaker of both English and German and it was striking to me when I was learning to write.


It struck you that /e/ and /ɛ/ were spelled with an "e"?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: