Hacker News new | past | comments | ask | show | jobs | submit login
The capital sharp S in now part of the official German orthography (typography.guru)
188 points by fanf2 on July 3, 2017 | hide | past | favorite | 364 comments



As a german, I see this as a step in the wrong direction.

The sharp S is a PITA to begin with, so I'd rather abolish it completely.

Now there's a whole bunch of complications coming from this - how do I type this? To type lowercase "ß", I press the key above "p" (in the same position as with qwerty) and "ü" (to the right of it), which will do "?" when pressed together with shift. So do we move "?" somewhere else?

QWERTZ is already bad for e.g. programming with all its punctuation (typing "]" means holding altgr and pressing 9), so that would make it even worse.

Personally I'd rather use composing since that would mean I could continue ignoring "ß" like I already do (I use "ss" instead), only now I'll ignore both forms.

Note also that a capital "ß" is barely useful to begin with - no word begins with it, so the only reason to use it is to write a word in ALL CAPS.


As a German, I disagree with you. German's not English, even though you might prefer if it was. Others don't. It's fine that it uses the letters it ended up with historically. This change fixes an internal inconsistency, your reply seems fairly irrelevant. It feels like replying to some bugfix in, say, Ruby with something like "I never liked Ruby anyhow".


How is using "ss" instead of ß in German any worse from using "th" instead of þ in English?


Annoyingly, English uses "th" for two rather different phonemes: þ ("thing") and ð ("this").

Icelandic maintains this useful distinction. I feel like English spelling would need at least half a dozen more letters to disambiguate. As it is, English is actually a not great choice for a global language because it's so unphonetic.


Every language has something dumb about it that makes it "not a great choice for a global language". English spelling is a nightmare, but the grammar isn't too bad, and most importantly, it has no gendered nouns.


Spanish would be a much better global language than English or French due to simple phonetics and overall consistency.

Gendered nouns are really simple. I'd say the hardest part is the subjunctive case but it's not critical for communicating.


Disclaimer: I'm a Spanish newb.

While there are good things about Spanish, speaking as a native English speaker there are annoyances too. One big on compared to English is that English is seemingly more economical with syllables than Spanish for many common words e.g. Demasiado, todavia, many adverbs like recientemente.

Spanish does seem to be one if the better languages for gender of nouns in that fairly simple rules seem to correctly classify many but still not having a gender for table or chair is imho inherently better from a cognitive load standpoint. Likewise the agreement of article, gender and number is much simpler than, say, German.

English also wins with much simpler verb conjugation. The forms in Spanish are distinct enough that pronoun subjects are often dropped. I prefer "I speak" and "we speak" to "[yo] hablo" and "[nosotros/nosotras] hablamos". Again English is far more economical with syllables here.

Spanish object pronouns are a PITA because they're ambiguous to the point that you gave to add "a ella" to things.

I also find the number of accents used to be incredibly tedious but at least they don't seem to be critical and are consistent with the pronunciation.

Modern English also has only one second person form. Spanish has 3 or 4 (tu, usted, ustedes and Latin American Spanish doesn't seem to use the vosotros form that Spain does).

I also find the ordering of pronouns common in mainland European languages a PITA meaning objection pronouns before the verb, non-pronoun objects after.

English actually front weights important information pretty darn well from a linguistic standpoint but at least Spanish doesn't have anything silly like German's separable verbs or putting the verb last in Nebensatzen.

The interesting thing about English was that it lost a lot of the linguistic bullshit (basically) during the centuries wheer it wasn't the official language of England (French was), this being the transition from Old English (essentially a foreign language at this point) to Middle English (mostly comprehensible to Modern English speakers).


Spanish conjugations are not harder for me than saying the right English verb (do/does, was/were, am/are/is, has/have). Yes, conjugation adds a bit more to learn at the start but learning English spelling and irregular verbs took me years at school. And I'm still not sure when to use many of the English perfect tenses.

(Editing this comment because I made a conjugation mistake for the millionth time – "add" instead of "adds".)


Yes, Spanish is mostly simple and consistent. It feels too verbose though. I think there are more (space or time) efficient languages.


I think Spanish speakers compensate for this by speaking quickly. Since Spanish has a simple and unambiguous syllable structure, this is perfectly possible. Contrast with Mandarin, which can be pretty terse but also requires precision in pronunciation.

Apparently the information density of most languages when spoken by native speakers is much the same, despite variations in verbosity.


Blond/blonde and fiancé/fiancée would like to disagree with you (yes yes, loan words and you’re probably talking about gender in articles anyway).


I think those is more like ram/ewe, just words to signify specific objects having gender. But words themselves don't have grammatical gender, not like in Russian and German, where every noun is gendered and all (well, most) adjectives/verbs/etc. have to be modified to fit the gender of the noun. Presents tons of problems for localization, because a) you have to have 3 separate ways of saying the same and b) you can not say something like "user has logged in" without knowing the gender of the user, at least not if you want to sound grammatical.


Also the dispute this generates! You wouldn't believe how many friendships are torn apart over the question of whether it is "die Nutella" oder "das Nutella". It is "die Nutella", just so we are clear!


Das ~Glas, die ~Creme/Butter, der ~aufstrich. Since it's uncountable, it's just "gib mir <x>".


Every time an old lady orders "das Cola", I twitch a little.


> das Nutella-Glas

FTFY


Der Nutella-Brotaufstrich.

Two can play this game. There is no sane answer.

Except for rules of thumbs like "Romance-sounding word ending in -a is probably female".


My point is that, whenever you want to use a brand name like "Nutella" in a specific context and you're unsure about the gender, you can just use a compound word with a clearly defined gender instead.


Touché.

On a semi-related note: in the Rhineland gendered articles are also used colloquially when talking about people: "Der Mark", "Die Steffi" and so on. I was born into this but I'm unsure how I feel about it after finding out it's actually non-standard in the rest of Germany.


I assume Russian and German are like French and Spanish, where the standard practice is just to use the masculine version if the gender is unknown.


Not at all. It's often something some people feel quite strongly about and disagree with each other about.

The gender is often derived from synonyms or etymologically related words. I'm guessing the reason it's "Der Computer" for example is that "Computer" works similarly to e.g. "Trockner" (drier) or "Rechner" (calculator, in fact often the German synonym for "Computer") and these are all male (as generally words ending in -er tend to be).

OTOH it's "Das Steak" (maybe because "Das Fleisch" or "Das Kotlett"?) and "Das Shirt" (maybe because "Das Hemd"?). The fallback generally seems to be "das", i.e. neuter, not male.

An example of something with inconsistent use is "Der Laptop" vs "Das Laptop". Presumably this is because "Der Laptop" is derived from "Der Desktop", which in turn is likely simply a shorthand for "Der Desktop-Rechner", making it male. Even the Duden (the closest thing to a "standard" dictionary in Germany) lists it as "der; auch das Laptop" (i.e. "der" seems to be somewhat more common but both are in use).


Oh. Well romance languages don't generally have a neuter gender so that's not an option.


Not really. In Russian, it's usually follows what it feels intuitively closer to - i.e. if it ends with a vowel like "a", it's likely to be female - Coca-Cola, Nutella, Fanta, etc. If it ends in consonant, it'll be male. If it ends in "o", "e", etc. - probably neuter. Though there could be more complex cases - e.g. I'm not sure what gender would something like "Subaru" assume. Probably depends on implied noun - if it's a "company named Subaru" then probably female, because "company" is female, if it's a restaurant named "Subaru" (assuming such one exists and is not sued for trademark violation) then it's be male because "restaurant" is male. Some speakers would just avoid putting such words into a context where such decision needs to be taken (fortunately, Russian has no articles, so die/der/das is not an issue :) or attaches explicit native noun like "company" or "restaurant" to make it clear.

Some words are kind of gender-fluid too. E.g. a word meaning "hall" was female and neuter in past centuries, and now is male. The word for "coffee" is male in formal/educated speech and neuter is less-educated but (to my despair) rapidly gaining legitimacy speech. Russian is fun :)

For things like the user, you can always assume male, but as with other such assumptions, some people may object to it. In some cases, there is a built-in assumption that male includes female when gender is unknown, but in others it just means male and that's it.


It once was (and e.g. newspapers still generally only use the masculine form) but this is a very political question. At work we are ordered to use e.g. Bürger/-in (others use Bürgerin oder Bürger which is just both forms written out). Sometimes there is gender-neutral word (such as Lehrkraft instead of Lehrer) that is a good alternative.


In Russian, at least, the grammatical gender would correspond to that of the word user - which is always masculine.


In Spanish there's usuario and usuaria depending on the gender of the user.


You can technically construct a female-gendered noun for "user" in Russian as well - there's a generic construct for that - it just looks really awkward.


I see. My sense is that the Spanish one is actually used if you do know the user is female, but of course most software doesn't know the gender and defaults to masculine form.


No, it's not standard practice.


German now has a number of competing, made up schemes to degender so that a minority of women can do an inner victory fistpump whenever they see the language pockmarked with their particular brand of degendering.

What they all missed was that German already had degendering built right in, the diminutive (something dearly missing from the English language) : der Holzfäller -> das Holzfällerchen, der General -> das Generälchen, die Kanzlerin -> das Kanzlerchen, die Maid -> das Mädchen


Yeah I meant gender in articles, not words where it indicates actual gender of a person/animal (although too many of those can be annoying too, here in Germany I see signs where it has to say "Lehrer oder Lehrerin" or the equivalent everywhere because way more professions/nouns have gendered forms than in English).


That's silly. Gendered nouns seem difficult to you, as an English speaker, because you speak a language that doesn't have them, and not because they're inherently harder than any other language feature.


No, that's silly. Gendered nouns are no hassle whatsoever in your native language, but a major stumbling block in one you wish to learn later on, believe me.

Danish is my primary language. We are blessed with only two genders, not three as in German. As closely related as the two languages are, I really cannot make a useful gender mapping from one to the other. That's one - not the only, but definitely one - reason I shall never speak German with anything like the ease with which I've mastered English, even though being exposed to the former at a much younger age.


My native language is English.


My bad. Unfounded inference.


As an ignorant English speaker who has learned enough German to get by, I wonder:

- what utility do you think a language gains from gendered nouns?

- (more immediate to my experience) how do you know what gender a noun is? I assume there are "themes" that might inform you. But if you're unclear, how do you do find out? Is there a social cost to getting it wrong?

- how are new genders decided? I'm struggling to think of an example as "laptop" presumably has the same gender as "computer", but it must come up sometimes.


"el ordenador" vs. "la computadora" (both computers) would be a spanish example of a gender change for the same concept (former is more Spaniard, latter more Latin-American).

I would say that (at least in Spanish) gendered nouns mainly provide some redundancy that helps error-correct sentences, but apart from that and the occasional meaning subtleties that can carry, they are not very useful.


> how are new genders decided?

Many new nouns are nouned forms of verbs. In German there are a bunch of fixed suffixes for turning verbs into nouns, and they have their own fixed genders: "-er" (masculine), "-ion", "-ung", "-age" (feminine), "-en" (neutral).

For other words, speakers, over time, sort of converge on a consensus, which may only be regional. For example, in some regions "Cola" (the drink) is "die Cola" (female), in others it's "das Cola" (neutral).

Others... not sure. "Mobile phone" is "das Handy" (neutral), I guess from "das Telefon". I'm fairly sure "das Internet" came by analogy from "das Netz", which is the German word for "net".


I also speak English as a native language, but allow me to try to answer your questions:

> - what utility do you think a language gains from gendered nouns?

Not a whole lot, but a language doesn't gain much utility from English features like mandatory plural agreement or a/an distinction either. Even something like "his/her" and "he/she" distinction isn't necessarily and a lot of non-native speakers of English struggle with it. You can dream up some sentences that are clarified by the distinction, but in a different language you obviously just express the same idea differently.

Ultimately a committee sitting around asking if a feature has utility and should be added is not very close to the way languages actually form.

> how do you know what gender a noun is? I assume there are "themes" that might inform you. But if you're unclear, how do you do find out? Is there a social cost to getting it wrong?

Sure. If a Spanish word ends in a it's probably feminine unless it's of Greek origin. But there is also memorization involved. Getting the wrong gender on an inanimate object is a lot like any other non-native mistake; it's likely someone who is talking to you can figure out what you meant.

Again, having to just memorize some cases is common in languages. How do you know what preposition goes with which verb? It probably seems very natural to you, but there isn't much logical basis for a lot of them. In Spanish you dream "with" someone rather than "of" them. Someone who wants to speak native-sounding English has to spend a lot of time learning the right pairings (as well as stuff like which compound expressions should have the infinitive versus the gerund).

> - how are new genders decided? I'm struggling to think of an example as "laptop" presumably has the same gender as "computer", but it must come up sometimes.

It's not really like a group sits around deciding, but of course there's a lot of precedent with existing words that could inform you which answer is likely. Synonymous words can have different genders, so it's not really like, for instance, computing words are all masculine. In general if a totally new word is borrowed into Spanish it's more likely to be masculine (so, like, el kabuki, el ánime, etc.).


I speak Russian, which is gendered (masculine/feminine/neuter) as my native language.

> what utility do you think a language gains from gendered nouns?

None whatsoever. When speaking of things that don't have any meaningful gender, it makes things unnecessarily complicated. When speaking of things that do have physical gender, it often forces you to make assumptions about it even when it's completely irrelevant.

> ow do you know what gender a noun is? I assume there are "themes" that might inform you. But if you're unclear, how do you do find out?

In Russian, it mostly follows the spelling of the words - i.e. if a word ends with a certain vowel, it's feminine, except if ... etc.

Sometimes it doesn't work that way, mostly with loanwords. Some loanwords so obviously don't match any "genderizing" pattern, that they basically end up with neuter by default.

In other cases the historical form of the word did match a pattern, and was assigned a gender accordingly - and then changed (e.g. by re-loaning it in a more accurate spelling), and no longer fits. When that happens, people will use the more "appropriate" rather than the "right" gender in colloquial speech, and eventually it becomes the new standard, collecting the mismatch.

A good example of this is the Russian word "coffee". When it was first loaned back in 18th century, it was "kofiy" - and in Russian, that is definitely masculine. Eventually it got re-loaned as "kofe", which would normally be neuter; but the masculine gender assignment stayed from past spelling. In the dictionaries, that is - in practice treating the word as neuter became one of the common incorrect colloquialisms, just because it doesn't "look" masculine. Language purists fought this for several decades, and eventually lost: it's still nominally masculine, but neuter is considered an "accepted variant" in modern dictionaries.

> how are new genders decided? I'm struggling to think of an example as "laptop" presumably has the same gender as "computer", but it must come up sometime

For new words, by their spelling, as described above. So "laptop" is masculine because it ends with a consonant, for example.

Personal names are an exception to this stuff. As in, when a foreign name is used in a context where gender cannot be guessed, people will often apply the usual rules (and then often get it wrong). But once context is established, it's properly accounted for.

On the other hand, for place names, gender is automatically assigned by the usual rules. So to a Russian speaker, New York and Texas are masculine, while California and Florida are feminine.


My understanding is that proto-IndioEuropean had two genders: animate and inanimate which would have served a purpose. Later animate split into masculine/feminine. Some languages preserved all three, others dropped the inanimate (or it morphed into "neuter").

Much like English spelling it's all a mess now.


My native tongue is Lithuanian. There was never a inanimate/neuter gender and we got masculine/feminine ingrained deeply in the language.

Some people say Lithuanian has kept most of proto-Indo European features. If that's true, inanimate/neuter had to evolve later than gender split.


Over the last 20 years or so, the consensus has grown in Indo-European linguistics that Proto-Indo-European had only the animate/inanimate distinction mentioned above. After the Anatolian branch and possibly Tocharian split off from the other Indo-European languages, the remaining core of IE languages developed the three genders masculine, feminine, and neuter.

The idea that Lithuanian is extremely conservative from a PIE perspective is rather outdated. Lithuanian does preserve a number of features of what might be termed “Proto-Nuclear-Indo-European” (to use the terminology of Lundvist & Yates), but the insights from the Anatolian languages show that not all of these features can be reconstructed back to Proto-Indo-European itself.


What is interesting, neither Lithuanian, nor Latvian has neuter gender. While slavic languages, like Russian or Polish, do have. To be fair, old Prussian seem to have had neuter gender. But given Germanic influence, they likely adapted in later days. It'd be weird if IE had developed 3rd gender, then Baltic languages dropped it while other languages around them did keep it.

Got any links or literature on how that evolved according to the new school?


It isn't at all strange that the East Baltic languages developed (along with other PNIE languages) the neuter and then dropped it later. The very same happened in Albanian, Irish, and the Romance languages (except for Balkan Romance).

Any recent introduction to IE linguistics starting from Beekes' Comparative Indo-European Linguistics will discuss the neuter being an innovation after the Anatolian languages broke off. However, I would especially recommend the Handbook of Comparative and Historical Indo-European Linguistics that will be published by Mouton de Gruyter this September, as it will contain a state-of-the-art survey of the field by a number of prominent scholars. See Lundvist & Yates' chapter on Morphology in it (you can also download a preprint PDF of this chapter from Yates' papers on Academia.edu if you have an account there).


If the neutral gender was dropped, I'd expect to be at least some leftovers... Anyway, I'll try to get hold of that book once it's out. Looks like it'd be interesting read.


> Some people say Lithuanian has kept most of proto-Indo European features. If that's true, inanimate/neuter had to evolve later than gender split.

For my part I'd say that it probably doesn't maintain them that much more than the huge number of other modern-day descendants of PIE.


@xenadu02

Some languages (Dutch and Danish at least) still only have two genders: male+female and neuter.


Gendered nouns are difficult even for those who has them in their native language (articles in German behave very strangely for example). There's usually no direct mappings between languages, there are exceptions etc.


Gendered nouns are incredibly difficult for non-native speakers (even if their native language contains gendered nouns) as they are completely arbitrary, just like the words themselves. So instead of remembering (word) a non-native now has to learn (gender, word) for every noun.

It's not the concept that is difficult, it's the extra piece of information that makes it more difficult.


Memorizing some arbitrary crap is most of what learning a language consists of.


Difficulty aside, gendered nouns make gender-neutral communication awkward. If you want to refer to a group in a gender-neutral way you have to refer to both the feminine plural and the masculine one, or invent some silly new forms.


What's so bad about gendered nouns? They are quite useful, especially when the language allows you to omit the noun and leave only a pronoun, or maybe the entire subject altogether.


How are gendered nouns useful? (I'm referring specifically to assigning genders to nouns that refer to objects that aren't intrinsically gendered.)

My native language is English, and I never have to remember whether a "keyboard" or a "rock" is masculine or feminine. I've studied a few other languages (French, German, Spanish) that do expect me to remember such things.

To be clear, I'm not trying to refute your statement that they're useful, just asking how. I'm interested in learning about a different perspective.


I'd imagine the use is that it sometimes makes pronouns more useful.

Substituting "he" and "she" for gendered articles:

"I have he keyboard and she rock. He is large and she is grey"

Without that, I'd have to repeat the "keyboard" and "rock".

The question of course is whether this is worth all that rote memorization (since no language I know is fully logical here - german's "Das Mädchen" - girls are apparently of neutral gender - being a particularly egregious example).


The case of «das Mädchen» is a mere historical curiosity and doesn't require the memorisation as long as long the person in question is acquainted with basics of the German noun formation, i.e. the "-chen" suffix in German, when added to a noun of masculine or feminine gender, unequivocally results in the "neutralisation" of the noun gender, as well as the "umlautisation" of the stressed vowel of the noun being gender bent, eg. der Hund + "-chen" -> das Hündchen. It's a very simple rule, really.


The thing about this, though, is that it only works if you get lucky and none of your nouns share a gender. I feel like that undermines the argument at least a bit.


I feel like I could come up with some examples if I had kept up with German after high school. I remember it being difficult for a year or two, then it seemed more helpful as we got into more complex language mechanics. In any case, German felt more consistent than English, and most of the words just felt right with one gender or another (and in speech you could usually get away with something between "the" and "duh" if you weren't sure about der/die/das).


They are generally going to sound good together because the word and the pronoun co-evolved. If they didn't sound good together, either the pronoun or the word would have changed. That isn't about the gender being right so much as just the sound, though.


There are other possibilities, but trying to apply logical rules seems pointless; the whole point of language -- what actually makes something a language -- is a completely arbitrary set of rules.


If you do that in German (i.e., "Ich habe eine Tastatur und einen Stein. Sie ist groß und er ist grau."), everyone would start slapping you with a style manual. It's just so unnecessarily contrived.

The major argument for keeping gendered nouns as they are in existing languages is that speakers would be uncomfortable with having their language changed by some standards body.


German isn't my mother tongue, but I think it's a form of diminutive, which often becomes neuter in Germanic languages, as for instance in Dutch.

I agree it feels silly to learn the gender of "sexless" words; but this case and the rule behind it are quite clear in the respective languages this occurs in. I guess it's not unlike English using neuter for animal pronouns, which feels strange to speakers of most Germanic languages.


What is an animal pronoun?


I translate Italian 16th century dance manuals into English. There are many pronouns in dance descriptions, and having gender as an extra clue is very helpful when I'm trying to figure out what pronouns refer to.

Now, in English, the author might have used fewer pronouns if they were ambiguous... but I've seen a lot of ambiguous pronouns in English writing.


I don't think anyone is objecting to gendered pronouns (he, she, him, her) but rather gendered nouns in general. For example in french "night" and "baguette" are female whereas "book" and "chair" are male. It seems arbitrary and makes learning the language more difficult.


I was speaking about gendered nouns. They help me resolve pronouns.


Oops I'm sorry about the mansplaining then. Can you give an example? I'm struggling to conceive what you mean.


The nouns for heel, foot, and toe have gender. When I get to the end of a dance step mentioning all 3, and it says "and in the final beat you lift <pronoun>", I use the clue that the pronoun gender should match the noun gender to try to finger out what the antecedent is.


I speak two languages with gendered nouns, and I didn't find much specific usefulness for it. It's just something that is part of the package, so you go with it, and you can claim that provides more rich texture or such (though I'm not sure why knowing "table" is "male" and "government" is "female" really has any meaning, but maybe poets have one more tool to play with), but I'm not really sure it's that useful outside of using it for objects for which gender does make sense. But even then saying different word for "walked" depending on whether it was male or female walking doesn't really seem to me much of an advantage. It's just what it is.


It certainly does provide some extra flavors for poets to play with.

It also makes translations to other languages hell, when noun genders are used for allegoric purposes.


> when noun genders are used for allegoric purposes

Ugh, really? How disappointing.


Just to give an example, consider this short poem:

https://de.wikisource.org/wiki/Ein_Fichtenbaum_steht_einsam

Now try translating this to a language where the tree names are gendered differently, such that e.g. both are male, or both are female.


That is exactly what happened to the Russian translation:

https://ru.wikisource.org/wiki/%D0%9D%D0%B0_%D1%81%D0%B5%D0%...

In Russian, both words are female.


Yeah, Russian is the one I was thinking of.

Although it's actually more complicated, because there's more than one Russian translation, and this problem was tackled in different ways. Lermontov just did a straightforward translation, changing the implied meaning. Tutchev and Fet both changed the pine to another tree such that the word is male: cedar or oak (in the latter case, this also required changing the described environment in which it grows).


Just out of curiosity, which language are your examples from? I'm asking because my native language is Serbian and it also features "male" tables and "female" government.


Hebrew has the same. Spanish though has it reversed (I didn't mean it among languages I speak since I'm just beginning studying it).


German, too. He might accidentaly be onto something.


And it's the exact opposite in French and presumably all (most?) other Romance languages as well: female table, male government.


If it's your native language, or if you're at a good level of fluency, you don't have to "remember" it, it comes naturally.

I agree it's harder for students though.


What you find hard is mostly a function of what other language or languages you speak. People who speak languages without articles find articles extremely confusing in English and honestly I have a hard time articulating rules for when "the" or "a" would be appropriate.


I think a simple way of putting it (as a native non-bilingual English speaker) would be to think of it as "a" -> non-specific and "the" -> specific.

Example:

A basket is in the car -> some kind of basket of unknown shape or color.

The basket is in the car -> a specific, known basket.

HTH


Having a simple way to put it doesn't mean it's easy to do right.

My native language don't have anything like articles. I frequently misuse a/the in borderline cases. Or just omit the article completely..

Yet gender is super easy in my language! Each noun has gendered suffix. Once you know the word, you know it's gender. Or once you know the gender, you know the suffix... :)


But it is easy once you understand :-)

A == non-specific The == specific

'A dog' could be any old dog.

'The dog' is a specific dog we know.

Thus

A == any The == thing

Perhaps I'm not clear where with this understanding it would trip you up. Can you think of an example?


Can't come up with anything on the spot. But sometimes I see text I wrote and think why I put "a" or "the" instead of vice versa. Sometimes it's just not clear enough wether this item is specific enough or not.

On the other hand, I omit article completely more frequently than using a wrong one. I find I have to consciously think if I should use an article and which article should I use. When I write/talk quickly without double checking, shit happens. Even after using English a lot for 2 decades, articles is just a foreign feature that I have to use consciously.


My wife is Korean and often asks me questions and, while you aren't wrong, the rules are way more complicated than that.


I'm trying to think of an example--do you have one by chance?


Sorry, I'm coming up short. But the theme, I think, is that there are a lot of cases where that sense of specificity isn't quite obvious to a learner, combined with the fact that in some cases the right choice is no article.


We're talking about what makes a good world language though, which implies how hard it is to learn for native speakers is important.

Like tones in Chinese are another example of "easy enough for native speakers, really freaking hard to non-native".


It's a significant bit just like the ones that form the letters: la tour, le tour, entirely different words that have their most famous examples within visual range once a year.


It's really easy, because "keyboard" is "teclado" which ends with o (so it's masculine) while "rock" is "roca" which ends with a (so it's feminine). :)


And there are, of course, counter-examples: el águila, el ala. I believe the rule should just be "it sounds right", as to say "la águila" is well, a third 'a' in there, two or them consecutive, and is just harder to even pronounce.


It gets worse in French: un livre (book), une livre (pound, in France equal to 0.5 kg). They're even pronounced the same.


Yes, yes, but this example is so well trod that it immediately popped into your mind, and, what's more, try to think of a sentence where you might honestly be confused about whether "pound" or "book" was meant.


I can certainly imagine a French learner being confused about which gender goes with which noun. Before you say that this is ingrained in French speakers, so are unphonetic English orthography, Chinese tones and ideographs, and other linguistic sticking points ingrained in the native speakers of those languages.

French has quite a few words that break the apparent gender rules: un musée, un lycée, un mille (meaning "mile"; the homograph meaning "thousand" is feminine but usually doesn't take an article), le mort (dead person) vs. la mort (death), etc. All adding to the shit you gotta memorize. If you don't, your meaning will still come across but you'll sound "off".

(I don't even think I remember all my Vandertramp verbs...)


Well my whole point is that it doesn't make a whole lot of sense to talk about whether one language is or isn't appropriate as an international one because the idea that one is just objectively harder than another doesn't really hold.


Le port and la port.


Port is always masculine though, it only means port/harbour.

But it's not that important, gender does add some redundancy and some error-correction to a language. It's not necessary (as shown by English) but it has some use.


I guess he's thinking of "la porte" but the pronunciation isn't the same.


That only happens with a handful of words: those that 1) are feminine and 2) their first syllable begins with an "a" and 3) their first syllable is stressed.

It's similar to English which uses "an" instead of "a", but it happens very very very rarely.

(Needless to say, "águila" and "ala" ARE feminine nouns, what's changing here is the determiner so the two a's don't clash, not the gender of the nouns)


If the last letter determines gender what's the point of using "el" or "la" at all?


English lets you omit the noun and only use a pronoun. Colloquial English lets you omit the subject sometimes, too. And it doesn't have official gendered pronouns.

This sentence is short. It is short.

"What did you do today?"

"Rode bike"

Sometimes we even use genders for nouns even though it's not official:

The ship sank. She sank.

Compared to German:

The dog jumped on the table and bit the man.

Aww shit there are three "the"s in there... if I'm speaking, I can just slur through it and say "d'Hund" or "d'Tisch" but that doesn't work when I'm writing... okay let's try to get through this.

Der Hund sprang auf... Hmm, well "the table" is Der Tisch, but hang on, I have to figure out what case this is... ah, accusative! den Tisch und biss Ah fuck, "the man" is "der Mann" but what case is this... I don't even care anymore... d'Mann. Nailed it.

It's not like anyone is going to say "he jumped on him and bit him", so the genders serve absolutely no purpose.


Mmmh, but in German you can say "Der Hund sprang auf den Tisch und biss ihn". This kind of ambiguity can be a lot of fun. (It can also be confusing, I admit. But it can be fun, too.)

EDIT: Well, in English you can say "The dog jumped on the table and bit him", but it is not ambiguous. ;-/ The ambiguity in German is not due to gendered pronouns, but due to the gender of table/Tisch. :-|


the benefit of languages with cases is that you can switch the order of the sentence without changing the meaning. In those languages I can say "the dog bit the person" and "the (accusative) person bit the (nominative) dog" and it would not change the meaning of the sentence but it would allow me to emphasize one object more.


You can still do that in English though: the dog bit the person, the person was bit by the dog


The person, the dog bit.

- Yoda


They may be useful, but they make a language much harder to learn to speak correctly, which is a downside for a lingua franca.


But so do inconsistent spellings. There is a reason, most languages apart from English have no use for spelling bees. [1]

[1]: https://www.washingtonpost.com/news/wonk/wp/2013/05/30/spell...


They make a language harder to learn and increase it's cognitive profile, especially for newer speakers.

I'll also make the contentious claim that they are clearly visible as being kinda sexist to progressively-oriented native speakers of languages that don't have them. Of course, if you honestly follow that train of thought, it doesn't stop at having gendered pronouns for inanimate objects. See https://www.cs.virginia.edu/~evans/cs655/readings/purity.htm...


If they appear to be sexist to progressively oriented speakers of languages without gender, the problem isn’t the language, but the ignorance of the progressively oriented speakers of languages without gender.

The masculinity or femininity of a table ought not be controversional except among those who are eager for things about which to be offended. It would seem there are certain groups of people that seem to derive pleasure from being offended. It might also seem, to speakers of languages with gender, that these so-called progressives are being regressive by imposing their linguistic ideology upon others.

Of course I support sex equality for humans, but really sometimes so-called progressives venture into Newspeak territory or at the very least, absurdity. “Womyn” is another example of similar nonsense.


Grammatical genders are nonsensical and almost completely arbitrary. For example "girl" is an "it" in German.

Languages also differ significantly in terms of which grammatical gender they assign to things.

It's worse than useless.


> What's so bad about gendered nouns?

Gendered nouns are a significant barrier to learning a new language.


Differences make learning difficult. More at 11...


There have been many, many failed attempts at introducing phonetic/phonemic/simplified spelling systems for English. It’s a surprisingly hard problem both technically and socially.

For starters, we probably can’t split “th” into “þ” and “ð” because of words like “with”, where “wiþ” and “wið” are in free variation; even words like “thank”, which you might expect to be uniformly “þank”, show up as “ðank” in some dialects, sometimes also in free variation.

English consonants are pretty consistent across dialects, with a few rare exceptions (like [x] in Scottish “loch”), but the vowels are all over the place phonetically. Even attempts to represent them phonemically (e.g. the 24 Standard Lexical Sets[1]) aren’t perfect across dialects because of the presence of different vowel mergers & splits, as well as one-off exceptions.

[1]: https://en.wikipedia.org/wiki/Lexical_set


I think that the fact that English uses a small alphabet and no diacritical marks (except for a few loan words) and hence cannot closely specify pronunciation is a strength not a weakness. It makes English inclusive and usable by people with widely (or wildly) varying accents and dialects.


English would probably be 50% improved if we'd just get over things and add a "ə" vowel.


What's this?


Bizarrely, in English, the most common vowel sound has no single glyph to represent that sound. Among the many variations of sounds that vowels can make, all of them can be pronounced as 'ə' (schwa) in certain circumstances. This is part of what makes English orthography a nightmare and then pronunciation of written words difficult.

As an example: "banana" has two different vowel sounds, 'ə' and 'æ' (in American English at least). If English allowed for 'ə' it could be spelled bənanə.

Other words with schwa:

amazing - əmazing

tenacious - tənacious

replicate - repləcate

percolate - percəlate

supply - səpply

Not all spoken Englishes might agree with all of these, but every English variant uses the sound.

http://www.slate.com/blogs/lexicon_valley/2014/06/05/schwa_t...


> As it is, English is actually a not great choice for a global language because it's so unphonetic.

English would be a more-useful global language — in that it'd be easier for non-native speakers to learn and use it — if we pedants who secretly look down on anything less than "proper" spelling and grammar (and I'm embarrassed to be one of them) would get over ourselves and accept simpler, easier forms as they naturally arise.

EXAMPLE: Less vs. fewer — it's disconcerting to read or hear, e.g., less and less people when the supposedly-correct form is fewer and fewer people; there's no logical reason that the former shouldn't do double duty, as it does in math.

EXAMPLE: Who vs. whom.

EXAMPLE: It's vs. its.

EXAMPLE: Different forms for the subjunctive — e.g., if I was a carpenter (supposedly-incorrect) vs. if I were a carpenter.


As a foreign speaker, I find some of your examples curious, because for them it would actually make it harder for me to deal with English if we went with those suggestions. Reason being that some of these words reflect similar semantic changes in my language, and losing them would feel unnatural. "Who" vs "whom" is a good example.

"It's" vs "its" is also bothersome, but for a different reason - this probably has to do with learning English in a way that emphasized formal sentence structure, rather than just the way it sounds. But I think that's not all that uncommon for non-native speakers. When I studied in college in New Zealand, I once had a chance to converse with our ethics teacher (native Kiwi) one on one, where she remarked on my spelling in a recent essay. From there she quickly went on to a general rant about how immigrants are much better than natives at spelling, citing "its" vs "it's" as one of the examples - according to her, she had never seen a non-native speaker substitute one for the other inappropriately.

On the other hand, "if I were" never made any sense to me, and is one of the more common mistakes I still make, even after 8 years of speaking and writing English 99% of the time.


Those are all examples of problems learners hit only when they've successfully learned English. By the time a non-native speaker has to worry about subjunctive or less vs. fewer, they've already reached a level of proficiency that allows them to communicate fluently despite an occasional grammatical gaffe.

Before reaching that level, however, they have to deal with truly maddening stuff like how to pronounce word like "read" or "tear" or "close", or why it's "an interesting little book" instead of "a little interesting book", or why you can see a movie but you're not "seeing TV", or the fact that both "thir-tee" and "thir-dee" mean 30.


I don't think it is a given that languages naturally evolve to become simpler and only narrow-minded people are holding them back.

Languages evolve to suit the needs of their speakers. If people find themselves without words to describe something succinctly, languages will evolve to become more expressive.

I think compound nouns (as used in German) are a good example of this. They make the language more complex and harder to learn, but allow speakers to capture what exactly an object is or how it relates to others by slapping multiple nouns onto each other.


English can do compound words, except with spaces between the words. This is better: is clear where the boundaries are. When the word combination is used frequently, it may become a single word, like pushchair.

My pushchair wheel lock mechanism is broken.

Or my pushchairwheellockmechanism is broken.


Let's be careful on proclaiming what's better.

Yes, the spaces make it easier to distinguish the individual components, but on the flip side it makes it much harder to figure out where the compound group ends. Even your examples illustrate that.

Different languages have evolved according to different, sometimes diametrical opposite goals. As a consequence, somethings are easier to express in some languages than in others. That's the beauty.


The group ends at the verb. I expect there can be ambiguity sometimes, but I don't see it in the example.

I'm learning Danish, and regularly have trouble working out where compound words should be divided. I can't see any benefit compared to separating the words with hyphens or spaces. The pronunciation would be the same.


@TulliusCicero: the only true genderless language is Persian. There is no he, she, it even. There is no way to hear if someone is man or woman without a precise context or asking "Man or Women?", which I often hear Iranians ask in Conversations.


> the only true genderless language is Persian.

https://en.wikipedia.org/wiki/List_of_languages_by_type_of_g... begs to differ. There are whole language families without genders.


There certainly are. To their point, quite a few of the languages on that list with no grammatical gender are of a similar ancestor language as Persian.


English, German, French, etc are also "of a similar ancestor language as Persian".


Interestingly, it's an easy language & script to learn. In terms of what someone referred to as a "good global language" for universal ease of learning, it provides a great base.


At least the "th" has close function in "thing" and "this". A non-native speaker wouldn't even notice the difference.

But don't expect a non-native speaker to pronounce "pothole" correctly...


We absolutely do notice the difference. One is voiced, the other one is voiceless. The way you articulate them is different as a result, even though there are obvious similarities as well.


I actually don't think the distinction between "thing" and "this" was pointed out when I learned English in elementary school in Germany.

They were just ambiguously referred to as the English "Tee Aitch", which doesn't help anyone.

Note they did point out the different pronunciations for each word but it was pretty much along the lines "they're the same thing but sometimes sound differently".


A number of other languages maintain that distinction as well. For example, θ versus δ in Greek, and th versus dd in Welsh. But when it comes to confusing aspects of English orthography, the list is much longer than that...


English spelling, especially for words of germanic or romance origin, is actually pretty consistent if you think of the spelling reflecting meaning rather than sound. As accents vary quite bit, not trying to be phonetic is a plus imho.


"Rather different" phonemes? There is no room for them to be more similar to each other without coinciding. And they contrast in only a single minimal pair, one element of which is incredibly rare.


Well, ok. I'm not a native English speaker, so I suppose "rather" wasn't the right word. Maybe "notably different"?


"Rather" appears to be a fine word choice for your intended meaning. I'm objecting to your intended meaning (as far as I can perceive it):

- [θ] and [ð] are objectively similar sounds, differing only in voicing.

- The argument is not even very good that /θ/ and /ð/ are different phonemes in English -- they contrast only in the pair of words teeth/teethe, and one of those words is fairly rare. If you, as a foreign speaker, pick the wrong one of those sounds, you have 100% odds of being correctly understood as a "guy with a foreign accent". Even if you get confused between s/z (compare "visible"), you're very likely to be understood, but with θ/ð there is literally no chance of confusion.


Thanks for the reply! I appreciate the clarification.

The variations around "th" is one of the things I remember finding particularly difficult when learning English pronounciation, so my comment came from there.


There is ambiguity. For example, "Masse" and "Maße" are two different words meaning "mass" and "measures". There is a difference, if you drink beer "in Maßen" or "in Massen" (in moderation or en masse).


They happen to be both "Masse" in that little country with lots of cheese and mountains. :)

However I fully understand the problem, given that one of the "clever" ideas of the latest Portuguese revision was to get rid of a few accents.

So "para" (going, giving to some one) and "pára" (stop doing something) are both "para".

Same for "côr" (color) and "cór" (know something by heart), which are "cor" now.

Or "fato" (suit) and "facto" (fact about something), which became "fato".

Oh and in spite of having been reduced to the same spelling, they kept the original pronunciation when spoken, which adds more fun for the foreigners learning the language.

There are lots of other examples, the goal was to make all Portuguese spelling variants more consistent, but made the language much more context dependent.


I was confused by the idea that 'in measures' could mean 'en masse,' but it looks like you've switched the two terms by accident and 'in Maßen' corresponds to 'measuredly.'


The rules of spelling and pronunciation have evolved over time, and include all sorts of nonlinear effects: the 'ss' indicates short vowels around it (as all double consonants do), and a sharp, voiceless 's'. For 's' it's the other way around. So ß fills a somewhat required place.

That's not to say people wouldn't get around it, or get used to it. But trying to change the rules of languages top-down are like trying to change biology top-down: it makes a lot of sense, and yes, cells are the worst sort side-effect spaghetti-code, ever. But it won't make life easier.

It's very much comparable to eliminating some random letter from the english alphabet. Life would definitely go on, but it would seem to be a net-negative suggestion to do so now. Hey, keyboards would be so much easier if we switched to binary.

There's obviously the other side of that argument, saying that an alphabet of 80 letters is probably not the best idea. But having studied biology, I've seen too many examples of a messy long-term process arriving at something approaching an optimism, and of people taking a long time to grasp the elegance of some of these solutions, to be willing to entrust this domain to top-down decision making,


> eliminating a random letter

Reminds me of the scifi short story "Xong of Xuxan" by Ray Russell about a girl with a typewriter that didn't have the letter 's'.

http://www.isfdb.org/cgi-bin/title.cgi?68165


In the German orthography reform of the 1990s the "ss" replaced the "ß" after short vowels. But after long vowels it is still used.

So with current spelling system, it's possible to distinguish words pronounced with long vowels from the ones with short vowels (e.g. by a German learner who sees a word for the first time). So it does have its advantages.

I guess this goes back to the origins of the ß - it can be interpreted as either a ligature of ſs=ss or ſz=sz (ſ is the "long s" which is no longer used).

Maybe we should go back to write the long vowel words with "sz". I am sure that would irritate everybody :D


In the current orthography, a single consonant causes the previous vowel to be read as long, and this holds for `ß`, which is considered in the current orthography to be a single letter (though it was historically a ligature). Changing `ß` to `sz` would make an exception to this general rule.


Take the words Maße (measurements, sizes) and Masse (weight, mass).

There’s hundreds more of such examples.

ß and ss have a different effect on surrounding letters (the a in Maße is short, the one in Masse is long)


I think you meant it the other way 'round? long/short, I mean.

Otherwise I totally agree, ß is important to distinguish phonetics.


It is important but you are marking the wrong thing, you could make it Māsse and Masse if you really cared.


I don't think speakers of the language that invented Worcester Sauce are in a position to give prescriptive logical advice to other languages.

Languages are the products of a sort of evolutionary processes. Rarely do the conventions appear to be the best possible solution at first sight. But it isn't uncommon for convincing arguments to appear later, showing that some seemingly arbitrary rules are almost beautifully constructed to make languages efficient. Not only do common words tend to be shorter than uncommon ones; I've also seen examples of word pairs that would be dangerous to confuse to be further apart in spelling and pronunciation than statistics would suggest (I believe sailing was the example I saw: single words can mean life or death, and are often spoken under difficult circumstances).


> I don't think speakers of the language that invented Worcester Sauce are in a position to give prescriptive logical advice to other languages.

Of course they are. They didn't invent it, and even if they did they could still be right on _this_ argument.

>Not only do common words tend to be shorter than uncommon ones; I've also seen examples of word pairs that would be dangerous to confuse to be further apart in spelling and pronunciation than statistics would suggest

And there's a whole bunch of counter-examples for that. One is the german "zwei" (the number 2) and "drei" (3). The german military commonly uses "zwo" for 2 because those two are so easily confused.

It's not like our languages are willfully poorly designed, but there's still a bunch of bad decisions in there, and we might be in a position to fix it, especially when it comes to things like spelling (which is mostly decreed from on high, since what schools teach matters most).


Or Maesse, which is closer to what it sounds like anyway.


Which would be equivalent to Mässe, though. And then you'd have another letter you may want to abolish.


You're right! It's a good job there are no words in English where one spelling encompasses many meanings, of which some could even be contradictory, which would have to be inferred from context. That would be so bad!


There's no reason for sarcasm, especially not with such misplaced sarcasm. Nobody is arguing that the ß is required for the future of humanity, or that German is somehow different, or even superior, to English, and requires this letter to do its magic.

But for good or bad, it is part of the language. People used to it consider replacing it with 'ss' as ugly. Because of custom, and also because a double consonant almost always indicates a short vowel, whereas a single 's' indicates a soft s. Only ß actually represents the intended pronunciation.

f course w'd gt by witout it. Bt Englis wld similry srvive vs fewr lttrs, so y nt ablsh 1/2 t lphbt?


I don't know whether you're a native English speaker or not, but you certainly reason like one. I started learning English when I was 6 years old and have had very few problems with it. My wife started learning it when she was 39. She's having a horrible time trying to wrap her head around all the inconsistencies, the incredible amount of stuff that has to be inferred from context and the utter lack of pronunciation rules.


See "read" with a different pronunciation depending on tense (/riːd/ vs. /red/).


So, you’re saying we should actually make German a worse language just for your own convenience? No thanks.


as somebody else already said in this thread: this argument is irrelevant, it works just fine w/o the ß in Switzerland.


By that argument, PHP works just fine. I still prefer haskell, or even Java.

And I’ll prefer a natural language that tries to be as unambigous as possible, and with as little exceptions as possible, while being as expressive as possible.


It isn't, the Swiss easily live without ß and use ss – the meaning is usually apparent from the context.


Well, many things are clear from context and spelling could be simplified.

For example, in Year 1 that useless letter "c" would be dropped to be replased either by "k" or "s", and likewise "x" would no longer be part of the alphabet.

The only kase in which "c" would be retained would be the "ch" formation, which will be dealt with later.

Year 2 might reform "w" spelling, so that "which" and "one" would take the same konsonant, wile Year 3 might well abolish "y" replasing it with "i" and iear 4 might fiks the "g/j" anomali wonse and for all.

Jenerally, then, the improvement would kontinue iear bai iear with iear 5 doing awai with useless double konsonants, and iears 6-12 or so modifaiing vowlz and the rimeining voist and unvoist konsonants.

Bai iear 15 or sou, it wud fainali bi posibl tu meik ius ov thi ridandant letez "c", "y" and "x" -- bai now jast a memori in the maindz ov ould doderez -- tu riplais "ch", "sh", and "th" rispektivli.

Fainali, xen, aafte sam 20 iers ov orxogrefkl riform, wi wud hev a lojikl, kohirnt speling in ius xrewawt xe Ingliy-spiking werld.

(Attributed to Mark Twain)

About Swiss and the as: I remember one funny occurrence I had while reading a Swiss paper. It took me two paragraphs in an article "Neue Busse für $company" till I understood this wasn't about busses, but a penalty fine (Buße) Both are pronounced differently and mean different things.


One of the ways in which we know how classical Latin was pronounced, is the recovered writings of a Roman scholar who seriously proposed dropping the letter 'k' as redundant and replacing it with 'c' anywhere it is used. Should give you a hint as to the actual pronunciation of "veni, vidi, vici".

EDIT: Erroneously swapped 'c' and 'k'.


> the recovered writings of a Roman scholar who seriously proposed dropping the letter 'c' as redundant and replacing it with 'k' anywhere it is used

Can you source this? It doesn't make a lot of sense -- the Latin letter C wasn't redundant, because there is no Latin letter K.


I got it from this video, which attributes it to Quintilian: https://www.youtube.com/watch?v=_enn7NIo-S0

And I seem to have swapped 'c' and 'k'; Quintilian wished to eliminate 'k' in favor of 'c', which is probably why you think there is no Latin letter 'k'.


If you look at the qualifier Quintilian adds (the video translates it as "the letter C keeps its strength before all the vowels"), you can see why K was in use at the time -- the sound change familiar from modern Romance languages, where C "softens" to [s] or [tʃ] before front vowels, made indicating [k] before front vowels difficult.

That is an innovation to deal with language change. It is not a part of classical or old Latin. Quintilian isn't proposing dropping K because C makes it unnecessary; he's proposing retroactively not adopting K in the first place because the change that prompted people to use it is wrong.


I didn't even notice the changes in spelling until really far in. Blew my mind!


I love different characters, so if þ and ð were to be reintroduced, I would fully embrace it. I don't þink it would happen ðough, because it does read really weird to ðe untrained mind. Kind of like a 'b' and a 'd'.

I wonder if I even used them right.


I don't think there's anything wrong with þ either. It fell out of favor, so it went away. If ß falls out of favor one day that's fine, but at the moment it's part of the language and that's fine too.


Thorn wasn't in the printing presses (imported from Germany) so the letter wasn't used. "Y" was in the font, but wasn't much used in English, so for a while that was used instead. "Ye Olde Shop"


A double letter signifies a short vowel beforehand. So the word "Stoß" (impact, jolt) should be written with "ß", because it is pronounced with a long 'o'. If you were to write it as "Stoss" the 'o' should be pronounced short, wich is inconsistent.


In theory (standard language), ss and ß mark differences in pronunciation. That's why some words are written with ß. In practice, the differences are pronounced differently in different regions, which is why some German speaking people don't know the difference.


Really ?

As someone who speaks natively German, can you give me an example or a source for this ?

ß is after a long vocal, ss is after a short vocal. I don't know any regional problems with that (and I come from a region with a VERY different language than most of the German speaking people)


The op claimed it makes no difference if you write words with ss or ß. I simply claimed it makes a difference in standard language but certain dialects differ from that so that many people don't know/hear/speak certain words correctly and thus don't understand the difference. For some words, both forms are said to be correct (e.g. Geschoss and Geschoß) but not for all.


BTW, the English word for Vokal is vowel, not vocal (which means something else).


Oops - thank you.


It's rather like using the ^ symbol for exponentiation: it's a commonly accepted workaround when you're limited to ASCII, but it's not the proper notation.


As a European, I disagree with you. I don't have the German letters on my keyboard and I've yet to come across someone from Germany who took exception to my butchered German spelling. I do the same in my own language which has a compound letter for ij, which I wouldn't even know how to create and which is part of my name.

Communication is not about glyps, it's about getting meaning across from one person to another and whatever works is good enough.

Schoene gruesse :)


Of course nobody is going to fault you for succumbing to such practicalities, and it's perfectly fine to use substitutions when necessary.

But there's something to be said for preserving a few aspects of the diversity of cultures, as each one acts as its own preserve of centuries of wisdom. Hey, it's the German language that gave you Uber (the name). Which should actually be Über, but I digress. (and it'll be Over soon enough, anyway).

We've come a long way with making unicode universally available, and there's really no reason to give up now when it's almost done. It's not a big issue for German, but cultures with non-latin script would rightfully be offended if forced to abandon it for ASCII, in the same way that the world would be poorer if all restaurants were replaced by McDonald's.


Agreed on that, but culture to me goes a lot further than exchanging messages on the net.

I'm thinking of France here, which has something called the 'Academie Francaise' (apologies for butchering the c-cedille in this context, the accent aigu 'e' in cedille and acedemie). The only effect this seems to have had is some token re-inforcement of French as a 'world language', but in practice it became a 'come up with French words for words where we already have a perfectly good one that everybody the world over understands but which isn't French'.

If that's culture then I'd much more concentrate on the content than on the form and remove these artificial barriers. Which of course will lead to more and bloodier wars ;)

Unicode is great, it made all languages equal. But I don't see the point of artificially making the problem worse by introducing new glyphs, that's the wrong message to send. We should aim for simplification within limits (not to go down to carricature) and if German got by so far without the 'capital sharp S' I see absolutely no reason to introduce it in the modern age. It's a step back, a small one, but still.


I find introducing "new" unicode glyphs for characters that have historically been in wide use is extremely useful, as people working with historical texts no longer need to use workarounds to express them. See [1] for an example of the kind of characters you find in historical texts. Having an official way of representing anything that's actually (or has been) in use makes sense. Whether that's the same as recommending it for widespread adoption is a different question. Several German-speaking countries have dropped the ß from their orthography, but still agree on what the ß character should look like and how to represent it in text.

Now, emoji, OTOH, just feels to me like unnecessarily polluting the character space. I'd much rather have a capital ß in the codespace than some smiling poo family.

[1] http://www.unicode.org/L2/L2015/15327r-n4704-medieval-punct....


I have never understood the idea of protecting a language against foreign words. I think languages naturally strive to be as expressive as possible and will adopt words accordingly.

However, I think it is important to note that the ẞ is not a letter they just made up. Heck, even the picture in the article shows an old book cover, which uses this never standardized letter. To me this is just correcting a previous oversight and formally allowing (not forcing) people to use the 'capital sharp s', for example in print, where they want to typeset a title in capitals.


I can see a lot of trouble coming from this. Sort order for one. If a language this old didn't need a glyph up to 2017 it can be done without, especially if the use case is limited to all caps words. But I guess in that case you could convert its occurrence to SS first and then do the sort.


Actually I was a bit surprised when I learned that Unicode only added a codepoint for it in 2008 (and even rejected an earlier proposal). While the spelling rules have said for some time to replace it with "SS" the glyph was used quite often, including official documents such as passports.

For sorting there is standard (DIN 5007). Replacing ß with ss is correct (even in the lower case variant). The other letters are more fun: ä is replaced with a for sorting, except if you sort a list of names. In that case you need to replace it by ae. Probably not something international software is aware of. The Austrian sorting of names is different and other languages that have the same glyph (e.g. Swedish) also have other rules (e.g. by placing ä at the very end of the alphabet).


> Probably not something international software is aware of.

Collation rules that vary by locale exist for this reason, and all major programming languages and OS'es support this. Of course whether the software you use does this or not depends on the developers writing the software.


Thanks for the term, I wasn't aware of it. This looks to be a standard problem in DBMSs at least. While not all of the rules I mentioned seem to be shipped by default, it looks fairly straightforward to add them. I just remember our development team moaning quite a bit after I added the requirement to sort names differently.


Funny you mention Swedish. Let's all bond over the shared experience of "why tf is this using latin1_swedish_ci collation?"


Actually there was a time where many typefaces actually included a capital sharp s and the letter was in somewhat common use, cf. https://de.m.wikipedia.org/wiki/Gro%C3%9Fes_%C3%9F, especially the second section. Limited availability of a glyph and no way of representing the letter in any digital text encoding probably meant it fell out of use and was more or less forgotten.


> I'm thinking of France here, which has something called the 'Academie Francaise' (apologies for butchering the c-cedille in this context, the accent aigu 'e' in cedille and acedemie).

Actually, German has the Council of the German Language, which is what did this standardization effort of the new ẞ


In my language omitting diacritics can change the meaning a lot. "Ji ilgėjosi savo sūnų" vs. "Ji ilgėjosi savo šunų". Many lazy compatriots of mine would write "Ji ilgėjosi savo sunu“. It may be obvious from the contexts, but without it you could interpret it either as the first one ("She was missing her sons") or as the other one ("She was missing her dogs").

This lazy way of writing had some valid technical reasons in the past (like phones not supporting diacritics), but these have long been solved.

P.S: if you think gendered nouns are hard, try language where almost everything is gendered: nouns, verbs, adjectives. In English: "I saw a nice boy walking"/"I saw a nice girl walking". In Lithuanian: "Mačiau gražų vaikiną einantį"/"Mačiau gražią merginą einančią".

And then there is this: https://s-media-cache-ak0.pinimg.com/originals/51/91/e2/5191...


Sure, languages are about communication. As long as the other person understands what you are trying to get across, you're probably fine.

However, orthography does serve an important part in getting your point across. German for example heavily relies on capitalization (compared to e.g. English) to help the reader parse the sentence structure. The Umlauts serve a similar purpose to make it easier to differentiate words.

For instant messaging this probably does not matter too much, but what about books, technical reports, etc.? What's to gain from not using "proper" orthography in them, which might give the reader more hints about the sentence structure and hence its meaning?

Schöne Grüße ;)


> German for example heavily relies on capitalization (compared to e.g. English) to help the reader parse the sentence structure.

For Germans, maybe. And maybe for Germans it is a bit harder to read an English text where only proper names and the first word of a sentence is capitalized. But I routinely forget to capitalize even proper names (especially with companies, days of the week, names of months and such), and I've never felt that having upper case for the first letter of each noun made German any easier to parse, I'm mostly blind to 'case' and have to really remind myself that 'I' in English is with a capital letter.

In Dutch this is even stronger, we do not capitalize most of those except for proper names and the first letter of a sentence.


> and the first letter of a sentence.

Except for digraph IJ, which needs both I and J capitalised, e.g., IJsselmeer.

> I'm mostly blind to 'case' […]

Most people aren't though. Leaving out capital letters where they should be jars the reader's flow. Let each language have its idiosyncrasies; it makes life more interesting.


I've seen Greek, Russian and even Arabic written with Latin letters on the internet for convenience. So ss for ß or the other way round is only a minor inconvenience, if even that. By the way, typing ij as two letters is the official way to do it; only in handwriting and occasional signage will you see it written as one glyph. I have no idea why the ij digraph/ligature ended up in Unicode, really.


It's probably for round-tripping with old single byte character sets. In this case, with Dutch VT220 character encodings.


>This change fixes an internal inconsistency

I agree that it does! "ß" turning into "SS" when uppercasing is awful.

>It feels like replying to some bugfix in, say, Ruby with something like "I never liked Ruby anyhow".

In this analogy, "Ruby" would be the German language. And in that case... yes, I can disagree with a bugfix if it makes matters more complicated! I can argue that something is a misfeature, like ß is. Removing it would improve the situation, especially from a technological viewpoint.

>German's not English, even though you might prefer if it was

That's kind of a low blow. "ß" is not a defining feature of the German language.


>"ß" is not a defining feature of the German language.

It is though. As a language-lover and German-learner, AFAIK, no other language uses this letter. "ß", along with less exclusive "ä", "ö", and "ü" are what distinguishes German orthography from other languages.


English seems to manage okay with no 'special' latin characters.


What a coincidence that English manages fine with just the letters in ASCII.

Seriously though, what definition of "special Latin characters" are you using? Italian does just fine with fewer...


G, J, K, U, W, Y, and Z are all characters that aren't used in Latin (in the case of G, J, U, and W, because they didn't exist; in the case of Y, it is used in Greek words but doesn't represent a sound that's possible in Latin; I'm not sure if Z was used in Greek words or not. K is just a Greek letter -- since it represents the sound written with C in Latin, it is never used at all).


Sure about the G?



Latin wikipedia isn't exactly an ancient source. Without bothering to read the article, why do you think it has a subhead of "Fontes_de_vita_C._Iulii_Caesaris_praecipui"?


I'll take this cum cranulo salis.


You think "cum granulo salis" is evidence for G being an original part of the Latin alphabet, while not being evidence that U was an original part of the Latin alphabet?


Sure, G wasn't "originally" part of the Latin alphabet—until they invented that letter.


What do you think makes G different from J and U in that regard?


Timing. Where do you think G slots in the timeline, relative to the Latin classics?


G dates to the classical period. But, as I pointed out above, it's still intrusive enough then that Caesar's praenomen, Gaius, is written C, not G. The name is older than the letter, but younger than Latin writing.


What do you mean by "special"? If you mean "can be encoded in US-ASCII", then that's a bit circular, because that character set was obviously designed to encode the English language. In a parallel universe where the ancient Romans had designed a character set, it would be the English language that would require all those weird unicode characters like "J" and "U" (which didn't exist in the Roman alphabet).


It actually seems fairly intuitive to view Ä as a cheap knockoff of A whereas U and W are independent forms.

That would be wrong -- as the name suggests, W derives from U, similarly to how G, J, R, and U derive from C, I, P, and V. But it shouldn't be hard to see the intuition behind the idea that s is a "natural" letter and š is an unnatural modification.


I meant that at least AFAIK all the English characters exist in several languages. No special tics or dots that are unique or very unusual.


> tics or dots that are unique or very unusual

It's almost like a lot of languages have a shared ancestry. Differences may be unusual to you. Are ș, î, or ț unusual because they don't exist in English? Those phonemes aren't common in English which is why we don't have distinctions for them, but they're important in Romanian. Not unusual at all if you're Romanian.

By your logic, 대한민국 characters are weird and unusual also, and so on...


Yes, hangul is unusual, because there's only one language that uses it. Duh. Latin characters are less unusual because tons of languages use them. I'm not sure why this is difficult to grasp.


Tons of languages use é, ø, and other accents so they are "normal" by your definition.


Except "café"


And naïve, and façade, and mediæval, and coöperate, which are either correct spellings, or a matter of editorial style.


When I think "weird special English letter" I think Œ, as in fœtus. Granted, in American English it has been eliminated almost completely and even in British English it's now often seen as a bit archæic (pardon the pun) but it's still there and to me signifies "English" just as ß signifies "German", ø signifies "Nordic", ı signifies "Turkish" and é signifies "French".



Yes, but that's not modern English.


The sharp s is only used in Germany and Austria, but not in Switzerland. There really is no need for it.


As a German and Unicode nerd, I love this. It'll take forever to be standardized in keyboard layouts and to be fully supported in mainstream fonts but I love that this makes it officially blessed and gives it leverage when convincing type foundries and standard bodies to support it.

DON'T PANIC. You'll be fine to use "SS" when shouting at people online for the foreseeable future. But those of us who care about typography can now use the proper capitalization and maybe 5-10 years from now keyboard layouts will support this like they added support for the Euro symbol (although that was a bit easier).


As a French living in Germany I love it too. Restricting the wonderful differences between languages just because of the artificial "keyboard entry issue" is a lowest common denominator approach which has no sense at all.

Can you imagine restricting Chinese to just the ideograms you can type on a keyboard?

So, thank you for all the ß in German, the ø in Norwegian, the œ in French and all these small variations which are making my lectures a bit more tasty.


FWIW the French ç is a pain to type on a German keyboard too (hooray for compose-comma-c on Linux) but I like these quirks of our languages that all share a variation of the same alphabet but took it in different directions.


This really doesn't have much of any effect on anybody–as you noticed yourself. It's for use in ALL CAPS, which is mostly done in software these days (i. e. headlines formatted with CSS), so you actually don't have to type it.

It in fact simplifies lots of things, because the ß->SS capitalisation screws up all sorts of algorithms: It's unusual to replace a single lower caps letter with two capital letters. It's also not reversible, so you actually need a dictionary to do 'AUSSENTHEMPERATUR'.lowercase.


As a developer the ß->SS transformation has been one of my pet peeves about digital German. Uppercase meant destroying data. It's awkward. The ẞ glyph fixes that.


A language should not be optimized for easy keyboard typing. The power of German comes through it's richness and beauty of expression, including ß. Downgrading this for easy typing would be the wrong direction.


A language should be optimized for what its speakers want it to be optimized. Of late, this has always been easy keyboard typing.


It's trivial to support ẞ in smartphone virtual keyboards. Physical keyboards may follow.


Humans downgrade for easy typing all of the time. It's a nightmare for search engines.


If we have this character, I think it makes sense to have an uppercase form, turning toUpperCase() into a proper bijective mapping for German :)


> how do I type this?

On Linux, capslock then ß works. No idea how to do that on Windows, though.

> so the only reason to use it is to write a word in ALL CAPS.

Yep, it was only added for ALL CAPS. It was actually added to Unicode back in 2008.

https://en.wikipedia.org/wiki/Capital_%E1%BA%9E

IMO it makes sense that it exists. ß -> SS -> ss is a lossy transformation.


I love the compose key on Linux and other -nix-like OS'es. I'm Dutch, and use a standard US-layout keyboard. I can type every diacritic and special character I need — including these em-dashes — just by hitting the compose key (most people use right-alt for this), and a logical sequence of keys.

For ë, this is (compose key) " e. For the em-dash; (compose key) - - - (for en-dash, it is - - .). I.e., it all makes sense.

So when I write in German, I can enter the ß by hitting (compose key) s s. Makes sense right? So reading this article I wondered: could I compose the upper-case ß?

Sure: ẞ. (compose key) S S. Totally guessable. Someone must have submitted a patch for it sometime these past few years, and another person submitted the glyph to a couple of fonts.

It's such a shame the compose key never took off on Windows or Mac OS. I cringe every time I see a colleague enter a memorized alt-code on Windows, or just forego diacritics and non-ASCII characters in general; even in people's names.


As a learner/non-native speaker of German, please don't abolish the only way to tell the vowel length of an unfamiliar word.


Weg vs weg. No way to tell either...


The use of ß permits consistency in the syllable splitting rule (split ,,ss'' but not ß).

It's a little weird that it looks and is pronounced like s-z but languages do evolve...and we no longer have need for the ch and tz fraztur letter/ligatures.

Would you always write ,,ue'' instead of ,,ü''?


We are German. We have rules for everything. DIN 2137 declares that the upper case ß should be typed with 'ALTGR + H'. On most Linux distributions you can type it by activating CAPS-LOCK and type a normal ß. I also think that this is the best solution for this problem.


Thanks. Do you know how to fix this on Linux (Ubuntu in particular)? When I press ALTGR+H I get ħ, not ẞ.


AFAIK the official keyboard driver for the DIN 2137 T2 layout is avaiable only for Windows. I think you have to rebind the keys manually...


Belgian here, I grew up on AZERTY, but ended up learning QWERTY for coding.

Story time... My last laptop had a physical QWERTY keyboard, but it met a fateful end (I drove on it) and had to replace it immediately. Good luck finding a QWERTY MacBook in the stores of the corner of France I currently live in... So now I'm typing QWERTY on an AZERTY keyboard.

Not sure how it works on PCs, but OS X has a "US. International" keyboard mode, were ` ' " and ^ àré dëâd keys to type accented characters. You can toggle plain US and "US International" with a simple keyboard shortcut.


You can find almost any laptop parts on eBay.

The Mac parts are usually 2 or 3 times as expensive as the similar PC parts.


Oh, yes, I know. I usually source Mac parts from iFixit though, they provide a warranty and are a known quantity.

I'm used to repairing my computers, but the broken one was wrecked (it still booted, but the case was bent around the battery, and the screen was broken).

Changing the keyboard of the new one would void the warranty, and having it changed by Apple staff would cost half a new notebook.

So I type QWERTY on an AZERTY keyboard. The only issue I have is the closing paren, (<shift> + <0> in QWERTY) is next to the corresponding key in AZERTY (<-> in QWERTY), so I end up sometimes typing underscores (<shift> + <->) when I want to close parentheses.

I touch-type for the most part, but that key is too far from the home row for me to hit reliably, so I end up looking and mistyping anyway :-)


"Personally I'd rather use composing"

Then use a keyboard with a compose key: compose-s-s for ß is already typical, set up compose-S-s for the few cases you need ẞ.


I'm on Ubuntu and have a laptop with a Menu key, so I rebound that to Compose. compose-S-S also works, btw, as does compose-s-s with capslock.


> QWERTZ is already bad for e.g. programming with all its punctuation (typing "]" means holding altgr and pressing 9), so that would make it even worse.

This is why I switched to a US layout a few years ago. Mac in particular makes US layout pleasant to use with European languages. Want a funny S? Hold S, pick what you want. All these are one longpress, and one num key, away: ß ś š


I personally think the bubble takes way too long to pop up. There are hotkey alternatives to a lot of the common characters. alt+u makes the next character have an umlaut (¨). alt+e -> ´. alt+s -> ß. alt+c -> ç, etc.


As a german with an ß in my last name I love it.


> I see this as a step in the wrong direction.

Agreed. The council's press release mentions that this will enable the use of the character in passports. The horror for the poor people who have this in their name! I hope for their sake they will have a say in this. Otherwise, filling in any form abroad will become a headache.


For us Poles it's a toss-up whether MICHAŁ is interpreted as MICHAL or MICHAK at borders.


Actually, funnily enough German government agencies properly write Ł on all my papers. I needed to settle for L anyway because banks, telcos etc. have a huge problem with it.

On the other hand in some other countries it ends up being some random crap. Eg.: Micha%b


Interestingly, the German government has defined their own subset of Unicode for use on documents: http://www.bmi.bund.de/SharedDocs/Downloads/PERS/Themen/Vorh...


Wait, how does it become K suddenly? I mean, if it's something other than L, shouldn't it then really be W?


It looks a bit like A K.


It is already standard (and always was) to use the ß in passports as this is the only correct way, problems abroad notwithstanding. However, if you can prove that it creates hardship abroad this is one of very few permissible reasons to change your name in Germany.


How is this different from names that contain an umlaut (ÄÖÜ)?


Foreign officials ignore the umlaut when looking at your passport or when transcribing what you scribbled onto a paper form. It's (rightly) viewed as a modifier of a letter they know. (In contrast, the sharp S is a letter they don't know. Many people think it's a B.) When filling in electronic forms, you're typically better off leaving off umlauts.

Source: A lifetime of experience with funky accents in my name, not living in the place where the name comes from.


That’s actually the worst thing you can do. That actually violates the relevant DIN standards for transcription.

You have to replace Ä with AE, Ö with OE, Ü with UE.

You won’t find a person named Müller under Muller in any database, especially not digital databases.

For electronic forms, always use äöüß, or, if the system happens to be an American one (the only form I’ve ever had to fill where I couldn’t use äöüß was a US DoD export declaration), use ae oe ue ss. Although be aware that some US software will have an issue with that, too, and just remove ß and assume äöü to be aou.


Yeah, I guess you're right for German. My accents are different.


Do people outside Germany even care about DIN standards?


The machine-readable part of the German passport uses this DIN standard, so to avoid having to explain German orthography to some suspicious border guard, you'd better make sure you use it.


But when you get a US visa, the name in both the visual and machine redeable part is written with ä replaced by a, so you end up with three different spellings in your passport. And then you get a Russian visa, where they use a pretty strange transliteration into the Cyrillic alphabet (not the one I would use) and then transform that back to ASCII in the machine readable zone for variant 5. And so on.


I see your point. But wouldn't this problem be better addressed with an additional field on one's passport, which states the name in a standardized alphabet, e.g. ASCII?

Asian passports seem to include something like this.


Most (all?) countries that use non-Latin alphabets include a romanized name for these reasons.

The only problem is that often there isn't a good clearly unambiguous set of rules on how to do that. For the forms, it doesn't matter, so long as the country chooses one and sticks with it. But it can mess up the pronunciation of your name real bad.


The passports contain such a field in the machine readable zone. It has created some confusion in the past, as the transformations ß -> ss and ä -> ae are pretty surprising if you never heard of that.


Why do they need all these orthography reforms anyway? Like with the reform of 1996 - it was supposed to simplify things, instead they tend to introduce a huge set of rules and a bunch of exceptions to these rules; as a result it gets to be more complicated than before...


German millennials must have something to do with this; it didn't bother anyone throughout the 20th century.

The strange thing is that it looks indistinguishable from the lower case one.

It's as though they have taken the lower ß and tweaked its font metrics to make an capital version, and not actually invented a proper upper case character.

For all we know, the examples shown could just be a of a lower ß taken from a different font.

They should have changed something substantial in the glyph, like added a corner, or loop or whatever; some feature that the lower case one doesn't have.

How about borrowing a Chinese radical: ⻏ ?

That's can't be, of course; this whole thing is part of some initiative to save Germany from cultural invasion.

GRÜ⻏ GおTT!


The idea for a capital ß is pretty old and goes back to the 19th century. One edition of the DDR Duden had a capital ß on its cover. German ID cards only use capital letters and need to distinguish ß and ss. It has nothing to do with millenials.


> goes back to the 19th century

Yes, and so that indicates people are reviving a problem that their great grandparents didn't bother with.

> German ID cards only use capital letters and need to distinguish ß and ss

Are these newly introduced? If not, why hasn't it been a problem so far? If they are newly introduced, why do they only use capital letters?


The 1988 passports had capital letters and used ß as well, not sure what was used before. International standards (namely ICAO Doc 9303, starting in 1980) strongly recommend capital letters but I don't know the reasoning.


The idea is old, yes. But except from the one singular example, noone ever used it. It has no place in the language.


It is not one singular example. Speakers of German have used that letter occasionally for the last 120 years or so and there was a debate to introduce it the whole time.


Show an example of more than ten years ago, please.


Well, one is the first picture in the featured article (it is from 1910). I collected some more here: http://imgur.com/a/NaFgA

And any German passport (at least those issued after 1988, don't know about what was used before).



That can be said about every feature of every language at some point in time.


Sure. Language changes and that is good.

It's not good when an unelected body decides on language changes, without that change already extant and in use by a significant part of the speakers.


This body was (on the German side) put in place by the Kultusministerkonferenz. Their recommendation don't "change" the language (and never could do so). It just says what administration and schools have to use. Everybody else is free to do whatever they want (and many publishers have their own version of the rules).

The changes that were made before (such as the ones in 1996, 1901, or 1876) were made by similar bodies.


This article: https://typography.guru/journal/capital-sharp-s-designs/ (linked from the featured article) shows some designs with corners and stuff. Some of them look less ugly than others. But in all examples except the Backstein one the letter still stands out like an ugly thing that doesn't go well with the rest of the font. More work is needed.


As a german with a US layout for coding I do not care.


I'm a big fan of ASCII German (downconversion of scharfes s and umlauts); but some people just like the way the hand orthography looked, and will do anything to hold on to it.


I'd suggest maybe it should just be a ligature, but that would make it hard to type English words like crossing and German sharp-S words in the same font.


And there are double-s words in German (like "Amtssicherheit", off the top of my head) that don't use a sharp S - it's not a ligature.


There are words where the difference is actually significant, e.g. "Masse" (=mass) vs "Maße" (=dimensions)

It's also not a ligature for sz ("Amtszeit"), although it has developed from one (of the long s and the tailed z, see https://en.wikipedia.org/wiki/%C3%9F )

I like that the information loss in toUpperCase() is gone now in this case, so I am all for it O:)


Amtszeit is a bad example, though, as German never uses ligatures for letter sequences spanning syllables. And I think there are no examples of sz appearing within a syllable.


It's actually a ligature for "sz" - at least it used to be. You probably won't find many german words that have an actual "sz" in them.


But there are some – e.g. Auszeit. And at the same time, we use ß in locations where it'd otherwise be ss. While it started as a ligature, its current usage doesn't fit that.


That doesn't mean it's not a ligature. If English still had a "th" ligature, the word "outhouse" wouldn't use it.


Interesting claim ... a similar compound word with a present-day ligature is "halfling." TextEdit renders that with an "fl" ligature, but should it not?


Is w then still a ligature of vv, and should vve remove it?


th in English wouldn't be a ligature, it would be a letter (thorn or eth).


Absolutely true, it's not really its own letter nor is it really a ligature. It's an unique feature of the german language, which is hopefully here to stay...


It's more apparent if you look at the long s (ſ) which has since fallen out of favour.

If you squint and your font permitting, ſz should look more like ß.


Or, even more similar, ſƷ. (Which is long s (ſ) + capital ezh (Ʒ))


There are tons according to http://www.wordmine.info/Search.aspx?slang=de&stype=words-wi...

FRAKTIONSZWANG FREIHANDELSZONE KONFISZIEREN [...]

A lot of them are combined, but still quite a few 'regular' ones


Here is the full list of non compounded words containing an "sz".

Abszeß - Adoleszenz - Disziplin - Eszett - Faszie - Faszination - Fluoreszenz - Koaleszenz - Konfiszierung - lasziv - Lumineszenz - Obszönität - omniszient - Oszillation - phosphoresziere - Plebiszit - Proszenium - Rekognoszierung - Rekonvaleszent - Szene - viszeral

Most of these actually are of latin origin. My favorite is obviously "Abszeß" - probably the only word in the german language that will show a non compounded co-existence of "sz" and "ß".


It did, 20 years ago ;)

Nowadays it's written Abszess as the e is short.


FWIW, in all of them the "sz" spans a syllable boundary (yes, even "Szene" although I despise the syllabic "S" with a vengeance because it's so awkward).


Yes, historically it's a ligature.


From Wikipedia (https://en.wikipedia.org/wiki/ß):

In German orthography, the grapheme ß, called Eszett (IPA: [ɛsˈtsɛt]) or scharfes S (IPA: [ˈʃaɐ̯.fəs ˈʔɛs], [ˈʃaː.fəs ˈʔɛs]), in English "sharp S", represents the [s] phoneme in Standard German, specifically when following long vowels and diphthongs, while ss is used after short vowels. The name Eszett represents the German pronunciation of the two letters S and Z.

It originates as the sz digraph as used in Old High German and Middle High German orthography, represented as a ligature of long s and tailed z in blackletter typography (ſʒ), which became conflated with the ligature for long s and round s (ſs) used in Roman type.

The grapheme has an intermediate position between letter and ligature. It behaves as a ligature in that it has no separate position in the alphabet. In alphabetical order it is treated as the equivalent of ⟨ss⟩ (not ⟨sz⟩). It also has no traditional capital form (although some type designers have introduced forms of "capital ß" de facto). It behaves like a letter in that its use is prescribed by orthographical rules and conveys phonological information (use of ß indicates that the preceding vowel is long).


This seems to be a poorly writen article for various reasons:

> The grapheme has an intermediate position between letter and ligature.

It might have been a ligature in the past, but it is definitely not a ligature today. A ligature is a purely typographical device that is completely interchangeable with the non-ligature variant. This is not the case with ß.

> It behaves as a ligature in that it has no separate position in the alphabet.

Not having a separate position is not sufficient evidence that it is a ligature. ä, ö, ü also don't have separate positions when ordering stuff.

> It also has no traditional capital form (although some type designers have introduced forms of "capital ß" de facto).

It had the same traditional from for almost 130 years. Yes, there are a few stylistic variations on the traditional forms, as well as some early experiments trying something completely different which have never been widely adopted.

> It behaves like a letter [...]

It is a letter, just like w is a letter, not a vv ligature.


Most commenters seem to be under the impression that German orthography gained a new letter, or that uppercase sharp S will be the norm. This is not the case. From the original article:

> The change doesn’t mean that everyone now has to use a Capital Sharp S. The previous spelling of replacing ß with SS in uppercase texts remains the default for the time being.

The Council for German Orthography sanctioned the use of the uppercase ß. They just accepted the fact that it exits and is used. No more, no less.

That being said, I find it a bit ironic that at first Unicode had to bend over backwards to allow the strange surjective mapping that only German requires, only to later resolve to problem the easy way with the introduction of a new character. I mean it's probably the right thing to do, because both things are used in German, but on the other hand it introduces a lot of complications just for one very specific special case.


It's a bit more important than that, since ẞ didn't "exist" in official use before (if you have a name with ß, the spelling on your ID card can now use ẞ, instead of SS where its unclear if it is ss or ß, now it becomes valid in schools, ...), but yes, it's only an accepted variant now.


The article and comments are a bit misleading. The actual official document is a bit clearer:

> E3: Bei Schreibung mit Großbuchstaben schreibt man SS. Daneben ist auch die Verwendung des Großbuchstabens ẞ möglich. Beispiel: Straße – STRASSE – STRAẞE.

http://www.rechtschreibrat.com/DOX/rfdr_Regeln_2017.pdf (page numbered 29, page 27 of the PDF)

Roughly translated:

> When writing in uppercase one writes SS. Additionally the use of the uppercase ẞ is also possible. Example: Straße – STRASSE – STRAẞE.

In other words, it's an officially blessed variant spelling. The new standard actually defines several variant spellings for cases that have been debated since the last reform, e.g. Delfin ("new") vs Delphin ("old"). It's a lot more forgiving, standardizing a lot of existing usage rather than making up arbitrary rules.

It's far more descriptive than the previous spelling reform some fellow Germans still like to bitch and moan about. The reason the capital sharp S is singled out as being newsworthy is mostly that it's one of the few additions; almost everything else is just establishing widespread non-standard variants as permissible.


While the glyph is still missing from many fonts, Unicode 5.1 designated a code point for it:

U+1E9E ẞ LATIN CAPITAL LETTER SHARP S


I for one find this GROẞARTIG. Maybe not using umlauts or writing nouns in lower case is more efficient, but sometimes it feels good to have something like this.


Interestingly, this is the opposite as what happened in Spanish [1]

We had two extra letters, CH and LL that were just ligatures, so we dropped them and everyone is better off.

[1] http://www.rae.es/consultas/exclusion-de-ch-y-ll-del-abeceda...


As other's have written already, ß is not simply a ligature. Since the orthography reform in the nineties it universally indicates that the preceding vowel is lengthened, dropping it would lead to a large amount of homophones. An example is Maße (measures) vs Masse (mass).


What a non-issue.


>La eliminación de los dígrafos ch y ll del inventario de letras del abecedario no supone, en modo alguno, que desaparezcan del sistema gráfico del español.

Important distinction to make. They are still part of written Spanish. To be completely honest I didn't even know they were part of the alphabet before and I wouldn't know how that affects everyday life.

The capital Eszett is a new letter that (almost literally) no one used before


>and I wouldn't know how that affects everyday life.

It mostly affected lexicographic order. 'ch' was between 'c' and 'd'.

So for example, when 'ch' was a separate letter in the alphabet, 'chico' (little, boy) would come after 'cristal' (crystal) in the dictionary. Now it's the other way round.

Something similar happened with 'll'.


In German, ßẞ is not just a ligature.

Since the recent orthographic reforms, ßẞ means a very specific sound that’s separate from s, and from ss.


> ßẞ means a very specific sound that’s separate from s, and from ss

Kinda. ⟨ss⟩ and ⟨ß⟩ are both /s/, ⟨s⟩ being /z/ for contrast, but they do affect how you read the preceding vowel.


Well damn. Now I have to learn myself a new alphabet song.


So when will we see it in string.toUpper()?


I don't like it, it looks too much like B. At least lower-case version stands out among its lower-case peers, but the big one is the same size as B.


Were words using ß ever spelled with sz? The ß is pronounced as sz (in German) and I think it's a ligature of old German shrift s which is a line when found in the middle of a word, and z. So again, were words like strasse, spelled as strasze and hence the ß? And then later it became ss as the sharp s?

Edit: OK nevermind. Other comments and wikipedia discuss this.


"straße".toUpperCase() is still "STRASSE", though.

Wondering how any current i18n system will get out of this mess ;)


Hopefully this will motivate a libpango replacement written in Rust. :)


The Rust Evangelism Strikeforce is already gathering :)


So, I've read a few articles now but I'm left wondering: How do I enter the new character "ẞ" on a german keyboard? Pressing the key for "ß" + shift obviously doesn't work (it yields a questionmark).

(I used cut + paste to get the ẞ into this post)


And how do I type thiß capital sharp S?

The Unicode input seems a little complicated :-/

CTRL + SHIFT + u1E9E = ẞ


On a Unix-like system with X11 and a compose key, I believe Compose + S + S should work (away from my computer, so can't test).


Do tell, which key is this "Compose"?


I have it set to Caps Lock. GNOME used to (and might still) have a setting for it in its keyboard preferences; I think KDE and Xfce do, too. Otherwise, you can add "setxkbmap -option compose:caps" to your ~/.xsession or ~/.xinitrc (whichever you're using) or to some other startup script.

Some old Unix workstations actually do have a physical "Compose" key on their keyboards. Alas, I do not have such a keyboard, but it ain't like I use Caps Lock all that much anyway.


Depending on your keyboard layout, the key might be the "AltGr" key, or have an Apple logo, an MS logo or a penguin on it.


Thanks for the hint. While neither of those keys seems to be the compose key on my linux, SHIFT + ALTGR + S workẞ here.


YEẞ, it workß.


On Windows, altgr-shift-ß.


on linux: [AltGr]-[Shift]-[S]: ẞ


[compose]-S S also works


actually Capslock + ß, on Fedora at least


both work here (Ubuntu and Arch)


And it has been in unicode since 2008. https://en.wikipedia.org/wiki/Capital_%E1%BA%9E


ParselsprachebenutzerInnen feiern! ẞssssßs sss!


So ß is finally out of β-test!


First of all, it's not a 'sharp' s, as one can see compairing 'Maße' and 'Masse'. I doubt this will be used in practice. Many people nowadays are confused by last orthography rules and even write 'Straße' wrongly with double-s.


Well, it is commonly called "scharfes S" (sharp S), at least in Austria -- and I'm pretty sure in Germany too. I certainly plan to use it in practice, and have already started telling my friends and relatives about it. The substitution with "SS" always rubbed me the wrong way, since it should be "SZ". The capital "ẞ" solves that problem for me, and I no longer have more letters in the uppercase version of certain words, which is nice.


Some street signs in the town where I live (Weingarten) are written with double-s. It is really confusing.


What a typographic disaster.

Making an uppercase version of a lowercase ligature was bad idea from the get go.

I suggest to type designers who want to add this symbol to do an SZ or SS ligature.

Caveat: I studied typography in another life.


It's not a ligature. It's not even a variant anymore. Yes, in orthography it's defined in contrast to "ss" but it is not acceptable to substitute "ss" for "ß" in names. Mesner, Messner, Meßner and Meszner are different names.

Typography is not just about making pretty shapes that work nice alongside each other.


Again, an unelected body decides on my mother tongue (the orthography reform of 1996 was the same, but more catastrophic).

The Rechtschreibrat is tasked with codifying existing changes to the language, not inventing something new.

Capital ß does not exist and never did. Before a few years ago when Unicode accepted it and a handful of designers played with it, there was exactly one documented occurence in hundreds of years of the language: the title page of one singular run of an orthography book in the GDR. The runs before and after that had SS.

I think there should have been an uprising (violent, if necessary) in 1996, but the ship has sailed. Very few people care about their language. Heck, most people get the simplest parts of orthography wrong and imitate English patterns (even the famous contatenative property is dying, people are just writing constituent words serially, with space in between).

Never forgive, never forget.


> I think there should have been an uprising (violent, if necessary) in 1996, but the ship has sailed.

Lol you want a popular revolution over the Rechtschriebereform?

Strong contender for First World Problem hall of fame here.


You're a prime example of what I've talked about: people without any love for or sense of their language (whichever one in your case).


Personal attacks are not allowed on Hacker News, regardless of how strongly you feel about orthography. Please don't post like this.

It's hard to know how to read your comment upthread but invoking violent protest and using political rhetoric like "never forgive" can't possibly have a salutary influence on an online discussion. When you're hot under the collar, please cool down before posting.


You say that as if it's an inherently bad thing. Different people have different passions, and assuming one's own interests are "more important" seems kind of ridiculous to me. Most people simply use their language to communicate, and nothing more.


Mother tongue isn't a hobby you choose. It's part of your identity and a core of your being.

That's why we consider the Turks eradicating the Kurdish language a human rights violation, but don't say the same thing when, say, a swimming pool closes.


> Mother tongue isn't a hobby you choose. It's part of your identity and a core of your being.

How much it is a part of one's identity and a "core of one's being" is up to every specific person to decide, not for you to impose.


Do I see a hint of a Reichsbürger in that rant?

If you don't like the official orthography, you're free to ignore it. Just don't complain if you get reprimanded when doing so in situations where standard orthography is required (e.g. written exams in school and university).


Please don't insinuate Nazism to score points in an argument on HN. We all know where that leads to and we don't wanna go there.


While some Reichsbürger share neonazi ideologies, the term actually refers to the German equivalent of the Sovereign citizen movement. But I admit that was a bit of a low blow even without insinuating neonazi leanings, I apologize.


Ah—guilty of my own charge I see. Thanks for responding so charitably! These subtleties escape me.


ASK HN: Why is this at the top of HN?


Because it was up voted.

HN likes linguistic things, and typographical things, and Unicode things, and this is all three. And the weird .toUpper behavior required by ß->SS is an old example of string-handling pitfalls that non-German programmers probably remember (and obviously German ones know about the issue!)


Languages with priesthoods make me chuckle.


>Languages with priesthoods make me chuckle.

You'll likely be downvoted by the way you made your point, but there is a point to be made here. Native English-speakers, for instance, probably couldn't fathom there being an official body that makes decisions like this about the language. It just so happens that some of these official bodies make a point to stave off the influence of English on their respective languages (for better or worse - I won't argue either here).


Right, we don't have an academy of English. Instead we have several competing organizations publishing dictionaries and style guides to the same purpose.


Arguments against:

ß is a ligature, one of ss. ffl is a ligature, but that doesn't mean FFL should have one too.

Just get rid of it. Switzerland did and they're managing just fine.

Capitalisation in Unicode for German locales is just a headache, and this really doesn't help.

How would one enter it on a keyboard?

Argument for:

ss is pretty much the same shape as SS, and Dutch has ij and IJ.

It's just an optional ruling, anyone can do whatever they want and as with the new orthography, they did. Prescriptive linguistics is useless anyway, especially over the past two decades in German-speaking regions.


Spelling reform got rid of quite a lot of the use of ß in (German) Standard German, didn't it? As a German learner, I actually quite like the eszett, as the ß is a very obvious hint that the preceding vowel (if "monophthongic") is long.

> Capitalisation in Unicode for German locales is just a headache, and this really doesn't help.

Huh, if anything doesn't this make it easier? Replacing one code point (ß) with two (SS) when capitalizing seems like a bigger headache.


It started out as a ligature of ſ and ʒ, forms of s and z that don't exist anymore today in German. And it became a letter like æ did. Also this happened at a time where z was pronounced differently from how it is pronounced today.


> It started out as a ligature of ſ and ʒ

I know that, but again prescriptive linguistics vs. descriptive and descriptive definitely says ss.


Ss is horrible because double consonants signify short preceding vocals in German. Yet ß is preceded by long vocals. If you want to replace it, then just with s.


I can't speak for other languages, but in English æ isn't really a letter in the modern tongue. It's just used when someone wants to be 'fancy'. You'll see athenaeum far more than you see athenæum, for example.


He might be referring to danish, where æ is a standard letter used all over the language.


Actually it's a ligature of SZ (as even it name says, eszet)


And it has a separate, and distinct sound from s or sz, and has a separate meaning for the pronunciation of the surrounding letters than ss.

For example, Maße and Masse are only possible to differentiate due to ß.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: