Hacker News new | past | comments | ask | show | jobs | submit login
Windows 11 calls a zip file a 'postcode file' in UK English (twitter.com/jymfox)
827 points by TonyTrapp on June 7, 2023 | hide | past | favorite | 469 comments



I remember reading an interview with a British author, it might have been Neil Gaiman. He was just about to get his first book published in the US, and the American publisher contacted him and asked if it was OK if they changed a couple of British words to American, like "flat" to "apartment". Not wanting to risk the publishing deal, they said sure. A couple of month later they got the first edition of the US version back and found lines like.

"He looked out over the apartment landscape"

and

"'Come with me', he said apartmently"


- "Someone’s done a lot of find and replaces -- NEVER a good idea in galleys. Dave Langford put something in Ansible recently about how on the galleys of my novel Neverwhere someone Found-and-Replaced all the flats to apartments. People said things apartmently, and believed the world was apartment."

- "None of these were quite that bad – they were subtler..."

- "F’rinstance: All instances of the word round have become around. Fine for walking around the lake, less helpful for the around glasses, the around holes in the ice; blonde has uniformely become blond, and so blonder has become blondr; for ever has become, universally, forever, and for everything thus became foreverything, and we also got foreveryone, forevery time and so on. Each had to be found and caught."

- "Little things – the icelandic þú became , which won’t bother anyone who isn’t Icelandic. Blowjob had inexplicably become blow job again. (I think a blowjob is a unit of sexual currency, whereas a blow job is something you can get -- or indeed, give --instead of a wrist job, a sleeve job or a window job.) And once again every damn comma gets scrutinised. And I changed an Advertise to an advertize which was nice of me."

https://journal.neilgaiman.com/2001/03/ ("American Gods Blog, Post 24" [2001])


My dad had a work anecdote about a company invite to an "African-American tie event".

It was a British company.


My South African sister in-law who recently immigrated to America said that her black peers in grade school would sometimes refer to themselves as African American.

Cultural hegemony complete.


It's not too surprising in America, but I know a person who would refer to any Black person as African American in Europe. In most cases, neither party would have had set their foot in America in their lives.


I think parent talks about people in South Africa referring to themselves as ‘African American’.


Yes, this was happening in classrooms in South Africa.


For good or bad, in many underdeveloped places where black people are rare, they are intuitively perceived as Americans (which stereotypically means rich, cool, free and capable of teaching you great English) and treated with additional respect because of this.


I had a South African friend in middle and high school. They fled the country when it became popular to put a tire around a white person's neck, full it with gasoline, and light it. Mandela's wife promoted doing this.

They gained American citizenship. She became my African American friend. Everyone told me I was terrible for talking that way.


Heard my daughter using the n-word for her school mates (they ain't black) - they all listened to rap music and obviously our schooling system don't cover American history i.r.o slavery.

Had to explain to her why it would not be good idea to do that in the US.

I never believed in the idea of embracing an offensive word and somehow "taking out the sting" like rappers did.

We have a similiar word - the k word for blacks - it is fortunaly banned and if you used it on social media you clearly identify as a racist.


I'm a little confused - are you talking about Black South Africans? Hmm.

I foolishly expected that people like Elon Musk could be truthfully called "African-American", but I did a survey of the definitions, and most all of them agreed that an African-American must be Black (and I'm not sure how that is recursively defined, or whether it has to do with race, ethnicity, or simply color of skin: it must be the latter if it has any meaning separate from "African"!)


Typically, African-American means black AND, unintuitively, American for generations IME. Maybe it shouldn't mean that but that's kind of assumed where I live at least. For first or second generation Americans you might use the country (eg Nigerian-American). For other races from Africa you'd also rarely hear them referred to as African American.


It's a bit weird from the outside tho cause if you're a white south-african and you hear the phrase used in America you'd be like "oh, I should refer to myself as an African-American" although unlike Americans, people generally only refer to themselves as being from the place once they have citizenship (and would therefore have learnt the intricacies of the American phrase) and even then many people choose not to.

American: "I'm Irish" Me: "No, you're not"


Of course, part of the issue here is that there isn't really a good shorthand term for "the group of American Black people of diverse African origin whose cultures were oppressed and destroyed by the white majority for centuries, leading to a mélange culture intentionally constructed out of the remaining scraps".


But "Caucasian" is (was?) used for people who for many generations weren't anywhere near Caucasus Mountains.


Oh God I hope people showed up in some outrageously... I don't even know what, but just some outrageous (not black bow) ties. That's brilliant.


I could see how some ties could upset everyone.


Alternatively, that's how you end up with an accidental cakewalk.


The Scunthorpe problem

In my experience Americans are shocked that it's perfectly fine to say that someone's black in the UK.


Those of us who follow usage guidelines more closely know that it's OK to use the 'b-word'. Case in point, https://apnews.com/article/race-and-ethnicity-us-news-busine...


It gets an even more politically incorrect meaning when you read "tie" as a verb.


> It gets an even more politically incorrect meaning when you read "tie" as a verb.

The change from "black tie" to "african-american tie" seems much less like a goofy 'find and replace' fail when you read it that way.


I once worked on a project where we had the concept of a "case" and we had to change it to a "task"..

Suddenly, we had a lot of "switch/task" statements that somewhat perplexed the compiler.


An SEO bot stole my blog post about Old Street in London, it was now about Ancient Road instead.


You should check out the TV show Old Aliens.


Don't you mean Senior Citizen Foreigners?


Re>> "Someone’s done a lot of find and replaces -- NEVER a good idea in galleys.

Ah, the Scunthorpe problem. [0] It's never a good idea, but the immature 12 y/o in my always giggles at the examples.

One of my favorites is the word "ass" being replaced by "butt", resulting in this gem [1]...

"'Prohibition on buttbuttination' states, 'No employee of the United States government shall engage in, or conspire to engage in, political buttbuttination.'"

[0] https://en.wikipedia.org/wiki/Scunthorpe_problem#cite_note-b...

[1]


> In February 2004 in Scotland, Craig Cockburn reported that he was unable to use his surname ... In 2010, he had a similar problem registering on the BBC website

So the BBC was allowed to register their domain without issue, but Craig couldn't register his account with the BBC. That's Dick.


This was great, thank you.

One positive outcome though is the invention of a collection of excellent words: foreverything, foreveryone and forevery. I don't know what they mean, but I like them.


Some of those could be avoided by using a search regex such as "\bblonde\b". I believe a sophisticated enough regex could solve all of these, if it were a central library used and maintained by many.


Maybe people can just deal with books having regional differences in the same way authors do. Is it so scary for F. Scott Fitzgerald to write about the color of the light, but Rowling to talk about the colours of the school houses?


This. Stop dumbing things down (like my browser's dictionary that apparently doesn't know the word "dumbing"). Let the readers find and delight in the differences of diction and usage.


I think the sort of people who read books can probably cope. It sounds like the publishers are making allowances for the sort of people who don’t do much reading anyway.


Uhh... People who count listening to the audio book as "reading."


Well people listening to the audio book:

- can’t hear the differences in spelling between American and British English

- are listening to someone read. That person probably is able to read either American or British English.

That said, some words really are different and could cause confusion/disconnect so probably are potential candidates for being changed even in audio books.

- the one he gave “apartment” vs “flat” is a good example. I think British people would all be fine with “apartment” whereas I think “flat” might confuse some Americans who hadn’t heard that usage.

- “pavement” means the road surface in the US vs it means the pedestrian path on the side of the road in the UK (that in the US you would call a “sidewalk”). “Sidewalk” wouldn’t confuse a British person but it’s not a word a British person would use. “Pavement” meaning the road surface does confuse British people.

- “jumper” in the UK vs “sweater” in the US

- Don’t even ask about “pants”

- etc


I'm not having that though - it's all one way in favour of keeping Americanisms because we're familiar enough with them to understand them. No thanks.

On the flipside, I was delighted to hear stories of US kids confusing their parents with British English because they'd watched so much Peppa Pig during lockdown. Very good.


> might confuse some Americans who hadn’t heard that usage.

Imagine learning something about the world after reading (or hearing) a book!


I basically agree and wouldn't want things to be changed in general. That being said, the choice of words in a novel should generally be governed by the artistic intent of the author rather than the desire to teach people stuff.

But here's an example that's worth thinking through. This movie https://www.imdb.com/title/tt0110428/?ref_=fn_al_tt_1 was called "The Madness of King George". It's based on a play called "The Madness of George III". They changed the name when producing the movie because of the difference between Britain and the US.

In Britain pretty much everyone would know that George III refers to King George III. However when they first started discussing this project in the US, a common reaction was "I haven't seen "The Madness of George 1 and 2". Now the differences between the titles wasn't that important and clearly this was just an impediment to understanding with no benefit.

Now you could say "Well the people could go on wikipedia and find out that there was a King George and a King George II and then this guy, George III" and that's true but the problem is negative self-selection. People who don't know that don't realise that's what's at play here and so won't do that search so nobody learns anything.


It's not particularly reassuring that "George III" (spoken aloud as "George the Third" ), and "George 3" (spoken aloud as "George Three" ) are apparently thought to be interchangeable in the US.


This! One of the things that used to happen is you'd pick up a book, and encounter a word or phrase you don't know or understand, and you'd go look it up. This is easier than ever with the internet.

Don't know why someone would "rent a flat?" You go look the phrase up and discover that it's British usage. Confusion over... because you learned. Reading Moby Dick and don't know what "scraggy scoria" is? You look the words up and learn they're the perfect fit for the landscape Melville is painting.

https://webstersdictionary1828.com/Dictionary/scraggy

https://webstersdictionary1828.com/Dictionary/scoria


My favourite anecdote is when an english bloke goes to the U.S. and asks if he can "bum a fag" from someone, meaning "ask for a free cigarette" in British.


At least it is the full book, unlike movie adaptations, also a nice way to "read" while driving.


That example isn't scary, of course. The -or/-our spelling difference is only the simplest example of what's different between American and British English. I think it's pretty clear to someone who's never even encountered the other spelling to infer what the word means.

But there are difficult examples, even in Rowling. She uses "revision" in a way that American readers don't know ("studying") and might have a hard time piecing together from context. It's not like the reader will be prompted to open a dictionary, since "revision" is a common word in AE. It just makes for a poor reader experience.


That's the most ridiculous thing I've come across in a while. It absolutely doesn't make for a poor reader experience, and that's some US Defaultism if I've ever heard it. And I'm neither British nor American! US writing is basically never localised to Australian English, why in the world must the converse be true?


If this is what the level of education has become in the US, well, I guess it kind of explains a lot


Doesn't quite have the same ring to it:

> Righto, mate, gimme a shout as Ishmael. Few donkey's years ago—don't get your knickers in a twist 'bout when exactly—findin' me wallet as dry as a dead dingo's donger, and not a bloody thing worth a squiz on the land, reckon I'd chuck a U-ey on life, put a bit of water under me bridge. Wanted a stickybeak at the wet half of this great wide world, didn't I?

(With ChatGPT assist!)


Donkeys years ago means a long time. You don't prefix it with a few.


Do you imagine people without any dialect of English as a native language read books and articles on the internet with a dictionary at hand or do you think they somehow cope with the unknown and manage with context alone?


As someone who reads fantasy fiction a lot, honestly I'd prefer if I read everything in American English. The biggest thing that gets me is e.g. "defence" instead of "defense" and other places where British English uses a 'c' instead of an 's'. It seriously diminishes the immersion for me, even if I've read probably 50+ books with the version I'm not used to by now.

It's never stopped me from reading a book before, but it does diminish my enjoyment quite a bit. For nonfiction, I don't care, I'm reading actively regardless. But when reading passively for enjoyment, it's noticeable and irritating.


For fantaſy, eſpecially, I diſagree. When I'm in charge, the long-ſ will be mandatory in any ſtory involving magick, to better ſhew the hiſtorical baſis of the mythology (unleſs, of courſe, a ſkinwalker were to be involved, in which caſe American ſpellings are indeed preferred)


Фор бетер іммерсіон, ол тектс ін сторіес И ред шѵд бе урітен уіþ олд сирілік летерс, регардлес ов þе лангѵадж.


ᚳᚫᚾ᛫ᛁ᛫ᚫᛞᛞ᛫ᚫ᛫ᛒᛁᛏ᛫ᚩᚠ᛫ᚱᚢᚾᛖᛋ᛫ᛏᚩ᛫ᛏᚻᛖ᛫ᛗᛁᛉ?


פור טקסטס סט ין אנשיאנט טיימס, אונלי עברית וילל דו.


Incredible. I sent that to Google translate and it translated fairly well.

(For those who can't read it - the text is English transliterated into Hebrew script, not actually Hebrew. I'm impressed that Google managed to make sense of it)


> уіþ Was the thorn really a part of old Cyrillic? Would make some sense given the shared history, but still.

> И I struggled for a bit with this one, but I don't know of a better way to represent it other then "ai" in Cyrillic I guess.

The "ѵ" is interesting too, never seen that before since you can make do with the sorta B-looking letter instead.


There're several mistakes:

- instead þ could be used ѳ;

- с in тектс is missed, it should be текстс or better теѯтс;

- ѵ is basically Greek upsilon which appeared in English mostly as u [auto] or y [system], therefore шѵд would be better written as шоѵд where /oѵ/ in old Cyrillic pronounsed as /u/ as in Greek, or just шꙋд;

- дж in лангѵадж could be writtend as џ;

And so on.


Thanks, these are good. I took inspiration from this horrible PDF: https://drive.google.com/file/d/14YgpdJ-k5M-GiwRDblvnEtsFvok...


> Was the thorn really a part of old Cyrillic?

No, never was a thing. Early Cyrillic had a bunch of unique letters that are long gone, but it never had "þ" in there.

> The "ѵ" is interesting too

It's from Greek "Y" (upsilon) and it - depending on the place - could've meant either /i/ or /v/ sound (and /u/ when in "оѵ" digraph, so parent comment has it wrong): https://en.wikipedia.org/wiki/Izhitsa


I think you’re on to something


𐡅𐡉 𐡍𐡕 𐡀𐡓𐡌𐡉𐡊


I have to say that the long-s looks quite horrible in Verdana, and probably most other sans-serif font. That makes sense, since most Sans Serif typefaces significantly postdate the abandonment of the long-s.

If you want to see how it should look in context, here's a typical 17th century example: https://www.raptisrarebooks.com/images/86066/paradise-lost-a...

The prints from the 18th century are even nicer with better quality and consistency in general. The Caslon typefaces are pretty exemplary here: https://upload.wikimedia.org/wikipedia/commons/4/45/A_Specim...

Roman typefaces with a long-s are a very 17th-18th century thing. A blackletter (gothic) typeface gets you closer, but all of that is still not very much medieval. What you really want is a meticulously manuscript in Carolingian Miniscule[1] or Uncial script[2], complete with killer rabbits[3] and knights fighting snails[4].

[1] https://en.wikipedia.org/wiki/Carolingian_minuscule

[2] https://en.wikipedia.org/wiki/Uncial_script

[3] https://blogs.bl.uk/digitisedmanuscripts/2021/06/killer-rabb...

[4] https://blogs.bl.uk/digitisedmanuscripts/2013/09/knight-v-sn...


Sadly I still read the s as f so I keep got stumped. "fantafy?"


Reminds me of a Just William story where he copied a letter from a letter-writing guide book, studiously converting all the "f"s into "s"s.


I read someone saying that reading text with the long s is like hearing someone speak with a lisp, and that is SO true. I love it.


> seriously diminishes the immersion for me

Upvoting because this is seriously interesting. I can see old or Tolkien English being a hurdle, so I relate to what you’re saying. But small changes in diction tend to draw me into the world, by othering it from the familiar.


I think it might be partially because I've done & currently do some copyediting work. So I'm pretty trained to see any grammar or spelling mistake. To be clear, it's not the difference in language that bothers me, just words that are spelled differently from American English. I literally can't not see them.


...what? Why would this make a difference?


I've found this jarring only in one case: the dub of Mary and the Witch's Flower, where the setting was an emphatically English-style public school, the lead character's VA had (or was using) a strong Home Counties accent, but the lines were full of US-only terms like "Somophore". It just felt horribly fake (even in a fantasy story!) because you'd never see those things together in real life.


> but the lines were full of US-only terms like "Somophore"

“Sophomore”, maybe?


Oh probably, I don't really speak American and it was a while ago.


Somophore is the same thing but they carry flags with them everywhere.


A differense, please, GP's reading.


For me, it's reading words like 'centre' and 'colour' or 'practise' because I can't help but read them as 'sen-trey', 'col-ore' and 'prac-tize'


People knowing enough about regex replace to use word boundaries are most likely aware just how dangerous bulk replacing is, and will either be super careful, or just won't do it like that.


Troublingly, "blonde" isn't even a British vs. American English thing -- it's a gendered descriptor, of which we have relatively few remaining in English. Blonde is the feminine form, blond is the masculine/neuter form. Whether you want to preserve gendered adjectives is a different question entirely, but the definitions are rather clear, having carried over from French.


Surely, an LLM could handle this, if it were to get fed the entire book in chunks. "Change all mentions of the word 'flat' in its connotation as the British word for 'apartment', to say 'apartment'. Make no other changes to the manuscript." Then have it double-check its work afterwards. "Does every instance of the word 'apartment' in this text make sense? Does every instance of the word 'flat' in this text make sense in American English?"

and then run this prompt on each pair of words that you want it to do. When it's done, run a quick diffcheck on it & check its work to learn of any gotchas.


Surely the effort to manage LLMs is wasted on keeping a publisher from having to do the job they’re paid for.

Are we really going to let big corp automate away their role in the social contract? Ostensibly Random House needs to provide real output not just be a brand that farms out all the work to AI, and thus captures value on the fiat ledger.

The number of rent seeker non-contributors is too damn high. We cannot base society off dying peoples hallucinations about how the world works anymore. This is bonkers.

I never asked to exist. Why am I constrained to beliefs like Bill Gates and the like are divine in contemporary ways?


Why the hell does it have to be AI? A regex would be faster to run, test, and diff check. Even better,instead of paying a programmer to waste your time just pay an actual editor to proof read it


It doesn't. It perhaps shouldn't.

That said, I found myself doing little text mangling tasks with GPT-4 instead of appropriate command line tools, because fuck if I remember the flags to sort and unique, and it's much faster to just describe what I want and have the LLM take a crack at it.

This is why, I think, you'll see a lot of simple(ish) tasks being done by LLMs, at couple orders of magnitude more compute cost - it's just that much more convenient.

EDIT: nevermind. That was about just doing regex replace.

But if you want to do it correctly in an automated way, LLMs are indeed the best tool for the job in the general case, because they understand how language work, in a way that you can't really formalize in code. No, feeding the complete definition of English grammar to the computer won't help, because a) it probably doesn't exist, and b) even if it does, it's merely a suggestion - natural language is a living thing, and is not bound by fixed rulesets.


Typing out the instructions, copying and pasting the data, then analyzing the output for correctness is faster than running `sort -h` ?


It can be less mental effort, not necessarily faster.


Faster than recalling or finding out again that it's `sort -h`, and not `sort -n` or `xargs -wtf -- sort -xkcd {}` or whatever.


But the LLM already "understands" the changes that need to be made. A regex to figure out if "flat" means "apartment" or not? Consider the text (which I just made up & I'm really not a fiction writer so I apologize for it being terrible):

"Oh, I live near there too! House or flat?"

Stanley hesitated. These questions were getting more and more personal. Was this her idea of casual conversation? Or was she trying to get to know him personally? Well, he thought, what could go wrong if I treat this like a conversation. "Flat," he said. And then asked a question of his own. "What kind of pop do you like?"

How are you going to decide if "flat" is talking about non-fizzy pop/soda, or a dwelling space with a regex? It's a lot closer to the word "pop."

Of course, GPT handles this perfectly https://chat.openai.com/share/fb6a56a2-749c-4e71-9506-541bb8...


Some places in America do say "pop" instead of "soda," so I'm hesitant to call it perfect...

But I'll admit I agree that this is well beyond the scope of regex.


Because replacing all appearances of apartment with apartment just apartment out isn’t going to work?


If you're going to use an ML model, you might as well use an existing language translation model rather than asking ChatGPT or the like to convert words one by one. You'd probably get better results treating American English and British English as entirely different languages rather than assuming one is the same as the other but with a handful of different words.


instead of feeding it into a black box and hoping it does the right thing, how about just paying someone to manually find each instance and only replace it if appropriate. it's tedious and grueling but assuming every other word isn't "flat" should only take a few hours.


You must be american if you think it's that easy. I think american don't realise how brits end up learning both our versions of words and theirs too because so much TV is american.


It's a shame that part-of-speech analysis tools aren't more widely available or widely used. That could at least reduce the damage done on "flat" -> "apartment", even when only considering entire words.


This is not going to save you completely unfortunately. You still need to know if a flat is a flat tyre, a flatbed, or an apartment.


Douglas Adams also battled with this:

> One general point. A thing I have had said to me over and over again whenever I’ve done public appearances and readings and so on in the States is this: Please don’t let anyone Americanise it! We like it the way it is!

> There are some changes in the script that simply don’t make sense. Arthur Dent is English, the setting is England, and has been in every single manifestation of The Hitchhiker's Guide to the Galaxy ever. The ‘Horse and Groom’ pub that Arthur and Ford go to is an English pub, the ‘pounds’ they pay with are English (but make it twenty pounds rather than five – inflation). So why suddenly ‘Newark’ instead of ‘Rickmansworth’? And ‘Bloomingdales’ instead of ‘Marks & Spencer’? The fact that Rickmansworth is not within the continental United States doesn’t mean that it doesn’t exist! American audiences do not need to feel disturbed by the notion that places do exist outside the US or that people might suddenly refer to them in works of fiction.

https://news.lettersofnote.com/p/please-dont-let-anyone-amer...


This hampered me when playing the HHGTTG text adventure as a youth. My memory is fuzzy on the specifics, but one of the early things needed is to relieve a headache or similar, and a search of Arthur's home would reveal an "analgesic" eventually. Unfortunately you wouldn't be able to do something else until you consumed the analgesic, and eventually you'd lose the game because the home was demolished while you're in it.

I did not know at the time what an analgesic was, despite having a fairly broad vocabulary. If anything, I assumed it was some dirty adult toy; certainly not a thing to be consumed as a remedy for pain.

It was frustrating because I searched high and low for aspirin, Tylenol, pain-killer, etc. but never got past that section until I had a chance to search the Internet for a solution - many years after I really cared to play a text adventure again.


The word you really needed to know was dictionary, not analgesic ;-)


Analgesic isn't a British word, though. It's used plenty in American medicine.


No, Americans only know brand names because #capitalism.


...that's why everyone in the UK has a Hoover. The most common term I've heard for {acetominophen, ibuprofen, etc} in the US is "pain killer".


So aggressive. In Dutch it's a pijnstiller, or "pain silencer".



Sometimes it turns out great, though. Wasn't there a scene with an award for "best use of the word 'fuck'" that was changed to "best use of the word 'Belgium'"? That was an improvement.


Yes! It was the “Rory Award” from Life, the Universe and Everything:

https://hitchhikers.fandom.com/wiki/Rory_Award

Definitely funnier as “Belgium”.


"So why suddenly ‘Newark’ instead of ‘Rickmansworth’? And ‘Bloomingdales’ instead of ‘Marks & Spencer’?"

Newark (on Trent) is in Notts. Rickmansworth is pronounced "Rick-uth" (not really but it should be) and Bloomingdales would be pronounced something like "Blimmin'dolls", if it ever showed itself over here.

The Adams quotes are from some time ago and we have all passed a lot of water since then. The entirety of en_* are now routinely bombarded with everyone else's TV/stream output in various guises. I might be a Brit but I am now (not really) intimately familiar with Australian and New Zealand police and "Border Force" etiquette and more. I can even hold my own when confronted with a particularly tricky issue in say Nunavut with the RCMPs. I am of course an expert in Texan rangering thanks to a bloke called Walker.

You say tomato and I say fuck that: viva la difference!


> The Adams quotes are from some time ago

I would be surprised if they were recent, but I wouldn't put it past him, either.


There's a Newark in England.


There's an American sprinter named Tyson Gay. Or, as some conservative media referred to him, Tyson Homosexual. (https://languagelog.ldc.upenn.edu/nll/?p=294)


We'll all be homosexual when Johnny comes marching home.


The best part of this link was learning about Reuters saying Queen Elizabeth lays 2000 eggs per day.


I missed that!


My favorite bad translation was a Philip Jose Farmer novel where all measurements were given in Imperial and Metric units. As in:

"It was about 1000 feet, or 304.88 meters". "He was 6 feet tall, or 182.88" meters."

Every single measurement, converted with two or three decimal places. It got old fast.


-I'll see you and raise by some movie or other with Walther Matthau in it which I saw on the Hallmark Channel a few years ago - in a scene, he asks a kid if he knows Lincoln's Gettysburg Address.

The Norwegian subtitle? 'Do you know where in Gettysburg Lincoln lived?'

Oh, and of course, from Star Wars - 'Luke, this is your father's light sabre' was translated as 'Luke, this is your father's lightweight sabre'


> Star Wars

Putting this here for anyone who needs a laugh:

Star War The Third Gathers: Backstroke of the West: https://www.youtube.com/watch?v=9DI5WyiHQno

(Someone took a copy of Revenge of the Sith that had been translated from English to Chinese and back, and then re-recorded that dialogue and put it back over the original footage.)


And here's the entire 2h20 of this mistranslated movie: https://youtu.be/XziLNeFm1ok

(Baader-Meinhof is a funny thing. I have learned of this movie less than 12 hours ago and here it is again)


Thanks, I just watched a fragment of that video.

I'm amazed by how much more realistic the space fights looked in the first three Star Wars movies compared to the CG stuff.


I think it's worth citing the original source for this!

http://winterson.com/2005/06/episode-iii-backstroke-of-west....

It was originally bought from a DVD street vendor in Shanghai. And it's the source of "do not want" as a meme!


This is brilliant; I didn't know there was a well-known source for this. Thank you!


In the Finnish subs for The Royal Tenenbaums DVD, a character who says in English “There’s a dent in the car. There’s one here, too” gets translated as “There’s a dentist in that car. There is a dentist, too.”

A few years after I saw this, I entered the film translation business myself. Generally for anything Hollywood or otherwise big-budget, you can watch a copy of the whole film yourself to understand the context, and you can bill the client for the time spent doing that. Therefore, I tend to suspect that such cases of mistranslation are laziness or a company with an incompetent workflow.


Another classic is a Simpsons episode where Homer shouts „Isotopes rules!“ which in the German dub turned into „Isotopenspielregeln!“ (rules of the game Isotopes)


Very avant-garde. I like it. It must be possible to subtly translate an entire episode like that to make it be about something else entirely.


Translations of foreign media that make the story about something else entirely, are a well-established genre. For example, one of Woody Allen’s earliest films.[0] Granted, this wasn’t done “subtly” at all.

[0] https://en.wikipedia.org/wiki/What%27s_Up,_Tiger_Lily%3F


I've often seen TV series translations (both as subtitles and as scripts for later dubbing) done by handing over the text to be translated - without any access to the show itself. So the translator has zero context what is on the screen that they're talking about, for things similar to that Star Wars scene, the translator would have no way of telling whether that "light sabre" is bright or lightweight.


I mean... The first one wouldn't be confusing, if it was plainly called Lincoln's Gettysburg Speech.

So a Norwegian interpreter - who has no obligation to know of Lincoln whatsoever - can read it and reasonably interpret it as a request for an address in Gettysburg, for Mr Lincoln.


> who has no obligation to know of Lincoln whatsoever

Part of being a decent translator is having a knowledge of the common cultural references of the source-language’s country/countries. Films very frequently generally play on local history or previous films or literature, and you are expected to be able to deal with that.


I suspect it was an early attempt at machine translation, or perhaps that the translators are paid so bad there is no incentive to pause even for a moment to evaluate if the translation makes sense in context.


The problem is what you don't know you don't know. If your understanding is lacking, how do you know you're not watching a movie veering into the absurd?


The first one wouldn't be confusing, if it was plainly called Lincoln's Gettysburg Speech.

Do you seriously not know that a speech is an "address?"


The translator apparently did not, and it would probably have been fine if it was actually called Speach.

that was the entire point of the comment


The weirdest number translation I saw was on a package of spaghetti. The cooking time was 8-10 minutes in English, and 10-12 minutes in Spanish. Note that they were both in Arabic numerals, not spelled out. Why do I have to cook my spaghetti longer if I speak Spanish??


I've got a pack of rice that says put in X amount of water and cook until dry in danish, in swedish it says put in 2*X amount of water, cook Y minutes and strain the rice. Double confusion, why different instructions and who strains water from the rice


It's an official recommendation in Sweden due to arsenic content of rice.

https://www.livsmedelsverket.se/en/food-and-content/oonskade...


Thanks for the revelation of the mystery


For food to be kept in the refrigerator Danish, Swedish, and Finnish instructions differ by 1 degree centigrade each. Don't remember the values from the top of my head, something like 8, 7, 6 respectively.


To this day, living in US for 8 years, I have to stop and think if I need to translate minutes and seconds to American.


There's a handy rule of thumb for this: is the number of seconds per minute (60 aka 2*2*3*5) a weird, random-sounding number that's way too easy to make clean subdivisions out of[0], rather than a nice, math-hostile power of ten? Then it's probably already in American units, unless it's British.

0: https://en.wikipedia.org/wiki/Highly_composite_number


Robespierre wanted to saddle us with metric time too, you know.


The 60 is from Sumer.


It's in the context of:

> I have to stop and think if I need to translate [] to American.

and I assume (reasonably though not certainly) that the commentor is not Sumerian.


After careful double blind studies, it was determined that if you directed impatient spaniards to cook for 10-12 minutes, they cooked it for 8-10 minutes?

This translation really is weird! :-D


Spanish speaker has different al dente preference?


We do. We tend to overcook pasta.


I'm surprised to learn that the English don't.


Because spanish people like their pasta mushy and overcooked (true story)


Many cities in Mexico are at relatively high altitudw (but probably not high enough to make a difference...).


Probably a cut and paste job from another style of pasta they offer, and the intern was asked to just change the pasta name and cooking times...


The English speakers will overcook it anyway.


Atmospheric pressure? Water hardness?

Go figure.


I remember seeing a snippet of a text talking about global warming that said something like "a rise of 2C(35.6F)"


Exactly 1,000 feet is 304.8 meters, but "or about 300 meters" would have sufficed given the context. "About 1000 feet" already implies not being exactly 1000 feet.

Of course, exactly 6 ft is 182.88 cm, but the precision is unnecessary there too. 189 cm if it was right on the 6 ft mark.


Precision might be an overshoot, but providing SI measurements along the obsolete yet better sounding to NA ear units is quite nice. Since most of the weird units offer no straightforward / memorable ratio to multiply by to get a regular one, so people outside the bubble cannot translate it to anything meaningful.


you have a small typo which is a bit humorous in this context: 182.88" meters (or about 183 inches-meters?)


That, and 6 feet is closer to 2 meters than 200.


I'd bet, it was centimeters which the way to give height in most countries using metric.


My dad wrote the gardening column in the local newspaper, and a yearly gardening book, when NZ went metric they updated the book with things like "plant seeds an inch apart" with "plant seeds 2.54cm apart" .....


maybe this actually are best space between seeds:)


Dad just thought it was stupid


There's several browser extensions which will do "metrication" for you. I leave one on just for the occasional amusement (although I'm actually more comfortable with US units), and it recently did that to the title of this article: https://news.ycombinator.com/item?id=35539595


In Windows 10 the Polish translation for the emoji keyboard shortcut was translated [Win+Okres], which pretty much means [Win+menstruation]


182 meters tall? Dayum.


with two digits precision even? that's bad ...


To add to the many find and replace issues that have shown up, at one point TSR's style guide for D&D material said that "wizard" had to be used to the exclusion of "mage". So when proofs of a book arrived where the authors had used "mage" they did a find and replace job.

Resulting in such lines as "The tower can take up to 200 points of dawizard before it is destroyed. Dawizard sustained is cumulative." and references to a spell called "Silent Iwizard".


There once was a company in Russia who tried to make their own wikipedia — but of course, they didn't have any resources to build their own content, so in their world, Europe in the middle ages was terrorized by mighty Nordic warriors called encyclongs (w and v are the same letter in Russian language).


Please help me understand this! Encyclongs doesn’t compute for me and also does not contain any w or v.


If you replace (in cyrillic script) viki[-pedia] to encyclo[-pedia] (because you can't do a whole word replace, since in inflectional languages like Russian every word has many forms with different endings, so you must replace the beginnings of these words to catch all the inflected wordforms) then vikings become encyclongs.


Vikings == wikings (in Russian spelling)

> echo "wikings" | sed "s;wiki;encyclo;" # which is what enci.ru did

"encyclongs"


This happened with D&D. Someone replaced all instances of "mage" with "wizard", and as a result, you would incur "dawizard". [1]

[1] - https://www.reddit.com/r/DnD/comments/s82mi4/til_that_in_199...


Is it so hard to only replace full word matches?


Also dawizard and wizardnta (when a tabletop RPG company tried changing "mage" to "wizard"), "amDanielan" (when an Eric was renamed to a Daniel) and a company that was "formerly in the red, but now in the African american."


Clbuttic mistake!


I feel like folks in editing or similar roles put their "problem detection" hat on too tightly.

I get lots of advice like that regarding code and UI and the suggestions about perspective problems are often absurd. Nobody has been confused by the thing yet and they're concerned that someone "might" be confused and not able to figure it out themselves so they change some words or UI and ... I kid you not more often than not the solution is the thing that trips up users.

In the above example, maybe if the reader doesn't know what a "flat" is, maybe they'll just look it up or understand by context and they'll be ok?


I don't think they're concerned so much by the people who don't know what a flat is, but by the people who do. There's more than a handful of bookreaders out there who are very protective of any difference between their national form of English and some other national form of English, who will get upset if a local publisher uses the foreign form.


Seems really strange to me not to defer to whatever the author wants.

Having said that, I don’t disagree with what you’re saying.


Why would such people read things written by a filthy foreigner?


That's exceptionally lazy. When coding, unless the false positive rate is exceptionally low, I just find (without replace) and go through them by hand. How many "flat"s can there be in 1 book? C'mon. You might have to read 20 sentences, oh no.


When I'm feeling down, reading about stuff like this always gives me an elevator


I developed KeenWrite[1] to make using variables in documents trivial[2]. My editor has no search and replace function. For my sci-fi novel, there's a variable {{location.protagonist.tertiary.Type}} that has the value "Bavarian Village". I could change this to "apartment" and every instance throughout the document prose and reference diagrams would update automatically and contextually. My typical example for explaining the use case is changing the name "May" to "June", but flat/apartment is hilarious.

[1]: https://github.com/DaveJarvis/keenwrite

[2]: https://youtu.be/CFCqe3A5dFg?list=PLB-WIt1cZYLm1MMx2FBG9KWzP...


That definitely must've been early. I would've thought Gaiman of all people knows how easily English humour can get lost in translation. Often even when not changing the text


the irony being that many readers probably attributed this to neil's sense of humor. i would have. "he said apartmently? what? aaah ... now i get it - haha - good one!"


ChatGPT, GPT4, 5, 6, ... :

Change all occurrences of flat to apartment, intelligently!

Got you!

;-)


As a bonus it will even insert random facts about current political candidates!


Ha ha, good one. In fact, (shhh ...), that is the main purpose of all AI. ;-)


"He looked out over the flat landscape" -> "He looked out over the landscape, intelligently"


Nice try.

Ch(e)ating, obviously. You omitted apartment in the replacement.


Oh man, that's a clbuttic mistake!


I laughed so hard, I let out an apartmentus.


A couple of years ago, Turkey/Türkiye had a campaign to get people to use Türkiye in English. At that time I flew Turkish airlines, which had a promotional video about this, and with all mentions of Turkey in safety cards, magazines and the seat back screen changed to Türkiye. Then when browsing for a tv show to watch on my seat back screen, I came across an episode of Everybody Loves Raymond with a description something like “on thanksgiving day, Raymond burns the Türkiye“


Should Turkey decide how their country is called in other languages?

America is "Měiguó" in Chinese.

I'm pretty sure the Chinese decide for themselves what they want to call America.


In general I think yes. People should be able to decide what they want to be called. I think it is really strange that we assign names to countries.

Imagine if I asked you what your name was then ignored what you said and pronounced that I would be called you Frank. That would be quite rude.


That's the story of my life, unless you speak French you will probably pronounce my name wrong, no matter how many times I repeat it and help people pronounce it, I don't get mad nor think it's rude, it's kind of amusing actually. Once I realize they can't handle it I just tell people to call me by my last name which is very easy to say in English. There are too many interesting things in life to get hung up on a petty detail like a name.


Hey if you write your name out as LAST First (which I see many French people do) then you might not even have to ask people to call you by your first name.


Most of the time (and all of the time with gendered pronouns), you don't use it to address me. You're using it to talk about me. And then, it's really between you two what you call me, isn't it? Of course I would be happier knowing you didn't refer to me in a rude manner, or as something I'm not, but I believe in privacy too, so it's really your business.

Turkey isn't rude, and it's usually understood from context that we don't refer to the bird (the Turkish government also push Türkiye on countries where the native word has no bird connotation). We used to translate all names, and that's understandable because names are often unpronounceable or otherwise violate grammatical rules if you just blindly drop them in a different language. I think it should be fine to use "Turkey" when talking to another English-speaker.


It's a bit more complicated than that. It is also quite rude of you, where you not to not accept that other people write using other letters have have other abilities in terms of what phonemes they can pronounce.


I see a distinction between mapping a name more or less faithfully to the sounds and spelling of a language and coming up with a completely different name. For example Brazil in English is not the same as Brasil but is fairly close and fits the language. Whereas IDK where Germany came from.


> Whereas IDK where Germany came from.

Von den Germanen.

https://en.wikipedia.org/wiki/Germanic_peoples

I'm not a fan of this modern idea that you should get to dictate how others refer to you. E.g. I don't think it makes sense to refer to China as Middle Country for anyone not from China.


Sometimes, sometimes not. Everybody changed Peking to Beijing when China requested that. But living in Netherland, I shake my head at all the languages insisting on pluralising the name of my country. And that's not even addressing "Dutch".


I'm 100% opposed to telling people how to speak their language. I don't even like telling people who use English as a lingua franca how to use it in a culturally "correct" way.

Normally, I'm outspoken against most "domestic" proposals to change words, since the rationale and implementation are usually very poor.

But as an American, I've decided that Turkiye without the ü is more desirable than Turkey. It resolves an annoyingly ambiguous search term, doesn't change anything about how it's pronounced, reads the exact same way, and it's trivial to switch how I write it. It really is a superior design with a painless transition.


no one's gonna type the "ü". this new name will either not get adopted all get adopted in wrong ways, like people using turkiye, adding more to the search fragmentation.


Not that I disagree, but it's not "no one". There are more countries than just Turkey with the letter ü :)


So should Germans write it as Tuerkye when the ü is not available?


I think that request passed the UN last year its being adapted by countries now, ao looking forward to more of these


Since the bird was named after the country shouldn’t the name of the bird change too?


My guess is they are doing this whole thing in the first place to get rid of association between their country and a large gobbling bird.


Is that the reason? I thought it was just more of a Erdogan nationalism thing


sounds like both no?


> TRT World explained the decision in an article earlier this year, saying Googling “Turkey” brings up a “a muddled set of images, articles, and dictionary definitions that conflate the country with Meleagris – otherwise known as the turkey, a large bird native to North America – which is famous for being served on Christmas menus or Thanksgiving dinners.”

Ha, so yeah it seems like that's definitely a part of it.


Why do you think this is a good thing?


Idk what's happening to the Windows/file explorer teams, this is embarrassing.

Another issue I've had in newest file explorer, (probably a bug, unless they're trying to get rid of ".txt" files), is this one also reported on Twitter by "MittringMartin":

https://twitter.com/MittringMartin/status/166378202005572812...

can't do "right click" > "New" > "Text document".

Happens on my system too, a stock windows install on a surface laptop, for which I've never messed around with any obscure settings.

It's also gone from the "shift right click" menu (aka the good old menu)...


> Idk what's happening to the Windows/file explorer teams

I'd say whatever it is, it's been happening for a looong time, considering they've still never fixed the glaring bug in previous versions where the down arrow takes you to the second file in the list. Or where you go to save a file (save as), and you put your cursor at the end of the file name to rename just part of it, the cursor randomly-but-not-always jumps to the beginning of the file.


I really don't think this is a bug. The first element is focused when you start, but not selected. Pressing down moves the focus down, and selects. If you want to select the focused element, press 'Space'.


This collection-/tableview technique is known by maybe 0.1% of users and used by much less. You can also alt-arrow(?) to move the cursor around and select more elements with space. Never seen anyone use that. Instead of clinging onto this cursor nonsense they’d better finally made it work as (no selection, arrows select next/prev, [modifier+]mouse selects as usual) and get rid of a cursor. And make a proper tree view. Everyone does it this way except explorer.


What percent of users navigate explorer with a keyboard at all? I'd hate to lose keyboard selection over this. Ctrl and shift do different things and I use them both.


I do this all the time.

Every once in a while, navigating with the keyboard in Explorer will cause the window thread to hang up in a busy loop and I have to kill it. I have no idea if this is an Explorer bug or caused by other stuff.


I would hate if Microsoft "fix" it because it has been this way since at least Win 98 and is what I am used to.


I was going to reply but then I realized I already have before! https://news.ycombinator.com/item?id=35516321


This is hilarious. Somehow I responded to both.


At least they eventually fixed the bug where "New/Text Document" thought file names weren't allowed to start with a dot.


This smells like a bug. I have Windows 11 Pro and I'm seeing weird behavior in this menu.

1. Historically it used to contain a bunch of blank file types, including .txt files.

2. Now it only has a Folder, Shortcut, WinRAR archive. That's right, only three items. This has to be a bug.


You got WinRAR installed? Is it still alive or are you doing archeology?


7-zip's GUI is basically an incomplete WinRar clone, and WinRar has a couple more features. Most notably recovery records to be able to recover from mild bit rot, but it can also preserve more NTFS attributes of your files (saving alternate streams, security records, turning hard links into links) and can optionally respect the Archive attribute of files (archiving only files with the attribute set, clearing the attribute after, optionally deleting them).

It's a very solid archival program. I wouldn't recommend that everyone runs out and buys it, and I use 7zip more than Winrar, but it does have some advantages, especially on Windows.


[flagged]


They just released a new beta last month. Doesn't seem very "stopped" to me.


After almost a year since last stable.


Then call it slow development, not stopped. A stable a year ago and a beta a month ago is more active than most software out there, for which development literally stopped.

Or do you have any affiliation with WinRAR?


No affiliation, I just use both and compare.


"...and WinRar has a couple more features"

Like the one that 7zip has AES256 encryption and WinRar can't open it, that one?


I've been using WinRAR for multiple decades and it works well. I'm not about to rearchive all my stuff with a new format so of course I still use it regularly. If I were a new computer user I might choose something else as I certainly experimented with various tools back in 2000. WinAce comes to mind as a decent competitor of the time.


I believe 7-Zip can open RAR archives.


Was just going to mention 7-Zip after reading GP post, then read yours.

Yes. I've used 7-Zip some, earlier, and found it to be quite good. IIRC, from both command-line and GUI.


And IIRC the command-line options were like tar; they could be typed without a leading hyphen.


7-Zip's GUI is very lacking, at least on Windows it doesn't have the basic feature of opening the folder where you extracted the files automatically


Use Peazip then.


... Or just stick with WinRAR, which works better than both of those, and is still supported?


And is closed source with an annoying pop up on every startup.


Oddly enough, I ran into an odd case a few months ago with a zip file which required me to use winrar.

In windows, when you extract a zip file that contains japanese characters, they get 'garbled', which can cause problems if you need to maintain the directory and file names. I tried with 7-Zip as well with the same outcome.

I found a fix on stackoverflow [1] which mentioned using winrar, as it had an option to change the name encoding for archived file names. Using that I was able to extract the zip and the files and directories maintained their original japanese names.

[1] - https://superuser.com/questions/554108/extracting-a-zip-file...


infozip also has problems with japanese characters. I ended up solving the problem with the python zip module. considerably more awkward to use but it offers a lot of control over the extraction process.

I did not look it very closely so I don't know what exactly infozip was getting incorrect in my case. but I did find this interesting bug report from 2012. Apparently a lot of encoders are sloppy about the spec and will leave header fields zeroed rather than set them. and if infozip reads a header that says the zipfile was created by dos(a zero) it believes it and extracts it using a dos compatible encoding.

https://sourceforge.net/p/infozip/support-requests/10/


He's got to get value of money out of it, after all registering WinRAR it isn't cheap.


I still have a valid WinRAR licence, I think.


There were a lot of programs distributed as rar archives.


Knowing a bit of what goes on under the hood. Every time I have to use Windows, I just marvel at it actually working.


I still have 20 things in that menu in windows 11, but it hangs for a good 10s whenever I open it.


Regardless of the menu problem.

Why could he just heard for Notepad in the start menu and open it there? Rather than going through the process of gutting out a publisher file.


> Idk what's happening to the Windows/file explorer teams, this is embarrassing.

They've saved a boatload of money by offshoring this work.


I’ve got the same problem on my new laptop it’s super annoying


> Idk what's happening to the Windows/file explorer teams

They spent all their creative and time budget on implementing the new explorer tabbed interface :D


To save you a query:

- "The name "zip" (meaning "move at high speed") was suggested by [Phil] Katz's friend, Robert Mahoney.[4]. They wanted to imply that their product would be faster than ARC and other compression formats of the time.[4]"

https://en.wikipedia.org/wiki/ZIP_(file_format)#History


Conversely:

The term ZIP is an acronym for Zone Improvement Plan; it was chosen to suggest that the mail travels more efficiently and quickly (zipping along) when senders use the code in the postal address.

https://en.wikipedia.org/wiki/ZIP_Code


And for anyone who hasn’t heard of him, look up “mr zip”! The smithsonian has some great material. I’m hoping to create an online archive (hah) of mr zip materials at https://mr.zip sometime soon


Interesting. In iconography I often see an allusion to a zipper being used as if compressing an overfilled suitcase.


That's funny, I always pictured it as "zipping up" like closing up some sort of container.


Quite a lot of zip application icons used this imagery, so that's not particularly surprising


to quote the famous bash.org:

"What should I give my sister for unzipping?"

"20 bucks?"

"no like, for files and stuff"


Hahaha


It's funny how well that would fit tar.


Huh, I had always assumed the etymology was related to "zipper," as in "all my clothes fit in my bag once I zipped it."


I think the zipper iconography was used by pretty early versions of pkzip, so it was definitely on the mind of the format's creator. Of course the more common implementation WinZip (these days from Corel but I assume it was acquired, not sure of the history) went the C-clamp route, which I think it shared with WinRar.


Like other commentors, I always thought it was an allusion to zipping up some sort of storage or container, like a suitcase - zipping would still be useful even if it didn't compress, as it's a way to send a folder as a single file.


I could've sworn it was named for the authors - took a while to find with nothing more to go on, but I was thinking of Lempel-Ziv (and I suppose Ziv became ZIP, sorry Ziv). I must be using a lossy compression for memory.


Interestingly, gzip came after zip (some 3 years later)


Well, GZIP is GNU ZIP. I ale says expected it was developed after the original ZIP.


gzip is just the compression/decompression algorithm (Deflate) from PKZip.

They needed a new compress/decompress tool once IBM asserted their patent rights on the LZW algorithm used in the compress tool.


gzip adds its own file header with lots of useless information like a field for a file name.

zip uses a different header for the compression stream, commonly known as the zlib format.

They also use different checksums at the end (CRC32 vs Adler32).


That's what the name might mean but it's simply wrong. gzip is not the same format as ZIP, and the tool itself is unable to read or write ZIP files.


It is however GNU's alternative to the zip format.


A gigazip is 1000 zips


1024


That would be a g̶i̶b̶i̶ kibizip.


No, it's a kilozip, like kilometer (= 1024 meters). A gigazip (not gibizip) would be 1'073'741'824 zips (1024^3 = 2^30).

Also it should be Gzip, not gzip, although it's not ambiguous since the reciprocal prefix is nano-, and in general there are no 2^-X0 prefixes that start with g- like there is for Mega-/milli-.


kibimeter = 1024 meters

kilometer = 1000 meters

https://physics.nist.gov/cuu/Units/binary.html


Was this post sponsored by the hard drive industry?


I remember fruitlessly trying to open a .zip file with gunzip back in the mid 90s. That was my first experience of the GNU project.


When I worked at Microsoft as a localization lead, we had actual human beings that would look over this stuff before calling machine translation good (of course, we had actual human translators back then, too). But that was a long time ago, back when Microsoft had human software testers, too.

And what options are available when I right click on a .zip/.postcode file? Say, perhaps I wished to uncompress it?


I think the worst example of this overzealous localization was when some person translated the VB(A) reserved keywords to the appropriate language (I want to say it was german or swedish). Happened in the 1990s.

Your programming language reserved keywords changing depending on your selected language was a big facepalm.

Searching for this problem on the web doesn't produce any valid relevant results, so the web has forgotten or maybe the horror has been plastered over.


VBA is still localized in many languages, I saw macros in italian. It's not just VBA keywords either, even excel functions are localized. e.g. this is from microsoft's docs[0]

    =CERCA.VERT(B2;C2:E7;3;VERO)

[0] https://support.microsoft.com/it-it/office/cercare-valori-co...


E.g. Excel does that still.


So obnoxious, although apparently in newer versions you can modify what language you want.

I remember seeing the dictionary to translate between German and English function names[1], one time I googled for "Excel German translation" and Google proudly showed the Translate interace pre-filled with "English: excel / German: übertreffen"

[1] https://www.perfectxl.com/excel-glossary/what-is-excel-funct...


The feature itself isn't that crazy – it's even a good thing, especially considering that Excel (and to lesser degree, VBA) were designed for ordinary people and that in the 80s/90s English was less common than it is today.

Where it went wrong is that as soon as you choose a language you were stuck with that. It should have been a display option.


Yes. The best is when you get an Excel file from Japan and the formulas are in Japanese. Who needs encryption or obfuscation.


No? Formulas are stored in a language-independent way, you should see English formulas in that file.


The missing part of that story is probably the file is a .xlsm (or even an older .xls with macros) and therefore has VBA scripts, which is not saved in a language-independent way.


Probably someone manipulating with formulas with VBA and improperly using FormulaLocal property. Formulas are stored in language neutral way and it shouldn't be a problem under normal circumstances.


But is it Shit-JIS, or are you lucky enough to get UTF-8?


Maybe you can safely believe it was all just a nightmare... certainly sounds like one!



There was a famous translation mistake back in the days of IE 5 for Mac or so where in one of the settings screens they translated the country code for Norway "no" as "nej" (literally No) in Swedish


I get that in the latest version of the Panasonic aircon apps. They have the term "No." (with ha dot, as the column header for the column "number" in a table). So in Swedish the app says

    Nej.  
    1
    2
i18n is HARD. But it's not THIS step that is hard. This is day one of i18n school. You can't translate from english terms. Just stop translating things from english, translate from a neutral key!

    key         en-US     sv-SE
    num_short   "No."     "No."
    yesno_no    "No"      "Nej"
    yesno_yes   "Yes"     "Ja"


This is still a problem today with YAML. Say you have something like a list of locales, the word 'no' get's interpreted as a boolean.


This seems to me like a case of a dev that used the zipPostalCode instead of the zipFilename key from the string table on accident or out of laziness.


Still, any machine translator worth your time would know what the phrase zip file is, unless it's not a machine but someone manually changing it.


Since translation is given less and less thought, the underlying system is also becoming more basic and less context-aware.

I don't remember exact occurrences but I've noticed quite a few similar problems on macOS that are just the result of words being translated out of context.

On a popular forum software I used to do support for, "flag" (for moderation) was translated in French as "drapeau" for example. Which is technically correct I suppose, drapeau does mean flag, but the wavy kind of flag that countries like to use as symbols. Nothing else. And it's not a verb.

In this case, there is simply the string "zip" somewhere that is used both in some address forms and for the file type, and the translator just happened to translate the first occurrence probably without even knowing that is was also used in another context as well. Or without any means to ask the dev team to separate them into two terms. Which might not even be possible if they used too simplistic a translation system.


I want to share this gem of a screenshot I took on the Belgian Amazon (Amazon.com.be) settings page a few weeks back:

https://media.discordapp.net/attachments/323831494718128130/...

This is French, English, German and JavaScript all in one sentence.

Two trillion dollar company.


Other localizations of Amazon are just as hilarious. And to understand the product listings, you have to know (pretty complete) English, because the translations only make any sense if you know the English description and can reverse the translation in your head back to English.

So, you have to understand 2 languages to understand these listings.


The programmers and the translators are different people who have probably never met eachother.

Translators usually just get to see the string they are translatinbg, and if they're lucky a 1 sentence explanation written by a programmer.

With no context, a "Zip file" might indeed mean a "postcode file" - perhaps it is a file for storing all the zip codes, and now it will be used for storing all the postcodes?


I’ve drapeaued this comment to favorites :)


J’ai drapeaué


To be fair, translation systems genuinely are a pain, and it's easy to make a mistake either as a developer or as a translator who has thousands of strings to translate.

It doesn't help that a lot of developers probably never translated an application; when I added translations the first thing I did was translate it myself, but that's not really feasible for larger systems and/or with people who don't speak a second language well enough to actually do translations.


> that is used both in some address forms and for the file type

Oh yes, the classic "I just picked the string from my IDE's dropdown list, it was there so it's fine right?"


They can't work on the per-word basis, translations only works for long enough/large enough texts and images. Words("zip") and partial sentences("expand zip") are garbage in. Out comes "postcodes" and "elaborate addresses" or whatever the PRNG chooses.


I’d assume they have to use Bing Translate instead of something decent like DeepL.


We had an almost identical bug, also at Microsoft, when I was working on Hotmail (it may have been Outlook.com by then, I can't remember): POP, the email protocol, was localized into UK English as DAD.


I find POP->DAD a little odd, because I've never (as an American) heard anyone say 'pop' referring to their father. 'pops'? yes. Now, POP->SODA, would not surprise me in the least.


"Soda" is even more of an americanism than "pop". ("Fizzy drink" is the only term I've ever heard used in the UK).


“Pop” gets used in the UK sometimes but I’ve only heard it in northern England. Never heard Soda anywhere in the UK though, apart from the specific case of cream soda.


It's always confused me why Wikipedia calls its UK language version "simple English".


Pop is common where I live in the Midlands


In Aus I use Pop to refer to my Grandad who was british. I do think it was to help with confusion with which grandparent I was referring too, was it my dads dad or my mums dad


It's an older generation thing. See Pop Tate in Archie comics [1], for example; the soda fountain guy. Guessing it's probably derived from Papa, like Pa, Paw (hillbillies), etc.

("Hillbillies" not meant derogatorily.)

[1] https://en.m.wikipedia.org/wiki/Archie_Comics


Didn't read Dr. Suess's treatise on patricide, Hop on Pop, I guess.


Vaguely related, I’ve noticed before manuals in which tech writers provide expansions of acronyms - but they don’t realise that in the context of that manual the acronym means something different.

I remember one z/OS manual saying AIX stood for “Advanced Interactive eXecutive”-which is true if we are talking about the Unix, but this manual was talking about a VSAM AIX (Alternative IndeX) - a secondary index for the VSAM flat file database system. Another example was USS being wrongly expanded as “UNIX System Services” when in the context it was actually a reference to VTAM’s “Unformatted System Services” (the part of VTAM which handles the initial LOGIN command, I think “Unformatted” because it is plain EBCDIC not 3270)

In Windows, there is an account flag called “UF_MNS_LOGON_ACCOUNT”. A lot of people claim “MNS” stands for “Majority Node Set” - even one MS doc - see MNSLogonAccount in https://learn.microsoft.com/en-us/powershell/module/activedi... - however, while it is true that’s what the acronym stands for in the context of Windows clustering, I’m pretty sure in the context of Windows user accounts it actually means “Microsoft Netware Services”-this flag was used by Microsoft’s 1990s era software which enabled Windows NT to pretend to be a Netware server, and the unrelated Windows clustering “MNS” has never used that account flag. “Microsoft Netware Services” was renamed to “Microsoft Services for Netware” (MSN-but not that MSN!-hence MSFN or SNW), no doubt for trademark law reasons, but the name of this flag got stuck with the original acronym. I don’t think it actually does anything unless you have that old stuff installed, which probably doesn’t work on newer versions of Windows. I suppose that being an effectively disused flag, there is nothing stopping people from stealing it for their own purposes, and probably someone out there has.


I've always wondered why "Shopping Cart" isn't called "Shopping Trolley" in the checkout flows of British websites...


One of our sites uses the word "hopper", and I was once told that when a client asks what it means, our UK people tell them it's an American term, and our US people tell them it's a British term.


That’s funny. It’s definitely an actively used term in the U.K. though I can imagine people not knowing it. Outside of heavy industry I’d associate it with centuries-old flour mills.

I’ve found that often when my fellow Brits complain of a word being American it’s actually just an old English word we stopped using.


> Outside of heavy industry

Even rather light industry. Coffee grinders have a feed hopper.


Jokes on all of you, it's a Canuck word.


The very first thing we do, when we create a new e-commerce website is to change "cart" to "basket" everywhere we can find it. Without that - as our manager says - we wouldn't sell anything on UK market


It's often called Basket to navigate this


because most people don't care that much, i've lived in the uk my whole life but my english is extremely americanized due to my use of the internet growing up


It is on Argos.


Argos is great. Like Amazon but you can walk in (at least before they started closing loads of their stores and merging them into Sainsbury's)

Same day delivery online before Amazon started doing it with Prime Now too.

Their app update release notes used to be hilarious too but they've stopped doing them, I think the person responsible left. https://thecircular.org/argoss-hilarious-app-updates/


It is on a few places, here it is on Argos - https://i.imgur.com/kik8y91.png


It is on some of them, but it isn't super common.


You find a lot of British-English terms are different from what we use in the US.


'isn't' means the same though


Another British English oddity is with Mac OS. Classic Mac OS had a British version until circa Mac OS 8, which called the Trash the Wastebasket. But the icon wasn't changed, so was still visually a dustbin. British English returned at some point with recent Mac OS X versions, but now the icon really is a wastebasket, they called it the Bin.

(If anyone has this British English Windows 11, is it still the Recycle Bin, or the more British Recycling Bin?)


Recycling Bin is the normal name in US English too. I'm in the US and when I search google for "Recycle Bin" (https://www.google.com/search?q=recycle+bin), the results (in order) are

* "Recycling Bins" from Home Depot

* "Recycling Bins" from Amazon

* "Find the Recycle Bin" from Microsoft

* "Recycling Bins" from Lowes.com

* "Recycle Bins" from Target

* "Recycling Bins" from Walmart


Gives new meaning to putting my shell scripts in /usr/local/bin :-)


In America, we usually either say "Trash Can" or "Recycle Bin". For some reason, we don't do Trash Bin or Recycle Can.


Dustbin is also UK English, and I (Canadian) don’t actually know what “visually a dustbin” looks like. In Japanese/English spoken in Japan, “dustbox” is also a term sometimes used for what I would call a small garbage can.

Do they also localize the folder path? It’s “./Trash” in US english


A dustbin is I believe called a trashcan in North America, ie: it’s what the Classic Mac OS Trash / “Wastebasket” icon shows. A (usually) metal cylindrical container for rubbish with a lid, usually kept outside. Nowadays in the UK mostly replaced by wheelie bins…

It's still internally .Trash, as that isn't user visible I suspect it's the same for all localisations of Mac OS.


It's still Recycle Bin in the British version of Windows.


On Windows, I think it has always been "Recycle Bin" in any English dialect since the beginning. I guess that leaves a question I never thought of: what do the British call a recycling bin?


we call it the "recycling bin", or the "recycling"


I check out what's under the bonnet.


Recycle Bin is oddly enough not idiomatic American english either (we would call it as you do).


I've honestly just given up using British / Commonwealth English localisations in software. You run into constant little bugs and annoyances, and it's not like I can't decypher the US English version.

Please always remember to let your users change regional settings separately and individually for currency, measures, first day of the week, etc.


ZIP Code was a service mark owned by the USPS. The generic term is a postal code. I've personally seen more and more software use the term postal code, especially anything that handles international addresses.


Ironically, colour (like a few other ou-isms) is actually a modern French influence that took root in Britain but not "in the colonies".


In the "colonies" of Australia and New Zealand, "colour" is the accepted spelling.


Many places have states (for example, Mexico) but "the states" typically refers to the ones in the USA. Perhaps the same applies to "the colonies"?


If you're American, who are generally seen as incredibly parochial (especially by the British), then that may be the case. If you're British, then the colonies really could be any of 75% of the world.

As a Brit, I think it's a tad unfair on the yanks to see them as that parochial. As a test, ask any Brit moaning about the NHS and railing against the US private system (these complaints often come together from people who would never consider themselves parochial) whether the health services of Germany, France or the Netherlands are private. They won't have the first clue.

Just shows you, all that consumption of worldly and international news by worldly and international types remains strangely parochial, and that's just one example.


There's quite a few things that are considered "Americanisms" now, but are actually British. After the American revolution, British English continued to change, but many old-timey Britishisms remained frozen in American English.

Now, they're thought of as American.


There's also two forms of "British" English, with the less common being Oxford English, which uses ‑ize instead of -ise for most words https://en.wikipedia.org/wiki/Oxford_spelling


Got an example? Sounds interesting.


Sure. "Soccer" is one. It started in England as college slang. "Soc" short for "association" from "football association", and "er" added as a jocular formation. That guy is a soccer. Rugby players were similarly called ruggers.

Also, more vague but fascinating, is that the American "southern accent" (quotes because there isn't a single southern accent, but most people think of a specific one when they think "southern accent") is largely a "British accent" (quotes for the same reason), but British circa the 1700s.

This is actually a fascinating subject, well worth looking into if you're interested in English language history. Also interesting is that the division between American and British English has been growing softer over time, and more and more modern Britishisms are being used regularly by Americans (and vice versa).


This isn’t true though!

The Southern accent in the US did not emerge until after the Civil War, replacing a diverse array of local accents. It came from the Appalachian South which was settled by the Scotch-Irish and Germans, none of whom had English accents.


It's quite funny and depressing, how one of the youngest and wealthiest countries on Earth has already fundamentally forgotten huge chunks of its history - after barely 250 years. Ten generations are clearly enough to lose a lot of data.


Whenever I hear the "British accent of the 1700" I just don't really believe it. There isn't a British accent now. There are many many different accents, all wildly different.


Right, as I mentioned. Same with American or even Southern American accents. There are a wide variety of distinct ones on both sides of the pond.

The various accents do tend to have features in common, though, so you can hear that the various southern accents are part of the same family, and similarly with the various British accents. There are, of course, exceptions to this as well. This bit of linguistic history is really referring to a fairly specific American accent and a fairly specific British accent.


They said "a British accent", as in, "one accent among others".


Rhoticity is the big one I know. Parts of Britain started dropping the R before 1776, but it became more widespread after that. The US port cities (most notably Boston, but it was also a class difference) had enough contact with Britain that they dropped the R too, whereas the rest of the US kept it.

https://en.wikipedia.org/wiki/Rhoticity_in_English


I don't think anyone would consider rhoticity as especially American considering several dialects across the UK, ireland & canada are rhotic. It is just another way in which english varies globally.


True, and I didn't mean to imply it was. I was just thinking of things Americans say because of the British that the British no longer do, wasn't thinking America-exclusive things.


"Tire" and "curb" were once the normal spellings in the UK; "kerb" is an innovation whereas "tyre" is either an innovation that is coincidentally the same as an archaic spelling, or the restoration/repopularisation of an archaic spelling. The spelling "kerb" upsets me whenever I see it because it's clearly referring to the curvature of the kerb, but fortunately I almost never see it.

Likewise, spellings like "programme" are deliberate changes to mimic the French spelling. These have been rather more successful than they have any right to be, but some have completely failed (like "gramme") and a lot of people still use the older spelling.

-ize, also, used to be the standard spelling with -ise an alternative also found in the UK. In this case, it's clear that an understanding that -ize is used in the US and -ise is used in the UK became an understanding that -ize is the US spelling and -ise is the UK spelling which raised its currency. But I think they both remain in use in the UK (-ise has more-or-less chased out -ize in Australia though).

-or spellings like "honor" and "color" were once much more common in places where they are now rarely seen and vice versa. To an extent they follow the same story as -ize/-ise, with the US standardisation of one chasing its use out in the UK. In Australia, -or was much more common (than now) when the power to distinguish oneself from was the UK, but now that the main power one needs to distinguish oneself from is the US, -our has chased it out except in the name of the Labor party (because the paperwork was filed by someone who happened to prefer the shorter spelling in a time when both were current) and some uses of "honor" that are literally etched in stone. (The last general use was until about the year 2000, by "The Age", a Melbourne newspaper which used -or as its house style, but by then it was seen to be improperly American and they switched to -our.)

Generally, spellings and spelling variations remain open and subject to gradual change in all English-writing countries.


Gotten: “English speakers in North America preserved gotten as the past participle of got. Outside of North America, the shortened version [got] became standard.”


This is hilarious, because when I was in grade school, "gotten" was considered incorrect English and you would be corrected or marked down for it.

I do hear "gotten" said occasionally in my part of the US, but it's not common. It's a bit like "ain't" in that way.


gotten still remains in occasional use in Britain, certainly in spoken English.


Beyond words, the imperial units and Fahrenheit for temperature and I wonder if the mm/dd/yyyy format happened that way too.


Modern French has had little influence on English. As I understand it, the -our forms are evolutions from the Norman French spellings that introduced these Latinate words into English after the Conquest. In the Renaissance, as many words made their way into English directly from Latin, there was something of a desire to Latinize the spellings of these words to their original -or forms. This had some traction in Britain (eg, horror, tremor, governor) but really took root in America.


Next level Clbuttic [1] mistake

[1] https://en.wiktionary.org/wiki/clbuttic


My favorite is when google translate translates the Chinese word for product/item/thing as "baby." You shop for babies, babies go on sale, here, have a coupon for $2 off any baby! If it doesn't work out there's always the "return defective baby" button!


I'm guessing this is for Taobao? They do stylize the term they use for products.

The more accurate translation would have be "precious/treasure" -- i.e., Shop for [Treasure]. (Taobao lit. means to [Pan] for [Treasure])

In common colloquial usage, though, the term would indeed refer to "baby" in the same sense as "my precious/darling".


Yep! That's exactly it, thanks for the context.


This reminded me of the song "All that she wants" by Ace of Base. The mistranslation of the word "baby" makes the song sound like it's about a woman who wants more kids, rather than a woman who wants a lover.

https://www.youtube.com/watch?v=DrwlFTqS_bg


Reminds me of the gourmet cuisine to be had at Translation Server Error.


Thank you for the term, I didn't knew about it.

Another very ironic example of that is this recent Reddit post [1], where the name "Nasser" was censored in-game to become "N***er", making it look much, much worse.

[1] https://www.reddit.com/r/gaming/comments/znfw7y/my_name_is_n...


Probably an issue during localization, where someone saw "zip" and assumed it meant "zip code" and so "translated" it to "postcode". This is a perfect example of why its important to also supply the context instead of just the raw strings you need translated!

It also brings to mind a translation issue from my home country, wales, where road signs must be bilingual (english and welsh). A request was sent to a welsh translation service asking for the translation for a specific phrase. The signwriters received the response, completed the sign and it was then erected. The problem: it was an out-of-office auto-reply! https://www.snopes.com/fact-check/mistranslated-welsh-traffi...


zip doesn’t translate to postal code. ZIP does.


I dont think you understand how localization works. They have a localization file and they send them off to a translation service. The translation service goes through the file and translates the individual strings (or string fragments).

Either the translators made a mistake, and thought it was referring to a ZIP (regardless of capitalization) and translated accordingly, or a developer used the wrong key when assembling the string references - i.e. he used the equivalent of (this is pseudocode as I dont know how they handle localizations):

  localize("CompressToArchive", localize("Zip"), localize("File")) - i.e. with a reference to localization of "Zip" (or ZIP, or zip - the dev likely just searched for a string that matched what he wanted to localize)
instead of

  localize("CompressToArchive", "Zip", localize("File")) - i.e. with a string of "Zip"
where the strings are defined as:

  CompressToArchive: Compress to %1.%2 (same for us and uk)
  Zip: ZIP (us) or postcode (uk)
  File: File (same for us and uk)


This kind of mistake isn't that amazing, it's actually very common in translation because translators often don't get enough context - they just see a string with no comments.

But you're supposed to find it in QA after that.


That'd be easier if they hadn't got rid of all their QA people years ago.


Did they really do this? How can they possibly justify that? Are their users so held hostage that broken software isn't enough to get them to leave?


Yes? People have been joking about Microsoft's buggy software for longer than some of my friends have been alive. Backwards-compatibility is a hell of a drug; combine it with "every time my computer changes, it's really hard to figure out how to use it, so I don't want to change OS", and it's unstoppable.


Yes, there was a reorg in 2014 where they eliminated the role of 'software developer in test', the justification was something something testing as a separate function is slowing us down and we want to put out more releases. Or something. Supposedly testing responsibilities didn't go away, they were just moved to the software developers, but my lived experience is quality went down, so I'm convinced that testing isn't happening as much. Certainly windows mobile 10 had much less polish than any windows phone >= 7.5 release. Of course, design changes I don't like can be hard to separate from some of the things I think QA would have found.


Nah they just outsourced their QA department to the users, it's the modern thing to do.


It's the right thing to do, for Microsoft. Their users aren't going to leave them, no matter what, so why waste money on QA staff and testing? It's better for the company's profit margin to just let the users deal with those problems, and the users agree since they're happy to keep using MS products and paying for them.


Reminds me of the time I used my French bank card in a ticket machine in the UK. The machine detected it was a French card and switched the interface to French (fine). It then asked me to "entrez votre broche" as a prompt for my PIN.

In the context of electronics "pin" (as in on a component) translates to French as "broche" but it makes no sense for a pincode!


I saw a kiosk in the UK which cycled through languages inviting you to interact with a "t touch screen to start!"

The Swedish translation said "The touchscreen that starts!"


I was initially confused and thought why you would use pincode at ATM, then realized you meant the card's PIN.

In India pincode refers to zipcode or postalcode. the pin in pincode means Postal Index Number.

It would be interesting to know if similar bug exists for the locale en_IN.


Many gas stations in the US still ask for the zip/postal code instead of the actual card PIN code :|


The scope and scale of translation work has always seemed daunting to me.

I can’t imagine a way to make it any more efficient, without sacrificing significant accuracy by missing context, than to do manual translation of everything.

I can’t imagine if, every time I added a button to a UI, an entire team of localization translators mobilizing to make sure the right context came through in every language the UI supported. Not to mention the tools that support that workflow of passing context to translators, compiling the translations, and binding them all into a data structure I can use to populate my UI.

And that’s just for language. Iconography and colors have their own localizations.


Screenshots are all you ever need. Just give a screenshot or terminal output to a bilingual SME and have it double checked. It’s not like translators must be given sweat and blood of engineers just for their motivations.

A screenshot of just the dropdown menu could have been generated in CI and could have easily prevented the “postcode file”, so long the translator was presented that image along the string and had recognized that Windows dropdown.


Can someone report back on whether it returns a .postcode archive :)


How does that even happen? I mean for localizing software - any software - you need some kind of Lookup key for each resource and then you can do a lookup for that key and a language, to get the resource (An image, a text string, whatever). So the function is (key, language) -> resource. E.g. ("filetype:zip", "en-US") -> "Zip file". Since any term can be used in multiple places and words can have multiple meanings, you can't localize software as a mapping from one language to another. Especially not english.


A lot of software does it gettext-style, where the code contains one language (typically English) and wrap it _() so it's picked up by the translation system. For example: https://www.gnu.org/software/gettext/manual/html_node/Prepar...

You can add context this way, but programmers often forget, translators don't look for context, and seemingly no one reviews the final product in other languages.


That's exactly my point: using gettext and hoping to hard-code one language (typically english) as the "key" language and then sending it to translators who will try to map context-less strings to a different language is just not a good way of localizing software. I think the key design flaw is that it's the best solution to an unviable problem: the idea of adding i18n to software as a simple transform from one language to another (i.e. an afterthought).

The proper solution again, is using keys that provide the context. A special syntax for these neutral strings (e.g. prefix, uppercase, whatever) will quickly show where a translation is missing. Translating is then a mapping from keys to english, or keys to french and so on and never english -> french, even with gettext. Instead of "Zip file" you'd have to hard code ":filetype_zip".


I have worked with volunteer translations for Valve using the now defunct Steam Translation Server (and Crowdin very rarely nowadays) and discussing problems like these with other people from the community, the mods agreed that this is a lack of context issue.

Then, I started playing World of Warcraft and realized how bad the localization is. I mean, the translation isn't bad per se, I really appreciate that they tried to adapt the game to my country's culture and that is totally awesome.

But seeing Blizzard pay a ton of money for some people that don't even play the game to verify if their translation is correct or makes sense is just astonishing to me. When Blizz released the new UI for WoW, there was an option called "snap" which meant "snap to grid". The amazing translator team, having no context and not even the decency of checking the context in game, translated this option as "estalo", which, in portuguese, means "snap (sound)", as in a finger snap.

Another example is that when they added minimap sizing to the UI editor, there's an option that allows the name of the zone to be below the minimap. The brilliant translator team translated it in portuguese to "abaixo cabeçalho" (literally under header) or something very silly which you could literally see that they put no effort in the translation.


Playstation used the equivalent of "Store/archive/stock 20%" on the PSN Store for years in Norway instead of "Save 20%", because they translated Save without realizing that we have different words for saving money and saving things.


This was initially reported over 2 months ago when it first showed up in canary builds.

Microsoft, at the time: "This is an issue in the latest Canary Channel build and the loc folks are working on a fix" (https://twitter.com/JenMsft/status/1643599284120723456)


Using software in a language other than US English has always been a pain.


Yup. Every graphic designer I know around the world uses Photoshop in English, not in their native language.

Every Photoshop user knows the difference between the image and the canvas, between layers and channels and levels. They learn what dodge and burn mean.

But nobody can even guess what arbitrary terms a translator will use for layers and canvases and channels and levels, or what terms printers used for dodging and burning back in the day.

Not to mention most Photoshop tutorials you find are in English as well.


And to someone who hasn't done touch-up work on film, the terms "dodge" and "burn" have no obvious connection to lightening or darkening an area, so even native English speakers need to learn what the terms mean.


Burn should be fairly logical. Burning things tends to make them black.


Russian translations tend to be quite robust even for open source software - large user base who are used to translation being there and of acceptable quality.

MS was the first mover here by having a robust translation of Windows 95 from the start (and yeah, they did translate the start button).


Open Source translations tend to be better in general. Specialy for smaller languages like Estonian. MS uses pretty horrible direct machine translations here that usually don't make sense.


Russian seems to be better than even English at grabbing words wholesale from elsewhere. Maybe that makes translations easier? Probably not, but it's a fun thought.

My favorite example of that was a sign at an airport. Marking a spot off for taxis and rideshares. The latter was spelled out phonetically in Cyrillic. I was a little surprised they didn't use the Russian words for "ride" and "share".


A big, and already mentioned in this thread, issue is when ignorant people start to reimport words from other languages (English these days), and that catches up to some extent, even though educated people are using some proper term for decades or centuries.

You may end up having multiple conflicting ways to describe the same thing or even a person. A few days ago I saw Hades transliterated as Гадеc in some web comic translation, even though he is usually called Аид for many centuries already.


Most people aren’t proficient enough in English. And even if they are, many would prefer to see their native language, just like you would prefer to see a nice GUI instead of MS-DOS.

(Though I personally find most Polish translations of software abysmal, especially the translations into Microsoft Polish, and use US/UK English everywhere.)


Microsoft's German translations at least used to be really good and professional. It's painful to see the decline.


I haven't worked with MS products in ages, especially not in German. But I sure remember the gruesome "Schaltfläche" which unfortunately became the industry's standard translation for "button". And please don't forget about the "Eingabegebietsschemaleiste".

Professional? Maybe. Elegant? Not the least bit.


Gotta convince the corporate Germans to switch to English somehow ;)


Since when?


Misspellings, mistranslations, strings that don't fit in the label/button, harder to google errors in other languages, etc.


These are localization issues. If the GUI was done in another language then US from start it wouldnt be an issue. Of course sloppy work is bad, doesnt mean it’s inherently bad


What GUI toolkit are you using in 2023 where a widget doesn't adjust its size the size of its text label?


Literally any fixed-size dialog box ever.

Which seems to still be mostly the norm for apps in any desktop OS for things like preferences/settings dialogs.


There are so many other ways this can happen that it's really not surprising this happens. Like: started 20 years ago with the UI toolkit in English and a junior dev hence fixed size then suddenly got a growing customerbase in another language and slammed on a translation only to realize it would be a pretty large effort to track down and change each label to resize. Or even just: things adjusting size can simply be a big no depending on software (and you'd use truncation+toltip), because one doesn't want to waste space and/or keeps things aligned so it's easier to learn where everything is. Your browser tabs probably don't adjust size to text, just to name one thing.


Alot of UIs just can't deal with the size of a box being resized, so they instead adjust the size of the text, but for languages like Arabic and Japanese this can easily make text unreadable because characters often rely on finer details to be understood


Also text that's the wrong way or not ligaturized


Since forever according to the OP


The worst localization bug I'm aware of is the default conversion of string to datetime in MS SQL Server.

If the session language is non-US, the ISO formatted string '1991-02-03' becomes 2rd March rather than 3rd Feb. yyyyMMdd works as expected.


I was going to say since the invention of ASCII in 1963, but there were encodings before it and arguably you could say since the invention of Morse code in 1844.


There we go, Google's TLD fiasco resolved. Thank you, Microsoft!


This feels like such a Microsoft thing to do


No shit. In the Spanish version, the column in Explorer showing the 'link target' of shortcuts is translated to the equivalent of 'connect the goal'.


I thought AI would fix the context-awareness of automatic translators, but apparently not.


I guess it was a simple "search/replace", no intelligence involved, human or artificial.


There are keys for locks (Schlüssel*) and keys on keyboards (Tasten*). Guess which word VMWare uses for its special keys menu in the *German translation. :-)


All software is buggy. By translating it more bugs will be introduced. So I try to avoid using any translated software.

I remember in Windows 95 (that's been a while) a little binary, probably ipconfig.exe was broken in the Finnish version we had on our home computer. I copied the binary from the English version we had at work and it worked.


Related: Is there already a .postcode toplevel domain?


In Japan, they are simply called .prefecture


Windows localizations have been getting worse and worse over the past decade and a half. And now apparently they are further degrading to automated translations? Who could possibly think this is a good idea?


>I like a clean install and ripping out all the non-english languages takes a while. I also trawl through programs and extensions and get rid of the unnecessarily installed [for me] languages. Additionally, any os images - backgrounds, ms logos etc get reduced to 10*10 pixels. I don't need eight versions of the same wallpaper. And ss for themes...


Can’t wait for 7postcode x64 installer!


What's very funny to me is that there is a very important "file" called the postcode address file in the UK, which is Royal Mail's mapping of postcodes to addresses.


I especially object to the case of zip-code being used internationally, because this isn’t a case of a sensible alternative usage, but rather one very specific US term being interpreted as universal.

Please, for internalisation sake, use the generic term ‘postal code’ in English , then regionalise as appropriate (e.g. Post Code in the UK, Eircode in Ireland, Postal Index Number in India, Zip code in the US, etc.)


I was browsing Airbnb the other day and noticed a sentence in a listing containing “P[SENSITIVE CONTENTS HIDDEN]. Turns out the word was “Pagoda”.


Given the .zip domain kerfuffle maybe we should just accept this and change all archive software to save .postcode files in the PKPOSTCODE format.


I am skipping Windows 11, but have no idea what could possibly force Microsoft to make Windows 12 user-friendly again.


They shall implement the Windows 2000 GUI and Control Panel.


Fun fact: the Windows 2000 Control Panel (debuting in Windows 95, iirc) was largely HTML-based. Everything old is new again.


Not really? Windows 95 pre-IE4 has no HTML stuff in the shell, so the core Control Panel UI and applets dating back to then are native. The main HTML bits in Explorer of the era once it was glued in was the waste of space side bars. I guess some newer applets might be HTML, but the main ones weren’t.


a mass switch to linux.


If only that was on the cards. MS really needs something to give them a good kick up the arse so they stop pumping out crummy software. Sadly I dont see it happening any time soon.


Is it the Year of the Linux Desktop?


Fix all the bugs they introduced by trying to rewrite stuff that worked fine like the taskbar, stop changing functionality that works like the taskbar, remove the telemetry and need for an MS account, and give back users control over updates.

That's basically it.


guess they've got their amazing AI doing the translations now


Windows is a goldmine of horrible translations. Since the introduction of the Windows Fax & Scan tool the dutch translation has the translation "Zoeken" for the button to start the scan. This translates back to "Search". In dutch we actually just call it scan, just like in english.


Reminds me of the Scunthorpe problem :-)



I'm late to this discussion, but a datapoint:

On the two Windows 11 machines that I have access to, one Pro and one Home, and both set to UK English, the file compression option says "Compress to Zip file". I've never seen "postcode file" in the Windows UI.


This indeed seems to be an old thing:

https://www.reddit.com/r/CasualUK/comments/12cwylk/microsoft...

It's odd that apparently nobody on HN bothered to personally check this story, except you. (Is a lesson somewhere for the coming wave of AI fake news?)


Reminds me of this, where they similarly inappropriately went for ‘ticked’ instead of ‘checked’: https://twitter.com/jawj/status/1547894728447778818


This is an example of why everything, including Windows, is trying to move over to icons and just icons for everything.

Don't need to translate/localize anything when there's no text to translate/localize. Presumably that's easier.


I hate that design trend. Text is far faster to recognize and clearer than icons in most cases.


I wonder if we could get a sane Windows experience if ReactOS was a little bit more advanced in its drivers support. Microsoft's incentives with Windows are clearly not aligned with the customers'.


It was less prioritary than putting the kardassians on the Start Menu.


I've heard from sources that this ticket has been sitting stale at MSFT for at least a couple weeks now.

Surely they could have avoided the embarrassment of pushing this to prod?


How has this happened? It was always correctly called "zip file" in Windows <11 - did they not just copy existing translations over to 11?


I've seen a lot of translations breaking lately also in Windows 10 - I guess someone is just eager to rework a lot of things that have been working perfectly to get a raise.


It's not a trash can but a rummage bin!

And I love how the English called filets fill-its but when it comes to eggplants it's aubergine!

What a colourful language.


I've never heard of a rummage bin, maybe you misheard "rubbish bin"? I prefer not to have to rummage through it!


A rummage bin is where they put the discounted items in shops.


That's a bargain bin


Yeah my bad it's rubbish bin. Phone posting is a curse.


So... how many crisps does your motherboard have?


This is the most 2023 answer possible but I wonder if LLM translation will help handle this sort of situation better.

An LLM first translates to tokens in more of a "concept space", then translates to other languages. So it would translate "zip file" to something completely different than "zip code", and translate to UK English correctly, or at least better I would hope.

They may not solve all problems, but maybe this one?


There's like two main ways this kind of mistake happens.

One is a programmer using the wrong string ID. Probably a mistake when searching a translation file/db.

The second is missing context. The word "zip" has multiple meanings in software, so you need more information to disambiguate when translating. "Zip code" and "zip file" is probably sufficient, but you could imagine them also having a templated translation for file types like "{} file".

I could see an unspecialized LLM translation working well for whole sentences or paragraphs, because that kind of text isn't really likely to run into either of these problems. And if you have the context you need to avoid the second problem for small snippets, then human translation is also pretty unlikely to fail.


So, such LLM still would have to be informed that this is referring to a zip compressed archive, not a postcode file, whatever such file might be. This is basically the same thing as LLM hallucination, in that it comes from lack of constraints in latent space, if I dare use that buzzword.

Without context, a “zip” or “zip file” still can be anything. Could even be “zip [this] file” as a command.


This is, in fact, the task that transformers (the technology currently branded as "LLMs") do well. It was the task they were invented for: https://arxiv.org/abs/1706.03762


Why would you even localize an American operating system for the UK?

Most of the concepts in an OS are named by Americans.

If I were using a UK-made OS, I'd happily live with "dustbin" rather than a "trash can". Unless you have your head inserted in your ass, you know some UK words if you're North American and vice versa.


Apple and Google do the same calling the Recycle/Trash as "bin".


Must be machine translated and then not looked over by a human. Actually sad.


It’s not even the same word, one is a word the other is an acronym


That's hilarious. Talk about AI/automation gone wrong.


It is rather obvious they dont give a flying flamingo over it.


In England they call a zipper a postcodemonger.


Not going to lie, this is quite cute


That's _AWESOMELY_ british.


Rather Microsoftish. Similar things are commonly happening in Windows translations to other languages.


Epic. L18n via sed.


This is true AI!


Well, my take here is that English language should be renamed to "American", and Brits should be forbidden to speak it, because they make it sound ridiculous.

Honestly, speaking as a foreigner, who has been learning English one way or another for half of his life, I cannot even attempt to speak in British accent without feeling like I'm doing a stand-up comedy. How can Brits speak with a straight face?


How can you type in English with your foreign accent? I can barely make out a word, and your pronunciation is terrible.


Sigh ... unpostcodes

(from Twitter)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: