Hacker News new | past | comments | ask | show | jobs | submit login
North Yorkshire Council to phase out apostrophe use on street signs (bbc.co.uk)
166 points by IMSAI8080 7 months ago | hide | past | favorite | 221 comments



As far as I can tell, this isn't an issue with the specific database itself, but the standard they are required to record geographic data in, which the end of the article mentions as "BS 7666".

https://www.agi.org.uk/wp-content/uploads/2020/11/BS7666Guid...

    # Standard data types used in BS 7666
    CharacterString: a sequence of alphanumeric characters
    ...
If I had to guess, alphanumeric is interpreted as [0-9a-z].

The sign printer probably expects this format when printing signs for the government, or worse, has a contract that says the government must provide this standard format for the sign information.

So it's just a government mandated database schema... I don't think that's any better of a reasoning though lol


The Council is making the same mistake you are:

> How to use these standards > > The UPRN and USRN standards form a machine-readable addition to an address or street record held in a system. When using UPRNs and USRNs you can continue to use existing formats but add a field for these identifiers. https://www.gov.uk/government/publications/open-standards-fo...


Better not prod Falkirk council, as otherwise they'd have the issue of Bo'ness vs Borrowstounness to deal with.

Similarly Grangemouth apparently has a Bo'ness Road...

A related issue would be places in England named 'by-the-Sea', e.g. Newbiggin-by-the-Sea.

So the signs shall simply have to match reality.


Don’t forget Westward Ho!

https://en.wikipedia.org/wiki/Westward_Ho%21

Edit: ironically HN formatting couldn’t handle the exclamation mark either.


Such a strict interpretation would also exclude spaces.


FORTUNATELYSPACESAREARELATIVELYNEWINVENTIONASARELOWERCASELETTERSANDPUNCTUATIONINGENERAL


A·L·L·S·T·R·I·N·C·S·M·U·S·T·B·E·C·O·M·P·A·T·I·B·L·E·V·V·I·T·H·R·O·M·A·N·S·T·O·N·E·T·A·B·L·E·T·S


*·M·V·S·T·


G·R·A·T·I·A·S


at least over there something would probably "isle of pen" rather than "pen island"



S'cunt'horpe


You don't want to visit Pen's Island?


The actual standard (well, the previous version, BS 7666-1:2006) does contain that wording, but also says:

> Abbreviations and punctuation shall not be used unless they appear in the designated name, e.g. “Dr Newton’s Way”, and only single spaces shall be used.

Given that "alphanumeric" is vague, not defined, and prominently contradicted, I'd say it's quite clear that they can't really blame it on BS 7666.

The council probably have some horrible CSV-infested GIS workflow, and have decided to change reality to avoid the bugs.


Ahhh I was looking for something like that! Good find.

So, even worse than imagined... it's an ambiguous standard blamed for a bad implementation of that standard.


BS in "BS 7666" stands for British Standard and not anything else.

If the standard restricts street names to those certain characters, may be it really is BS.


The '666' portion doesn't inspire confidence either.

We love apostrophes so much we have them on our supermarkets. If they're not there we add them.

e.g. I'm going to Sainbury's

e.g. I'm going to Tesco's

...despite the fact that it's real name is plain old Tesco.


That's because you're not going to Tesco the registered company, you're going to (one of) Tesco's (shops). It's just traditional and logical grammar, not misnaming.


This is not true. We don't apply this "rule" to any other establishment, e.g. people don't say "I'm going to Burger King's" or "I'm going to Costa's".

There's nothing to analyze: "Tesco's" is just wrong. It's not the name of the business.


people don't say "I'm going to Burger King's" or "I'm going to Costa's"

Oh yes they do!

There's nothing to analyze: "Tesco's" is just wrong.

It’s kind of funny that there are prescriptivists on both sides here, all convinced that they’re right.


In 40 years I have no seen such usage in print


People say it.


You may not be wrong that people do say it, I'm sure people say pretty much anything, across this vast planet. Yet people use double negatives, when meaning the negative too. Doesn't make it right.


I think language is defined by real-world usage, not by the logical structure that we theorise underpins it. If so, double negatives actually are right, or at least they can be, if they successfully communicate meaning.


You're not wrong.


Interesting observation ! Might it be because Tesco and Costa resemble ordinary names, and therefore easily appear in a possessive form ?

Or it might be about imaginary hierarchy. There could only be one actual duly-anointed king of burgers, but if one were to use the definite article to mark this, saying "I am going to the Burger King's", it would imply firstly that the king of burgers does actually exist, and possibly also secondly that he has but one solitary burger outlet.

A grammatical construction not quite so jarring when used with (say) "Tesco".


Pasting my reply to someone else rather than rewrite the same thinking:

I agree it's not consistent, but who ever accused English of being that?! My point isn't that all business names are treated that way, just that the ones that are the reason is grammatical tradition not (for the most part) people who incorrectly think the shop is called "Tesco's" or whatever.

(But as others have replied to you, it's also more common than just Tesco, definitely including "Costa's" for lots of people.)


Costa’s I for sure have heard. Burger King’s I agree though.


lol… I am glad for you that you haven’t heard those yet.


Thank you, I'll remember that next time I pay a visit to Sainbury's's!


I agree it's not consistent, but who ever accused English of being that?!

My point isn't that all business names are treated that way, just that the ones that are the reason is grammatical tradition not (for the most part) people who incorrectly think the shop is called "Tesco's" or whatever.


I agree swores. Your comment did say 'traditional' and my comment was facetious.

There's been an historical transition from small chains owned by individuals (e.g. the Victorian Mr John Sainsbury) to big brands (e.g. Superdrug), hasn't there.

The possessive apostrophe was appropriate for the former but surely less so nowadays. I would guess "Sainsbury's" was a rebrand intended to reflect tradition.


Bleh, I might've realised you were kidding had I gone to Starbucks's this morning


Up in the northeastern USA there was a discount retail (department store) chain called Caldor:

https://en.wikipedia.org/wiki/Caldor

My mom used to buy small appliances and Christmas gifts there -- and she always called it Caldor's.

Clothes, she usually bought at Marshalls -- which hasn't had an apostrophe since 1974.


Fun fact, Lotus (Tesco's Thailand stores) have an erroneous apostrophe: "Lotus's" ...

https://upload.wikimedia.org/wikipedia/commons/8/8b/Lotus%27...


I say Sainsbury's because the name is exactly that. I don't say Tesco's because the name, as you say, is Tesco. I would guess those who say Tesco's (and maybe Asda's) are just getting confused because of Sainsbury's.


Sounds like this Sainsbury person is responsible for spreading an apostrophe infection to Tesco.


Marks and Spencer are studiously looking the other way, innocent expressions on their faces.


I believe standards are the minimum, not the maximum to implement, don't you think so?!

Why those dare to implement - and explain - standards who cannot comprehend their purpose?....


Standards are just that. You can say that you can standards are the 'minimum, not the maximum', but if the standard says only alphanumeric characters are permitted (and it turns out this isn't a very well written standard, so there's a load of discussion possible on this point), an implementation that allows non-alphanumeric characters is wrong.

As a thought experiment, how far beyond the 'maximum' is acceptable? Latin letters with diacritics? Cyrillic letters? Arabic letters? Chinese logographs? Emojis? Would you expect all systems which are standard-compliant to be able to handle all of the above?


The UK includes Wales and Northern Ireland, both of which have place names that include diacritics. Whether or not Wales and Northern Ireland choose to follow this particular British Standard, I don’t know. Some examples here:

https://english.stackexchange.com/questions/111148/do-any-uk...


Yep, that just emphasises how stupid the standard is. Others have mentioned Westward Ho![0] which is in England and contains a non-alphanumeric character.

For Wales for sure, often there are two versions of the place name, the Welsh, and an English butchering of it (see for example Pont-y-pŵl, which is spelt Pontypool in English), so I suppose it would be the English thing to do to simply pretend the Welsh/Irish spelling doesn't matter, and only use the English spelling (a little tongue-in-cheek from my side, but sadly likely the reality).

However, I suspect that whoever authored the standard was just sloppy and wrote 'alphanumeric' without giving it any careful thought.

[0]: https://en.wikipedia.org/wiki/Westward_Ho!


You can print apostrophes on a street sign without any database issues because a street sign doesn't interface directly with a database. At least not yet...

All the technical issues here have already been solved a hundred times, there's plenty of other options. It's a little worrying that we're eliminating punctuation in real life because of issues with integrating with geographical databases.


Interestingly, the OpenStreetMap project considers road signage to be unquestionable "ground truth". If the sign changes, the map changes. We can't use any other database because that might not be available under a compatible licence.


Just because something is written on a road sign does not make it magically copyright or license free.

I know a business which has to pay a nominal $1 license fee for the names of its own stores to a map maker, simply because the business has lost any evidence that it had the names before the map maker put them in the map.


> Just because something is written on a road sign does not make it magically copyright or license free.

That's untrue; facts can't be copyrighted. Whether there's a road sign or not, the name of the road is not subject to copyright, but if there is a road sign, then it's really easy to prove that your claim about the road is an uncopyrightable fact.


Petty clarification: single facts can't be copyrighted, but collections of facts (e.g. sports scores) can:

https://en.wikipedia.org/wiki/Copyright_in_compilation


That doesn't apply here; no matter how much street name data is collected, no amount of the collection, including the entire thing, can be copyrighted in a way that would impose any restrictions on OpenStreetMap.

From your link, which is quite short:

> such copyright may exist when the materials in the compilation (or "collective work") are selected, coordinated, or arranged creatively such that a new work is produced. Copyright does not exist when content is compiled without creativity, such as in the production of a telephone directory.

So you'd have to ask yourself, "was any creativity, of any kind at all, required in order to call this street by its own name?" And the answer is even more obviously "no" than in the paradigm case in which the telephone directory can't be copyrighted because it consists of facts (involving no creativity) in an externally specified order (alphabetical) in a collection specified by an external rule ("everything is included").


On that page:

  facts are not copyrightable
..compilations are, but you're obtaining a copyright on the compilation, not the facts.


No they aren't, but there is such a thing as a database right which is separate from copyright. This is why OSM changed from a Creative Commons licence (a copyright licence) to ODbL (a database licence).


> but there is such a thing as a database right which is separate from copyright

That's purely an unforced error on the part of OpenStreetMap. They are incorporated in England, which recognizes a database right. But there is no reason for that. In general, there is no such thing as a database right.

And this isn't even relevant to londons_explore's point; he is stupidly arguing that even if the information is available on a road sign, the public cannot use it because it might be included in a protected work somewhere. That is obviously untrue; the availability on the road sign does in fact automatically mean that inclusion in a protected work is irrelevant.

If you see a sign giving you the name of a road, and then publish information to the effect that the road's name is what is printed on that sign, no database was involved at any point, and a database right cannot apply. All you have is a bare fact.


I haven’t checked but how do they handle streets if the signs have two different spellings? There was a street growing up that was spelled two different ways and I still don’t know what the right one was!


My favourite example is a small road in Killarney called The Hahah in English but in Irish written "An Fhaiche" in one place and "An Háhá" in another. OpeStreetMap seems to designate the first one as name:ga and the second one as alt_name:ga.


This is a recurring problem in Ireland due to nationwide local government incompetence in using the Irish language.

Centuries old Irish language place names get replaced with bastard gaelicised versions of their English names, and now you’ve a mishmash of signage all over the place. Often the new names are just an invention of the council that sort of sounds right.


This is very true, but there is also an issue of which Irish orthography to use, right? Place names are very conservative, but modern speakers would be more used to O'Donnell's dictionary's variants than to ones from Dinneen, basically. Paradoxically, English forms sometimes give hints as to the correct pronunciation.


This tagging is most likely the result of a single OSM editor decision. They are quite often wrong / suboptimal / do not conform to guidelines.

It is always better to consult extensive OSM wiki to figure out how something should be tagged rather than try to figure out based on existing examples.

In case of names here is relevant guide: https://wiki.openstreetmap.org/wiki/Names


Naming is complicated. OSM community invented approaches to capture lots of nuances.

https://wiki.openstreetmap.org/wiki/Names


The rules are as open to change as the data I guess, but if there really isn't one obvious "proper" name I've seen things named with a slash like "name 1/name 2". Often there is specific tagging to overcome this, though. For example in bilingual countries like Wales many roads have any English and Welsh name. The language-specific names are uncontroversial, but the overall "name" (for the international map) is supposed to be in the "local language". As you can imagine this isn't always uncontroversial.


Don't know if it is still an issue but Atlanta used to have problems with the in-house sign shop producing street signs with misspelled names. Apparently, boulevard in particular was difficult for the city to get correct which was a problem since there's a major road simply named "Boulevard". Maybe it got corrected after one of the local news channels did a segment about it but does highlight the fact that road signs sometimes aren't authoritative.


You are talking about the country where accounting systems cannot add numbers...

And people believe anything that comes out of a computer regardless of thousands of signs that something is really not right.

They rather prosecute others without doubt on mass scale rather than look into themselves.

(see the Post Office scandal)


This routinely happens to names. Even now you get systems which refuse to deal with O', let alone non-English characters.


It seems like it is yet another example of software not written to requirements so now the requirements are adjusted to fit the software implementation.


Apparently, this is because of a standard they're required to conform to, not database software in specific:

https://news.ycombinator.com/item?id=40265929

> As far as I can tell, this isn't an issue with the specific database itself, but the standard they are required to record geographic data in, which the end of the article mentions as "BS 7666".

On the other hand, you’re naïve if you think English hasn’t already been simplified to fit on machinery such as typewriters and cheap printing presses. This process began long before computers.


The Linotype machine supported fl, ff, ffi, ℔, œ, and æ. See https://upload.wikimedia.org/wikipedia/commons/4/46/Linotype... linked to from https://en.wikipedia.org/wiki/Linotype_machine. Or see https://archive.org/details/LinotypeKeyboardPractice/page/n9... .

Even cheap printing presses could handle more than you think. Here's a type case layout from 1846, again with æ, œ, fl, ff, and ffi, fi and ffl. https://archive.org/details/printingapparatu00holtrich/page/... .

That book is for the "Parlour printing press [which] was invented by Mr. Cowper for the amusement and education of youth, by enabling them to print any little subject they had previously written, provided the printing did not exceed in size the dimensions of an ordinary duodecimo page, which measures about 5 inches by 3 inches."


Did it support þ? That would be the best-known example of written English being changed to accommodate a printing press rather than the other way around.


No. But that doesn't really fit since þ had mostly died out by the 1300s, with only few remaining uses by Caxton's printing - long before the 1800s.

I also wouldn't call those typewriters or cheap printing presses.


Is there something relevant about the 1800s? I said þ was an example of the written language changing to accommodate the printing press, not changing to accommodate the 1800s.


I had responded to msla's comment about "typewriters and cheap printing presses". Those didn't exist in the 1500s.

þ's disappearance from English is not due to either, though the lack of þ in available type faces was certainly an issue.


But the reference to "cheap printing presses" has to be interpreted as actually being a reference to type, as you have already done with your archive.org link. The press itself cannot either support or fail to support any graphical forms; there's no difference in the press whether you're using type or block printing.

Press quality issues are things like "are the surfaces flat" / "how much pressure can the press apply" / "how many pages can we press before running into a mechanical issue".


I disagree. It does not have to be that. I pointed to a printing press designed for kids. That should be much cheaper than one used to print larger sizes or many copies, much less the quality of the Bible in the 1500s.

I do get your point, but I think it's still wrong to use "cheap" to refer to the typefaces available for the Caxton and KJV Bibles. I suspect they were quite expensive.

The physical press is only part of the printing cost. The Linotype typesetting machine made it possible to set an entire line of type, drawing from a 90 or so characters. While the press itself didn't care, adding new symbols required manual effort, making it more expensive.


> but I think it's still wrong to use "cheap" to refer to the typefaces available for the Caxton and KJV Bibles. I suspect they were quite expensive.

I don't think this argument quite works; something can be stunningly expensive, in an absolute sense, at the same time that it's the shoddy low-price option people choose for budget reasons.


(Grrr! I've been saying Caxton Bible but I meant Tyndale Bible.)

Sure, it can be, but was it?

The KJV was very expensive as it was. ("Robert Barker invested very large sums in printing the new edition, and consequently ran into serious debt" says https://en.wikipedia.org/wiki/King_James_Version .) That doesn't mean they used a shoddy low-price option.


With the printing press it makes sense, there's physical constraints that make things very difficult if you want to support a huge number of different characters. Those limitations don't exist anymore though.

On one hand we have systems supporting Unicode and just about every character imaginable, it's an example of using technology to go beyond the limitations of the past. You could have a system that supports emojis on street signs, but instead we're going the opposite direction and introducing artificial limitations that are even more restrictive than 500 year-old technology.


I used to live in North Yorkshire until I moved away to start a job in mainland Europe.

When I moved I tried to fill in the form on their website to indicate that I wouldn't be paying the council tax on the house I used to live in anymore. Weirdly I couldn't make the form work, it broke with weird errors about timing out. After some headscratching I decided (on a whim) to change my computer's timezone back to GMT and hilariously the form started working perfectly.

Sadly I couldn't finish filling in the form, because it required a postcode for my new address, and would ONLY accept postcodes which matched the UK format (which my new address, in a different country obviously didn't match).

You shouldn't be surprised about any IT insanity from North Yorkshire Council, they are impressively incompetent.


Australia removed apostrophes from all official place names back in 1966, leading to names like Surfers Paradise, Princes Highway and Wisemans Ferry. Given the date, I doubt computing was a major consideration though.


And my father-in-law is still upset about it.

It's led to some interesting outcomes - for example, I've seen people write "Princess Highway" a fair few times, presumably on account of the official spelling ("Princes") falling confusingly somewhere half-way between the more usual "Prince's" and "Princess".

In day to day use, I don't think most people actually use the apostrophe-free official spellings, at least for names comprised of ordinary English words (Prince's Hwy, Surfer's Paradise), but it might be more of a free-for-all with proper names (Wiseman's Ferry or Wisemans Ferry).


I have never seen anyone write "Prince's Hwy" (or "King's Hwy" for that matter), the apostrophe-free version would be far more common.

The "Princess Hwy" thing is also common I think because the pronunciation most often used sounds a lot like "Princess", particularly when it's run right into "Highway" without a gap.


> I have never seen anyone write "Prince's Hwy" (or "King's Hwy" for that matter), the apostrophe-free version would be far more common.

I don't know what to tell you; I've seen it often. I'm talking day to day here, text messages and handwritten directions. If you're thinking of things like road signs, those will obviously follow official usage.

> The "Princess Hwy" thing is also common I think because the pronunciation most often used sounds a lot like "Princess", particularly when it's run right into "Highway" without a gap.

Obviously that is so, but I would suggest that the reason 'Princess Hwy' is comparatively so much more common than other similar mistakes (e.g. 'sea' for 'see' etc.) is because the official name (Princes Highway) feels so unnatural in English that it fails to act as a meaningful standard for usage.


Alternative solution: put the e back in that the apostrophe is standing in place of. St Maryes Place


They would be little choice with St. James'.

St. James means something else.

So unless they go for St. Jamess it's have to be St. Jameses.


This seems silly for the reason they're doing it, in that a modern database should be able to handle characters with a little sanitation.

However, it does seem like it could be helpful when it comes to satnav applications to remove ambiguity. Google's going to autocorrect most of the time anyway, but this way, you're less likely to run into an issue where it takes you to Kings Landing in the wrong town because you didn't type King's Landing in the town you meant.

Sure, they tell you what town you're looking at, but I can't be the only one who's quickly typed in a destination and didn't take the time to double check and ended up driving to the wrong location for something. For some reason all of the hockey rinks near me have almost identical names...


>but this way, you're less likely to run into an issue where it takes you to Kings Landing in the wrong town because you didn't type King's Landing in the town you meant.

That's pretty uncompelling. Should we also get rid of the letter s to avoid mix-ups between Kings Landing and King Landing?


Devils advocate: the apostrophe isn’t pronounced. The s is.


England would have to do some serious work to get rid of all the unpronounced letters in their place names.


"This seems silly" is the nicest way to describe government standards that were already outdated in the 90's


I have a street in a town near me which grammatically should be called St. Thomas' road. However, the street signs call it St. Thomas road at the north end and St. Thomas's road at the south end.


Not quite related to apostrophes, but I feel the need to point out my favourite street name of all time (as read on a street sign): St John St (-> Saint John Street)


I've seen Doctor King Drive (Dr King Dr) in a few different towns too. One of them typographically differentiated it like "DR King Dr" though.


Formally, that should be “St John St.”


I would say that on average, the street name is correct ;-)


Thomas' is not grammatically correct in any version of English that I know. It's not plural. There is no special rule for that. Both the street signs you mentioned are at least grammatically coherent


When a word ends with an s, the use of an apostrophe without another s is valid English.

Thomas’ and Thomas’s are the same thing.


I actually had to look this up, it depends if the possessive form is actually said with one or two Ss. For instance, it's Jones's and Bridges'.

https://grammar.collinsdictionary.com/easy-learning/what-are... https://www.sussex.ac.uk/informatics/punctuation/apostrophe/...


lol no. arguably the word ain't is more proper English than Chris' or boss'


It is trivial to find more than sufficiently authoritative source that cover the rules that make "Chris'" and "boss'" perfectly value contracted possessives in English. [1]

However, it's English: there isn't just one rule, another rule can also be valid and might be the one you're familiar with on a day to day basis. That doesn't mean any other way to say or write the same thing is wrong, it's just a pattern you never saw. Like someone going "lol snuck isn't a real word, it's sneaked!" and then you hand them a dictionary and they learn something new about their own language.

[1] https://www.ox.ac.uk/sites/files/oxford/media_wysiwyg/Univer...


Page not found while accessing your link.


You pronounce both of your examples with two S phonemes at the end. Putting that in the written form is absolutely Ok.


> You pronounce both of your examples with two S phonemes at the end.

Um... what? Pronouncing a possessive suffix with /səs/ isn't valid anywhere. The only possibility is /səz/. Same goes for the plural suffix.


Being confident doesn't change the fact that you're misinformed about English grammar.


Lol yes though?


Confidently wrong.

> Some writers and editors add ’s to every proper noun, be it Hastings’s or Jones’s. There also are a few who add only an apostrophe to all nouns ending in s.

> ..One method, common in newspapers and magazines, is to add an apostrophe plus s (’s) to common nouns ending in s, but only a stand-alone apostrophe to proper nouns ending in s.

> Examples: the class’s hours; Mr. Jones’ golf clubs; The canvas’s size; Texas’ weather


"Some writers" say that the French Foreign Legion has been deployed to the front lines in Ukraine too. We must take this very seriously, then!

Some writers write things like Four Fat Harvard Girls Lose Book Bag too. They use sentence fragments. They try to save ink by doing weird shit. Professional Buzzfeed writers write AF (yes, in caps) to mean as fuck. The Atlantic used the words electroöptical and rôles in 1940. Just because some minimum-wage burnout or penny-pinching editor breaks a rule doesn't mean that the rule doesn't exist.

If you go around the office saying that someone drank out of the boss' mug, they'll think you're fresh off the boat. Not only is it wrong in written English, it's not even accepted colloquially in spoken English, anywhere. And so it makes perfect sense that the written form would reflect the pronunciation.

Saying Texas' weather out loud just confuses people into thinking you're using it as an adjective when what you're really doing is trying to sound smart when you're actually sounding dumb. If you point at a book and say, that's Chris', you sound like you have brain damage. How is the book Chris'? Chris is a person, not a book! The only reason that people don't correct you is that they're being polite. And people misspell words all the time and the world doesn't cave in. That doesn't imply any particular thing about English grammar.

Another commenter found that you can say Jeff Bridges' because this is an irregular case to avoid saying the same sound twice—an exception which proves the rule (and also, I don't think it's irrelevant at all to point out the fact that Bridges is literally a plural noun made into a name). But Thomas is decidedly not in this narrow category. His source even uses Thomas' as an example of what not to do, lol. Normally I wouldn't dumpster someone this hard but hn rate limits so I may as well lengthen my response. Nothing personal.


Rules, especially in English, are not grounded in any agreed-upon authority and never have been, tracing all the way back to Chaucer more-or-less codifying the written form of the language by simply writing one story that then became the most popular English printed work for a generation.

Try not to lean too hard on how other people use the language just because it ain't how you use it. Makes you look outta touch with the way folks are playin' around with one of our shared human comms protocols, neh?

> but hn rate limits

Not in general. Only if you've proven yourself to be a poster for whom the mods think that rate-limiting you improves the health of the discourse 'round these parts.

Being someone who is also rate-limited. ;)


> it's not even accepted colloquially in spoken English

How the hell would you hear the apostrophe in "spoken English"?


That's exactly his point. You don't hear the apostrophe but you do hear the "s," meaning that Thomas and Thomas' cannot be distinguished. And so Thomas's must be used instead.


The spoken and written language are not the same thing. Even if you say "Thomas's," sources disagree on whether you write "Thomas's" or "Thomas'", because the latter is more consistent with the rules for other ends-in-s words and, therefore, easier to remember.

(My personal prediction: give it 100 to 200 years and we're going to drop the trailing 's' in all these cases. "Cat'" will just be pronounced "cats" and understood to mean "an adjective indicating the noun is owned by the cat").


As the OP said, "Thomas'" is pronounced "Thomases".

"Thomas's" means "belongs to Thomas". Pronounced the same, but spelled differently, because it is a different word.


Thomas’ is pronounced the same as Thomas’s.


Now you've provided an explanation I can see you're right (but downvoted). We would say Thomas-es in the possessive so it's written Thomas's.


It is easy to observe that this "rule" is false. Even though everyone pronounces it "thomases", some spell that "Thomas's", others spell it "Thomas'". It is a purely stylistic spelling difference, and both forms are in common use, in literate environments. So, there is no one rule about how this word is spelled. And since neither form reflects the pronunciation, both are purely conventional, they don't have a much deeper meaning to lean on.


You’re wrong. It’s definitely correct, but even native English speakers get the Rules of Apostrophe wrong all the time.


It's an older rule that's falling out of style, but it is real. Until 2017 Thomas' was correct by the Associated Press stylebook.


I do like Thomas'® English Muffins though.


As if database were not able to del with apostrophes or other special characters... Yes you have to sanitize your queries, but you have to do it anyway. Client applications will of course have to be smarter


As someone with an apostrophe in my name, it has been my experience whenever I come across this sort of thing, you can be sure the project is crap.


It's like the stories of people with the last name "Null" who get errors when trying to enter their name into websites. If that's true then I don't want to think about how poorly built (and insecure) those systems must be.


Legally change your name to have \’ such as D\’Armond.

That should break untold numbers of systems.


Lots of countries limit what you can name yourself. I very much doubt having a special apostrophe in your name can be a legal requirement.

Either way, apostrophes in names is quite uncommon in many parts of the world, and they are very likely to be ignored even on paper forms in those areas, if they are even allowed.

In general, each area has certain limits on what kinds of names if allows/understands, and it is up to the minority to adapt one way or another. It's very reasonable to want an Irish or maybe even British system to recognize a name like O'Reilly, but it's not really something you can expect of a Japanese system. Just as much as you shouldn't expect a name like 田中 to be recognized in France.


This results in amusing side-effects, like buying a house requiring you to sign every variation of your name that the credit check found, but it can also get you multiple “one per person” signup options, so there’s that, too.


H2 offers quite a comprehensive solution for dealing with this:

> [H2] provides a way to enforce usage of parameters when passing user input to the database. This is done by disabling embedded literals in SQL statements. To do this, execute the statement:

> SET ALLOW_LITERALS NONE;

> Literals can only be enabled or disabled by an administrator

https://www.h2database.com/html/advanced.html


I would argue if you sanitize your input you are already doing it wrong, you should parameterize queries and send the data entirely separately from code.


from a certain perspective, parameterization could be seen as sanitation, no?


If it sanitizes anything, parameterization sanitizes the code, not the data, and has much lower impact on the outside world (because the rest of the world isn't pressured to rename things in the real world to fit arbitrary constraints in the computer).


I think part of the problem with apostrophes is also that there's two characters for it. ' and ’


The Hawaiʻian ʻokina symbol begs to differ...


My keyboard has at least 3 already: ` ´ and ' ...

I guess there are lots more in other languages...


The first two are accents, and to me it always looks extremely unprofessional when they are abused as apostrophes.


Right single quote is not less-correct than the neutral character.


And now they've added the problem of some roads having two names. Such as the example in the article's first photo.


There’s not though. My iPad gives this one, but that’s probably the fault of Apple thinking they know better.


The best part of this article is:

Ruby Wang... did not mind the changes. "To be honest with you, because I'm not from this country it doesn't matter because it's the same pronunciation," she added.


This could be from The Onion.


And then the photo of her looks exactly like my wife when she doesn’t give a shit about my bullshit


Looks like a much easier solution than having a couple people learn how to escape a string and prepare a sql statement.


My local (US) county has a web service to look up property deeds and titles.

They "solved" this problem by just having you enter the street name with no punctuation or suffix (ave, st, etc), and if there is a collision the form pops out a drop-down selector to have you disambiguate.

It's not the cleanest solution but it works. I agree that bending the humans to serve the needs of the machine feels... Sub-optimal.


To be fair, a degree of fuzziness in matching is quite valuable. Is it Eighth Ave, 8th Ave, Eighth Avenue or 8th Avenue? Or W 8th Ave, perhaps? Is that deed for Unit 7 or Apt 7 or #7 or Ste 7? Is the street address 15 8th Ave or 015 8th Ave? In cases where one zip code spans multiple cities, should anyone really be matching on the city? How about communities in a city in which no one is quite sure what goes in the city field? (Is “Pacific Palisades, CA” a valid city+state? How about Van Nuys, CA or Hollywood, CA? But don’t confuse this situation with West Hollywood, CA or Beverly Hills, CA, which are actual cities.)

I wonder what address verification services expect for the “numbers in the address” when the address is 123 1/2 4th Ave #5”.

Apostrophes seem like the least of anyone’s concerns.


My favorite confusing street I've dealt with is Boulevard in Hartford, CT[1]. Often abbreviated, so addresses like "1600 Blvd" are common. And of course there are streets where "boulevard" is the last word in the name, like usual, to add to confusion.

[1] https://www.openstreetmap.org/way/1091898993


For the American case, the USPS has a free API to do address validation and standardization.

USPS APIs in general:

https://www.usps.com/business/web-tools-apis/#dev

https://www.usps.com/business/web-tools-apis/documentation-u...

The specific API:

https://www.usps.com/business/web-tools-apis/address-informa...


FWIW, I've used this API for some years. On the con side, it uses XML, but on the pro side, it's fast, reliable, and consistent. And free, at least for my volumes.


I can't find a public copy of the recent versions of BS7666. The 2006 version had zero instances of the word 'apostrophe' so not sure what they think they are referencing.

BS 7666: 2006 is based upon an International Standard ISO 19112 Spatial referencing by geographic identifiers.


“The AGI now has a membership portal and it may be that the page you require sits within that portal. As a member you will be able to reach the page by logging in. If you are not yet a member of the AGI and are interested in joining please visit”

If a standard isn’t publicly available under a free license it should not be called a standard.


While I agree with you, it's common that standards are not available without payment. For example, to get a copy of ISO 8601-1:2019 (as in, the date format standard!), it'll cost you $190.

https://www.iso.org/standard/70907.html


Just get it from Anna's Archive

https://annas-archive.org/md5/6b38669dbfb1042a40be0f804258fd...

And upload anything you think is important enough to Library Genesis

https://wiki.mhut.org/content:how_to_upload


Or, my personal gripe, ISO 7816 - your everyday smartcard standard. A full copy sets you back > 2k €.

[1] https://www.vde-verlag.de/iec-normen/suchen/?publikationsnum...



That's just Part 4 (and 13 is available there as well). The rest remains hidden, sadly - I assume that there aren't that many people willing to get Aaron Swartz'd.


There are guidelines for street gazeteers here:

https://www.agi.org.uk/wp-content/uploads/2020/11/BS7666Guid...

4.4.1 Street names The designated street name is usually to be found on the name plate on the street. However, these may not always be correct, and may differ between the ends of the street. Unofficial street names are ones that have not been adopted by the appropriate Highways Authority but may be in common usage, e.g. "The Great North Road". Street names, whether designated or unofficial, should be recorded in full. Abbreviations and punctuation should not be used unless they appear in the designated name, e.g. "Dr Newton's Way". Only single spaces should be used.

So I think all its saying gazetteer editors should not add punctuation if its “missing” from the designated place name.

Punctuation is fine if it already part of the place name.

I think the intention is to preserve the original place name.

So the council is wrong to blame the specification.


Interesting! However the company which runs the official gazetteer has advice, and the reasoning is nonsense:

>GeoPlace does not advise that councils include or remove punctuation in official naming or on the street name plate. Street naming and numbering is a council policy decision.

>However, the Data Entry Conventions documentation does state that GeoPlace would prefer not receive data (including street names) with punctuation.

>This is for two main reasons:

> machine readability – punctuation can be misinterpreted by computers

> usability – for example, if loaded into say an emergency service command and control system and a caller provides a street name, the search will be faster if the search is entered and returned without punctuation.

https://www.geoplace.co.uk/street-naming-and-numbering/guida...


I refuse to believe any standard considered fit for use in 2024 allows absolutely no special characters via escape sequences.


Councils can barely afford standards fit for for use in 1994, unfortunately


Can't the government fund a single open standard? Every council is now individually paying to license this broken spec.


You can make that suggestion at https://github.com/co-cddo/open-standards

That's the UK Government's discussion space for adopting open standards.

(I used to work there.)


Thanks, I will do!

I'm a big fan of the UK government's commitments to open standards and their whole IT philosophy in general.

That said local councils usually give me quite the opposite feeling. I will definitely look into your suggestion in the hope of seeing some trickle-down!


If you give responsibility for street signage to local councils then you'll inevitably find that one of the hundreds of councils eventually does something dumb like this. The technical solution for this specific problem is readily available but the political solution seems more interesting/complicated.


Airlines around the world, and many US-specific online forms, sternly refuse to accept that my surname is Hugh-Jones. So it becomes HughJones which makes me seem like a rogue robot.


So, no "O'Malley" or "O'Kelly" or "O'Brien" will ever be honored with a street name in North Yorkshire using their actual (Anglicized) name?

This isn't a new issue. Around 1990 one of the computer labs at my school was run by someone with an Irish surname starting "O'", and I remember him complaining about software which couldn't handle his name.

It's been 30 years, and there are still problems?!?

(To say nothing of "Madeleine L'Engle" or any of many others with an apostrophe in their name.)


Even among current and recent MPs and Lords, we've had: O’Brien (5), O’Donnell, O’Halloran, O’Hara (2), O’Mara, and O’Neill.

https://members.parliament.uk/members/commons?SearchText=%27...


And if the "no punctuation" mandate were to be taken seriously, it would also affect innumerable hyphenated names.


To add insult to injury, many government systems in Ireland cannot handle áccented létters in names.


Doctor M’Benga


We cannot (competently) make computers work like we want them to, so let's change our lives to adapt to computers instead.


I live on Bobby Tables Close.


Help I'm Stuck In A Street Sign Factory Avenue


I live on Butts Wynd


Do you live next to Seymour Butts?


In the UK, if you venture from a side street to a main road, the chances are that there won't be a sign to tell you what that main road is, unless you venture down that main road to where it meets another main road. This can be a considerable distance.

I am all for keeping in the apostrophes as they are mini 'flashcards' to help the youngsters learn the value of punctuation. I also think that it is out of respect for residents, if I was on 'St. Mary's Road' and I had to write 'st marys rd' then I would worry that people outside Yorkshire might think I was illiterate.

One day a UK county will do an excellent job of signs, so people always know where they are without SatNav. Remember that many signs were removed just in case the Germans arrived, and we couldn't have them finding their way around, could we?

North Yorkshire council could trial some best practice signage that involves having actual signs instead of making the punctuation vanish. They could get an unexpected tourism boost from doing so with mildly fewer cars on the roads.


Apparently, this is because of a standard they're required to conform to, not database software in specific:

https://news.ycombinator.com/item?id=40265929

> As far as I can tell, this isn't an issue with the specific database itself, but the standard they are required to record geographic data in, which the end of the article mentions as "BS 7666".

On the other hand, you’re naïve if you think English hasn’t already been simplified to fit on machinery such as typewriters and cheap printing presses. This process began long before computers.


There is a public recreation facility called Peter'?s Field in NYC.

It is named primarily for Peter Stuyvesant and Peter Cooper (NYC historical notables), secondarily for Peter Piper, Peter Parker, Peter Pan, Peter Peter Pumpkin Eater, Peter Rabbit, and Peter from "Peter and the Wolf".

Seems they use Peters Field and Peter's Field but not Peters' Field.


I don't really understand why they've decided to do this in 2024.

We all know that older systems had problems with encoding and escaping special characters, but wouldn't they have encountered and dealt with all the possible problems by now?


The implications are bizarre here. Only software/database impacted by this change is something custom for the city, anything more widely used must handle apostrophes etc anyways because other places have them too. And also this being a change now implies that whatever custom software in question is something new, because any old software must have been dealing with these street names with apostrophes.

So first question in my mind is what/why is this software that they are attempting to accommodate??? Or is this all just based on misinterpretation of the mentioned BS7666 and nobody thought to check it??


Next: due to an issue with primary key mapping, all cities in the US are now required to have a unique name, so all affected cities must work together to come up with new names. Start with the Springfields.


No problem, we'll just rename every city to the latitude and longitude of the city hall, expressed to 3 decimal places and concatenated.


This reminds me of the more extreme case of Spain changing the treatment of <ll> due to limitations of PCs in the 1990s.

Our tools should adapt to the needs of humans, not the other way around!


The digraphs ch (named che), ll (named elle or doble ele) have traditionally also been treated as letters of the alphabet, since 1803. However, in 1994, the tenth congress of the Association of Spanish Language Academies agreed to alphabetize ch and ll as ordinary pairs of letters in the dictionary by request of UNESCO and other international organizations, while keeping them as distinct letters for the alphabet and other purposes. In 2010 the Spanish Language Academies agreed that these two digraphs were not separate letters. Similarly, rr (named erre or ere) has sometimes been considered a separate letter but is no longer.

https://en.wiktionary.org/wiki/Appendix:Spanish_alphabet

https://web.archive.org/web/20150426001803/https://www.nytim...


The other day (in 2024!!) , I got a message saying my password needed to include a special character, with a helpful list of special characters that were forbidden.


Hmm... However, language is itself also a tool. There are cases where adapting language to another tool is easier than the other way around.


Sure, but the computer is a tool designed to be malleable and adapt to the human. This kind of grunt work (adapting to humans' models of collation) is exactly the kind of task that should be handed off to machines.


Do you have pointers to this? As a Spaniard I recall that even though I was originally told the spanish alphabet treated the LL as its own letter, it always felt quite inconsistent. And I always assumed it’s removal was more about simplifying things than having to do with computers


There is a comment from "adolph" parallel to yours with footnotes. But I remember the event well (though it was 30 years ago) as the difficulty for programmers was given as the justification for the law (this was a change in collation).

This particularly irked me as unicode was a few years old at this point, and while not really adopted yet, was clearly the future.


Not really a problem but some buses and digital displays where I live in Austria skip umlauts and just use the 26 letter spelling ae rather than ä or strasse rather than Straße.


Just yesterday I generated a 64 char length mysql password, which had \ and ` and '. I couldn't properly escape it, to pass via argv, so I had to truncate it and remove all those symbols.

So I thought, how can this problem be solved? IMHO by doing a hex representation 0x00-0xFF per char. That would also increase entropy.

MySQL and other databases would need to support hex input of passwords, also setting of hex passwords via SQL.


MySQL at least should support that last bit (it has hex strings, and passwords are set with strings using SQL), but not the first as far as I know.


Wrapping the reality around computer ssytems that supposed to (and proclaim themselves so) solve any problem.

Post office scandal about the inability to add numbers, a tiny line f uk the whole system, and we want to give your life into the hands of AI systems. What can go wrong?


Replace it with a QR, but it contains a UUID which has to be used to look up the current street name.

(If all you have is a phone, everything can be solved with an app.)

https://en.wikipedia.org/wiki/QR_code


The Danish Address Register (DAR) assigns a UUID to everything in the addressing system - street names, postal codes, building addresses... It's a nice official database to have in Denmark but I don't know of any products that actually store the UUIDs in the database.


I'm going to guess this is going to be one of those stories that ends with "and that person is an idiot". And "there was nobody with a clue present to tell them they're obviously an idiot".


So, I take it the UK now needs to rename Land's End and John o' Groats.


Not to mention Westward Ho!.


Only if they get somewhere moved to North Yorkshire.


Base64 (or any other similar encoding) could be of help.

But unrecoverably modifying the data to fit within constraints on input, storage or output seems a rather poor "solution".


These discussions always remind me of the old adage that capitalisation is important because it means the different between helping your Uncle Jack off a horse…


How many hundreds or thousands of new streets are being created in North Yorkshire each year I wonder?


Existing street signs need replacement over time too.


They used to be made of cast iron, probably good for a couple of hundred years I’d image, unless smashed or stolen for scrap (sometimes this does happen). I wonder what they make them out of today that leads to needing frequent replacement.


Before reading tfa I thought this was going to be about "Bar t'At Lane".


The Street of Saint Mary--that's what they should write instead.


Tell me you're vulnerable to SQL injection without telling me you're vulnerable to SQL injection.


Makes me wonder if this was the root cause of that software glitch that wrongly sent all those 1000s of postmasters to prison.


Different issue. In that case, the vendor had given some guarantees of consistency of data across network nodes that the network didn't actually support. Because there were guarantees, the law went looking for horses instead of zebras, and the "horses" in this case were that only a few people had admin rights to mess with the transactions and the audit logs.

... but in reality, no human was messing with those; system bugs were dropping or duplicating data. The government should not have trusted claims of a third-party without independent auditing they controlled (and, ultimately, I think that's the takeaway that all governments should be taking from this disaster).


dropping and duplicating data is exactly the symptom you get from not sanitising aprostrophies in your data correctly.


Not at all. If I understand correctly, the failure to synchronize was a fundamental flaw in the the networking code and had nothing to do with the payloads inside the networking code.


that could still be ' handling,

try to sql

(`type`:'express post',`from`:'st mary's street',`value`:'50')

and it will drop the value field, throw an error, and quite possibly duplicate several type and half filled froms depending on how the error handling is done.

This article is basically admitting its cheaper to change the street names than unfux their buggy software, so something is up. what are the other options that meet that criteria?


The software glitch that was involved in 1,000 people being arrested had nothing to do with street names.

By this point, the flaws are pretty well-documented. If you find anything in the reports about handling of apostrophes, feel free to cite it.

The underlying communication protocol from node to node wasn't even SQL; it was an XML format called "Riposte." There was, perhaps, SQL involved in eventual account database updating, but issues had occurred in message transit even before that phase, and it's those issues that led to account reconciliation errors and (incorrect) charges of fraud on the part of the subpostmasters.


source for it being well documented? xml suffers apostrophe issues to.

https://www.theguardian.com/uk-news/2024/jan/09/how-the-post... says As early as 2001, McDonnell’s team had found “hundreds” of bugs. A full list has never been produced,

seems almost guaranteed to me it had apostrophy bugs.


Source for it being identified as a root cause of the Horizon issues.

As far as I can see, there is no evidence that it was.


The evidence is they are changing the street names to stop bugs in software.

The question is what software is so hard to fix its easier to change the physical street names than fix the data entry for those street names.

Horizon seems the likely candidate, and the fix is equally stupid.


Little Johny Drop Tables gets a bad rap every time.


There might be good reasons for not printing them on signs, but this reason seems dumb.

Around here (Melbourne, Australia) I don't think they are ever used. "Princes St" etc. It's fine.


I can't believe nobody has yet mentioned the obligatory relevant XKCD:

https://xkcd.com/327/


Makes sense IMO, the apostrophes don't add much but can cause a lot of issues in manual transcription, simple queries, etc.


As always, the problem is always the underlying issue, even though the surface one is pretty ridiculous. If the council doesn't understand how to sanitise database input, imagine just how bad they are at the stuff that's mildly difficult or worse. Do NOT give any sensitive data to them, whatever you do!


It's not just queries though - but CSV exports, etc. - every single system that ever handles this data needs to handle them.

Searching if users omit the apostrophes, etc.


I'm sure that IT working in the council understand input sanitisation and escape sequences, but they're working to a broken standard.


The standard isn't the point. You can have a "search key" or "standards compliant name" column in a table, and also have a "sign name" column. Whoever came up with this plan was either a fool or likes annoying people (possibly those two things are the same thing).


And/or this is #145 on their to-do list, so after a good 15 months of postponing it, an overworked IT council guy told his supervisor "if you want X implemented without overhauling system Y just get rid of apostrophes" so this article happened.


Aside from making some signs grammatically incorrect, it means places/streets cannot be named after anyone with an apostrophe in their name without mangling their name. It's a bit ironic to honour people by failing to respect their name.


*dont


next to go will be capital letters

thenspaces


y not i fink sum ppl do that already in sum context nyway


It's been common though not exclusive practice to not use apostrophes in street names for a very long time in the south of England, is Yorkshire just catching up?


I can provide pictures in evidence if the down voter needs it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: