As far as I can tell, this isn't an issue with the specific database itself, but the standard they are required to record geographic data in, which the end of the article mentions as "BS 7666".
# Standard data types used in BS 7666
CharacterString: a sequence of alphanumeric characters
...
If I had to guess, alphanumeric is interpreted as [0-9a-z].
The sign printer probably expects this format when printing signs for the government, or worse, has a contract that says the government must provide this standard format for the sign information.
So it's just a government mandated database schema... I don't think that's any better of a reasoning though lol
> How to use these standards
>
> The UPRN and USRN standards form a machine-readable addition to an address or street record held in a system. When using UPRNs and USRNs you can continue to use existing formats but add a field for these identifiers.
https://www.gov.uk/government/publications/open-standards-fo...
The actual standard (well, the previous version, BS 7666-1:2006) does contain that wording, but also says:
> Abbreviations and punctuation shall not be used unless they appear in the designated name, e.g. “Dr Newton’s Way”, and only single spaces shall be used.
Given that "alphanumeric" is vague, not defined, and prominently contradicted, I'd say it's quite clear that they can't really blame it on BS 7666.
The council probably have some horrible CSV-infested GIS workflow, and have decided to change reality to avoid the bugs.
That's because you're not going to Tesco the registered company, you're going to (one of) Tesco's (shops). It's just traditional and logical grammar, not misnaming.
You may not be wrong that people do say it, I'm sure people say pretty much anything, across this vast planet. Yet people use double negatives, when meaning the negative too. Doesn't make it right.
I think language is defined by real-world usage, not by the logical structure that we theorise underpins it. If so, double negatives actually are right, or at least they can be, if they successfully communicate meaning.
Interesting observation ! Might it be because Tesco and Costa resemble ordinary names, and therefore easily appear in a possessive form ?
Or it might be about imaginary hierarchy. There could only be one actual duly-anointed king of burgers, but if one were to use the definite article to mark this, saying "I am going to the Burger King's", it would imply firstly that the king of burgers does actually exist, and possibly also secondly that he has but one solitary burger outlet.
A grammatical construction not quite so jarring when used with (say) "Tesco".
Pasting my reply to someone else rather than rewrite the same thinking:
I agree it's not consistent, but who ever accused English of being that?!
My point isn't that all business names are treated that way, just that the ones that are the reason is grammatical tradition not (for the most part) people who incorrectly think the shop is called "Tesco's" or whatever.
(But as others have replied to you, it's also more common than just Tesco, definitely including "Costa's" for lots of people.)
I agree it's not consistent, but who ever accused English of being that?!
My point isn't that all business names are treated that way, just that the ones that are the reason is grammatical tradition not (for the most part) people who incorrectly think the shop is called "Tesco's" or whatever.
I agree swores. Your comment did say 'traditional' and my comment was facetious.
There's been an historical transition from small chains owned by individuals (e.g. the Victorian Mr John Sainsbury) to big brands (e.g. Superdrug), hasn't there.
The possessive apostrophe was appropriate for the former but surely less so nowadays. I would guess "Sainsbury's" was a rebrand intended to reflect tradition.
I say Sainsbury's because the name is exactly that. I don't say Tesco's because the name, as you say, is Tesco. I would guess those who say Tesco's (and maybe Asda's) are just getting confused because of Sainsbury's.
Standards are just that. You can say that you can standards are the 'minimum, not the maximum', but if the standard says only alphanumeric characters are permitted (and it turns out this isn't a very well written standard, so there's a load of discussion possible on this point), an implementation that allows non-alphanumeric characters is wrong.
As a thought experiment, how far beyond the 'maximum' is acceptable? Latin letters with diacritics? Cyrillic letters? Arabic letters? Chinese logographs? Emojis? Would you expect all systems which are standard-compliant to be able to handle all of the above?
The UK includes Wales and Northern Ireland, both of which have place names that include diacritics. Whether or not Wales and Northern Ireland choose to follow this particular British Standard, I don’t know. Some examples here:
Yep, that just emphasises how stupid the standard is. Others have mentioned Westward Ho![0] which is in England and contains a non-alphanumeric character.
For Wales for sure, often there are two versions of the place name, the Welsh, and an English butchering of it (see for example Pont-y-pŵl, which is spelt Pontypool in English), so I suppose it would be the English thing to do to simply pretend the Welsh/Irish spelling doesn't matter, and only use the English spelling (a little tongue-in-cheek from my side, but sadly likely the reality).
However, I suspect that whoever authored the standard was just sloppy and wrote 'alphanumeric' without giving it any careful thought.
You can print apostrophes on a street sign without any database issues because a street sign doesn't interface directly with a database. At least not yet...
All the technical issues here have already been solved a hundred times, there's plenty of other options. It's a little worrying that we're eliminating punctuation in real life because of issues with integrating with geographical databases.
Interestingly, the OpenStreetMap project considers road signage to be unquestionable "ground truth". If the sign changes, the map changes. We can't use any other database because that might not be available under a compatible licence.
Just because something is written on a road sign does not make it magically copyright or license free.
I know a business which has to pay a nominal $1 license fee for the names of its own stores to a map maker, simply because the business has lost any evidence that it had the names before the map maker put them in the map.
> Just because something is written on a road sign does not make it magically copyright or license free.
That's untrue; facts can't be copyrighted. Whether there's a road sign or not, the name of the road is not subject to copyright, but if there is a road sign, then it's really easy to prove that your claim about the road is an uncopyrightable fact.
That doesn't apply here; no matter how much street name data is collected, no amount of the collection, including the entire thing, can be copyrighted in a way that would impose any restrictions on OpenStreetMap.
From your link, which is quite short:
> such copyright may exist when the materials in the compilation (or "collective work") are selected, coordinated, or arranged creatively such that a new work is produced. Copyright does not exist when content is compiled without creativity, such as in the production of a telephone directory.
So you'd have to ask yourself, "was any creativity, of any kind at all, required in order to call this street by its own name?" And the answer is even more obviously "no" than in the paradigm case in which the telephone directory can't be copyrighted because it consists of facts (involving no creativity) in an externally specified order (alphabetical) in a collection specified by an external rule ("everything is included").
No they aren't, but there is such a thing as a database right which is separate from copyright. This is why OSM changed from a Creative Commons licence (a copyright licence) to ODbL (a database licence).
> but there is such a thing as a database right which is separate from copyright
That's purely an unforced error on the part of OpenStreetMap. They are incorporated in England, which recognizes a database right. But there is no reason for that. In general, there is no such thing as a database right.
And this isn't even relevant to londons_explore's point; he is stupidly arguing that even if the information is available on a road sign, the public cannot use it because it might be included in a protected work somewhere. That is obviously untrue; the availability on the road sign does in fact automatically mean that inclusion in a protected work is irrelevant.
If you see a sign giving you the name of a road, and then publish information to the effect that the road's name is what is printed on that sign, no database was involved at any point, and a database right cannot apply. All you have is a bare fact.
I haven’t checked but how do they handle streets if the signs have two different spellings? There was a street growing up that was spelled two different ways and I still don’t know what the right one was!
My favourite example is a small road in Killarney called The Hahah in English but in Irish written "An Fhaiche" in one place and "An Háhá" in another. OpeStreetMap seems to designate the first one as name:ga and the second one as alt_name:ga.
This is a recurring problem in Ireland due to nationwide local government incompetence in using the Irish language.
Centuries old Irish language place names get replaced with bastard gaelicised versions of their English names, and now you’ve a mishmash of signage all over the place. Often the new names are just an invention of the council that sort of sounds right.
This is very true, but there is also an issue of which Irish orthography to use, right? Place names are very conservative, but modern speakers would be more used to O'Donnell's dictionary's variants than to ones from Dinneen, basically. Paradoxically, English forms sometimes give hints as to the correct pronunciation.
This tagging is most likely the result of a single OSM editor decision. They are quite often wrong / suboptimal / do not conform to guidelines.
It is always better to consult extensive OSM wiki to figure out how something should be tagged rather than try to figure out based on existing examples.
The rules are as open to change as the data I guess, but if there really isn't one obvious "proper" name I've seen things named with a slash like "name 1/name 2". Often there is specific tagging to overcome this, though. For example in bilingual countries like Wales many roads have any English and Welsh name. The language-specific names are uncontroversial, but the overall "name" (for the international map) is supposed to be in the "local language". As you can imagine this isn't always uncontroversial.
Don't know if it is still an issue but Atlanta used to have problems with the in-house sign shop producing street signs with misspelled names. Apparently, boulevard in particular was difficult for the city to get correct which was a problem since there's a major road simply named "Boulevard". Maybe it got corrected after one of the local news channels did a segment about it but does highlight the fact that road signs sometimes aren't authoritative.
It seems like it is yet another example of software not written to requirements so now the requirements are adjusted to fit the software implementation.
> As far as I can tell, this isn't an issue with the specific database itself, but the standard they are required to record geographic data in, which the end of the article mentions as "BS 7666".
On the other hand, you’re naïve if you think English hasn’t already been simplified to fit on machinery such as typewriters and cheap printing presses. This process began long before computers.
That book is for the "Parlour printing press [which] was invented by Mr. Cowper for the amusement and education of youth, by enabling them to print any little subject they had previously written, provided the printing did not exceed in size the dimensions of an ordinary duodecimo page, which measures about 5 inches by 3 inches."
Did it support þ? That would be the best-known example of written English being changed to accommodate a printing press rather than the other way around.
Is there something relevant about the 1800s? I said þ was an example of the written language changing to accommodate the printing press, not changing to accommodate the 1800s.
But the reference to "cheap printing presses" has to be interpreted as actually being a reference to type, as you have already done with your archive.org link. The press itself cannot either support or fail to support any graphical forms; there's no difference in the press whether you're using type or block printing.
Press quality issues are things like "are the surfaces flat" / "how much pressure can the press apply" / "how many pages can we press before running into a mechanical issue".
I disagree. It does not have to be that. I pointed to a printing press designed for kids. That should be much cheaper than one used to print larger sizes or many copies, much less the quality of the Bible in the 1500s.
I do get your point, but I think it's still wrong to use "cheap" to refer to the typefaces available for the Caxton and KJV Bibles. I suspect they were quite expensive.
The physical press is only part of the printing cost. The Linotype typesetting machine made it possible to set an entire line of type, drawing from a 90 or so characters. While the press itself didn't care, adding new symbols required manual effort, making it more expensive.
> but I think it's still wrong to use "cheap" to refer to the typefaces available for the Caxton and KJV Bibles. I suspect they were quite expensive.
I don't think this argument quite works; something can be stunningly expensive, in an absolute sense, at the same time that it's the shoddy low-price option people choose for budget reasons.
(Grrr! I've been saying Caxton Bible but I meant Tyndale Bible.)
Sure, it can be, but was it?
The KJV was very expensive as it was. ("Robert Barker invested very large sums in printing the new edition, and consequently ran into serious debt" says https://en.wikipedia.org/wiki/King_James_Version .) That doesn't mean they used a shoddy low-price option.
With the printing press it makes sense, there's physical constraints that make things very difficult if you want to support a huge number of different characters. Those limitations don't exist anymore though.
On one hand we have systems supporting Unicode and just about every character imaginable, it's an example of using technology to go beyond the limitations of the past. You could have a system that supports emojis on street signs, but instead we're going the opposite direction and introducing artificial limitations that are even more restrictive than 500 year-old technology.
I used to live in North Yorkshire until I moved away to start a job in mainland Europe.
When I moved I tried to fill in the form on their website to indicate that I wouldn't be paying the council tax on the house I used to live in anymore. Weirdly I couldn't make the form work, it broke with weird errors about timing out. After some headscratching I decided (on a whim) to change my computer's timezone back to GMT and hilariously the form started working perfectly.
Sadly I couldn't finish filling in the form, because it required a postcode for my new address, and would ONLY accept postcodes which matched the UK format (which my new address, in a different country obviously didn't match).
You shouldn't be surprised about any IT insanity from North Yorkshire Council, they are impressively incompetent.
Australia removed apostrophes from all official place names back in 1966, leading to names like Surfers Paradise, Princes Highway and Wisemans Ferry. Given the date, I doubt computing was a major consideration though.
It's led to some interesting outcomes - for example, I've seen people write "Princess Highway" a fair few times, presumably on account of the official spelling ("Princes") falling confusingly somewhere half-way between the more usual "Prince's" and "Princess".
In day to day use, I don't think most people actually use the apostrophe-free official spellings, at least for names comprised of ordinary English words (Prince's Hwy, Surfer's Paradise), but it might be more of a free-for-all with proper names (Wiseman's Ferry or Wisemans Ferry).
I have never seen anyone write "Prince's Hwy" (or "King's Hwy" for that matter), the apostrophe-free version would be far more common.
The "Princess Hwy" thing is also common I think because the pronunciation most often used sounds a lot like "Princess", particularly when it's run right into "Highway" without a gap.
> I have never seen anyone write "Prince's Hwy" (or "King's Hwy" for that matter), the apostrophe-free version would be far more common.
I don't know what to tell you; I've seen it often. I'm talking day to day here, text messages and handwritten directions. If you're thinking of things like road signs, those will obviously follow official usage.
> The "Princess Hwy" thing is also common I think because the pronunciation most often used sounds a lot like "Princess", particularly when it's run right into "Highway" without a gap.
Obviously that is so, but I would suggest that the reason 'Princess Hwy' is comparatively so much more common than other similar mistakes (e.g. 'sea' for 'see' etc.) is because the official name (Princes Highway) feels so unnatural in English that it fails to act as a meaningful standard for usage.
This seems silly for the reason they're doing it, in that a modern database should be able to handle characters with a little sanitation.
However, it does seem like it could be helpful when it comes to satnav applications to remove ambiguity. Google's going to autocorrect most of the time anyway, but this way, you're less likely to run into an issue where it takes you to Kings Landing in the wrong town because you didn't type King's Landing in the town you meant.
Sure, they tell you what town you're looking at, but I can't be the only one who's quickly typed in a destination and didn't take the time to double check and ended up driving to the wrong location for something. For some reason all of the hockey rinks near me have almost identical names...
>but this way, you're less likely to run into an issue where it takes you to Kings Landing in the wrong town because you didn't type King's Landing in the town you meant.
That's pretty uncompelling. Should we also get rid of the letter s to avoid mix-ups between Kings Landing and King Landing?
I have a street in a town near me which grammatically should be called St. Thomas' road. However, the street signs call it St. Thomas road at the north end and St. Thomas's road at the south end.
Not quite related to apostrophes, but I feel the need to point out my favourite street name of all time (as read on a street sign): St John St
(-> Saint John Street)
Thomas' is not grammatically correct in any version of English that I know. It's not plural. There is no special rule for that. Both the street signs you mentioned are at least grammatically coherent
It is trivial to find more than sufficiently authoritative source that cover the rules that make "Chris'" and "boss'" perfectly value contracted possessives in English. [1]
However, it's English: there isn't just one rule, another rule can also be valid and might be the one you're familiar with on a day to day basis. That doesn't mean any other way to say or write the same thing is wrong, it's just a pattern you never saw. Like someone going "lol snuck isn't a real word, it's sneaked!" and then you hand them a dictionary and they learn something new about their own language.
> Some writers and editors add ’s to every proper noun, be it Hastings’s or Jones’s. There also are a few who add only an apostrophe to all nouns ending in s.
> ..One method, common in newspapers and magazines, is to add an apostrophe plus s (’s) to common nouns ending in s, but only a stand-alone apostrophe to proper nouns ending in s.
> Examples: the class’s hours; Mr. Jones’ golf clubs; The canvas’s size; Texas’ weather
"Some writers" say that the French Foreign Legion has been deployed to the front lines in Ukraine too. We must take this very seriously, then!
Some writers write things like Four Fat Harvard Girls Lose Book Bag too. They use sentence fragments. They try to save ink by doing weird shit. Professional Buzzfeed writers write AF (yes, in caps) to mean as fuck. The Atlantic used the words electroöptical and rôles in 1940. Just because some minimum-wage burnout or penny-pinching editor breaks a rule doesn't mean that the rule doesn't exist.
If you go around the office saying that someone drank out of the boss' mug, they'll think you're fresh off the boat. Not only is it wrong in written English, it's not even accepted colloquially in spoken English, anywhere. And so it makes perfect sense that the written form would reflect the pronunciation.
Saying Texas' weather out loud just confuses people into thinking you're using it as an adjective when what you're really doing is trying to sound smart when you're actually sounding dumb. If you point at a book and say, that's Chris', you sound like you have brain damage. How is the book Chris'?
Chris is a person, not a book! The only reason that people don't correct you is that they're being polite. And people misspell words all the time and the world doesn't cave in. That doesn't imply any particular thing about English grammar.
Another commenter found that you can say Jeff Bridges' because this is an irregular case to avoid saying the same sound twice—an exception which proves the rule (and also, I don't think it's irrelevant at all to point out the fact that Bridges is literally a plural noun made into a name). But Thomas is decidedly not in this narrow category. His source even uses Thomas' as an example of what not to do, lol. Normally I wouldn't dumpster someone this hard but hn rate limits so I may as well lengthen my response. Nothing personal.
Rules, especially in English, are not grounded in any agreed-upon authority and never have been, tracing all the way back to Chaucer more-or-less codifying the written form of the language by simply writing one story that then became the most popular English printed work for a generation.
Try not to lean too hard on how other people use the language just because it ain't how you use it. Makes you look outta touch with the way folks are playin' around with one of our shared human comms protocols, neh?
> but hn rate limits
Not in general. Only if you've proven yourself to be a poster for whom the mods think that rate-limiting you improves the health of the discourse 'round these parts.
That's exactly his point. You don't hear the apostrophe but you do hear the "s," meaning that Thomas and Thomas' cannot be distinguished. And so Thomas's must be used instead.
The spoken and written language are not the same thing. Even if you say "Thomas's," sources disagree on whether you write "Thomas's" or "Thomas'", because the latter is more consistent with the rules for other ends-in-s words and, therefore, easier to remember.
(My personal prediction: give it 100 to 200 years and we're going to drop the trailing 's' in all these cases. "Cat'" will just be pronounced "cats" and understood to mean "an adjective indicating the noun is owned by the cat").
It is easy to observe that this "rule" is false. Even though everyone pronounces it "thomases", some spell that "Thomas's", others spell it "Thomas'". It is a purely stylistic spelling difference, and both forms are in common use, in literate environments. So, there is no one rule about how this word is spelled. And since neither form reflects the pronunciation, both are purely conventional, they don't have a much deeper meaning to lean on.
As if database were not able to del with apostrophes or other special characters...
Yes you have to sanitize your queries, but you have to do it anyway.
Client applications will of course have to be smarter
It's like the stories of people with the last name "Null" who get errors when trying to enter their name into websites. If that's true then I don't want to think about how poorly built (and insecure) those systems must be.
Lots of countries limit what you can name yourself. I very much doubt having a special apostrophe in your name can be a legal requirement.
Either way, apostrophes in names is quite uncommon in many parts of the world, and they are very likely to be ignored even on paper forms in those areas, if they are even allowed.
In general, each area has certain limits on what kinds of names if allows/understands, and it is up to the minority to adapt one way or another. It's very reasonable to want an Irish or maybe even British system to recognize a name like O'Reilly, but it's not really something you can expect of a Japanese system. Just as much as you shouldn't expect a name like 田中 to be recognized in France.
This results in amusing side-effects, like buying a house requiring you to sign every variation of your name that the credit check found, but it can also get you multiple “one per person” signup options, so there’s that, too.
H2 offers quite a comprehensive solution for dealing with this:
> [H2] provides a way to enforce usage of parameters when passing user input to the database. This is done by disabling embedded literals in SQL statements. To do this, execute the statement:
> SET ALLOW_LITERALS NONE;
> Literals can only be enabled or disabled by an administrator
I would argue if you sanitize your input you are already doing it wrong, you should parameterize queries and send the data entirely separately from code.
If it sanitizes anything, parameterization sanitizes the code, not the data, and has much lower impact on the outside world (because the rest of the world isn't pressured to rename things in the real world to fit arbitrary constraints in the computer).
Ruby Wang... did not mind the changes. "To be honest with you, because I'm not from this country it doesn't matter because it's the same pronunciation," she added.
My local (US) county has a web service to look up property deeds and titles.
They "solved" this problem by just having you enter the street name with no punctuation or suffix (ave, st, etc), and if there is a collision the form pops out a drop-down selector to have you disambiguate.
It's not the cleanest solution but it works. I agree that bending the humans to serve the needs of the machine feels... Sub-optimal.
To be fair, a degree of fuzziness in matching is quite valuable. Is it Eighth Ave, 8th Ave, Eighth Avenue or 8th Avenue? Or W 8th Ave, perhaps? Is that deed for Unit 7 or Apt 7 or #7 or Ste 7? Is the street address 15 8th Ave or 015 8th Ave? In cases where one zip code spans multiple cities, should anyone really be matching on the city? How about communities in a city in which no one is quite sure what goes in the city field? (Is “Pacific Palisades, CA” a valid city+state? How about Van Nuys, CA or Hollywood, CA? But don’t confuse this situation with West Hollywood, CA or Beverly Hills, CA, which are actual cities.)
I wonder what address verification services expect for the “numbers in the address” when the address is 123 1/2 4th Ave #5”.
Apostrophes seem like the least of anyone’s concerns.
My favorite confusing street I've dealt with is Boulevard in Hartford, CT[1]. Often abbreviated, so addresses like "1600 Blvd" are common. And of course there are streets where "boulevard" is the last word in the name, like usual, to add to confusion.
FWIW, I've used this API for some years. On the con side, it uses XML, but on the pro side, it's fast, reliable, and consistent. And free, at least for my volumes.
I can't find a public copy of the recent versions of BS7666. The 2006 version had zero instances of the word 'apostrophe' so not sure what they think they are referencing.
BS 7666: 2006 is based upon an International Standard ISO 19112 Spatial referencing by geographic identifiers.
“The AGI now has a membership portal and it may be that the page you require sits within that portal. As a member you will be able to reach the page by logging in. If you are not yet a member of the AGI and are interested in joining please visit”
If a standard isn’t publicly available under a free license it should not be called a standard.
While I agree with you, it's common that standards are not available without payment. For example, to get a copy of ISO 8601-1:2019 (as in, the date format standard!), it'll cost you $190.
That's just Part 4 (and 13 is available there as well). The rest remains hidden, sadly - I assume that there aren't that many people willing to get Aaron Swartz'd.
4.4.1 Street names
The designated street name is usually to be found on the name plate on the street.
However, these may not always be correct, and may differ between the ends of the street. Unofficial street names are ones that have not been adopted by the appropriate Highways Authority but may be in common usage, e.g. "The Great North Road".
Street names, whether designated or unofficial, should be recorded in full.
Abbreviations and punctuation should not be used unless they appear in the designated name, e.g. "Dr Newton's Way". Only single spaces should be used.
So I think all its saying gazetteer editors should not add punctuation if its “missing” from the designated place name.
Punctuation is fine if it already part of the place name.
I think the intention is to preserve the original place name.
So the council is wrong to blame the specification.
Interesting! However the company which runs the official gazetteer has advice, and the reasoning is nonsense:
>GeoPlace does not advise that councils include or remove punctuation in official naming or on the street name plate. Street naming and numbering is a council policy decision.
>However, the Data Entry Conventions documentation does state that GeoPlace would prefer not receive data (including street names) with punctuation.
>This is for two main reasons:
> machine readability – punctuation can be misinterpreted by computers
> usability – for example, if loaded into say an emergency service command and control system and a caller provides a street name, the search will be faster if the search is entered and returned without punctuation.
I'm a big fan of the UK government's commitments to open standards and their whole IT philosophy in general.
That said local councils usually give me quite the opposite feeling. I will definitely look into your suggestion in the hope of seeing some trickle-down!
If you give responsibility for street signage to local councils then you'll inevitably find that one of the hundreds of councils eventually does something dumb like this. The technical solution for this specific problem is readily available but the political solution seems more interesting/complicated.
Airlines around the world, and many US-specific online forms, sternly refuse to accept that my surname is Hugh-Jones. So it becomes HughJones which makes me seem like a rogue robot.
So, no "O'Malley" or "O'Kelly" or "O'Brien" will ever be honored with a street name in North Yorkshire using their actual (Anglicized) name?
This isn't a new issue. Around 1990 one of the computer labs at my school was run by someone with an Irish surname starting "O'", and I remember him complaining about software which couldn't handle his name.
It's been 30 years, and there are still problems?!?
(To say nothing of "Madeleine L'Engle" or any of many others with an apostrophe in their name.)
In the UK, if you venture from a side street to a main road, the chances are that there won't be a sign to tell you what that main road is, unless you venture down that main road to where it meets another main road. This can be a considerable distance.
I am all for keeping in the apostrophes as they are mini 'flashcards' to help the youngsters learn the value of punctuation. I also think that it is out of respect for residents, if I was on 'St. Mary's Road' and I had to write 'st marys rd' then I would worry that people outside Yorkshire might think I was illiterate.
One day a UK county will do an excellent job of signs, so people always know where they are without SatNav. Remember that many signs were removed just in case the Germans arrived, and we couldn't have them finding their way around, could we?
North Yorkshire council could trial some best practice signage that involves having actual signs instead of making the punctuation vanish. They could get an unexpected tourism boost from doing so with mildly fewer cars on the roads.
> As far as I can tell, this isn't an issue with the specific database itself, but the standard they are required to record geographic data in, which the end of the article mentions as "BS 7666".
On the other hand, you’re naïve if you think English hasn’t already been simplified to fit on machinery such as typewriters and cheap printing presses. This process began long before computers.
There is a public recreation facility called Peter'?s Field in NYC.
It is named primarily for Peter Stuyvesant and Peter Cooper (NYC historical notables), secondarily for Peter Piper, Peter Parker, Peter Pan, Peter Peter Pumpkin Eater, Peter Rabbit, and Peter from "Peter and the Wolf".
Seems they use Peters Field and Peter's Field but not Peters' Field.
I don't really understand why they've decided to do this in 2024.
We all know that older systems had problems with encoding and escaping special characters, but wouldn't they have encountered and dealt with all the possible problems by now?
The implications are bizarre here. Only software/database impacted by this change is something custom for the city, anything more widely used must handle apostrophes etc anyways because other places have them too. And also this being a change now implies that whatever custom software in question is something new, because any old software must have been dealing with these street names with apostrophes.
So first question in my mind is what/why is this software that they are attempting to accommodate??? Or is this all just based on misinterpretation of the mentioned BS7666 and nobody thought to check it??
Next: due to an issue with primary key mapping, all cities in the US are now required to have a unique name, so all affected cities must work together to come up with new names. Start with the Springfields.
The digraphs ch (named che), ll (named elle or doble ele) have traditionally also been treated as letters of the alphabet, since 1803. However, in 1994, the tenth congress of the Association of Spanish Language Academies agreed to alphabetize ch and ll as ordinary pairs of letters in the dictionary by request of UNESCO and other international organizations, while keeping them as distinct letters for the alphabet and other purposes. In 2010 the Spanish Language Academies agreed that these two digraphs were not separate letters. Similarly, rr (named erre or ere) has sometimes been considered a separate letter but is no longer.
The other day (in 2024!!) , I got a message saying my password needed to include a special character, with a helpful list of special characters that were forbidden.
Sure, but the computer is a tool designed to be malleable and adapt to the human. This kind of grunt work (adapting to humans' models of collation) is exactly the kind of task that should be handed off to machines.
Do you have pointers to this? As a Spaniard I recall that even though I was originally told the spanish alphabet treated the LL as its own letter, it always felt quite inconsistent. And I always assumed it’s removal was more about simplifying things than having to do with computers
There is a comment from "adolph" parallel to yours with footnotes. But I remember the event well (though it was 30 years ago) as the difficulty for programmers was given as the justification for the law (this was a change in collation).
This particularly irked me as unicode was a few years old at this point, and while not really adopted yet, was clearly the future.
Not really a problem but some buses and digital displays where I live in Austria skip umlauts and just use the 26 letter spelling ae rather than ä or strasse rather than Straße.
Just yesterday I generated a 64 char length mysql password, which had \ and ` and '.
I couldn't properly escape it, to pass via argv, so I had to truncate it and remove all those symbols.
So I thought, how can this problem be solved? IMHO by doing a hex representation 0x00-0xFF per char.
That would also increase entropy.
MySQL and other databases would need to support hex input of passwords, also setting of hex passwords via SQL.
Wrapping the reality around computer ssytems that supposed to (and proclaim themselves so) solve any problem.
Post office scandal about the inability to add numbers, a tiny line f uk the whole system, and we want to give your life into the hands of AI systems. What can go wrong?
The Danish Address Register (DAR) assigns a UUID to everything in the addressing system - street names, postal codes, building addresses... It's a nice official database to have in Denmark but I don't know of any products that actually store the UUIDs in the database.
I'm going to guess this is going to be one of those stories that ends with "and that person is an idiot". And "there was nobody with a clue present to tell them they're obviously an idiot".
These discussions always remind me of the old adage that capitalisation is important because it means the different between helping your Uncle Jack off a horse…
They used to be made of cast iron, probably good for a couple of hundred years I’d image, unless smashed or stolen for scrap (sometimes this does happen). I wonder what they make them out of today that leads to needing frequent replacement.
Different issue. In that case, the vendor had given some guarantees of consistency of data across network nodes that the network didn't actually support. Because there were guarantees, the law went looking for horses instead of zebras, and the "horses" in this case were that only a few people had admin rights to mess with the transactions and the audit logs.
... but in reality, no human was messing with those; system bugs were dropping or duplicating data. The government should not have trusted claims of a third-party without independent auditing they controlled (and, ultimately, I think that's the takeaway that all governments should be taking from this disaster).
Not at all. If I understand correctly, the failure to synchronize was a fundamental flaw in the the networking code and had nothing to do with the payloads inside the networking code.
and it will drop the value field, throw an error, and quite possibly duplicate several type and half filled froms depending on how the error handling is done.
This article is basically admitting its cheaper to change the street names than unfux their buggy software, so something is up. what are the other options that meet that criteria?
The software glitch that was involved in 1,000 people being arrested had nothing to do with street names.
By this point, the flaws are pretty well-documented. If you find anything in the reports about handling of apostrophes, feel free to cite it.
The underlying communication protocol from node to node wasn't even SQL; it was an XML format called "Riposte." There was, perhaps, SQL involved in eventual account database updating, but issues had occurred in message transit even before that phase, and it's those issues that led to account reconciliation errors and (incorrect) charges of fraud on the part of the subpostmasters.
As always, the problem is always the underlying issue, even though the surface one is pretty ridiculous. If the council doesn't understand how to sanitise database input, imagine just how bad they are at the stuff that's mildly difficult or worse. Do NOT give any sensitive data to them, whatever you do!
The standard isn't the point. You can have a "search key" or "standards compliant name" column in a table, and also have a "sign name" column. Whoever came up with this plan was either a fool or likes annoying people (possibly those two things are the same thing).
And/or this is #145 on their to-do list, so after a good 15 months of postponing it, an overworked IT council guy told his supervisor "if you want X implemented without overhauling system Y just get rid of apostrophes" so this article happened.
Aside from making some signs grammatically incorrect, it means places/streets cannot be named after anyone with an apostrophe in their name without mangling their name. It's a bit ironic to honour people by failing to respect their name.
It's been common though not exclusive practice to not use apostrophes in street names for a very long time in the south of England, is Yorkshire just catching up?
https://www.agi.org.uk/wp-content/uploads/2020/11/BS7666Guid...
If I had to guess, alphanumeric is interpreted as [0-9a-z].The sign printer probably expects this format when printing signs for the government, or worse, has a contract that says the government must provide this standard format for the sign information.
So it's just a government mandated database schema... I don't think that's any better of a reasoning though lol