Hacker News new | past | comments | ask | show | jobs | submit login

You can print apostrophes on a street sign without any database issues because a street sign doesn't interface directly with a database. At least not yet...

All the technical issues here have already been solved a hundred times, there's plenty of other options. It's a little worrying that we're eliminating punctuation in real life because of issues with integrating with geographical databases.




Interestingly, the OpenStreetMap project considers road signage to be unquestionable "ground truth". If the sign changes, the map changes. We can't use any other database because that might not be available under a compatible licence.


Just because something is written on a road sign does not make it magically copyright or license free.

I know a business which has to pay a nominal $1 license fee for the names of its own stores to a map maker, simply because the business has lost any evidence that it had the names before the map maker put them in the map.


> Just because something is written on a road sign does not make it magically copyright or license free.

That's untrue; facts can't be copyrighted. Whether there's a road sign or not, the name of the road is not subject to copyright, but if there is a road sign, then it's really easy to prove that your claim about the road is an uncopyrightable fact.


Petty clarification: single facts can't be copyrighted, but collections of facts (e.g. sports scores) can:

https://en.wikipedia.org/wiki/Copyright_in_compilation


That doesn't apply here; no matter how much street name data is collected, no amount of the collection, including the entire thing, can be copyrighted in a way that would impose any restrictions on OpenStreetMap.

From your link, which is quite short:

> such copyright may exist when the materials in the compilation (or "collective work") are selected, coordinated, or arranged creatively such that a new work is produced. Copyright does not exist when content is compiled without creativity, such as in the production of a telephone directory.

So you'd have to ask yourself, "was any creativity, of any kind at all, required in order to call this street by its own name?" And the answer is even more obviously "no" than in the paradigm case in which the telephone directory can't be copyrighted because it consists of facts (involving no creativity) in an externally specified order (alphabetical) in a collection specified by an external rule ("everything is included").


On that page:

  facts are not copyrightable
..compilations are, but you're obtaining a copyright on the compilation, not the facts.


No they aren't, but there is such a thing as a database right which is separate from copyright. This is why OSM changed from a Creative Commons licence (a copyright licence) to ODbL (a database licence).


> but there is such a thing as a database right which is separate from copyright

That's purely an unforced error on the part of OpenStreetMap. They are incorporated in England, which recognizes a database right. But there is no reason for that. In general, there is no such thing as a database right.

And this isn't even relevant to londons_explore's point; he is stupidly arguing that even if the information is available on a road sign, the public cannot use it because it might be included in a protected work somewhere. That is obviously untrue; the availability on the road sign does in fact automatically mean that inclusion in a protected work is irrelevant.

If you see a sign giving you the name of a road, and then publish information to the effect that the road's name is what is printed on that sign, no database was involved at any point, and a database right cannot apply. All you have is a bare fact.


I haven’t checked but how do they handle streets if the signs have two different spellings? There was a street growing up that was spelled two different ways and I still don’t know what the right one was!


My favourite example is a small road in Killarney called The Hahah in English but in Irish written "An Fhaiche" in one place and "An Háhá" in another. OpeStreetMap seems to designate the first one as name:ga and the second one as alt_name:ga.


This is a recurring problem in Ireland due to nationwide local government incompetence in using the Irish language.

Centuries old Irish language place names get replaced with bastard gaelicised versions of their English names, and now you’ve a mishmash of signage all over the place. Often the new names are just an invention of the council that sort of sounds right.


This is very true, but there is also an issue of which Irish orthography to use, right? Place names are very conservative, but modern speakers would be more used to O'Donnell's dictionary's variants than to ones from Dinneen, basically. Paradoxically, English forms sometimes give hints as to the correct pronunciation.


This tagging is most likely the result of a single OSM editor decision. They are quite often wrong / suboptimal / do not conform to guidelines.

It is always better to consult extensive OSM wiki to figure out how something should be tagged rather than try to figure out based on existing examples.

In case of names here is relevant guide: https://wiki.openstreetmap.org/wiki/Names


Naming is complicated. OSM community invented approaches to capture lots of nuances.

https://wiki.openstreetmap.org/wiki/Names


The rules are as open to change as the data I guess, but if there really isn't one obvious "proper" name I've seen things named with a slash like "name 1/name 2". Often there is specific tagging to overcome this, though. For example in bilingual countries like Wales many roads have any English and Welsh name. The language-specific names are uncontroversial, but the overall "name" (for the international map) is supposed to be in the "local language". As you can imagine this isn't always uncontroversial.


Don't know if it is still an issue but Atlanta used to have problems with the in-house sign shop producing street signs with misspelled names. Apparently, boulevard in particular was difficult for the city to get correct which was a problem since there's a major road simply named "Boulevard". Maybe it got corrected after one of the local news channels did a segment about it but does highlight the fact that road signs sometimes aren't authoritative.


You are talking about the country where accounting systems cannot add numbers...

And people believe anything that comes out of a computer regardless of thousands of signs that something is really not right.

They rather prosecute others without doubt on mass scale rather than look into themselves.

(see the Post Office scandal)


This routinely happens to names. Even now you get systems which refuse to deal with O', let alone non-English characters.


It seems like it is yet another example of software not written to requirements so now the requirements are adjusted to fit the software implementation.


Apparently, this is because of a standard they're required to conform to, not database software in specific:

https://news.ycombinator.com/item?id=40265929

> As far as I can tell, this isn't an issue with the specific database itself, but the standard they are required to record geographic data in, which the end of the article mentions as "BS 7666".

On the other hand, you’re naïve if you think English hasn’t already been simplified to fit on machinery such as typewriters and cheap printing presses. This process began long before computers.


The Linotype machine supported fl, ff, ffi, ℔, œ, and æ. See https://upload.wikimedia.org/wikipedia/commons/4/46/Linotype... linked to from https://en.wikipedia.org/wiki/Linotype_machine. Or see https://archive.org/details/LinotypeKeyboardPractice/page/n9... .

Even cheap printing presses could handle more than you think. Here's a type case layout from 1846, again with æ, œ, fl, ff, and ffi, fi and ffl. https://archive.org/details/printingapparatu00holtrich/page/... .

That book is for the "Parlour printing press [which] was invented by Mr. Cowper for the amusement and education of youth, by enabling them to print any little subject they had previously written, provided the printing did not exceed in size the dimensions of an ordinary duodecimo page, which measures about 5 inches by 3 inches."


Did it support þ? That would be the best-known example of written English being changed to accommodate a printing press rather than the other way around.


No. But that doesn't really fit since þ had mostly died out by the 1300s, with only few remaining uses by Caxton's printing - long before the 1800s.

I also wouldn't call those typewriters or cheap printing presses.


Is there something relevant about the 1800s? I said þ was an example of the written language changing to accommodate the printing press, not changing to accommodate the 1800s.


I had responded to msla's comment about "typewriters and cheap printing presses". Those didn't exist in the 1500s.

þ's disappearance from English is not due to either, though the lack of þ in available type faces was certainly an issue.


But the reference to "cheap printing presses" has to be interpreted as actually being a reference to type, as you have already done with your archive.org link. The press itself cannot either support or fail to support any graphical forms; there's no difference in the press whether you're using type or block printing.

Press quality issues are things like "are the surfaces flat" / "how much pressure can the press apply" / "how many pages can we press before running into a mechanical issue".


I disagree. It does not have to be that. I pointed to a printing press designed for kids. That should be much cheaper than one used to print larger sizes or many copies, much less the quality of the Bible in the 1500s.

I do get your point, but I think it's still wrong to use "cheap" to refer to the typefaces available for the Caxton and KJV Bibles. I suspect they were quite expensive.

The physical press is only part of the printing cost. The Linotype typesetting machine made it possible to set an entire line of type, drawing from a 90 or so characters. While the press itself didn't care, adding new symbols required manual effort, making it more expensive.


> but I think it's still wrong to use "cheap" to refer to the typefaces available for the Caxton and KJV Bibles. I suspect they were quite expensive.

I don't think this argument quite works; something can be stunningly expensive, in an absolute sense, at the same time that it's the shoddy low-price option people choose for budget reasons.


(Grrr! I've been saying Caxton Bible but I meant Tyndale Bible.)

Sure, it can be, but was it?

The KJV was very expensive as it was. ("Robert Barker invested very large sums in printing the new edition, and consequently ran into serious debt" says https://en.wikipedia.org/wiki/King_James_Version .) That doesn't mean they used a shoddy low-price option.


With the printing press it makes sense, there's physical constraints that make things very difficult if you want to support a huge number of different characters. Those limitations don't exist anymore though.

On one hand we have systems supporting Unicode and just about every character imaginable, it's an example of using technology to go beyond the limitations of the past. You could have a system that supports emojis on street signs, but instead we're going the opposite direction and introducing artificial limitations that are even more restrictive than 500 year-old technology.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: