Hacker News new | past | comments | ask | show | jobs | submit login

I'm pretty sure the reason only some of the currency symbols aren't correct has to do with the database.

If you think about it, the item names are most likely coming from a database that just might not be in the right encoding (latin1 is still the default in MySQL I think). The symbols that do work are probably hard coded into the receipt's template, and hence don't have this problem.

Why a shop owner would store the price and currency symbol in an item's description is beyond me, but having worked in the POS world and seeing what shop owners do with their items I'd definitely believe it.




Note that the encoding that MySQL calls "latin1" (and uses as its default) is not, in fact, latin1. It is windows cp1252 except with 8 random characters swapped around. I wish I was joking.


Haha, and what mysql calls "utf8" is not, in fact, all of utf8. That's called "utf8mb4".


It also can't sort unicode correctly according to the standard UCA algorithm. The ticket for this is closed as a wontfix.


Jesus Christ.

Is there a reason for this? Or is it just another case of lol@mysql


So in 2017, whats the correct character set to use in MySQL?


"utf8mb4", though really in 2017 the correct thing is to use Postgres.


but not everyone can use postgres


Anyone can if they want to enough. Some don't want it enough, or want other things more, but not using postgres is always a choice.


Which ones? I can't find info on that.


https://dev.mysql.com/doc/refman/5.7/en/charset-we-sets.html

Looks like I misremembered - 5 rather than 8 characters. But it isn't standard cp1252 and this can matter.


That's just describing how it will handle erroneous data. If you give it cp1252 text, it will work exactly as expected. If you give it certain invalid characters, it will treat it as those code points.


There's nothing eronneous about u0081. MySQL's encoding functions are documented to behave in particular ways when a given character cannot be represented in a given encoding, and its handling of "latin1" violates that documentation unless you take into account the nonstandard extra mappings MySQL uses.


1252 does not have a character assigned to 0x81. If you store 0x81 in the database as 'latin1', then it needs to error or do something weird. If you store u0081 in the database, that's a control character in the C1 block that doesn't exist in latin1, so it needs to error or do something weird.

If it violates the documentation about invalid characters, that's a problem, but that's not latin1 being incompatible with 1252.


If mysql handled columns declared as "ascii" as utf8 that would be "compatible" in the sense you're describing. I think it would be fair to say "mysql ascii isn't actually ascii" in that case though.


I see your point, but in that case I certainly wouldn't say it mishandles ascii. Defining some undefined behavior is very different from changing existing behavior.

Plus that's a different scale of change because it's going from fixed width 7-bit to a variable width scheme.


> Why a shop owner would store the price and currency symbol in an item's description is beyond me

If you've worked in the POS world you know exactly why: bad software. It either doesn't support the use case the owner has or isn't easy to use.


Shop owners (almost) invariably just bash their price lists into a spreadsheet or a Word file. A lot of this can be reasonably blamed on their only tools being hammers, so to speak.


> Why a shop owner would store the price and currency symbol in an item's description is beyond me...

Have you never watched Pulp Fiction?

https://www.youtube.com/watch?v=zoJAc_aSM7E

This video will explain exactly why a vendor might put a price in the description field.


Indeed, latin1 is currently the default for MySQL. But it changes to utf8 in MySQL 8.0 (in development).


Hopefully to utf8_mb4 instead of the broken one ...





Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: