Hacker News new | past | comments | ask | show | jobs | submit login

The BOM is optional in UTF-16 too:

  The UTF-16 encoding scheme may or may not begin with a BOM.
  However, when there is no BOM, and in the absence of a 
  higher-level protocol, the byte order of the UTF-16 
  encoding scheme is big-endian.
(D98 in http://www.unicode.org/versions/Unicode6.1.0/ch03.pdf)



Oh jeez, that's just begging to get mangled by a transition between protocols, e.g., HTTP PUT followed by rsync (to a box which doesn't know the PUT was UTF-16LE).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: