Hacker News new | past | comments | ask | show | jobs | submit login

MS popularized the idea of adding the UTF-16 BOM into UTF-8 to distinguish between UTF-8 text files and Windows code page files, or what they called "Unicode" and "ANSI." There's (nearly?) unanimous agreement among everyone else that BOMs in UTF-8 text are really stupid.

Note that the "BOM" in this case means storing the U+FEFF character in UTF-8 form (just as UTF-16 stores it in the appropriate endianness). This means that the result would be EF BB BF.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: