It's interesting how history seems to have repeated itself with UTF-16. With ASC...

mark-r · on June 20, 2016

It's worse. With UTF-8, if you're not processing it properly it becomes obvious very quickly with the first accented character you encounter. With UTF-16 you probably won't notice any bugs until someone throws an emoticon at you.

ridiculous_fish · on June 20, 2016

Unfortunately not. It's easy to process UTF-8 such that you mishandle certain ill-formed sequences that you are unlikely to encounter accidentally. IIS was hit [1], Apache Tomcat was hit [2], PHP was hit twice [3] [4].

UTF-16 has its own warts, but invalid code units and non-shortest forms are exclusive to UTF-8.

[1] http://www.sans.org/security-resources/malwarefaq/wnt-unicod...

[2] http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2008-2938

[3] https://www.cvedetails.com/cve/CVE-2009-5016/

[4] https://www.cvedetails.com/cve/CVE-2010-3870/