Hacker News new | past | comments | ask | show | jobs | submit login

> The standard needed to specify those points

Yes, the problem is those which were specified.

> The calling conventions, the binary format and UTF-16 are widely supported across platforms and known.

The calling conventions are W64's, which is by far the least common across platform (especially since the AMD64 ABI defines calling conventions)

> UTF-16

Is a shit encoding and there is no justification for such a technical low-level API to use it[0].

> UTF-16 is used in different places as well, notably in Java.

Not a justification (unless you also assert that most UEFI applications will be written in Java), explained by Java (and Windows NT) having standardized their Unicode implementation between Unicode 1.0 and Unicode 2.0, 20 years ago. UEFI was not created 20 years ago, that it bears the trace of that event is a disgrace.

Plus technically Java does not use UTF-16, it's UCS2. Support for astral plane codepoints and APIs taking surrogate pairs in account were only introduced in JDK 1.5, separately from the old UCS2 APIs (including char, which remains a code unit)

> It also has the major advantage of being a fixed width encoding over UTF8.

You're high as a kite[2], UTF-16 is no more fixed-width than UTF-8, and UTF-16 implementations are far more commonly broken than UTF-8 ones as 1. astral planes support is rarely exercised and 2. they have to contend with the UCS2/UTF16 duality.

> All in all: It's probable that Microsoft influenced the standard here, but there's also sufficient technical reasons for those choices.

I assert that there is no technical justification whatsoever for the choices of calling conventions and encoding beyond "it's how Windows works and we couldn't be arsed to do things correctly".

[0] the only potentially acceptable justification for UTF-16 is low-markup asian script[1] with a vast majority of codepoints living in the upper segments of the BMP (U+0800 to U+FFFF) and encoded as a single UTF-16 code unit but as 3 UTF-8 code units. This is pretty much irrelevant to UEFI.

[1] mostly, some african and american scripts also live there

[2] no seriously, personal attacks may be bad but when you assert UTF-16 is a fixed-width encoding — even ignoring that Unicode makes the concept laughably useless — you're in a separate plane of reality than just about all of humanity.




Plus technically Java does not use UTF-16, it's UCS2.

Well then I've got good news! EFI, too, also means UCS2 when it says UTF-16 in the spec.


... I've come to assume that anybody who says they're doing UTF-16 actually means UCS2 with the wrong endianness and just wanted the easiest way to support Chinese characters.


Well I guess if it's going to be garbage anyway you might as well go whole-hog and ensure nothing is salvageable.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: