This is misleading. UTF-16 doesn't actually provide random access, because codep...

fnord123 · on Aug 13, 2017

> random access by codepoint is generally a bad idea because codepoints and graphemes aren't synonymous.

That's a good point.

>If you need indices into UTF-8 strings, then you can record them by decoding the string.

Sure but the context is that I was responding to "If you use a rope or tree representation (like most editors do these days), random access is O(log n) at best, in many implementations typically O(sqrt n) or even O(n), whereas concatenation string is O(1).".

Decoding the string is O(n); not O(1) (if GP meant "concatenation string" as in a string where the text is concatenated into a single buffer - if GP meant "concatenating a string in a rope or tree is O(1)" then I'm off on a wild tangent).

beagle3 · on Aug 13, 2017

But you only need to decode it once (or otherwise receive those indices without even deciding) whereas random access to a rope/tree is always o(log n) or o(n).

Use case is everything.

fnord123 · on Aug 15, 2017

>Use case is everything.

amen.