> *in cases where you know that the string contains only characters from some si...

mjw1007 · on June 26, 2022

That's true, but in most of those cases you don't need to be able to use numeric character-count-based indexes into the string (which is what the article is arguing that you don't need).

You'd typically be happy if the parsing function that you're using to find the location of (say) each comma in the string gives you an opaque token for each such location, with a way to use those tokens to get slices of the string back.

So in practice we can use byte offsets into a UTF-8 string as those tokens, while the programmer doesn't really have to care that that's what they are.