> then the textual output would actually be different, wouldn't it? Yes. The pro...

masklinn on Sept 30, 2016 | parent | context | favorite | on: Why and how you ought to keep multibyte character ...

> then the textual output would actually be different, wouldn't it?

Yes. The provided algorithm doesn't break UTF-8, but it will break the encoded text as it's not unicode-aware. I'm pretty sure it'll also leave through invalid UTF8 (lone surrogates or non-shortest encoding).