Hacker News new | past | comments | ask | show | jobs | submit login

> Shouldn't rev(1) reverse graphemes instead of code points?

I honestly don't know. Is the intended purpose of this program to reverse bytes, or reverse characters, or reverse grapheme clusters? Or extended grapheme clusters?

There's no spec - this has never been in POSIX. What is your expected behavior? Is it mine?

For what it's worth, I needed rev recently, but forgot that it existed and did this:

    perl -ne 'chomp; print scalar reverse . "\n"'
If I need it to handle UTF-8 in a certain way, I can use pragmas to change it's behavior. (I'm pretty sure that this, as it is, will ignore the surrounding locale.)



UTF-8 can also be handled in several ways. There is a lot of middle ground between software that handles bytes, and typesetting software which is fully unicode-aware. The small unix utilities fall somewhere in there.


> UTF-8 can also be handled in several ways.

UTF8 can not be handled in several ways without breaking it, it's a pretty straightforward and strict encoding.


What would you expect the output to be when the input is:

    nôn
    nôn
Sending that through rev (with a UTF-8 locale), I get

    n̂on
    nôn
By the way, did you know the perl -l flag removes newlines on input and adds them for a print, so your command could just be:

  perl -lne 'print scalar reverse'
And, for a unicode-aware version:

  perl -CDS -lne 'print scalar reverse'




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: