Hacker News new | past | comments | ask | show | jobs | submit login

Alt-255 was/is ascii null if I remember correctly



No, Alt-255 corresponds to ASCII character 255, which is a non-breaking space character.


Alt-255 corresponds to code-page 437 255, which is ascii 160


Unicode 0+00A0 (decimal Unicode codepoint 160). ASCII only defines 0..127.


No, ASCII stops at 127. Everything above that is extended ASCII specific to a code page. As you say, DOS's code code page is 437 [0], and character 255 is a non-breaking space on code page 437.

The problem you're having is that you're using Windows, which uses either Windows-1252 or UTF-16-LE (Windows Unicode). On those code page, non-breaking space is 0xA0 (160). Windows is converting extended ASCII 255 from code page 437 to either Windows-1252's [1] non-breaking space, which is 0xA0 or 160, or UTF-16-LE's code page [2] where non-breaking space is 0x00A0. The glyph is silently translated.

However, even on Windows 10 you can still get to a place where you're using the original code page of 437.

Fire up cmd.exe. Run "chcp" and it should tell you that the active code page is 437. Run "copy con C:\text.txt" to create a new file from the console input. Type <Alt+255>. Press Enter. Hit F6 or Ctrl+Z and hit enter to finish the file. Now type "powershell.exe -Command "[io.file]::ReadAllBytes('C:\text.txt')"". Your output will be:

  255
  13
  10
That's code page 437's extended ASCII non-breaking space followed by carriage return and line feed.

Now run "type text.txt". You'll get blank output lines.

Now run "powershell -Command "Get-Content text.txt"". Your output will be "ÿ" which is Windows-1252 or Unicode character 255 or 0xFF. Even if you try "Get-Content text.txt -Encoding ASCII" you won't get the same output as you do from cmd.exe because PowerShell's ASCII encoding is actually code page 20127 (7 bit ASCII), not code page 437.

Now try to run "powershell.exe -Command "[int][char]'<Alt+255>'"". You'll get 160.

That's also why you can fire up PowerShell and type this:

  '<Alt+255>' -eq '<Alt+0160>'
And the result is true. (Note: Alt codes with a leading zero indicate a Unicode alt code.) Windows is translating the glyph from the alt code for you in the background. You have to use a program which doesn't try to do that for you.

[0]: https://en.wikipedia.org/wiki/Code_page_437

[1]: https://en.wikipedia.org/wiki/Windows-1252

[2]: https://www.fileformat.info/info/unicode/char/00a0/index.htm




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: