Hacker News new | past | comments | ask | show | jobs | submit login

Last time I checked you could create a directory with a name "like:this" from Linux or something, and Windows would display it but not allow you to access it in any way.

Have they fixed this? (Just pure curiosity)




Interestingly, when you create a file name from the WSL shell using reserved Windows characters, those characters will get mapped to unicode codepoints from the private use area when viewed in Windows. So for example a colon (\u003A) will show up in Windows as \uF03A.

This means you can create a filename in Windows using the character \uF03A, and that character will show up in the WSL shell as a colon. You can even do the same thing with "regular" characters, e.g. using \uF061 instead of "a", and produce a filename that appears to be ASCII in the WSL shell, but is not actually accessible.


Mac OS X maps filenames between slash `/` and colon `:` to match with UFS, the colon used to be the path separator on old MacOS, which HFS+ inherited.

Try it:

    $ touch foo:bar
    $ open .
    # Look at the filename in Finder.
    # Try it the other way around by saving a file
    # with / in TextEdit, then ls in Terminal.
[0]: https://stackoverflow.com/questions/13298434/colon-appears-a...


This is compatible with the mechanism Cygwin came up with: https://cygwin.com/cygwin-ug-net/using-specialnames.html#pat...


Oh dear. How fun!

These little sins are truly terrible.

I prefer openat(2) and friends for dealing with ADS. It means you have to have specialized programs to deal with them from shell scripts, as open(2) and friends provide no naming conventions for getting at ADS. This approach is much much safer than the WIN32 approach of using :$HACK_ME_PLEASE and :$THANK_YOU_MAY_I_HAVE_ANOTHER.

Note that Linux has openat(2), but it doesn't support ADS. Solaris/Illumos does. Linux has xattrs, which, like ADS in Solaris/Illumos, requires separate system calls to access -- no open(2) naming conventions there.

Both, ADS and xattrs are extremely useful. Let's say you need to associate some metadata with a file, but you can't change its contents' format. You could use some separate file that goes with it, but now you can't atomically rename(2) the thing... But with ADS/xattrs you just attach those to the file, and then when you rename(2) the file the ADS/xattrs go with it atomically. (Yes, you could rename a directory, but it doesn't quite have the same semantics as renaming a file. In particular, rename(2) of a directory won't rm -rf the target if it exists.)


As a teenager in the 90s I remember using a trick where if you created a directory on the command line with the character inserted by pressing alt-255 (I think) in the name you couldn't access it from the Windows GUI. Very useful for hiding stuff from my parents. I imagine this was FAT32 but can't remember for certain.


Used to hide games on my high school's network using that trick.


Alt-255 was/is ascii null if I remember correctly


No, Alt-255 corresponds to ASCII character 255, which is a non-breaking space character.


Alt-255 corresponds to code-page 437 255, which is ascii 160


Unicode 0+00A0 (decimal Unicode codepoint 160). ASCII only defines 0..127.


No, ASCII stops at 127. Everything above that is extended ASCII specific to a code page. As you say, DOS's code code page is 437 [0], and character 255 is a non-breaking space on code page 437.

The problem you're having is that you're using Windows, which uses either Windows-1252 or UTF-16-LE (Windows Unicode). On those code page, non-breaking space is 0xA0 (160). Windows is converting extended ASCII 255 from code page 437 to either Windows-1252's [1] non-breaking space, which is 0xA0 or 160, or UTF-16-LE's code page [2] where non-breaking space is 0x00A0. The glyph is silently translated.

However, even on Windows 10 you can still get to a place where you're using the original code page of 437.

Fire up cmd.exe. Run "chcp" and it should tell you that the active code page is 437. Run "copy con C:\text.txt" to create a new file from the console input. Type <Alt+255>. Press Enter. Hit F6 or Ctrl+Z and hit enter to finish the file. Now type "powershell.exe -Command "[io.file]::ReadAllBytes('C:\text.txt')"". Your output will be:

  255
  13
  10
That's code page 437's extended ASCII non-breaking space followed by carriage return and line feed.

Now run "type text.txt". You'll get blank output lines.

Now run "powershell -Command "Get-Content text.txt"". Your output will be "ÿ" which is Windows-1252 or Unicode character 255 or 0xFF. Even if you try "Get-Content text.txt -Encoding ASCII" you won't get the same output as you do from cmd.exe because PowerShell's ASCII encoding is actually code page 20127 (7 bit ASCII), not code page 437.

Now try to run "powershell.exe -Command "[int][char]'<Alt+255>'"". You'll get 160.

That's also why you can fire up PowerShell and type this:

  '<Alt+255>' -eq '<Alt+0160>'
And the result is true. (Note: Alt codes with a leading zero indicate a Unicode alt code.) Windows is translating the glyph from the alt code for you in the background. You have to use a program which doesn't try to do that for you.

[0]: https://en.wikipedia.org/wiki/Code_page_437

[1]: https://en.wikipedia.org/wiki/Windows-1252

[2]: https://www.fileformat.info/info/unicode/char/00a0/index.htm


What is there to fix? The colon has a special meaning to the Windows NTFS driver - if you use that in a filename then that file is inaccessible with that driver.

If you put a null or forward slash in a filename on an ext2 filesystem then that file becomes inaccessible on linux, should they somehow "fix" that?


Of course, they should. It may be hard to fix it, too hard for it to be worth the disruption in stability, but it is a design error that breaks user expectations and can cause severe issues when interoperating with other filesystems.

And with NTFS, this is much more severe than your example with ext2.

For one, ext2 is a niche filesystem at this point, users practically never interact with an ext2 filesystem. It's not the main filesystem used on Linux.

And secondly, as a user, I have created files with '/', ':', '*', '?', '"', '<' or '>' in the name ('\' and '|' are also forbidden on Windows, but I admittedly have not yet needed those AFAIK).

Not being able to use these characters limits the ways I can express myself in what a file contains.

For example, I've had to rename a list with different levels of grouping from

    "List of members: City > Lastname > Firstname.pdf"
to

    "List of members - City, Lastname, Firstname.pdf".
And I'm still not convinced the recipients of that file actually understood that the levels of grouping were listed in the filename in the order from biggest to smallest grouping level.


Presumably, fsck-ing the NTFS volume (or the Windows equivalent) should canonicalize the colon into some escaped/mangled form, since it isn't valid for it to be there in the filename.


They di seem to have fixed this, at least as far as WSL is concerned. If I open a bash shell on Win10 and do `echo foo >/mnt/d/temp/like:this` then I can open the resulting file in Notepad and see the content `foo`. `mkdir like:this` works too. The file or directory name does appear a bit mangled in Explorer, but still works (and still looks like a colon from inside bash).

I think you are right that this used not to work in earlier versions of WSL. You could use colons within the WSL home directory, but not on the mounted NTFS drives. But it seems sorted now.


I can't view the post right now, so no idea about the content or if anything is fixed, but that colon syntax is supported by NTFS and called "alternate data streams". Granted the feature isn't exactly prominent and tools have various support for them, but it is getting better. Powershell's Get-Item has a -stream option, for example.


I think that was outlined as one of the very first tricks in the post. I don't have a system available at the moment to test that with. Also, it explores doing the same thing without a Linux system, and shows the various ways such files/folders can/cannot be accessed.


You can still put in backslashes from Linux as well. Technically NTFS allows for this (Win32 vs. Unix-y namespaces).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: