This is a great example of something else about software. As software grows in usage and use cases, it starts bumping up against edge conditions which need to be handled for various reasons.
Cargo now is becoming stronger and more stable because of bugs like this being discovered. All software goes through this growth cycle. It's great to see these things worked out in the various projects that support Rust.
There is another point here though; anytime the question comes up to just rewrite a piece of software, throw out all the technical debt, it's not as straightforward as it seems. Remember, together with that technical debt lies a lot of valuable learnings written into the code. I haven't worked on Windows directly in years, but I never knew that NUL was a reserved word as a file. I would, and probably still will make this mistake in the future.
Which makes me wonder, has anyone written a file name validation crate that guarantees that you're not writing to any reserved words on a filesystem of the host OS? A quick search of crate.io doesn't turn anything up.
It also shows how necessary it is to have some sort of deprecation process. Maintaining nonsensical landmine features for compatibility with an operating system released 36 years ago is putting the interests of MS's lazy long-term users ahead of the interests of its current users. Even if MS maintained a policy of only removing functionality after a 10-year deprecation period, this "feature" would have been gone long ago. Transitions must be orderly, but they should still happen.
It's nice that Rust's toolchain is better able to live Windows crazy ecosystem, but that doesn't make Windows any less crazy.
If you think Microsoft supports features long-term out of lazyness then you haven't been paying attention. It's a very deliberate choice that helped them grow their business and keep customers.
Transitions are nice from a development perspective but I can guarantee you'll never hear someone who uses your library happy that they need to rewrite parts of it.
Also Windows doesn't have a monopoly bizarre filenames/features/etc you can find plenty of things in the nix family as well.
Lastly, Rust is one of the few projects I've seen that has phenomenal Windows support. It's something that's really appreciated and is going to help them capture markets that other software won't.
In the order I happen to think of them: Filenames may be straightforward on the filesystem level, but a lot of UNIX programs do weird things with them. Many programs use "-" to mean STDIN or STDOUT as appropriate where it is used. Bash has a somewhat ill-conceived feature where it synthesizes a /dev/tcp/$host/$port filesystem that will write to TCP or UDP sockets. Most people don't know about this, a few people think it's a UNIX feature rather than a bash-ism.
The fact that multiple /s will be normalized to be the same as one sometimes trips up security code or code trying to validate that some particular file isn't used (i.e., checking that the filename doesn't start with /dev or a list of other blacklisted directories will fail if the user passes //dev).
Symlinks! Oh, gosh, symlinks. Were this not a stream-of-consciousness dump they probably should come first. You can do terrible things with symlinks, like upload a tarball or zip file that creates a symlink to an arbitrary location in the system, then use that symlink reference as a directory reference to plop a file down. (Some archivers prevent this, others don't.)
Also, /dev is just a convention, it's possible to place device nodes anywhere you want.
You can also pretty much mount arbitrary things in arbitrary places via bind mounts. Hard links can also cause some fun with code that assumes file systems aren't cyclic. Windows technically has a lot of these features but they're harder to get to and less well known whereas UNIX uses the various links in base Linux installs and they're readily available.
The code in Research Unix V8, V9, and V10 is available. Alcaltel-Lucent made them public a couple months ago[0]. Here are the relevant URLs and file paths within the archives. I already had them on my hard drive and it was easy to grep them. I removed a few columns from the output of tar.
Actually, the stronger case is that the feature should be removed from bash. While it's hard to point at a specific security guarantee that UNIX makes that bash violates by making TCP available via the psuedo-file system, it is a non-trivial ambient contribution to general insecurity for UNIX systems. (People itching to reply to that sentence, please parse it carefully first; I chose the adjectives quite carefully. In particular, I did not just call UNIX "generally insecure".)
Sometimes you don't get to "run bash", but just pass certain parameters, or add things on the end, or whatever other monstrosity an application programmer comes up with to use bash to do something. This allows you to do things like potentially redirect files to sockets of your choice, where you might exfiltrate data, or provide unexpected data to internal processes.
You would be correct in then pointing out that if you pass user parameters to bash without treating them as carefully as you'd treat radioactive waste, you're asking for trouble, and that /dev/tcp doesn't offer much than the various "nc"s don't. That's why I was sort of non-committal about condemning them; it's not like they are a massive breach of security. It's just one more thing that can surprise people if they're trying to lock a system down, and that's already a pretty long list. And since it's not clear to me that it could ever be a short list, that's why I wanted to emphasize I wasn't trying to condemn UNIX. It's just that it's a feature that doesn't add much but complexity to bash, while not really offering any functionality that isn't better done with nc or something, and on the balance, probably ought to just be removed from an already complicated and security-sensitive program.
I don't know about radioactive waste, but surely allowing untrusted user input into /dev is unrealistically sloppy. (Famous last words?)
I agree that having this as a bash feature versus just using nc doesn't seem to buy much. But I think having these in the actual file system is useful. So why not do both: expunge them from bash, and get them into /dev (or maybe /net, or wherever they belong).
Symlinks are a poor example, IMO. Yes, they need to be carefully handled for security reasons. But they also offer great flexibility that is actually widely used, and that wouldn't be available through other mechanisms.
To paraphrase: Windows NUL is a poor example, IMO. Yes, it needs to be carefully handled for reasons. But it also offers great flexibility that is actually widely used, and that wouldn't be available through other mechanisms.
It doesn't offer great flexibility though. It has characteristics that made it useful on ancient versions of DOS and now it only offers annoyances that we have to deal with.
Just look at Mac OS X, which is also from the Unix family. It has the feature of decomposing precomposed characters in file names, so if your software writes a file named "café" (caf\xc3\xa9), and later lists the directory, it will find a file named "café" (cafe\xcc\x81). That tends to confuse software which expects to find a file with the same name after creating it, like for instance git.
For a while, if you were in a team in which some developers were on Linux and others were on Mac OS X, and someone on the Linux side checked in a file named with a diacritic, on the Mac OS X side the file appeared to have been deleted (and a new untracked file with the "same name" appeared). Later git grew special code to work around this misfeature.
And yes, Linux has the "bizarre feature" of being way too permissive. A filename is a sequence of bytes of which only the null byte and the slash are forbidden, and only a single or double dot have special meaning; one can have files named with control characters, and/or with something which is not valid for the current character encoding (LC_CTYPE), leading to pain for languages which insist that a string must be always valid Unicode (this includes Rust).
But yeah, nothing compares to the madness that is forbidding simple names like "nul" or "con" or "aux" (alone or followed by any extension) in every single directory, made worse by the fact that you can create files with these names if you use a baroque escaping syntax (which is not available for every API), confusing every other program which does not carefully do the same.
And let's not forget about the fact that the file you just created might not be readable or writable the next instant, because some other process (usually some sort of "antivirus") decided to open it in a exclusive mode. I've seen several projects add retry loops when opening (or moving, or deleting) a file on Windows, to work around that issue.
> It has the feature of decomposing precomposed characters in file names
I was under the impression that the new APFS stopped trying to understand bytes in filenames at all, thereby switching from 'confusion' to [tableflip] as a policy (which is likely an improvement, but also amuses me on the basis it's nice to know [tableflip] is about the only response anybody has to certain unicode-isms)
(note that rust just requires the built-in string type to be valid Unicode, you are free to manipulate other kinds of strings, which is exactly how the os string problem is solved. Also gives you a chance to explicitly handle the errors.)
And let's not forget about the fact that the file you just
created might not be readable or writable the next instant
because some other process (usually some sort of
"antivirus") decided to open it in a exclusive mode.
THIS. Spent quite a long time trying to reproduce a Windows-only bug with the old Rails 2 gem unpacker caused by exactly this; the code would create a directory "foo-1.2.3" and then immediately try to write files to it and fail because of an exclusive lock - on an empty directory.
Exclusive mode is useful when used for good reasons - i.e. to get snapshot semantics (no-one else can change this) while reading, or implement atomic changes (no-one else can see the change halfway) when writing.
The problem on Windows is that too many APIs decided that exclusive should be the default mode if none is specified - which is the safer choice in a sense that it gives the most guarantees (and the least surprise) to the caller, but arguably the adverse effects it causes on other apps are more surprising and harmful in the end.
Each OS has it's set of weird, broken and surprising behavior. Most of it in the name of backwards compatibility. There is a group of people that see one mess bearable while the all others totally brain-dead. There are other groups that have somewhat different opinion.
Everything sucks. Which one sucks less? I pick the one that I know more about.
Well, Windows technically supports files with the reserved names - if you use the right APIs - but they break many programs including Explorer. You could make an analogy to Unix filenames with spaces or newlines, which can be created but don't work properly with some tools. (For spaces, try 'make CFLAGS="-I/path/with spaces/"' - there is no way to escape it or otherwise make it work. Newlines break a lot of stuff.)
That doesn't make a difference - regardless of where you put the opening quote, make gets the string "CFLAGS=-I/path/with spaces" as argv[1]. The quotes do help, as otherwise it gets split up into multiple arguments to make.
But actually, I was wrong - GNU make passes strings to execute to the shell, so you can use nested quotes: CFLAGS='"-I/path/with/spaces"'. Not sure why I thought differently. The shell itself doesn't work this way, though: when it splits a variable into multiple arguments, it just splits by spaces rather than doing any fancier processing. So there are issues with shell scripts.
The windows command line client for PostgreSQL used to produce confusing errors on my machine because my development source code directory happened to be called "C:\dev"
What constitutes "bizarre" depends a lot on what your prior assumptions are.
But it isn't just a 10 year old feature no one uses. It seems if you write to the NUL file in any directory it still works the same as writing to /dev/null today. There might be scrips written yesterday that rely on that behavior.
Joel Spolsky famously praised this policy of backward-compatibility at all costs which he called "The Raymond Chen Camp"[1]. Many agreed with him, but I always thought that Microsoft compatibility ideals were too radical to be real wisdom. At some point the list of features you try to keep compatibility with grows large enough that the Raymond Chen Way becomes unmaintainable.
The received wisdom of the 90s is wrong. Most users don't care about compatibility, as Apple's success has clearly shown, and most companies are now out following the Apple road.
Large enterprise care about compatibility, and they pay a lot, but this is not a forward-looking market. They'll keep using buying new versions of your software because of the compatibility, but if compatibility is the only story you have to offer, you'll slowly lose that market.
I completely agree with you that Microsoft should have had a strategy for deprecating these features back in the 90s, when they were already old.
In this specific case of outdated filename restrictions, you could start with what they already did:
Windows NT 3.5 - Allow accessing all filenames with a special prefix (which they already did).
Windows NT 4.0 - Make it easy to migrate to sane filenames by providing an opt-in per-process flag that makes all APIs use them by default.
At this point they can easily dogfood and migrate all Microsoft software to the new APIs, so you would be able to delete these pesky files in explorer.
Windows 2000 - Make the new API flag default for all versions compiled with the latest version of the Windows SDK.
Windows XP - Make the new API default for any app without a special entry in the compatibility database.
Somewhere along the road, batch files (which is the only place where compatibility with the old filenames was necessary) could be easily made compatible by modifying the batch parser to replace redirections to NUL with redirections to \\?\devices\null or something akin. You may see some breakage in scripts which use NUL and CON in non-standard way (e.g. as an argument), but the migration pain won't be huge, and you could still save an old script with a compatibility flag.
Microsoft obviously didn't take that way, and yeah, all the batch files written back in 1981 may still work without hitch, but newer things keep breaking in strange ways.
Newer things only break in strange ways because they're broken. So rather than break the old stuff, why not fix the new stuff?? - because after all, approximately the only criticism you can't level at the Windows NUL/PRN/COMx/etc. special names is that they're some kind of surprise that appeared suddenly out of nowhere! It's been this way for a very long time.
(I wonder if part of this is the rage of Unix fans discovering that portable means actually, you know, making an effort... and that there's more too it than just checking it builds on x86 Debian as well as x64 Ubuntu...)
You can't just say it's been that way for a long time so it's acceptable, because the industry (and for that matter the Internet) is getting fresh new people every day. You can't expect them not to be surprised, and you can't just arbitrarily require them to know something they haven't stumbled upon until after it caused problems.
> Most users don't care about compatibility, as Apple's success has clearly shown
Apple is not exactly big in the same markets where MS is big, e.g. enterprise. So while I agree that "most" users don't care, the very few who do care might be important customers for MS.
> I haven't worked on Windows directly in years, but I never knew that NUL was a reserved word as a file.
It's not. It's a reserved word through the MS-DOS file redirection facilities. If you use the newer file API or you use the \\?\[path] convention; the reserved words are not an issue and you can create files named for them.
You have to use both, actually. The Unicode API and the \\?\ path prefix. It also astonishes me sometimes how many applications nowadays still choke on Unicode paths.
It's swings and roundabouts. Say I create an app or tool that happily resides in c:\proc\whatever then I turn my attention to creating a Linux version and specify /proc/whatever then ... boom? Sure, it's maybe a convoluted example, but for the creator of this "nul" package they got burned by something that's actually common knowledge in the MS world.
I think you need to be a wee bit pro-active and take a look at your potential deployment targets and try and guard against these types of naming issues. Unix and Linux aren't the only (one true) operating systems in the world.
The right solution, actually, is to use a library that gives you the right path for the thing that you need to do depending on the conventions of the platform. For example QStandardPaths in Qt: http://doc.qt.io/qt-5/qstandardpaths.html
QString appDataDir = QStandardPaths::writableLocation(QStandardPaths::AppDataLocation);
// ~/Library/Application Support/<APPNAME> on macOS
// C:/Users/<USER>/AppData/Roaming/<APPNAME> on Windows
// ~/.local/share/<APPNAME> on Linux
Still doesn't solve the case where the developer just wants to slap something in the root of the C: drive on windows from the outset (Cygwin I seem to remember defaults to c:\Cygwin for example...again slightly convoluted).
Also those locations are user specific, there's nothing there to support the use-case of an app that's available to all users, or might just be a system service (/daemon).
And CON, as Macha mentions in a sister comment. An idiom I remember from old times in DOS, for quickly writing some contents into a file - equivalent to `cat > myfile.txt` on Linux:
I've done con.py on a Linux system a few times for net code in different projects and then realised I couldn't clone it on windows. It comes up infrequently enough that you can forget
Other magic aliases include CON, PRN, AUX, COM1-9 and LPT1-9. They are aliased to respective devices in Win32 namespace "\\.\". COMs and LPTs above 9 don't have aliases in global namespace and must be accessed explictly in Win32 namespace, eg. "\\.\COM10" (which itself is symlink to NT native "\Device\Serial9")
In fact, it is possible to create files named NUL, COM1, etc. using \\?\ (eg. "\\?\C:\NUL" is valid path) prefix which disables parsing arcane Win32 magic files. Unfortunately these files are causing strange behaviour in applications that don't use that prefix, Explorer included.
As the blog post mentioned, we solve the issue by deleting the crate from the package repository and reserving these problematic names. The incident lasted about 2 and a half hours.
Crate names have to be one or more valid idents connected by hyphens, so no other clever names like `/home` would be possible to upload. We already had some crate names reserved and we just needed to add these to the list.
And because it was a weekend, much of that time involved me trying to figure out who had the proper credentials for crates.io, and then texting those people until one of them responded. :)
Reserving just the crate names won't cover your bases, though, no? I'm not clear on what exists as part of a crate—but if there's any user control over the filenames of the contents of the crate (e.g. if the crate's source code is in there) then any crate might contain a file named e.g. "nul.rs", triggering the same problem.
I think you're misunderstanding the problem described in the OP. When you build a project via cargo using the default settings, it fetches the git repository at https://github.com/rust-lang/crates.io-index to enable it to resolve dependencies locally. This git repository contains metadata for each library on crates.io, where the metadata for a given library is located in a file with the same name as the name of that library. When the OP uploaded a library whose name was an illegal filename on Windows, git unexpectedly choked when updating the local crate index repo, impacting all Windows users.
It sounds like the concern you're describing is a different matter. It's likely true that if the source of a crate contains a file named "nul.rs", cargo on Windows will fail if it attempts to git-fetch the source (unless you're using Linux Subsystem for Windows, anyway). While this would indeed be a problem, it would only affect users who elect to use specific libraries, rather than serving as a denial-of-service for every Rust user on Windows.
Ah, I was just misunderstanding the format of the repo. I was assuming it was more similar to a ports tree, where each library is specified in the index using a directory which can have random files sitting in it, like Makefiles, .patch files, etc. along with a metadata spec file.
Looking at the repo you linked, there's no allowance for that, so at least in this case you should be safe.
I was going to ask how a remote pckg could do that. Not knowing how rust works (or package managers apparently) I didn't understand how it could be widespread. Makes sense, damn; that's substantial.
I'm not sure how other package managers do it (it should be noted that this approach was designed to avoid some problems that other package managers have encountered), but there is still room for improvement here: ideally, I think we'd be hashing crate names rather than storing them verbatim on the filesystem, to enforce more uniform distribution in the trie.
There was a bug in Windows 95 (98 too?) where if you tried to open 'nul\nul' or 'con\con' etc, it would BSOD instantly.
Provided lots of drive-by fun in computer labs... (got really good at typing win+r con\con)
What I don't understand is why cargo fetches the entire crate list and create a directory for every crate (even if you never install it). Why not just have a single file with the entire list? The issue mentions they use a trie, but why use the filesystem as the trie store? Why not have a single file?
The original authors of cargo, wycats and carllerche, aren't around today to ask (it's a weekend!) though IRC attempted to answer regardless:
<foo> to keep the number of files in a single directory down
<foo> tools become unhappy with hundreds of thousands+ of things in a single dir
<foo> as do filesytems
<bar> why not just a flat file
<bar> or sqlite or whatever
<qux> right now it uses git's deduplication feature
<qux> aka, when downloading updates you only download the objects that changed
<qux> but it mostly works on a per file basis
<qux> so git hashes each file and if the hash didnt change, it doesnt download an update
<qux> but if it did, it treats it as completely new file, even if its just a little change
<wycats> Because of this: https://github.com/CocoaPods/CocoaPods/issues/4989#issuecomment-193772935
<wycats> I ran some scenarios against huge repos when I first worked on cargo
<wycats> Trying to minimize the cost of operations
<wycats> I landed on the current strategy, and GitHub in the above thread more or less endorsed what we were already doing at that time
<wycats> Also see https://github.com/rust-lang/cargo/issues/2452
It's still fundamentally a waste of disk space. On my system, as of a minute ago, ~/.cargo/registry/index took up about 200MB for three different checkouts (for some reason). After deleting that and running `cargo update`, only one of them is recreated, 104MB. Out of that, 57MB is the JSON files and 47MB is git history. But if I just concatenate all the JSON files, the result is only 33MB, and after gzipping, 3MB. Hypothetically, a non-GitHub-based Cargo could store only those 3MB (using binary deltas to avoid resending it on every update), or even 0MB if it just relied on the server to resolve dependencies.
Once you've gzipped to achieve that 3MB storage, binary deltas are useless. Perhaps it the data could be (almost certainly is) transferred gzipped, then expanded to the full 33MB size so binary diffs could be applied to it later, but setting up a system to do binary diffs is a lot of incidental complexity: xdelta is a surprisingly complex format, and bsdiff is really tuned for executables, not arbitrary content (and is pretty complex too).
It sounds like the biggest win would be for cargo to keep using git, but clone the crates.io index as a bare repository rather than checking out the plaintext content. Then it would only take 47MB by your count, which is pretty close to 33MB, and you could still get out the plain content with `git cat-file` and friends.
Technically, the Cargo /already/ bundles a full copy of libxdelta as part of libgit2 (in addition to the separate Git binary delta algorithm); I just checked using nm that it's actually included in the binary. It could probably be removed, but, well, it probably adds a lot less than 44MB to the binary size :)
Alternately, since JSON is text, I suppose you could just ensure that whatever emits this hypothetical merged JSON file puts newlines between different packages' entries, and then use a regular text diff (on the uncompressed version, of course). But reading 44MB of JSON isn't instant; it would probably be better to switch to either a binary format, or even something silly like a sorted list of JSON strings separated by newlines.
There would be some incidental complexity around generating and applying the diffs… you'd probably want to precalculate them on the server side, but it could be rather expensive to, on every change, calculate a diff between the current version and every previous change. Instead, you could have daily checkpoints: each day the server would make a checkpoint and calculate a diff to the last N checkpoints; on every update the server would recalculate the diff between the latest checkpoint and HEAD. The client would store both HEAD and a reverse diff to the latest checkpoint (or just store the checkpoint separately and waste a few MB), so when it updates, it could revert to that checkpoint and request the diff from there to the new latest checkpoint; it would also request the diff from the checkpoint to the new HEAD. If its checkpoint is too old then it would just redownload from scratch.
Overall, not a trivial change, but probably not too hard either.
apt-get does something vaguely similar with its pdiff files.
I know that some of the people who worked on cargo originally had experience with other package managers - mainly bundler - and I believe bundler used to use a single file, but ran into performance issues.
Windows is, for better or worse, fiercely proud of its backwards compatibility. So it's not so much a stupid Windows design choice as a 'stupid' DOS 1.0 design choice (and not even so much a choice as simply a quirk of how the DOS 1.0 file system worked) that Windows doesn't want to break backwards comparability with.
I agree with parent that it's a bit crazy; but I wouldn't be as critical. to your point; presumably even if they dropped DOS support something between DOS and now likely relies on that. It's a fine line.
Stupid design choices on Windows are always justified by "backwards compatibility". What I don't understand is why an app for DOS doesn't work on Windows 10. Or a lot of Windows XP software that doesn't work anymore. Heck, anyone remember of Windows Vista and all the mess with "Compatibility Mode" that never worked?
Hehe, Linux (and Unix in general, I guess) is just a bundle of text files and C programs, held together by shell scripts and eternal vigilance. It's very impressive and very disappointing at the same time...
People could also stop using CreateFile without a \\? prefix and all the problems would go away. There's not even a MAX_PATH limit on any NT based Windows version if you do that.
Except that only works for absolute paths. And then also changes other semantics that you might expect to be there - e.g. it disables handling of .. and . to mean parent and current directory (which is valid even in absolute paths, and often useful).
It comes down to tutorials a lot of the time - does any tutorial in C, C#, Python, whatever mention that you should probably refer to paths in Windows like that?
C# can't because .NET has outright banned the usage. But now the length restriction has been lifted, at least in .NET Core. I think you still need Windows 10 with path length limitation disabled for the full .NET Framework. It's actually a bit confusing because the limitation was lifted in .NET Core first by using \\? prefix internally, but then not long after Windows 10 introduced the ability to remove the limitation for the api without the prefix, but it must be enabled by group policy or registry patch...
Anyways, it shouldn't really have mattered that Microsoft didn't care much for this for a long time. If everybody else had just been using the prefix since Windows XP came out then Microsoft would soon have been forced to change their own software as well.
For example with Python and non-MS C (e.g. clang and gcc on mingw), they should simply have made the standard libraries implement the file api using the prefix. Of course if you have a need to call CreateFile directly you would still be on your own, but if everything else created files you couldn't interact with then you would probably fix the problem.
To me, it sounds more like a problem with Rust that a single misnamed package could bring down the whole system. It's essentially a SQL injection attack (without the SQL).
Yep, just not allowing (directly) user controlled file names seems ideal. Maybe just hash the crate names and use the hash as a file name? No more silly restrictions due to platforms. Eliminates issues with some file systems having a length limit too.
In the Mac System 7-ish days, people used to earnestly warn each other not to name a file '.Sony' (a special name reserved for the floppy driver) as it supposedly trashed your HD. Although I've never heard of anyone reproducing it.
Every MS-DOS programmer of old knows about nul, con, and the other reserved names. Those might come from CP/M actually (so are even older), and Atari TOS had them as well I believe.
I was working on a video project for a local comics convention, and named the project file "con.proj". That file hung around until I upgraded my hard drive because no file manager could delete it.
It's very tricky to do cross platform file handling stuff, and only the most mature projects have ironed out this. Just look at your pet project and see if it handles
- Windows and unix line breaks in text files
- Windows and unix path separators
- BOM and non-BOM marked files if parsing UTF
- Forbidden filenames such as in this article
By "handling" I mean it should accept or fail nicely on unexpected input - e.g. say that line breaks should be unix style, or paths should be backslashes etc. Very few projects actually do this well. Even fewer will do even more complex things like handling too long paths with nice error messages etc.
Here is a character encoding issue that I ran into about a year ago.
git does not support UTF-16LE[0]. The result is that UTF-16LE encoded files will be mangled[1] by the line ending conversion. There is at least one generated Visual Studio file (GlobalSuppressions.cs) that is saved in UTF-16 by default.
Since Windows 10 now comes with an official Linux subsystem, why not just use POSIX APIs and conventions everywhere, and not bother with Windows-specific code if possible?
For one, because the Linux subsystem is an optional install. If you're making anything user-facing you can't rely on it being there - it's really a tool for developers, not end-users.
Depends on what type of application you are making. For a library that can be used in a "real" graphical Windows application, you can't make a posix type shortcut.
I think "if possible" is (at least still) very rarely the case that it is.
Because this leaks to the user. For example, you'll be dealing with paths like /mnt/c/..., which, if you surface it in your UI, will rather confuse someone who expects C:\...
And speaking of UI, that's one major hurdle right there.
While I know nothing of Rust, Diesel, or CrateDB, I do know that Windows uses a case-insensitive file system and this fix doesn't seem to take that into consideration. However, the author of the fix does note:
> I believe crates.io's namespace is case insensitive let me know if that's wrong
Not quite. Windows uses a case insensitive API on top of a case sensitive file system.
FUN FACT: As of 2017, Windows 10 is (partially) binary-compatible to Ubuntu Linux. Any application that was originally compiled for Linux will still be case sensitive when running on Windows 10.
I believe that git chokes by default on special file names on Windows, but I think there's a config variable that you can set to fix it. I don't know if libgit2 differs here.
Cargo now is becoming stronger and more stable because of bugs like this being discovered. All software goes through this growth cycle. It's great to see these things worked out in the various projects that support Rust.
There is another point here though; anytime the question comes up to just rewrite a piece of software, throw out all the technical debt, it's not as straightforward as it seems. Remember, together with that technical debt lies a lot of valuable learnings written into the code. I haven't worked on Windows directly in years, but I never knew that NUL was a reserved word as a file. I would, and probably still will make this mistake in the future.
Which makes me wonder, has anyone written a file name validation crate that guarantees that you're not writing to any reserved words on a filesystem of the host OS? A quick search of crate.io doesn't turn anything up.