Perhaps my biggest critique is that crates.io has no namespacing. Anyone can just claim a global and generic package name and we mostly have to deal with it (unless you avoid using the crates.io repository, but then you'll probably have more problems...). Some of these globally-claimed generic packages are not really the best package to use.
Maybe it was a reaction against the Java-style reverse DNS notation, which is verbose and annoying, but a more GitHub-style user/group namespace prefixing package names would have been the nice middle ground.
I did some analysis on crates.io to find the top name squatters. Then I did some calculations and found that the top name squatter created their crates at a rate of about one ever 30 seconds for a period of a week straight.
I send the analysis to the crates.io team and pointed that they have a no-automation policy.
They told me that it was not sufficient proof that someone was squatting those names. That's my problem with crates.io is that they have a clear policy and they don't enforce it so all the short/easy to remember names for crates are already taken and there is nothing you can do to get it.
> Using an automated tool to claim ownership of a large number of package names is not permitted.
And
- Hey, I found that someone created crates at a rate of about one every 30 seconds for a period of a week straight.
- That's not sufficient proof of squatting.
Whoever answered that, was either supporting the squatter or explicitly in favor of the practice. I cannot conceive that someone would get that evidence in their hands, and in their right mind think that the claim is bogus. Hell, I'd even be willing to suppress the squatter with evidence of one new crate created every 30 seconds for one hour!
The only reasonable conclusion to make is that they didn't really care. But then don't save face and claim that you do. That's hypocrisy.
There's a secret effort in the Rust community to supplant Crates.io and create an entirely new package ecosystem with proper namespacing, security, and much better community.
Not naming names, but I know several people working to put Crates.io out to pasture.
There's a level of playing nice with them for the time being (eg. build reproducibility), but it's only KTLO.
Crates.io needs to die for Rust to thrive. They're a bungled, mismanaged liability. New code, new leadership.
Crates.io doesn't need to die necessarily. It needs some competition as a wake-up call.
Once a better alternative is out there, crates.io will either wither and die or improve. If it matches its competition in terms of quality and reliability, everyone is better off. If not, the alternative solution will take over.
I'm eager for this crates.io alternative to land, assuming they don't break too many projects in their improvements.
Drama avoidance and avoiding bikeshedding seem obvious. Much easier to present a working system than a design that will get nitpicked into irrelevancy.
That quite funny. Just like when some people formed a violent militant group to take down a violent tyranical dictatorship. Of course they would promise that they would absolutely de-arm right after the dictatorship has been overthrown, and immediately establish a peaceful democratic government with fair election. They absolutely would, would they? They would never turn into what they were formed to replace, would they?
This was my first thought too. And there are a lot of questions that will get asked, like, will all crate library names start being prefixed as well? So you end up with
use::bar; // changing to
use::foo::bar;
I assume the library names that can be overridden in cargo would still be accepted, and then it all gets a little messy. The transition would be very messy.
Write some automated analysis that looks up popular packages on npm, pub.dev, rubygems, nuget. "Rustify" the package names. Add to it frequently used words, maybe popular names, etc. Then, write a script that creates an empty package, registers a name on crates.io every thirty seconds, and then you have about 20k package names after a week that nobody can use.
> Maybe it was a reaction against the Java-style reverse DNS notation
I suspect it was less a reaction against anything and more just following the norms established by most other package managers. NPM, PyPI, RubyGems, Elixir's Hex, Haskell's Cabal... I'm having a hard time thinking of a non-Java package manager that was around at the time Rust came out that didn't have a single, global namespace. Some have tried to fix that since then, but it was just the way package managers worked in 2014/2015.
> I'm having a hard time thinking of a non-Java package manager that was around at the time Rust came out that didn't have a single, global namespace
The implication here is that namespaces in package managers weren't a known concept. Outside Java, NPM - probably the biggest at the time - not only supported them but was actively encouraging them due to collective regret around going single-global in the beginning. Composer is another popular example that actually enforced them.
Not only was namespacing a known widespread option, with well documented benefits, it was one that was enthusiastically argued for within the Rust community, and rejected.
NPM added namespaces in version 2, which was released in Sep 2014, just 2 months before cargo was announced. I don't remember anyone making a big deal about using scopes in NPM for several years after that, it was just there as an option. The announcement blog post of v2 only gives two paragraphs to scoped packages and explicitly frames the feature as being for enterprises and private modules [0]:
> The most prominent feature driving the release of npm 2 didn’t actually need to be in a new major version at all: scoped packages. npm Enterprise is built around them, and they’ll also play a major role when private modules come to the public npm registry.
My memory is that the industry as a whole didn't really start paying attention to the risks posed by dependencies in package managers until the left pad incident.
To be clear, I'm not saying that it was a good idea to not have a better namespace system or that they were completely ignorant of better options, just that they were very much following the norms at the time.
The left pad issue was kind of wild coming from the enterprise Java space. Supply chain attacks against open source software were already being taken pretty seriously, my last company had it's own Maven repository manager running that was used to stage and vet packages before they could be used in production.
I don't think the left-pad problem wasn't about package namespacing it was about the ability to unpublish packages as well the prevalence of micropackages caused by lack of a decent standard library.
Also npm's bad policy/decision to transfer control of package in the name of predictability(this should probably be avoided for packages that aren't malicious. You could argue for seizing broken/trivial and unmaintained packages that have a good name but even then it might be best to leave well enough alone).
I suppose you're talking about the original dispute which led the developer to unpublish his libraries (which npm stupidly allowed, and cargo didn't). There's a smaller chance of a company wanting a random package namespace then a package name but its not impossible (think Mike Rowe Soft vs Microsoft)
> I don't think the left-pad problem wasn't about package namespacing it was about the ability to unpublish packages as well the prevalence of micropackages caused by lack of a decent standard library.
It was "about" cavalier approach to the dependency supply chain. A dependency disappearing outright is just one of many failure modes it has.
> The left pad issue was kind of wild coming from the enterprise Java space.
This may be a little off topic for this comment thread but this is a little misrepresentative. Hosted private repos for enterprise weren't exclusive to Java at the time of left pad - anyone doing enterprise node likely had one for npm too & were probably well prepared for the attack. Such enterprise setups are expensive though (or at least take a level of internal resources many companies don't invest in) leaving the vast majority exposed to both js & java supply chain attacks even today.
At the time Nexus was free to self host, and that is what many smaller teams did just that to archive known good packages for the CI pipeline, I'm not in the Java space anymore so I don't know if that's still the case.
Yeah it's all a bit of revisionist history here, or I guess a bit ignorant. I had a friend who worked at Sonatype from pretty early days and they were, as I understand it, specifically working in this area of infrastructure for vetting, signing, license checking, etc. for corporate environments that needed to be extra careful about this stuff.
That crates.io launched without explicitly acknowledging this whole problem is either naivety or worse: already by then Java wasn't "cool" and the "cool kids" were not paying attention to what happened over there.
It's not that the industry wasn't paying attention until the 'left pad incident' -- that only holds if one's definition of "the industry" is "full stack developers" under the age of 30; I remember when that happened and I was working in a shop full of Java devs and we all laughed at it...
Maven's biggest problem was being caked in XML. In other respects it was very well thought out. That and it arrived at the tail-end of the period in which Java was "cool" to work in.
It's not revisionist history, the wording I chose was meant to acknowledge that there were segments of the industry that did take dependencies seriously. I'm very much aware that the Java world had a much more robust approach to dependencies before this, but "the industry as a whole" includes all the Node shops that were hit by leftpad as well as all the Python and Ruby shops that were using equally lousy dependency management techniques.
Rust chose to follow the majority of languages at the time. Again, as I noted in my previous comment, I'm not defending that decision, just pointing out that most of the widely-used languages in 2014 had a similar setup with similar weaknesses.
mainly, you can trust that anything under the foo/ namespace is controlled by only a smaller group of people, as opposed to the current situation on cargo where people pseudo-namespace by making a bunch of packages called foo-bar, foo-baz, and you can't trust that foo-bin wasn't just inserted by someone else attempting to appear to be part of the foo project. It also helps substantially with naming collisions, especially squatting.
If you want to check your dependency tree for the number of maintainers you're dealing with instead of the number of dependencies, this can be done with cargo tree + checking the Cargo.toml/crates.io ownership info for each of the found packages. I don't know if there's a command written to do that already, but I've done that with a small script in the past.
Fair enough. As I noted to another commenter, I'm not trying to say there was no prior art (if nothing else there was Maven), just that they were following the overwhelming majority of mainstream languages at the time.
> just that they were following the overwhelming majority of mainstream languages at the time.
They were trying to do better than mainstream languages in other areas and succeeded. IIRC on this front they just decided Ruby's bundler was the bee's knees.
The same developer who worked on bundler also worked on the initial version of cargo. That’s why they’re similar.
And at that time, it was a good idea. Ruby was popular and bundler + gem ecosystem was a big reason for its popularity. No one was worried that Rust might become so popular that it might outgrow the bundler model. That was only a remote possibility to begin with.
A mistake that many programmers make, as if baking one more feature on top would have made any difference that wouldn't be amortized in just a few weeks... Sigh.
The people that are working on the project haven't implemented namespaces, or any other security feature really, so what they say is immaterial. What they do is the only thing that matters.
The point is that it's much easier to make a mistake typing "requests" than "
org.kennethreitz:requests" (as a pure hypothetical.)
It also means that more than one project can have a module called "utils" or "common", which once again reduces the risk of people accidentally downloading the wrong thing.
Just some random Cargo security-related issues I noticed:
- No strong link between the repo and the published code.
- Many crates were spammed that were just a wrapper around a popular C/C++ library. There's no indication of this, so... "surprise!"... your compiled app is now more unsafe C/C++ code than Rust.
- Extensive name squatting, to the point that virtual no library uses the obvious name, because someone else got to it first. The aformentioned C/C++ libraries were easy to spit out, so they often grabbed the name before a Rust rewrite could be completed and published. So you now go to Cargo to find a Rust library for 'X' and you instead have to use 'X-rs' because... ha-ha, it's actually a C/C++ package manager with some Rust crates in there also.
- Transitive dependencies aren't shown in the web page.
- No enforcement or indication of safe/unsafe libs, nostd, etc...
- No requirement for MFA, which was a successful attack vector on multiple package managers in the past.
DISCLAIMER: Some of the above may have been resolved since I last looked. Other package managers also do these things (but that's not a good thing).
In my opinion, any package manager that just lets any random person upload "whatever" is outright dangerous and useless to the larger ecosystem of developers in a hurry who don't have the time to vet every single transitive dependency every month.
Package managers need to grow up and start requiring a direct reference to a specific Git commit -- that they store themselves -- and compile from scratch with an instrumented compiler that spits out metadata such as "connects to the Internet, did you know?" or "is actually 99% C++ code, by the way".
> > "We're pretending security is not an issue." has been the feedback every time this is raised with the Cargo team.
> Do you have a specific link where I can read this response, because this is not at all the responses I have read.
Those aren't people saying security isn't an issue but examples of concerns you have which is different.
For some of those, there are reasonable improvements that can be made but will take someone having the time to do so. While the crates.io team might not be working on those specific features, I do know they are prioritizing some security related work. For instance, they recently added scoped tokens.
For some, there are trade offs to be discussed and figured out.
Sure, but all the drawbacks you enumerate are also advantages for gaining critical mass. A free-for-all package repository is attractive to early adopters because they can become the ones to plug the obvious holes in the standard library. Having N developers each trying to make THE datetime/logging/webframework/parsing library for Rust is good for gaining traction. You end up with a lot of bad packages with good names though.
> Extensive name squatting, to the point that virtual no library uses the obvious name, because someone else got to it first.
Maybe the obvious names should have been pre banned. But I don't see the issue with non-obvious names either way you're going to have to get community recommendation/popularity to determine if
In ASP.NET land, I regularly work on projects where there is an informal rule that only Microsoft-published packages can be used, unless there's good reason.
You don't want to be using Ivan Vladimir's OAUTH package to sign in to Microsoft Entra ID. That probably has an FSB backdoor ready to activate. Why use that, when there's an equivalent Microsoft package?
When any random Chinese, Russian, or Israeli national can publish "microsoftauth", you just know that some idiot will use it. That idiot may be me. Or a coworker. Or a transitive dependency. Or a transitive dependency introduced after a critical update to the immediate dependency. Or an external EXE tool deployed as a part of a third-party ISV binary product. Or...
Make the only path lead to the pit of success, because the alternative is to let people wander around and fall into the pit of failure.
Crates.io has publisher information-- namespacing is not required for that. For example, here are all the crates owned by the `azure` GitHub organization and published by the `azure-sdk-publish-rust` team: https://crates.io/teams/github:azure:azure-sdk-publish-rust
Sorta—it looks like they were mostly just using that system by convention until May 2015, when they finally become enforced [0]. Still, that's a good one that I hadn't thought of, and they at least had the convention in place.
I'm honestly astounded at how badly many languages have implemented dependency management, particularly when Java basically got this right almost 20 years ago (Maven) and others have made the mistakes that Java fixed. With Maven you get:
1. Flexible version (of requirements) specification;
2. Yes, source code had domain names in packages but that came from Java and you can technically separate that in the dependency declaration;
3. You can run local repos, which is useful for corporate environments so you can deploy your own internal packages; and
4. Source could be included or not, as desired.
Yes, it was XML and verbose. Gradle basically fixed that if you really cared (personally, I didn't).
Later comes along Go. No dependency management at the start. There ended up being two ways of specifying dependencies. At least one included putting github.io/username/package into your code. That username changes and all your code has to change. Awful design.
At least domains forced basically agreed upon namespacing.
> Later comes along Go. No dependency management at the start. There ended up being two ways of specifying dependencies. At least one included putting github.io/username/package into your code. That username changes and all your code has to change. Awful design.
"github.io/username/package" is using a domain name, just like Java. Changing the username part is like changing the domain name--I don't see how this is any worse in Go than in Java.
If you don't like that there's a username in there, then don't put one in there to begin with. Requiring a username has nothing to do with Go vs. Java, but rather is because the package's canonical location is hosted on Github (which requires a username).
I don't know why so many programmer's use a domain they don't control as the official home of their projects--it seems silly to me (especially for bigger, more serious projects).
Slight difference is that it wouldn't break existing builds if you changed namespaces in Java. The maven central repo does not allow packages to be rescinded once they are published.
So that old version of package xyz will still resolve in your tagged build years from now even if the project rebrands/changes namespaces.
Note that in Java it is merely a convention to use domain names as packages. There is no technical requirement to do so. So moving to a different domain has no impact whatsoever on dependency resolution. Many people use non-existent domain names.
To be honest I really like how Java advocated for verbose namespaces. Library authors have this awful idea that their library is a special little snowflake that deserves the shortest name possible, like "http" or "math" (or "_"...).
Java did a lot of things right beyond the language and VM runtime, both of which were "sturdy" by the standards of the early 1990s. Using domain names for namespaces was one nice touch. Having built-in collections with complete built-in documentation was another excellent feature that contributed to productivity.
It may be a convention but in practice if you want to publish your package to Maven Central you need to prove ownership of your group ID domain. (Or ownership of your SCM account, which is in essence another domain).
> I don't know why so many programmer's use a domain they don't control as the official home of their projects
Not only that, but a commercial, for-profit domain that actively reads all the code stored on it to train an AI. Owned and run by one of the worst opponents of the OS community in the history of computing.
At least move to Gitlab if you must store your project on someone else's domain.
Yes, #3 in particular is important for many large corps where one team develops a library that may be pulled in by literally thousands of other developers.
The dependency management side of Maven is great. OTOH, I was astounded to learn today that Maven recompiles everything if you touch any source file: https://stackoverflow.com/a/49700942
This was solved for C programs since whenever makedepend came out! (I'm guessing the 80s.)
(Bonus grief: Maven's useIncrementalCompilation boolean setting does the opposite of what it says in the tin.)
100% agree. It's unbelievable what a PITA it is dealing with pip or npm compared to Maven even 10 years ago. The descriptors could get convoluted but you could also edit them in an IDE that knew the expected tokens to make things happen.
No you see Java devs have stockholm-syndromed themselves into believe that a giant stack of XML, or some unhinged mini-language are actually good, and much better than something the humans involved can actually read and parse easily and now to compensate with other ecosystems providing 85% of the functionality, with 5% of the pain, they’ve got to find some reason to complain about them.
Is this a joke? XML is horrible to work with, more boilerplate than information. Compare your average maven file to a cargo.toml and tell me which is easier to work with...
"XML is more verbose" is a lazy criticism in the same veign as "Python is better than Java because you can do 'Hello World' in one line".
Maven files have a simple conventional structure and a defined schema. Once you learn it, it's a breeze to work with. It's not like you need to write hundreds of lines of SOAP or XLST — which is actually why people started to dislike XML, and not because XML inherently bad.
Edit: I'd also take XML over TOML any day, especially when it comes to nested objects and arrays.
For a descriptor verbose is superior. It's way clearer what you're looking at. Matching a named end tag is much easier than matching a }. Also, XSD means you can strictly type and enumerate valid values and you will instantly see if you've written something invalid.
Maven stores every version of every library you've ever needed in a central location. It doesn't pollute your working directory and it caches between projects. And this is more of a Java thing than a Maven, thing, but backwards compatibility between versions is way easier to manage. There's no incompatible binaries because you changed the node runtime between npm install and running your project.
The inverse-style domain name thing does a really good job of removing the whole issue of squatting from the ecosystem. You have to demonstrate some level of commitment and identity through control of a domain name in order to publish.
I would also say that this puts just enough friction so that people don't publish dogshit.
crates.io demonstrates quite clearly that you either have to go all the way and take responsibility for curation or you have to get completely out of the way. There is no inbetween.
and i dont particularly think that using xml is that bad. The schema is well defined, and gives you good autocompletion in any competent IDE (such as intellij).
It took some iterations before maven 3 became "good", so people forget that it wasn't as nice before now! Unfortunately, it seems that the lessons learned there is never really disseminated to other ecosystems - perhaps due to prejudice against "enterprisey" java. Yet, these package managers are now facing the sorts of problems solved in java.
I have no problem with XML in general and even think it's still the better format for many things. But it's not really appropriate for a build config. Thankfully Maven now offers polyglot but I've seen no use of it in the wild.
URLs for packages makes a lot of sense. It works well in the land of Go. It also conveniently eliminates the need for the language to have a global packages database. Upload your package to example.com/your-thing and it's released! (You can, of course, still offer a cache and search engine if you want to.)
No, URL's don't make sense because your application shouldn't care where on the internet your dependency happened to be hosted when you integrated it. It's location has nothing to do with what it is.
By the time you're going to production, your vetted and locked dependency should be living in your own cache/mirror/vendored-repo/whatever so that you know exactly what code you built your project around and know exactly what the availability will be when you build/instantiate your project.
Your project shouldn't need to care whether GitHub fell out of fashion and the project moved to GitLab, and definitely shouldn't be relying on GitHub being available when you need to build, test, deploy, or scale. That's a completely unnecessary failure point for you to introduce.
Systems that use URL-identified packages can work around some of this, but just reinforce terrible habits.
URLs are well structured and unique, with a sensible default - sourcing the file from the internet - and ubiquitous processes for easily mapping the URL to an alternative location.
I.e., when you're going to run the production build, the URLs are mapped to fetch from the vetted cache and not the internet.
I don't see any downsides to allowing them as a source, or making them the default approach
> and ubiquitous processes for easily mapping the URL to an alternative location.
This seems strange to me because the whole point of a Uniform Resource Locator is to specify where a resource can be located.
It's a bit like saying "My project depends on the binder on shelf 7 in Room 42, sixth binder from the left. Except when I go into production, then use...." Don't tell me what binder it's in, tell me what it is.
I can see a case made for URIs, which is basically what Java did.
This was a big annoyance for me back in the day when I was dealing with XML namespaces. URLs never made sense for that use case and too many tools tried to pull XSDs from the literal URL which was always generally out of date, some projects switch to URIs like tag uris or URNs and it was much better, imo.
Fully qualified domain names (java/maven) aren't URIs. The latter are far more transient. Maybe a form of permalink could work, but that likely places too great a burden on package maintainers. I don't see that working out honestly.
Isn’t that why GOPROXY exists though? Not sure why you would need an internet connection. URLs don’t necessarily equate to the internet. Our internal and external packages are all locally hosted and work regardless of the internet being available.
> By the time you're going to production, your vetted and locked dependency should be living in your own cache/mirror/vendored-repo/whatever so that you know exactly what code you built your project around and know exactly what the availability will be when you build/instantiate your project.
In the Go world this would be "vendored" dependencies, that is, the dependencies are within your source tree, and your CICD can build to its hearts content with no care in the world about the internet because it has the deps.
The URL is useful for determining which version of a specific project is being used - "Oh we switched to the one hosted on gitlab because the github one went stale"
The advantage of using gitlab, or github, or whatever public code repository is that you get to piggy back off their naming policies which ensure uniqueness.
But, at the same time, there's no reason that the repo being referred to cannot be in house (bob.local) or private.
Having said all of that, the Go module system is a massive improvement on what they did have originally (nothing) and the 3rd party attempts to solve the problem (dep, glide, and the prototype for modules, vgo), but it's not without its edge cases.
Isn't that just delegating the problem? URL dependencies do not replace what crates.io does, and a modern language will still want something like it. You'd just end up with most every dependency being declared as crates.io/foo.
[registries]
maven = { index = "https://rust.maven.org/git/index" }
[dependencies]
some-package = { index = "maven", version = "1.1" }
Obviously Maven doesn't host any Rust crates (yet?), this is just a theoretical example. Very few projects bother to host their own registry, partially because crates.io doesn't allow packages that load dependencies from other indices (for obvious security reasons). The registry definition can also be done globally through environment variables: CARGO_REGISTRIES_MAVEN="https://rust.maven.org/git/index". Furthermore, the default registry can be set in a global config file.
In theory, all you need to do is publish a crate is to `git push upstream master`, and your package will become available on https://github.com/username/crate-name (or example.com/your-package if you choose to host your git repo on there).
Personally, I don't like using other people's URL packages, because your website can disappear any moment for any reason. Maybe you decide to call it quits, maybe you get hit by a car, whatever the reason, my build is broken all of the sudden. The probability of crates.io going down is a lot lower than the probability of packages-of-some-random-guy-in-nebraska.ddns.net disappearing
It doesn't help with the failure mode of dependencies disappearing, which forces people that care about it to vendor, which in turn brings its own set of issues.
Cargo does support URLs to git repos for dependencies. But crates.io is the official platform and almost every search I do on it returns at least one generically named entry with an empty repository that someone snatched away and never used.
I don’t work with rust on the regular, but this is so annoying with package repositories in general. No don’t use http-server, it’s bad, instead you have to use MuffinTop, it’s better. And then you just have to know that. The concept of sanctioned package names would be interesting, but probably chaotic in practice as the underlying code behind this sort of alias changes over time. This will remain a part of being a domain expert in any given ecosystem forever I think, hooray!
Crates.io names are human-meaningful and everyone sees the same names, but it's vulnerable to squatting, spamming, and Sybil attacks.
You could tie a name to a public key, like onion addresses do, but it's unwieldy for humans. (NB, nothing stops you from doing this inside of crates.io if you really wanted)
You could use pet names where "http-server" and "http-client" locally map to "hyper" and "reqwest", but nobody likes those, because they don't solve the bootstrap problem.
It's a problem with all repos because when you say "http-server should simply be the best server that everyone likes right now", you have to decide who is the authority of "best", and "everyone", and "now". Don't forget how much useless crap is left in the Python stdlib marked as "don't use this as of 2018, use the 3rd-party lib that does it way better."
So yeah... probably will be a problem forever. As a bit of fun here are some un-intuitive names, and my proposed improvements:
- Rename Apache to "http-server"
- Rename Nginx to "http-server-2"
- Rename Caddy to "http-server-2-golang"
- Rename libcurl to "http-client"
- Rename GTK+ to "linux-gui"
- Rename Qt to "linux-gui-2"
- Rename xfce4 to "linux-desktop-3"
Then you only need to remember which numbers are good and which numbers are bad! Like how IPv4 is the best, IPv6 is okay, but IPV5 isn't real, and HTTP 1.1 and 3 are great but 2 kinda sucked.
Very simple. If a company as big as Apple can have simple names like "WebKit", "CoreGraphics", and "CoreAudio" then surely a million hackers competing in a free marketplace can do the same thing.
> Perhaps my biggest critique is that crates.io has no namespacing. Anyone can just claim a global and generic package name and we mostly have to deal with it (unless you avoid using the crates.io repository, but then you'll probably have more problems...). Some of these globally-claimed generic packages are not really the best package to use.
This is true that with no namespace anyone can end up squatting a cool name, but with namespace you end up in an even worse place: no-one end up with the cool names, and it makes discoverability miserable, because instead of having people come up with unique names like serde, everybody just name their json serialization/parsing library json and user now needs to remember if they should use "dtolnay/json" or "google/json" (and remember not to use "json/json" because indeed namespace squatting is now a thing), and of course this makes it completely ungoogleable.
We've had the namespace discussion for hundreds of time in the various Rust town squares, and the main reason why we still don't have namespace is because it doesn't actually answer the problem it's supposed to address and if you dig a little bit you realize that it even makes them worse.
Having a centralized public and permissionless repository opens tons of tricky questions, but namespaces are a solution to none of them.
> everybody just name their json serialization/parsing library json and user now needs to remember if they should use "dtolnay/json" or "google/json"
What's the problem with that? You will have to explicitly state intention which is always a good thing.
> and of course this makes it completely ungoogleable
You are aware that just pasting username/projectname gives you exactly what you are looking for in the several top search engines, correct?
> We've had the namespace discussion for hundreds of time in the various Rust town squares, and the main reason why we still don't have namespace is because it doesn't actually answer the problem it's supposed to address and if you dig a little bit you realize that it even makes them worse.
OK, if you say so. Is this worse state of things documented somewhere?
If you have a namespace, can't people just globally-claim namespaces instead? like serde/serde or something similar. I feel if you really don't want people claim whatever they want, you have to do the Java package style where namespaces are tied to domain names.
While Java made that notation famous, it was already used in NeXTSTEP and Objective-C, hence why you will see this all over the place in macOS and derived platforms, on configuration files and services.
CPAN has the best model IMHO. Hierarchy that starts with categories. You build on top of the base stuff and extend it, rather than reinvent/fork something with a random name. Result is a lot more improvement and reuse, and more functionality with less duplication. Plus it's easy to find stuff.
Perl's whole ecosystem is amazing compared to other languages. It's a shame nobody knows it.
And the horrible decision to not make library-level ("module") and code-unit-level ("package") namespacing orthogonal. The former was an afterthought tacked on since the package system was designed to be used only within Google's monorepo and little care was paid to how it would work when it was released to the public and used more generally.
I want to understand what you just said, but I fear watering your language down a bit might be a tall ask with some people. Would you be willing to eli5 what you believe Go did that was a horrible decision with regards to module/package namespacing?
I think what they mean is: if you see a line like `import git.example.com/foo/bar/baz`, that could be package `baz` inside module `git.example.com/foo/bar`, or it could be package `bar/baz` inside module `git.example.com/foo`.
Also, even if you know it's the latter, package namespacing isn't strictly related to directory structure, so `bar/baz` has no specific meaning outside of the context of a go import. They could have used any other separator for package components - `git.example.com/foo:bar:baz` - but instead they chose the slash, making the scheme both technically ambiguous and easy to confuse for an HTTP URL.
Ah that makes sense. I think Go did somewhat stumble a bit in the early days due to this, especially with repositories in GitLab, where GitLab allows essentially a directory tree where your repository can be nested indefinitely in directories like `https://gitlab.com/mygroup/subgroup1/subgroup2/repository`.
I still don't think this is a huge issue, to be honest. Not one big enough for me to complain about, for sure. But it's definitely not ideal.
This needs to be resolved by every damned language.
Just make signed dependencies a universal default, point to an https page for the package vendor and use the signing key from there.
Neither node nor maven ever bothered to solve this, so we end up wandering the Wild West wondering when it will be that HR, or legal, or architecture comes knocking on the door to ask what we were thinking having a dependency on a dynamic version of Left Pad.
I'd kinda like to see what Cloudflare and Let's Encrypt could come up with if they worked together on at very least a white paper and an MVP POC.
Pay somebody either internally or externally to maintain a repo of all your dependencies and point your code at that. You won't get a left-pad incident. You won't get a malicious .so incident (unless you mirror binaries instead of source code).
Like if you ran out of screws to make your product with do you walk around the street and scrounge up some? No, you go to a trusted vendor and buy the screws.
I'm suggesting that the guys who've repeatedly proven themselves technically competent around security at scale might have a couple of useful ideas regarding how the industry might go about crawling its way out of this little security clusterfuck.
And perhaps even stop treating something as simple as a BOM as an enterprise feature, given that the overhead on such things is damned near zilch and the security implications are staggering.
There's reasons why Google projects don't go out on the internet to get their 3rd party deps.
They're all checked into Google3 (or chromium, etc.). One version only. With internal maintainers responsible for bringing it in and multiple people vetting it, and clear ownership over its management. E.g. you don't just get to willy nilly depend on a new version -- if you want to upgrade what's there, you gotta put a ring on it. If you upgrade it, you're likely going to be upgrading it for everyone, and the build system will run through the dependent tests for them all, etc.
And the consequence is more responsible use of third party deps and less sprawling dependency trees and less complexity.
And additional less security concerns as the code is checked in, its license vetted, and build systems are hunting around on the Internet for artifacts.
Agreed. The other thing I don’t really like is that you can’t split up things in the rust namespace hierarchy between crates (something that’s natural with jars in JVM). I would have liked to have defined things so that I could have the unicode handlers for finl live in finl::unicode, the parser in finl::parser, etc. but because they’re in separate crates rust gets upset about finl being defined twice and there’s no workaround for it. There are likely pitfalls I don’t see in what I want, so I live with it.
As I recall, I could do something where I could have a common root crate that would import and re-export the other crates with the namespaces modified on the exports and then control which crates are exported through feature gates, but it just seemed more hassle than it was worth.
I don't think a lack of namespace is that much problem. Sure, it is often annoying, but people are creative enough to create a short enough and still available crate name for most cases. Namespacing only makes sense for a large group of related crates---and it wouldn't give much benefit over a flat namespace.
As other mentioned though, a typosquatting is a much bigger problem and namespacing offers only a partial solution. (You can still create a lookalike organization name, both in npm and in Github.)
I love this lack of namespacing personally, because it means that whatever crate you see in a project is going to be the same as the crate your see in another one. Never need to alias crate names. It happens in Golang all the time and I really think namespacing packages was a mistake there.
Golang's problems aren't due to using namespaces, they're due to delaying too many decisions until too late.
Go has namespacing mostly because for a long time it didn't have a package manager at all, so people just used a bunch of ad hoc URL-based solutions mostly revolving around GitHub, which happens to have namespaces and also happened to lend itself to aliasing (because a whole GitHub URL is too long).
If you want to look at an actual example of namespacing done well, Maven/Java is the place to look. There is no aliasing—the same imports always work across projects.
I don't see the problem. Even with namespaces you'll have
brandonq/xml vs parsers/xml with no clue if one is better then another.
Also possibly with some confusion over whether things with the same name are forks or not. May make it a little more difficult to Google. Why not just have brandon_xml vs xml-parser and have a community list of best and most popular libraries?
I guess the only issue is that some generic/obvious package names are bad packages. That can have been avoided if they banned/self-squated most of the names. I suppose if you use dns namespaces and actually tie it to ownership of the domain name it might make sense but that would also cause issues(what if you forgot to renew the domain?).
The advantage is one of trust. If the `abc` developers build well known library `abc.pqr` are well trusted then I know I can use `abc.xyz` and everything else under the same namespace without (much) vetting.
We could even have `rust.xyz` for crates that are decoupled from `std` but still maintained by rust core devs such as `regex`.
> brandonq/xml vs parsers/xml with no clue if one is better then another.
You will have this problem only once. After an initial research you settle on one and move on with life.
I don't get how having to vet a dependency is going to be more difficult than before. The process is 99% the same, you still have to do the research work initially in both cases.
Yet to see any proof that namespacing has made things better in other ecosystems, are go style links or other types of namespaced imports any less prone to supply chain risks?
It's definitely a good thing that people choose new unique names for crates rather than dijan/base64 vs dljan/base64
Do understand the desire of having a crate for audio manipulation called "audio" but at the same time how often do we end up with "audio2" anyway? It's an imperfect solution for an imperfect world and I personally think the crates team got it right on this one.
> Yet to see any proof that namespacing has made things better in other ecosystems
It's really as simple as this: many libraries are generic enough implementing something that already exists. Let's say you want a library to manage the SMTP protocol. On crates.io, of course someone has already taken the "smtp" crate (ironically, this one is abandoned, but has the highest download counts, because it's the most obvious name). Let's say you disagree with the direction this smtp crate has gone, and you make your own. What do you call it?
Namespaces solve this problem. You'd instead of have user1/smtp and user2/smtp competing in feature sets. You can even be user3/smtp if you don't like the first two.
This is precisely what Java enables too. The standard library is in com.java.*; if you don't like how the standard library does something, you can make com.grimburger.smtp and do it yourself. If you choose to publish to the world, all the more power to you. It doesn't conflict with the standard library's smtp implementation.
This is a common critique, and although I don't have insight into why the original decision to not have namespaces was made, the current outlook is that until issues related to continuity are resolved, it's a no go:
That article starts with the premise that “it’s a feature, not a bug” then goes on to describe a whole bunch of things I consider to be anti-features of a packaging system that has a flat namespace.
The first section says it discourages forking. I consider this to be bad. Nobody’s code should be more important purely because it squatted a better name.
The Identity section actually makes the case that flat registries make naming harder.
The section on Continuity is “we’ve tried nothing and we’re all out of ideas”. Make up an org name and grandfather all packages in the flat namespace into that special org. Also this is already a problem because packages in the flat namespace do get abandoned, then forked, and then we have the associated issues.
The section on Stability seems to take it as a given that crates.io should be the only registry. I don’t. It also seems to conflate cargo with rustc for the benefit of the argument.
The squatting section describes only anti-features and I don’t consider the author’s legitimate use cases to be legitimate reasons to squat.
I think the only legitimate problems that need addressing are the ergonomics of accessing namespaced packages throughout transient dependencies and backwards compatibility with non-namespaced code. But the fact that these are real problems does not, to me, make a flat namespace a “feature”. It’s just easier to implement.
It’s okay for it to be a mistake that takes effort and time to fix.
Another option would be to grandfather all packages into their own org. So serde becomes serde/serde. This way you don't need to manage permission rules in the legacy "all" namespace.
You get some oddities such as serde-derive/serde-derive but the package owners can choose if they want to move to serde/derive or leave it in a separate namespace.
Maybe it was a reaction against the Java-style reverse DNS notation, which is verbose and annoying, but a more GitHub-style user/group namespace prefixing package names would have been the nice middle ground.