Cargo: predictable dependency management

lilyball · on May 5, 2016

Something that I'm surprised this page doesn't talk about, and which is very important considering the recent hubbub over left-pad, is that any dependency you get on Cargo can be relied upon to continue to exist forever (well, as long as the crates.io site still exists, but if that goes away so does the Cargo index). The reason is because you can't ever remove a published version of your crate from crates.io. You can yank a version, which tells Cargo not to allow any projects to form new dependencies on that version, but the version isn't actually deleted and any projects that have existing dependencies on that version will continue to be allowed to use it. This is documented at http://doc.crates.io/crates-io.html#cargo-yank.

incepted · on May 5, 2016

> Something that I'm surprised this page doesn't talk about, and which is very important considering the recent hubbub over left-pad, is that any dependency you get on Cargo can be relied upon to continue to exist forever

Maybe the reason why it's hardly talked about is because it's common sense and pretty much all dependency managers support it?

Except node of course, because they have no idea what they're doing.

lilyball · on May 6, 2016

Except not all dependency managers support it. I'm not even sure if it's safe to say that a majority of dependency managers support it.

gnoway · on May 6, 2016

Package persistence is not really a dependency manager feature though, it's a package repository feature. npm didn't fail; npmjs.com did.

I'm just waiting for this to happen to bower next. AFAIK they're just a registry pointing to github. All it's going to take is someone doing a force push without thinking and we're in this same situation again.

bulkan · on May 5, 2016

> Except node of course, because they have no idea what they're doing.

NPM and node are two different things.

nathancahill · on May 6, 2016

..his point stands.

tatterdemalion · on May 5, 2016

This is incorrect and very rude. Rubygems has also implemented an automated delete feature. Rubygems and NPM have orders of magnitude more users than crates.io; once crates.io approaches that scale the team will have to allow users to delete crates or somehow find funding to field these kinds of support tickets.

riffraff · on May 6, 2016

the parent comment might have been worded better, but it has a point, i.e. removing published code generally has to be supported in some way to handle special situations, e.g.

* people publish stuff inadvertently (i.e. private information/keys) * people publish stuff they are not allowed to (i.e. copyright and trademark violations) * people publish stuff you do not want to see published (i.e. stuff related to breaking legal or ethical laws)

It can be argued that users must not do that or that cargo.io doesn't have to oblige, but if they, for example, get a DMCA notice they'll still have to.

gnoway · on May 6, 2016

I think the key is that crates.io (and rubygems.org, nuget.org, etc.) as repo owners have to own the removal operation themselves and not delegate that to the package owner/maintainer. The repo owner is better positioned to take package consumers' needs into account and make good decisions about when and how to remove a package.

As far as satisfying support requests: obviously exceptional stuff like DMCA gets handled, but package owners publishing keys, etc. is still their responsibility and shouldn't accelerate or even guarantee removal IMO. If you publish secrets, it's not crates.io's responsibility to help you hide that mistake, and you need to be changing those secrets anyway.

tatterdemalion · on May 6, 2016

> I think the key is that crates.io (and rubygems.org, nuget.org, etc.) as repo owners have to own the removal operation themselves and not delegate that to the package owner/maintainer. The repo owner is better positioned to take package consumers' needs into account and make good decisions about when and how to remove a package.

What does this mean? Rubygems and npm both allow a full unpublish, not only a yank. crates.io does not. Rubygems and npm are the same in this regard.

Remember that we are not talking about npm unpublishing someone's library without their approval (that's a different issue), we are talking about npm allowing a user to unpublish a library, which crates.io does not allow a user to do. I will repeat, because people seem not to believe it: rubygems allows this as well, specifically because the maintainers of rubygems could not handle the support tickets that resulted from not providing this feature.

The idea that this is some amateurish aberration from npm is a myth. If Rust is lucky, someday crates.io will have to choose between paying someone to field these tickets or letting users unpublish code.

riffraff · on May 6, 2016

the problem with saying "users should not publish their ssh keys" is that they will still do it and ping you with requests to remove them even if you have said it's not possible to do it, causing unnecessary support work.

That is, AFAIU, the reason the rubygems.org maintainers allow it now.

http://blog.rubygems.org/2015/04/13/permadelete-on-yank.html

gnoway · on May 6, 2016

Except they even state:

"If you’ve pushed a gem with internal code, you still need to reset API keys, URLs, or anything else sensitive despite the new behavior."

And:

"...we’ve been using an Amazon S3 bucket to store the gems for years now with versioning on - so if someone does remove gems that are necessary, we can easily restore them."

So what they've really done is given developers the illusion that the unwanted gem has been removed, while introducing the ability to break everyone's workflow just like npmjs. In some ways this is worse than before; devs still need to change secrets, and if it's non-secret sensitive code they are concerned about, it's still 'out there' and the dev still has to trust that the rubygems.org people don't do something unwanted with it.

technomancy · on May 6, 2016

> If you publish secrets, it's not crates.io's responsibility to help you hide that mistake

If you publish secrets, you're causing crates.io to perform copyright violations. It's not just about helping you hide the mistake, but about helping to stop further violations.

You shouldn't need to resort to a formal DMCA request to stop copyright violations.

gnoway · on May 6, 2016

How is publishing a secret (I mean passwords, ssh keys, etc.) a copyright violation?

technomancy · on May 9, 2016

Passwords are usually under the complexity threshold of copyrightability, but keypairs aren't.

But secrets can have many other kinds too, like accidentally publishing code which is under a restrictive license.

tatterdemalion · on May 6, 2016

How could my comment have been worded better? It is incorrect to claim that npm is alone in allowing automated unpublish, and rude to suggest the reason they allow it is because they do not know what they are doing, when in fact the reason they are doing it is because they have far more users than any other package manager. My comment was completely true and not at all aggressive.

waffle_ss · on May 5, 2016

> You can yank a version, which tells Cargo not to allow any projects to form new dependencies on that version, but the version isn't actually deleted

That's how RubyGems used to work, except you could contact the support team and ask them to permanently delete your gem version if something really sensitive and irrevocable was put into it. They had to change that due to their support log getting too big: http://blog.rubygems.org/2015/04/13/permadelete-on-yank.html

lilyball · on May 5, 2016

Cargo's policy is that if you upload secrets, you need to change the secrets because the code can't be deleted. From the aforelinked page:

> A yank does not delete any code. This feature is not intended for deleting accidentally uploaded secrets, for example. If that happens, you must reset those secrets immediately.

waffle_ss · on May 5, 2016

Even if they're given a court order to take a version down? It may be their policy, but I guarantee it will happen at some point.

steveklabnik · on May 5, 2016

We will comply with the law where required.

dkersten · on May 5, 2016

Court orders will be a lot less frequent than normal user requests so the overhead to the team won't be as high.

waffle_ss · on May 6, 2016

Of course, and I totally agree with their approach. I'm just saying the narrative eridius is pushing that somehow they can make assurances about things never being deleted totally false. For example, I'm sure if someone somehow put child pornography in a Rust crate it would rightly be taken down pretty fast (and not require a court order).

Also I just wanted to give a little history because eridius's original comment made it sound like it was a novel concept.

cfallin · on May 5, 2016

This seems a bit... inflexible. I definitely understand the arguments for not breaking builds and for reducing administrative overhead and such, but not every bit of secret data can be revoked like a key/credential can. What if you accidentally include user data, or proprietary business-logic code, or...? (Yes, with proper data hygiene and processes you'd never even come close to doing any of that, but it seems there should still be an escape hatch.)

kibwen · on May 5, 2016

And of course, a centralized site can still be compelled to remove a package via legal action if push comes to shove.

kibwen · on May 5, 2016

  > if that goes away so does the Cargo index

Not so, the location of the index is independent of crates.io. Currently the index itself is just hosted on Github, whereas all of crates.io is hosted on S3. Which is actually kind of a pain sometimes, since if Github goes down it means that Cargo won't be able to find the index and I don't know if there's an easy way to override the index check. In the future I expect Cargo will gracefully continue if the index can't be updated due to connection failure, though I'd think it would prompt the user to make sure they're aware that their local copy of the index might be out of date.

lilyball · on May 6, 2016

Ah, good to know. I assumed the index was hosted as part of crates.io but I didn't actually bother to check. The point I was trying to make (even if I didn't communicate it properly) was that if crates.io goes away for good, the index will too and so it won't really matter that the code is inaccessible because cargo won't know where to find it anyway. Of course, based on what you said, it's certainly possible for everyone involved in crates.io to get hit by a bus and crates.io vanish when the S3 bill goes unpaid while still leaving the index up on GitHub, but if crates.io is taken down intentionally then presumably so will the index.

jalami · on May 5, 2016

I know it wouldn't really fix issues like left-pad, but I would really like namespaced packages similar to the way Github does them. I think group ownership would be more explicit and understood. Top level packages encourages sqatting and small/old/unmaintained packages getting names that people would misunderstand for something else. I understand a lot of that is on the user of the library to research before downloading, but intuitiveness is a virtue. It's true packages could just be called eg. reactjs-node-bridge or something as opposed to reactjs/node-bridge, but anyone can prefix deceptively.

Cargo addresses the namespacing concern here http://internals.rust-lang.org/t/crates-io-package-policies/...

themartorana · on May 6, 2016

I think this is the scenario that has me feeling warm and fuzzy about vendoring. Sure, eventually even vendored dependencies will become stale, but I don't have to rely on a package manager past the initial delivery. It's not perfect, but I don't sweat code disappearing from a black box I don't own.

fzzzy · on May 5, 2016

It does mention it, briefly:

> This is enough information to uniquely identify source code from crates.io, because the registry is append only (no changes to already-published packages are allowed).

weinzierl · on May 5, 2016

> well, as long as the crates.io site still exists, but if that goes away so does the Cargo index

That makes me wonder: Is it easy or possible to replace crates.io with a self hosted repository?

Background of the question is that I know of a company where access to the standard public maven repo is forbidden. They use a commercial repository provider but I don't know if it is hosted on premises.

steveklabnik · on May 5, 2016

It's not easy, but it's possible. Everything is open source, and Cargo can easily be pointed at whatever host you want.

The feature that I'd like to see but haven't found the time to implement is delegation: "look at this index first, but if it's not here, go look at this other one". Right now, if you spin up your own crates.io and point Cargo at it, it won't have any packages... which works for some people, but not others.

weinzierl · on May 6, 2016

> "look at this index first, but if it's not here, go look at this other one"

I agree that this would be very helpful for a lot of people, but it is kind of the opposite of what I was asking about.

As far as I understand it the commercial repos try to solve two main concerns:

1. license compliance

2. security

Better performance and reliability are just additional benefits.

I only know the details for a certain Fortune 500 company. They don't want the builds to fetch packages from a site they don't control, and they certainly don't want "if it's not here, go look at this other one". The idea is more control about where the packages come from, not more flexibility.

I think if Cargo doesn't provide a way support alternative (possibly commercial) repos, it would be an obstacle to the adoption of Rust in the corporate world.

steveklabnik · on May 6, 2016

If you just want to ignore the broader OSS ecosystem, and run your own version of Crates.io behind a firewall, that's 100% supported today. The only real issue is that how to do so isn't particularly well-documented.

weinzierl · on May 6, 2016

Sorry for pestering again but I think this is kind of important and I haven't made myself entirely clear yet (English is not my first language).

In maven the repo URL is configurable in settings.xml. This URL can be different for different departments of even different projects.

From what I see in the cargo source the crates.io URL is hard coded. So the DNS is the only level of redirection we have. Using varying IP addresses for crates.io for different departments or even projects wouldn't fly, at least not in the world I live in.

>ignore the broader OSS ecosystem

It's not about that either because the commercial repos contain very much the same OSS packages as the standard repo but don't present all of them to everyone all the time. Take for example a car company: GPL3 for in-house projects that are just used by the employees are discouraged but somewhat tolerated. GPL3 for projects that run in the car are a big big no no. You want to be certain that no dev ever introduces GPL3 source into anything that is in the car. You want your build to fail if any of packages change their license to GPL3. You want your build to fail if any of you packages has a known vulnerability.

I know Cargo is not maven, but I believe this is a feature which is crucial for industry adoption. I think I will just add a feature request for this on GitHub.

steveklabnik · on May 6, 2016

  > Sorry for pestering again

No worries! This thread is a bit old but I'll try to pay attention to it.

  > From what I see in the cargo source the crates.io URL is hard coded.

It's not: http://doc.crates.io/config.html#configuration-keys

TL;DR:

  [registry]
  index = "URL_GOES_HERE"

and you're good.

  > It's not about that either because the commercial repos contain very
  > much the same OSS packages as the standard repo but don't present all 
  > of them to everyone all the time.

Ahh yeah. What I mean is, you'd have to set up the packages in that registry yourself. Which sounds like what they'd want to do, so seems fine.

pyre · on May 5, 2016

What would be useful is a sort of "caching proxy" that could have various knobs to handle situations like:

- crates.io is down

- crates.io says this cached package doesn't exist.

- etc...

ecnahc515 · on May 5, 2016

This is already possible today with existing caching proxies. This is a great way to make your CI/builds more reliable and quicker.

pyre · on May 7, 2016

I meant a caching proxy that functioned as a mini Crates.io in the absence of actual crates.io being up. Depending on the crates.io protocol, just caching HTTP requests might not be enough, but also acting as a (offline-able) middle man that knows the protocol gives rise to other knobs and such (e.g. a configurable blacklist of packages).

ybx · on May 5, 2016

What would be really nice is if there were a system like how debian, ubuntu, etc. does it, and allow for official (and unofficial) mirrors

viraptor · on May 6, 2016

But that's really a crates.io feature, not cargo's. You could easily deploy your own repository which serves everyone a random file when you pull "some_crate-1.2.3".

airless_bar · on May 6, 2016

Let's just see what happens when they get their first court order, which they certainly will (due to their global name-spacing).

steveklabnik · on May 6, 2016

As I said below, we will comply with the law.

(And I'm not sure what that has to do with namespacing.)

eyko · on May 6, 2016

I believe he means that packages aren't namespaced (owner/package-name), so if I publish a package called nike, Nike could come and want to take over the package name.

kibwen · on May 6, 2016

Global namespacing has nothing to do with that. Even with per-user namespacing, there's nothing to stop a user from making their username "nike" and publishing all their code under "nike/package-name", which would be subject to just as much potential legal action as a package named "nike" itself. Likewise there's also nothing to stop Bob from uploading "bob/nike".

airless_bar · on May 6, 2016

If course global name-spacing has something to do with it!

What's the difference between a "username" resulting in nike/package-name and a package called nike-package-name?

Not much in practice. Name-spacing doesn't mean "add another made up name entry somewhere".

That's why people who thought about these issues out-sourced this concern completely: Prove your ownership of nike.com and you can publish as nike.com.

Suddenly 99.9% of the trademark issues are gone, handled by registrars.

I think this approach is pretty obvious and I find the lack of thought coming from rust devs deeply concerning.

kibwen · on May 9, 2016

I'm not sure what you're trying to say. The problem exists independently of whether or not global namespacing is implemented by a package repository. Nike would have just as much authority to remove a "nike" package in a global namespace as it would to remove "bob/nike" (which is to say, almost no authority whatsoever).

I have no dog in the fight over whether or not a global namespace is a good idea, but it's tiresome to see irrelevant arguments being trotted out against global namespacing.

airless_bar · on May 11, 2016

Jesus Christ, just get on with times.

Even Node has overtaken Rust.

wyldfire · on May 6, 2016

It wouldn't likely meet the standard of "confusingly similar" unless you called it "Nike" in order to somehow leverage the actual shoe brand (logos 'n all). But even if it didn't, you could imagine an index operator succumbing to the (unreasonable) request of a lawyer in order to avoid a conflict requiring representation.

anp · on May 5, 2016

I like many things about Rust, but Cargo alone is what initially got me started with it. I was in the process of setting up a fresh C++ project with vendored dependencies, and it was just an absolute nightmare. In contrast, it took me about 5 minutes to get a Rust project set up with comparable dependency complexity.

akhilcacharya · on May 5, 2016

Speaking of which, is there any good package management system for modern C/C++?

ingve · on May 5, 2016

Conan is the most promising one, they just released a new version:

http://blog.conan.io/2016/05/03/New-conan-release-0-9.html

Arne Mertz has a Vagrant box for people who want to experiment with a "4C" development environment (Clang, Cmake, Conan, Clion).

https://github.com/arnemertz/Xubuntu1604_DevBox

anp · on May 5, 2016

I've had biicode recommended to me (https://github.com/biicode/biicode), but I haven't tried it yet.

EDIT: Also worth noting that a common response I've heard to "I need a better manager for my dependencies" is "you have a system package manager for a reason."

McP · on May 5, 2016

What do they say when you reply "Windows"?

anp · on May 5, 2016

I think it goes without saying that those with that sort of response tend not to use Windows, and also often seem to assume that a perfectly in-order Linux/Unix install is the only way to build software.

FreeFull · on May 5, 2016

And also that you're fine with creating a package for any dependency that doesn't have one.

anp · on May 5, 2016

Creating, maintaining, shepherding through distro processes, etc. All so you can then eventually build your own code.

McP · on May 5, 2016

Conan seems to have appeared recently but I haven't yet tried it myself. https://www.conan.io/

fungos · on May 6, 2016

Even if it doesn't has too much support, I highly recommend fips: https://github.com/floooh/fips

I have been using it, adding some libs and even contributing some changes. You should really give it a try!

rsaarelm · on May 5, 2016

The Meson build system http://mesonbuild.com/ seems to have a package manager like dependency management system.

imron · on May 5, 2016

It's also very quick and easy to get projects up and running.

For *nix based platforms Meson is now my go to build manager for c++ projects.

boris · on May 5, 2016

Check out https://build2.org

qznc · on May 5, 2016

I understand that old languages like C++ or Java have problems providing a single package manager. It seems that newer languages have roughly equivalent services.

Are there any big differences between Cargo and Python/Go/D/Ruby/Javascript/etc?

haddr · on May 6, 2016

I wouldn't say Java has this problem. Look at Maven for instance. Then look back at Crates, you will notice how many common solutions they share.

On the other hand I see that Crates solves the problem of dependency when 2 different version of the same library are used across the project. Maven doesn't solve that and it's painful. However this seems a bigger problem that just a package manager I suppose.

mavelikara · on May 6, 2016

Java has Maven Repository. This can be use from within Ant (Ivy), Maven, Gradle, sbt etc.

Both Ruby and Python are almost as old, if not older than Java. JavaScript is only slightly younger.

tatterdemalion · on May 5, 2016

I find cargo easier to use than bundler or npm. Specifically I never deal with "environment hell" where some config hasn't been picked up and I'm pulling in the wrong version of a dependency or the language or whatever. This is partly because of the way cargo is designed, but its also helped a lot by the fact that Rust generates statically linked binaries in one build instead of being a continuous interpreter process.

steveklabnik · on May 5, 2016

It depends on how you define "big". Cargo is closest to Bundler + rubygems, but with some aspects of npm.

riffraff · on May 6, 2016

notice that some of the really old ones (perl, tex) have had package managers for a really, really long time.

I think linux (*nix?) vs non-linux heritage might have something to do with it.

dastbe · on May 5, 2016

How does Cargo handle indirect dependency visibility? This article talks about visibility in terms of "do I need to manually include indirect dependencies? No!" but not in terms of "can I accidentally write code against an indirect dependency?" which isn't sufficiently answered.

If the answer is that indirect dependencies are still visible, I'd be interested in knowing if rust-lang/cargo plan to change that, similar to JDK9's module system for re-export.

SimonSapin · on May 5, 2016

Short answer: `extern crate foo;` doesn’t work if `foo` is an indirect dependency. So you cannot accidentally use something without declaring it in `Cargo.toml`.

Longer answer: when running rustc, Cargo passes individual `--extern bar=/path/to/bar.rlib` options for each direct dependency, not just a path to a directory with everything. You can see the exact command with `cargo build --verbose` or `cargo build -v`.

tatterdemalion · on May 5, 2016

As the sibling answer says, they are not visible.

However, crates can re-export symbols they have imported from their dependency, which couples their semver to that dependency's in a way that people have often not realized. This has been the source of some upgrade headaches so far, but its less of a problem than it has been with languages in my experience.

keeperofdakeys · on May 5, 2016

It's also relevant that rust can use multiple direct/indirect dependencies of different versions, due to the way that symbols are exported.

xkarga00 · on May 5, 2016

I have been working as a Go programmer for the past two years and I really love the language but dependency management has been one of the biggest pains for me. Godep really sucks (that's what we use in my current gig - we plan on switching to Glide, still haven't played around with it) and I am not sure I like the fact that I need to go out in the wild and choose between the rest of the existing solutions. I would prefer to have something that works ootb. The Rust team really got this right and Cargo is amazing. One of the main reason I have been dabbling with Rust lately.

dalailambda · on May 5, 2016

In my experience Glide is the closest to a modern package manager for Go.

gue5t · on May 5, 2016

Cargo assumes you want to cache downloaded/built dependencies at the granularity of a unix user account. It's very easy to break things if you try to force current cargo cache dependencies at the per-project level and there's been no interest in caching things at a per-machine or shared-between-machines level. If duplicating compilation work is desirable to ensure some amount of noninterference, it should be supported to have a package cache per project. If we can treat Cargo as a pure function from inputs to outputs, we should be sharing these build results across the Internet, at least within the set of machines used by one person.

Cargo also refuses to follow the XDG directory specification for its per-unix-user configuration; it should put things in $XDG_CONFIG_HOME and $XDG_CACHE_HOME, but instead dumps both configuration and downloaded/compiled stuff in ~/.cargo.

dbaupp · on May 5, 2016

> Cargo assumes you want to cache ... built dependencies at the granularity of a unix user account

This isn't true: actual build artifacts are cached in a per-project way (in `./target` by default). Cargo does download all packages to a user-shared directory, but this directory is essentially immutable, just a list of the source code.

Sharing more broadly than per-project is the form that is unreliable, things can change subtly between projects e.g. different targets, different compiler flags. Of course, cargo has essentially full information about everything involved in a build and so can track this---in the limit caching all the different configurations of each crate version---but I don't think it does currently (I recall an issue about it, but I cannot find it at the moment).

wyldfire · on May 5, 2016

> it should put things in $XDG_CONFIG_HOME and $XDG_CACHE_HOME

I was going to submit an issue but one already exists [1][2]

[1] https://github.com/rust-lang/cargo/pull/148

[2] https://github.com/rust-lang/cargo/issues/1734

anp · on May 5, 2016

There's an ongoing discussion of using build caches with Cargo:

https://internals.rust-lang.org/t/is-a-shared-build-cache-a-...

TLDR: There's interest, but some wrinkles need to be sorted out.

cm3 · on May 5, 2016

I like Nix-style caches myself, but a gc-cache command is needed in order to clean up only the stuff that isn't referenced anymore. I hope that's a design consideration.

yberreby · on May 5, 2016

> If we can treat Cargo as a pure function from inputs to outputs, we should be sharing these build results across the Internet, at least within the set of machines used by one person.

AFAIK, all it takes is a single `build.rs` to make rustc, and therefore Cargo, an impure 'function'. I'm not sure about compiler plugins, but I expect them to behave the same way.

steveklabnik · on May 5, 2016

You could also change $CARGO_HOME per project, to get a per-project cache.

BenTheElder · on May 5, 2016

I actually do this with one project, with a very small script that wraps cargo and sets $CARGO_HOME to a project local path before actually calling cargo. I keep a very short cargo.py and cargo.sh with my project.

I've experimented with using this to "vendor" dependencies in a local .cargo and so far that seems to work. An early example with discussion of the python version is here: https://www.reddit.com/r/rust/comments/3ea6je/is_there_a_bet...

SimonSapin · on May 6, 2016

Servo does this by default. (People who tried Servo but don’t otherwise develop in Rust didn’t like having random files in their home directory.)

e12e · on May 6, 2016

It sounds like there's two obvious work-arounds, and I'm not sure if I'd consider them all that bad: 1) use a caching http proxy for the downloads (in my experience cargo itself is pretty quick), or 2) use a shared user account for building ("sudo -s cargo-build; ... ").

I'm not sure concurrently writing to a group-writeable cache-folder is such a hot idea -- even things like apt use a write-lock when doing updates.

I suppose a third option is to mount (or probably symlink) ~/.cargo on a filesystem with deduplication -- but that wouldn't buy you caching for free (might work well coupled with a caching http proxy though).

masklinn · on May 5, 2016

> it should put things in $XDG_CONFIG_HOME and $XDG_CACHE_HOME

My unix does not follow XDG[0], neither of these are set and the XDG "fallbacks" are utter garbage, now what?

[0] hell, only a minority of linux distros do at all

jhasse · on May 5, 2016

Why are the fallbacks (~/.config and ~/.cache) utter garbage?

thristian · on May 6, 2016

How can a unix follow or not follow XDG? It's an application level concern, not a distro or kernel level one.

steveklabnik · on May 6, 2016

Distros distribute application packages. If those packages follow XDG or not, I would say the distro does or doesn't.

yazaddaruvala · on May 6, 2016

My Cargo wish list!

1. Like every enterprise build system I've ever gotten to use, I wish Cargo would manage the compiler version as well.

2. The compiler forces me to mark `unsafe` functions/blocks to use unsafe code. Why does Cargo not force me to mark `unsafe` dependencies to use unsafe code?[0]

[0] Only dependencies that explicitly use `unsafe` should need to be marked in my Cargo.toml. eg. Iron doesn't use `unsafe`, but Mio does. Iron depends on Hyper depends on Mio. If I have a dependence on Iron, I really want to be able to mark `#[unsafe_deep_dep] mio`, or else my build fails.

steveklabnik · on May 6, 2016

At the fundamental level, the machine is unsafe. This means that if unsafe were transitive, all code would need to be marked unsafe. Use a String? Your crate needs to be marked unsafe.

yazaddaruvala · on May 6, 2016

We agree. I think there was a bit of misunderstanding. I don't think transitive unsafe is appropriate.

Only crates which use `unsafe` explicitly should need to be whitelisted.

Fun Example: I need to depend on Iron. Iron boasts in its README that it doesn't use `unsafe`. Cool.

However, Iron depends on Hyper. Hyper is complicated, it needs to use `unsafe`, I can understand that. Hyper is also used by a lot of people, I trust it, and whitelist Hyper. My build passes, I know only `std` and Hyper use `unsafe`.

Why is this helpful?

Hasn't happened yet but @reem, being human, adds an `unsafe`[0] and lets Iron's README go out of date. It happens. However, as a user, I don't like to trust documentation, and I don't like to trust other developers[1]. When I update to this version of Iron, I'd rather my builds now failed. Then I can decide if this escalation of responsibility is appropriate.

This is especially true in an enterprise context where teams of new hires do whatever it takes to push features, and "accidentally" push some really unsafe code to a deep dependency[2].

[0] Lets pretend this `unsafe` was actually used in production code: https://github.com/iron/iron/commit/ba4d197030067d4347134c2e...

[1] Hence my other preferences for type checking, core reviews, and tests.

[2] Not everything is manually catchable in code reviews. Sometimes its 3am. Sometimes whoever it was, was really new and the CR is 14 pages long and needs to be deployed Thursday! Sometimes a reviewer is just having an off-day. It happens.

yazaddaruvala · on May 6, 2016

Another really cool thing is:

Even slightly discouraging the use of unsafe via Cargo will push crate owners to find already trusted "miro-crates" which abstract their usage of unsafe. Imagine something akin to left-pad existed solely to abstract unsafe. Most crates already use it, so most teams already whitelist it. Using finding and using a micro-crate might mean less friction for users to add your dependency into their application.

Benefits:

1. Smaller the crate, smaller the surface area, better the audit-ability.

2. The more people using the same crate, the less probable there is a bug.

3. If a relatively small crate exists, solely for its abstraction of `unsafe`, and it is relatively popular, then that could be a good indicator for moving that logic into `std`.

curun1r · on May 6, 2016

> 1. Like every enterprise build system I've ever gotten to use, I wish Cargo would manage the compiler version as well.

With rustup.rs becoming official, it's a much better way to handle compiler versions since it handles cargo versions as well and allows the two to be coupled together. This allows for tighter integration between cargo and rustc as well as making testing of new versions of cargo significantly easier (i.e. there's no permutation matrix of whether cargo A.B.C works flawlessly with rustc X.Y.Z)

yazaddaruvala · on May 7, 2016

To me, as a user (rather than, as a dev), I don't really care about the "Single Responsibility Principle", and the difference between Rustup and Cargo seems artificial.

They are both binaries I need to install that help me manage my Rust application's dependencies. The only arguable difference is that Rustup manages "build dependencies" where Cargo manages "code dependencies". However, the tools I've used, managed both those concepts, and did so seamlessly (I wish I could share more).

Meanwhile, I don't believe Rustup.toml currently exists (I'd love to be corrected). Since other people like the difference between Cargo and Rustup, I'm happy to change my Wishlist 1.

1. I wish Rustup had a Rustup.toml file, such that my git repo could manage the currently used version of rustc. Primarily for consistency across teams, and seamless transition when I `cd ../<other_application> && cargo build`.

steveklabnik · on May 7, 2016

Rustup supports your wish with "rustup override".

SimonSapin · on May 5, 2016

It was a happy day when Servo ditched a big hairy pile of makefiles + git submodules and switched to Cargo :)

Gankro · on May 5, 2016

Happier than Rust 1.0 and the end of The Rustup?

cm3 · on May 5, 2016

End of The Rustup? AFAICS, it's still downloading a very specific servo-only rustc.

mbrubeck · on May 5, 2016

It downloads a regular Rust nightly build these days. It depends on the specific nightly date (because we use compiler plugins that link to unstable rustc internals), but at least we no longer have to build our own Servo-specific snapshots.

kibwen · on May 5, 2016

Servo's efforts to upgrade their version of rustc used to be legendarily traumatic, these days it's comparatively tame. :)

cm3 · on May 5, 2016

Right, and if the binary toolchains would be built with musl (not targeting musl) statically, we could have a rustc+cargo that works on both glibc and musl, avoiding two versions.

dbaupp · on May 6, 2016

Could you explain what you're referring to? It seems like a total non sequitur, since as far as I know, there has never been any pain (or even rumour of pain) in a Rust upgrade caused by what libc the compiler uses, but rather breakage in the unstable features that servo uses.

cm3 · on May 6, 2016

If the official rust+cargo was built statically with musl, then a single download would work on glibc and musl distros. Right now, we don't have a download that works on musl, and one has to bootstrap on a glibc system, targeting musl, which makes it impossible to bootstrap rust+cargo on a musl distro without glibc because bootstrapping requires a snapshot.

dbaupp · on May 7, 2016

Ah, it is a total non sequitur, unrelated to what servo encounters when changing Rust versions?

cm3 · on May 7, 2016

I don't understand what you're saying. I suggested that linux builds of rust+cargo ought to be done as musl-static (not for targeting musl) so that a single download works on both Ubuntu and Alpine. Now we have only glibc builds. Yes, it's not related to servo, but it is relevant to you mentioning glibc, though this is off-topic so let's stop arguing.

dbaupp · on May 7, 2016

I think you should go back and re-read the conversation, particularly kibwen's comment above and your reply: they seem to be totally unconnected. I was trying to understand how you thought they (Servo upgrading Rust versions and a musl-compiled rustc) were related, I was not arguing. There is in fact no mention of glibc until your comments, nor is there any implication in my comments that I don't understand what musl does nor do I disagree that it is useful (in fact, I used Rust's easy ability to link binaries against musl just a few days ago). This discussion is purely prompted by me trying to understand why you thought it made sense to talk about musl on a seemingly unrelated comment... maybe there was some insightful connection I missed, but it seems not.

cm3 · on May 7, 2016

You're right.

cm3 · on May 6, 2016

I really wish downvotes would require a comment so that I knew why the two posts in this thread branch went negative after being positive for a long time. I feel like people treat it like a personal spam filter although it's the same view for everybody. Odd.

conradev · on May 5, 2016

I appreciate that predictability is a big priority with Cargo.

I feel like Rust and Cargo are in a great position to deterministically (per compiler version, per set of build flags) build libraries. That would be an amazing step forward in security if it could be enforced. Does anyone know if this is planned?

https://reproducible-builds.org

steveklabnik · on May 5, 2016

In general, we are very interested in reproducible builds, and have fixed bugs where accidental non-determinism has crept in. However, it can be easy to reintroduce with build.rs, which allows for executing arbitrary Rust code before a build. Syntax extensions are another problem here. But the scope is much reduced, it's true.

conradev · on May 6, 2016

That's awesome to hear. Would it be possible to make an attribute like

#[deny(non_deterministic)]

which would error if `build.rs` or similar is present? Also, for usage of things like the `file!` macro, which might mess up determinism based on the build directory.

Theoretically, syntax extensions could be written to be deterministic as well (for a given version, of course).

Is there a tracking issue on this currently?

steveklabnik · on May 6, 2016

There isn't, it might be an interesting idea! Not allowing build.rs would eliminate a lot of crates, and not all build.rs' are nondeterministic... So that's tough.

conradev · on May 6, 2016

Yeah. Perhaps developers could attribute functions in `build.rs` specifically with "deterministic", so you could see what in the dependency graph isn't deterministic.

Unfortunately, unlike with borrowck, reproducibility is inherently hard to verify, and you'd have to do it manually with a VM or some different build environment.

This is interesting: https://github.com/rust-lang/rust/pull/33296

cm3 · on May 5, 2016

It's great that Raph's line-breaking code has been reused, but in contrast to Mozilla projects any contribution to it will now require assigning copyright to Google (like Ubuntu or Microsoft's CLA require as well). At least it's a tiny piece, so may not hurt too much in terms of prevented contributions.

gpm · on May 5, 2016

Or just forking it, it's released under a open source license.

This wouldn't be the first time Servo has forked a dependency, for example they maintain a fork of glutin (library that handles opening windows) because they want certain features that upstream doesn't.

https://github.com/servo/glutin

cm3 · on May 6, 2016

That can work.

Curious what features servo/glutin has that are exclusive?

Perseids · on May 6, 2016

Something which is missing entirely in this concept (or, hopefully, only in the blog entry), are security updates. For example, let's assume I use an HTTP library whose TLS dependency has an error where it doesn't properly match the domain name in the certificate to the hostname of the server I'm connecting to. My Cargo.lock file references a TLS library of version 1.7.1, but the security patch is applied on 1.7.6 which has some breaking changes. Thus, the security update cannot happen silently (which would violate the predictability property of cargo anyway). Instead, we need a command to selectively update only packages with security updates and respectively we need a way to mark version bumps as security critical.

shadowmint · on May 6, 2016

    Libraries can be compiled as static libraries or dynamic libraries.
    even static libraries might want to do some dynamic linking (for 
    example, against the system version of openssl).

haha... I like cargo a lot, but I feel this is glossing over the biggest issue building rust code has: Building and linking to native libraries.

    Similarly, applications are often built for different architectures,
    operating systems, or even operating system version...

    Compiling winapi v0.2.6 
    Compiling libc v0.2.10

Oh? I'd love to see that compiling on a different operating system.

Linking against the system version of openssl? Are you sure it's the right version of openssl not to break your binding?

The C ecosystem is fundamentally problematic to get repeatable cross platform builds with; and trying to get repeatable cross platform builds with rust when it touches any c / c++ library is quite a pain.

Piston is a good example of how this is done... but people still struggle to compile it; the 'uses the compiler to explicitly build the dependency as a static library and links it' (eg. git binding afaik) is a much better solution, but it massively increases build times.

It's a rough edge point; and I still feel like there's been no real progress towards solving the issues.

Everyone just does something completely different in their native build (build.rs) step, and then your 'repeatable build' is largely, hit 'cargo build' and 'hope nothing goes wrong and all dependencies are installed'.

There are other more important priorities for rust at the moment (like recovering from panics...), but this is certainly an area I'd love to see improvement.

dbaupp · on May 6, 2016

> Oh? I'd love to see that compiling on a different operating system.

You'll note from the file paths in the blog post ("file:///Users/ykatz/...") that the builds are actually being run on a OSX machine: winapi is careful to not break builds when it is irrelevant.

It is possible and even necessary to write code that depends on operating system details (e.g. to implement the high-level cross-platform code), and indeed that code may not compile on different platforms. The point is that rustc and cargo together provide the tools to make it possible and easy to write code that builds and runs everywhere everywhere, they don't/can't/shouldn't guarantee it.

eyko · on May 6, 2016

As far as I can tell, the article isn't claiming that cross compilation in rust is easy, and it's definitely not saying that you can easily dynamically link and cross compile simultaneously. I develop on OSX but deploy to Linux, and that has always been an issue for me in Rust, compared to Go for example. The easiest approach for me so far has been building on the target platform with vagrant/virtualbox and calling it a day.

cm3 · on May 5, 2016

I see there is `cargo install` which will fetch and build an executable from crates.io, is there no `cargo update` for `install`?

steveklabnik · on May 5, 2016

https://github.com/rust-lang/cargo/pull/2405 landed, but https://github.com/rust-lang/cargo/issues/2082 is similar too and is still open.

cm3 · on May 5, 2016

Interesting, but I was thinking more of something like "update installed binaries with latest version from index". This doesn't look like it.

steveklabnik · on May 5, 2016

Ah yeah, it's per-thing, not every thing you've installed, currently. cargo install --list does list all the things you've installed, so it wouldn't be too big of a deal to add; the pieces are all there.

Siecje · on May 5, 2016

How does Cargo manage the case when dependencies X and Y both require a different version of dependency Z?

steveklabnik · on May 5, 2016

It attempts to unify them if it can, but if they cannot be unified, it will include both versions.

cm3 · on May 5, 2016

Include both how? Make renamed copies to have two and modify the code to use the new names?

steveklabnik · on May 5, 2016

It doesn't have to do any renaming or modifying of names. Rust has a module system, so each dependency will use the version they need.

The only time it doesn't Just Work is if one of those crates re-exports a type from the sub-crate in a public manner, and then you try to call something from the other crate with a value of that type. You'll get a compile-time error about mismatched types.

cm3 · on May 5, 2016

But it sounds like that's what's going on. Using foo-1.1 in bar.rs and foo-2.2 in rabbit.rs.

steveklabnik · on May 5, 2016

Sure. That will Just Work in every case _except_ if bar and rabbit both re-export a type from foo, and you try to use them together. If they don't re-export anything, then it all works just fine. No renaming or transformations needed.

kibwen · on May 5, 2016

Rust mangles symbols with version information, so it's no trouble to have two versions of the same library in the same binary if necessary.

cetra3 · on May 6, 2016

This article purports that Servo relies solely on Cargo for the build step.

If you look at the build step from github it looks like it's relying on mach also:

https://github.com/servo/servo#building

Shouldn't it instead be using cargo purely?

SimonSapin · on May 6, 2016

Mach is what you interact with when doing stuff in Servo, but all the build-* sub-commands end up calling Cargo, which is what does the heavy lifting.

The typical `./mach build` command will download known-good versions of Rust Nightly and Cargo (unless that’s already done), call `cargo build` with some flags, then show a system notification when it’s done. That’s it.

steveklabnik · on May 6, 2016

Servo relies on a specific version of Rust, and Mach helps with that.

Cargo isn't a general build system, it builds Rust code. I use Make to drive Cargo for my OS project, because I use nasm to build the asm and Cargo to build the Rust, for example.

indatawetrust · on May 6, 2016

the subject will be on the outside but what are examples of applications developed with rust?

steveklabnik · on May 6, 2016

There's a lot of OSS stuff as well, (well, "a lot" for how old Rust is) but we just recently added https://www.rust-lang.org/friends.html to the website to showcase production users.

dschiptsov · on May 6, 2016

So, dependencies in, say, cabal or go or CL are unpredictable?

The trend to put meaningless but catchy adjectives, like "safe" and "predictable", as if everything else is opposite, is misleading.

Designers of classic languages were not some arrogant punks (they were brilliant, like David Moon and other guys of his generation), and the most of findamental problems were well understood in times of Standard ML or Common Lisp.

So, in comparison to C or C++ it might be "predictable" and even more "safe", but in comparison with well-researched classic languages, these mere redundant adjectives.

dbaupp · on May 6, 2016

By all reports ("cabal hell"), Cabal is unpredictable, and Go's dependency management problems prompted countless external tools until the recent vendoring work: with the default `go get`, there is no guarantee that a build will be the same from day to day, as the best one can do is point at a given branch of a repository (a moving target), not tags or commits.

(I have no idea on CL's dependency management situation, maybe it is even better than Rust's, in which case it would be amazing.)

f2f · on May 6, 2016

minor factual adjustment: "go get" doesn't update a package once it's been downloaded. you need to force it.

"there is no guarantee that a build will be the same from day to day" is simply not true.

dbaupp · on May 6, 2016

Oh, indeed. I must've got mixed up: it doesn't guarantee that builds will the same from machine to machine (unless you run go get at the exact same instant).

dschiptsov · on May 6, 2016

So, in Golang there is no concept of package version?

For them different location (to import from) for an incompatible version is good-enough.

f2f · on May 6, 2016

go get is just a mechanism, not the tool. there are other tools build on top of go get that ensure proper versioning.

anp · on May 6, 2016

Rust isn't targeting all of the same use areas as the other languages you mentioned (and FWIW I've heard not great things about go dependency mgt). Many, if not most, of the rust materials in the wild are pitching it as an alternative to C and C++, because that's where the language has the clearest competitive advantage.