Hacker News new | past | comments | ask | show | jobs | submit login
NPM Package Hijacking From the Hijacker's Perspective (medium.com/nm_johnson)
158 points by chatmasta on March 25, 2016 | hide | past | favorite | 37 comments



Security in a complex interdependent software ecosystem is untenable.

Period.

NodeJS/npm is in no way unique here. Apt, yum, gem, Homebrew, Docker hub, etc. are just as bad if not worse, not to mention "git clone" followed by "./configure" or "make." Any time you bring down code onto your machine and execute it you are... well... bringing down code onto your machine and executing it.

I've been increasingly thinking that this could be a very fertile area for AI research. Security is really an AI hard problem. There is no combination of sandboxing, permissions, auditing, formulaic static code analysis, firewall hacks, etc. that will yield a system that is both (a) usable/convenient and (b) secure.

Take 'npm' for instance. It's amazingly convenient but it (and all other packagers like it) is a security nightmare. Operating securely would require one to set up a shadow mirror of the entire NodeJS ecosystem and then have someone ($$$$$) manually audit every single thing in there that you are going to use and every single change that comes down from above. That's not tenable for anyone but the most lavishly funded organizations, and anything like that is universally reviled by developers since it slows them down. I've worked in environments like that before (government), and we had people quit because it was just "impossible to do my work."


The real problem is that things you run have full access to anything that you as a user have. In a more ideal security scenario, one that I've been pondering for many years, every single program has its own sandbox to play in, and can't see or affect your other user data. I'm sure that mobile devices are doing very similar things.

An "Open File" dialog box would let the user see exactly what they're picking, but just return an opaque & reusable handle to the program, meaning there's no change to the user experience. There would be more permission requests to the user in other circumstances, but that's the result of distrusting code that's running on your box.

I wouldn't expect an AI to try to preempt all the problems of a black-list style system, but rather a white-list as above would be far more manageable.


Isn't that basically what OS X app sandboxing is? Well, the results of Open File aren't an opaque blob, but the app doesn't have permission to access files outside its sandbox until the user uses the Open dialog, which grants the app access to the file in question.


This is basically how mobile apps have ended up, and it works relatively well

Maybe it's not too late to do the same on desktop!


> git clone" followed by "./configure" or "make."

Actually, you don't even need to follow it with ./configure or make.. An ext:// git url to clone from will execute arbitrary code:

http://www.vuxml.org/freebsd/7f645ee5-7681-11e5-8519-005056a...


This is a recurring theme within the Unix world — amongst others, vulnerable programs have included tar(1), xterm(1), and vim(1), and they always stem from the eagerness of the authors to provide both the ability to run shell commands, and the ability to run commands from untrusted sources; the fix is to limit the extended ability to trusted sources. But when reported, it (1) gets fixed quickly, and (2) leads to an uproar within the community. Here it has been a wontfix: works as expected, and will likely stay that way.


> There is no combination of sandboxing, permissions, auditing, formulaic static code analysis, firewall hacks, etc. that will yield a system that is both (a) usable/convenient and (b) secure.

https://en.wikipedia.org/wiki/Object-capability_model contradicts this, or anyway offers an alternative to the pile of hacks. It's true that none of the systems listed there are familiar; it's very hard to supplant the basic assumptions we've built on and built on since the 1970s.


Seems like a complexity explosion.

Un-forgeable references can be created using crypto. In a way things like Bitcoin addresses or cryptographically authenticated network endpoint addresses might qualify.


Those sorts of references support the capability model, yes, and they're the best you can do in a distributed system. They don't fit the object capability model because an overtly confined process can smuggle a reference out to a confederate via covert channels -- timing, power, etc.

So why did I bring up the less general model? Because the parent comment claimed usable security is impossible, even at the local level. If local machines have no security, the distributed case isn't going to be any better.

The complexity explosion I'm seeing is in the bail-and-patch approach.


I don't think a super detailed object permission model would be less complex than bail and patch. Bail and patch is after all just the ad hoc input and adjustment of rules. It's an AI hard problem because the complexity just is and you have to deal with it.


See "Admonition and designation" in this short paper "Aligning security and usability": http://people.cs.vt.edu/~kafura/cs6204/Readings/Usability/Al...

Security isn't a separable concern; the epicycles grow out of trying to treat it as separable -- you have your program, and then you have your rules restricting the program. ('By admonition' in the paper.)

Here's a modern example of the alternative: https://sandstorm.io/how-it-works#powerbox "Notice how in this example, the application never gains the ability to send spam. And yet, the user experience is no worse and arguably better than before. The user is never prompted with any sort of security questions, yet the app is only able to email them with their consent."

P.S. if my remarks came across as combative, I didn't mean them to. I'm just offering links about how some of us think we're not stuck with the current untenable situation -- life can get better.


Seems like you should just accept the risks and isolate your systems as much as possible, i.e. depend only on ephemeral data and restart frequently.


apt and yum are emphatically better since at least they verify their sources and maintainers and sign packages, and it doesn't really require much compared to the effort put into actually writing all that packaged software. These scripting language packaging systems have none of that.


I'm not a JS dev, but would it be possible for a malicious actor to create a pre/post install script that looks for packages published by the current user and "worms" its way into those pre/post install scripts?

If this were possible, something like this could have the potential to infect quite a few npm modules.


The post install script can be like any other script the user can run. There's no sandboxing so it can access anything the running user can access. So essentially, yes, it could infect any other module in there. I believe it could even run an "npm publish", which is pretty scary.


Even worse, `scp` the `~/.ssh/id_rsa` private key and known hosts file. The attack vector went tenfold.


The recent NPM situation was a real eye opener for me. I never gave much thought to the attack vectors involoving package managers.

> The post install script can be like any other script the user can run. There's no sandboxing so it can access anything the running user can access.

Wow. This just seems wrong that the script has such far reaching privileges.


Considering node.js is able to spawn subshells and execute whatever code it wants when running, install scripts pose no additional threat to just running the javascript.


The changes a package install makes ought to be limited to the source files within its subdirectory (and perhaps some precompiled binaries). I don't quite expect it to be installing rootkits, and the principle of least privilege dictates that it should not be allowed to.

Once the package is installed, it is already too late for a code review, or any mitigation. A well-written worm will never be detected.

It is unexpectedly bad design (or, in case of the JavaScript community, an expectedly bad design).


I've not tried this, but a similar but easier exploit would be to register packages with names that are close to existing popular package names, and then have a post install script inject a modified version of the actual package into the node_modules directory that also does something malicious. So you register the "lodasj" package instead of "lodash" and than create a post install script to inject the malicious "lodash" package that re-exports all of the lodash API + does something nasty. If someone has a typo with "npm i lodasj" and doesn't notice the mistake, the machine that installs it and anyone that depends on the package is infected. I wonder how well policed NPM is against these kinds of malware attacks.


This one isn't doing anything scary at the moment, but shows the potential:

https://www.npmjs.com/package/uglifyjs

It's getting 30k+ downloads a month.

The actual package people are looking for is here: https://www.npmjs.com/package/uglify-js


they'd notice as soon as they tried to run their program as "import * from 'lodash'" would fail (unless they were consistent with their typos...)


You can inject a version of whatever you want into the node_modules folder in the post install script though can't you? So you just copy your own malicious lodash into node_modules. I haven't tried it but I think it'd work.


what if it ran "npm install -g lodash"?


Scripts of dependencies don't run. Period. You'd have to fork someone else's repo and run npm install on that fork. If you're doing that, you ostensibly trust their code enough to inspect and work on it. If not, wtf are you doing?

If you run `npm install` on a project, you're simply installing its dependencies (and actually running any pre-publish hooks too, for some stupid reason).


This idea appears to be the subject of https://news.ycombinator.com/item?id=11364347 .


yes, it basically means this and it is not about npm in itself, it can also be gem, a pod, a jar ... basically what he highlights is that there is a lot of trust in these days in the open source repositories towards non-certified/verified contributors. There are others that already found the secure ways, such as the Linux kernel developers. I as a non developer, compared to them, in the web world, I put a lot of trust in the other, with almost no idea what/who he is, mostly a basic read of his username/github profile. And npm architecture, the "ultimate" code reuse repository, due to "there's an npm for sum(a,b)" that they so madly promote, it is extremely sensitive to exactly these types of maliciousness. Maybe npm will start to introduce policies, boards, advisers, commissions, really start to act professionally as it should, maybe this will work, but up to a point, in the end, it is the dev's problem to use the proper shit, and have some basic "wash your hands before you eat" type of rules: - use as little dependencies as possible, it will byte you later, it has been proven, don't head this road, fuck it! - ensure a back-up for that stuff, use a FTP account if you still are in that era, put them on S3 or even Dropbox, it is modern times, you have the means. - when you finally decide on one, check a little bit its usage rate, its latest commit date, its main, contributors, its image, yes, be shallow, look at the clothes, it must look good to the eye too. Ensure at least the dev tried its best in his available time there, get involved, read the number of users, the magic stars, the votes, the ratings, the songs, the novels, the books. Have a problem, not in the github issues, a functionality change, well, don't just beg for it opening a new bloody issue, fork the stuff, fork it hard and add your new magic trick and then create PR for yourself to shin and let others handle it from where you left. Be wise, improve yourself, be polite, be nice, look at the dependency's dependencies too, if they are shady, fuck it! roll your own! - most of the programmers these days, especially web ones, tend to rush to solution with their new agile/kanban/shalmban methodologies, their golden paths to success and ideologies, without taking time to do some housekeeping, I also do not blame, sometimes it really is without actually HAVING any time at all because the Scrum Master 5000 expect deliveries, solutions, not problems - I am a human, I live, I eat, I sleep - I do not have time for your shit, if you want it fast like that and don't accept my own timeline after presenting it to you 100 times in every daily stand-up shit, then fuck-it - no housekeeping, I go shady I start a dance with the wolves! - there is something rotten in Denmark with this "there's a start-up for every shit" mentality from which everything flows

Enjoy life!



One thing that bothers me about the Node community is how the first person to implement a protocol/API/wrapper tends to name their package something like 'protocol' or 'protocol-js' instead of something like 'a-clever-differentiable-brand-name' as is common in other software communities. That makes it hard to distinguish an unmaintained package with tons of bugs from a more polished version sponsored by a big-name contributor, and makes it easy - even encourages - consumers to place unwarranted trust in packages with legitimate-sounding names. As an example, consider if a new browser on Linux was packaged as 'browser-linux' in the system repositories-- how many downloads would it get?


Why are modules allowed to be unpublished?


A few reasons why a module might be unpublished:

1) Legal -- copyright violation in the code itself, patent infringement, full text of a novel somehow found its way into the docs, etc.

2) Bugs or security holes, particularly if a bug is being actively exploited in the wild.

Hopefully, neither of these comes up often, but when they do, you need for the mechanism to be available.


This is exactly what has made me extremely nervous about NPM. There is no security oversight at all and modules running on my dev machine with my full credentials could be uploading anything - Looks like I'll have to move node.js development to a sterile VM.


In the Python world, you can self-host a copy of PyPI [1] (our package repo). Though you might do it for a faster build process, this can also prevent the "disappearing public repo" problem [2].

Is there something analogous for npm?

1: https://pypi.python.org/pypi/pypiserver

2: http://schinckel.net/2013/04/15/my-own-private-pypi/



Why can't we have a packages.lock like composer has? At least we can then decide if we want to install updates and keep all environments in sync.


npm shrinkwrap


> Caveats

> If you wish to lock down the specific bytes included in a package, for example to have 100% confidence in being able to reproduce a deployment or build, then you ought to check your dependencies into source control, or pursue some other mechanism that can verify contents rather than versions.

https://docs.npmjs.com/cli/shrinkwrap




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: