Hacker News new | past | comments | ask | show | jobs | submit login
NPM Vulnerability Discussion on Twitter (solipsys.co.uk)
140 points by ColinWright on May 10, 2022 | hide | past | favorite | 202 comments



Yes, I know, some of you will hate the layout with a passion. But this kind of diagram is the only way I can make sense of the back'n'forth that sometimes happens.

If you prefer Twitter's rendering then just click on a node and that tweet will open for you.

Edited to fix a typo ... thank you Jonn. Oh how I hate auto-corrupt.


It would greatly help if there were some text in the upper left saying that there is a large diagram on the page but you may have to zoom out or scroll to see it.

All that shows up on an iPad at normal zoom is a white page. I waited quite a while thinking maybe it was just generating something client side and that was taking a while before I noticed that there were scroll bars (both vertical and horizontal) on the page and their thumb size indicated I was only seeing a tiny fraction of the page.

On my desktop it depends on what I was doing previously. About half the time my browser window won't be big enough for anything to show up in the initial view.


I agree completely, but I've tried many times to figure out how to do that, and I've had no luck. I don't have any front-end skillz. If you can tell me how to do it, I'd be grateful.

Thanks.


I don't know how you would put some text in an SVG, but if you are OK with linking to an HTML page instead of directly to an SVG maybe just make an HTML page that has a line of text saying that one might need to zoom out or scroll to see the large image below, a line break or paragraph break, and then an image tag referring to the SVG?


Working on it ... I have a couple of options, neither perfect, both require some time, but I may be able to do something.

Thanks.


If you can run a post-processing step on the SVG file before putting it on the server, the following might work. I took a look at the SVG, and although I don't know anything about SVG other than it is an XML format it looks pretty easy to add some text at a given location.

The white background seems to come from this:

  <polygon fill="white" stroke="transparent" points="-4,4 -4,-11651.5 3898,-11651.5 3898,4 -4,4"/>
We just need to find that in the SVG, look at the points defining the rectangle, find the left side and the top side from that list of points, then add some text. The left side is the minimum first coordinate in the set of points and the top is the minimum second coordinate in the set of points. That's -4 and -11651.5 in this case.

Then we just need to add text there:

  <text text-anchor="left" x="-4" y="-11637.5" font-family="Times,serif" font-size="14.00">Hello, World!</text>
Using the left side for x seems fine. For y using the top results in clipped text because apparently the y for text is the baseline. I don't know how font sizes work in SVG, but it looks like moving the base line down by font-size works nicely.

Here's a little Perl script that can be used as a filter that takes your SVG and adds that text line, finding the top and left by looking at the points list in the first polygon line that has fill set to "white" and stroke set to "transparent".

    #!/usr/bin/env perl
    use strict;

    my $left = 0;
    my $top = 0;
    while (<>) {
        print;
        if (m{<polygon.*fill="white"} && m{stroke="transparent"} && m{points="(.*?)"}) {
            my @points = split /\s+/, $1;
            foreach (@points) {
                my($x, $y) = split /,/;
                $left = $x if $x < $left;
                $top = $y if $y < $top;
            }
            $top += 14;
            print qq{<text text-anchor="left" x="$left" y="$top" font-family="Times,serif" font-size="14.00">Hello, World!</text>\n};
            last;
        }
    }
    print while (<>);


Yeah, I've done the post-processing SVG version and it works fine, until the chart itself does have a node near the top left, and then things overlap.

So I need to test whether they will overlap, and suddenly it's a bag-o-nails. Or at the least, need to test if something is up there and not both including the extra, but that's ... inelegant.

The option of having an HTML template and including the SVG will work, and that's the one I'm probably going to go with. It needs tweaking.


Another way to save the post-processing approach that would probably be easier than testing for something already up there would be to lower the top coordinate of the polygon of the white background by subtracting font-size. That would give you some new space on top for the text that should be free of anything else.

(Probably need slightly more than font-size, in case parts of some characters descend below the baseline).


Something to experiment with ... I hope to have time in a few days.

Thanks.


I find the diagram very easy to quickly understand everything. Well done! Haters gonna hate no matter what you do.


I enjoy this layout and I only wish it had some sort of check for background color contrast accessibility.


Yeah, the colours are a complete hack. There are so many things this needs to make it more usable, but I have neither the time nor the skillz.


Actually prefer this to Twitter. Easier to read.


It is until the discussion gets really big. When there are several hundred tweets there's no good way to layout the graph. Then you need folding, but even then, folding in the 2D environment gives you a much better sense of the discussion.


To be fair, when there are several hundred tweets involved it's even more difficult to to follow on Twitter. Your format is better.


I don't suppose the source that generated it is available? :)


Not yet ... working on a few issues.


I like it. The spatial element gives some pointers towards order in time too. I think ordering by importance (most quoted, replied) takes precedence.


This old dog is reminded of trn(1), "threaded read news".

See top right: https://upload.wikimedia.org/wikipedia/commons/d/d8/Trn_cons...


That does bring back memories... and would be a nice way to visualize Twitter.


What do you think trn/slashdot/HN-style threading is lacking?


Any kind of sense of where I am in a discussion.

If you find the trn/slashdot/HN-style threading better than the 2D graphical layout then I doubt I can explain to you why I find it (the threading) lacking.


Your algorithm to embed a graph in 2D is intriguing: https://i.imgur.com/7xhl7VY.png

What kind of curve is this?


It's just using GraphViz.


I love this visualisation. Is the script to generate it publicly available?


Currently not, there are significant problems that I'm trying to solve before making it available. No, I'm not trying to make it perfect, but I'd like it to be actually usable before opening the code.


Fixing security issues is nice, how about we go a step further and learn to write a one or two liner instead of inundating every project with 400,000 indirect dependencies?


I'm not sure how to avoid this sometimes.

I have a Rust project I've been working on.

It has dependencies for:

axum: the web framework in use

serde, serde_json: Serialization of API responses

sqlx: To talk to sqlite

time: Standard datetime format with localisation and serde support

tera: templating

tracing: async logging

uuid: UUID generation

Of these, I feel they're reasonable dependencies. You could maybe quibble about uuid (which itself only depends on rng + serde with the features I've enabled), or tracing maybe, but they provide clear value.

Anyway, those direct dependencies expand out to 300-odd transitive dependencies.

So what do I do? Do I write a templating engine, HTTP server implementation, SQLite driver and JSON library from scratch to build my little ebook manager?


300 is nothing compared to the typical Node project. It's typical for a basic React project to have over 14,000 dependencies.


Can you please source this? A react project can be so many things...do you mean create react app? A custom react app?


create-react-app starts with 1506 deps:

    $ npx create-react-app my-app
    $ find my-app/node_modules -name package.json | wc -l
        1506


90% of which are development tools - eslint, testing, typescript, webpack, etc.

The actual runtime dependencies of a react app are basically just react and react-dom.


Are dependencies that run on your development machine any less of a maintenance or security concern?


No, but the number being quoted is the sum of two different security concerns - and it’s attributing the concern to ‘react apps’, when actually react itself is pretty clean in terms of dependencies.


Yes, because they aren't running in prod.


300-odd seems reasonable (or at least unavoidable, given modern realities). JS projects typically have much more, and additionally, you don't even get deduplication of modules, so you may have three different versions of the same module on disk.


I don't think you can reason about quality in a modern Javascript project. Keeping a close eye on thousands or tens of thousands of modules isn't practical.

If Software Engineers were more like "real" engineers (ie PEs) they could never sign off on a project built like this.


Browsers recently implemented uuid natively, not sure if that’s a js spec change but do check :)


Doesn't help in this case, since it's server side code.


Hex/base64 encoding of some bytes from system random isn't sufficient, it needs to be UUID parseable format specifically?

Aside from that, I guess your other dependencies have the same problem. It's not enough for one person to be mindful if they need something fully reasonable like a web server library which then depends on a million packages to build from source. Often-used dependencies could

- work like a regular package system (e.g. Debian's) and distribute (reproducible) binaries unless explicitly asked to build from source,

- only pull optional dependencies when they're used (a web server might depend on a logrotate dependency but maybe you don't use on-disk logs at all)

- and/or be more selective in what they depend on.

None of these are quick and easy to do without downsides


I think these packages are just holdovers from "Before ES6". I can see it be reasonable to pull in a package like this back before the functionality was implemented in native JS. All these packages are just holdovers that are only used by:

* Legacy Projects that pulled them in way back when and never migrated to the native API's

* Inexperienced developers not knowing this is part of standard JS now.

You can't really do much about the former. For the latter though, NPM would be wise to plaster a link to MDN's docs on some of these packages.


But "isEven"???

That's one line of code in your utils module...


In typescript? Sure. In vanilla JS? It’s more complicated. How do you handle someone inputting undefined/null/a string/a random object/NaN/etc?

For isEven, you first need isNumber. And that’s where the complexity lies.

    ({}) % 2 == 0 // false
    ([]) % 2 == 0 // true
    "" % 2 == 0 // true
    "a" % 2 == 0 // false
    "1" % 2 == 0 // false
    "2" % 2 == 0 // true
    undefined % 2 == 0 // false
    null % 2 == 0 // true


How do you handle someone inputting undefined/null/a string/a random object/NaN/etc?

Your isEven function should check that it's operating on a number. Eg "const isEven = (i) => typeof i === 'number' && i % 2===0;" If it's not a number then calling isEven on it doesn't make sense. If the user wants to check if a string is even then they can convert it before making the call.


Now someone is going to define const isOdd = (i) => !isEven(i); and you’ll get trouble.

A proper solution would be along the lines of:

    const isRealNumber = (n) => {
      if (typeof n !== 'number') return false;
      if (isNaN(n)) return false;
      if (n === +Inf || n === -Inf) return false;
      return true;
    };

    const isEven = (n) => {
      if (!isNumber(n)) throw new Error("Numerical error: Invalid input");
      return n % 2 === 0;
    }

    const isOdd = (n) => {
      if (!isNumber(n)) throw new Error("Numerical error: Invalid input");
      return n % 2 !== 0;
    }
This is the additional complexity created by dynamically weakly typed languages. And that’s why we get all these BS tiny npm packages.


The irony of using the isEven as an example of what you should just write yourself and then providing an example that returns a truthy value for isEven(3).


No idea what you mean. ;)


You don't handle it. Garbage in, garbage out.


Or garbage in => throw an exception and crash the app. This stuff should be fixed as soon as possible instead of invalid data slowly spreading in every layers of the app.


Assuming you're importing it as a black box abstraction, then absolutely.

'isEven' really shouldn't be a dependency though, either through npm or internally. 'x % 2 === 0' works fine for integers and can be inlined, or if you're using it a bunch and want slightly cleaner code, you can define it as a lambda alongside the code that uses it. Then everyone can see exactly what it's doing with the full context.

The real issue is overabstraction. Even if you do everything "right" and have it as an internal dependency with all the correct error handling, it still makes the code a pain in the ass to read.


No.

If the goal of isEven is to handle user input, it is woefully misnamed.


Right, it should be "isUserInputEven" /s


typeof x !== “number” ? false : x % 2 === 0


Somebody in your codebase is going to use !isEven(x) to test for isOdd and introduce a bug. IMO, it should crash if passed a non-number.


Actually it's the other way around. The isEven package has one dependency, which is the isOdd package.

I am literally not joking. https://www.npmjs.com/package/is-even


https://www.npmjs.com/package/is-odd-and-even

I don't know what to say.

  /**
   * is-odd-and-even
   * Github: https://github.com/fabrisdev/is-odd-and-even
   */

  import isOdd from 'is-odd'
  import isEven from 'is-even'

  /**
   * @param {number | string} i The number to check if it's odd and even
   * @returns {boolean} True if the number is odd and even, false otherwise
   */
  export default function isOddAndEven(i){
      return isOdd(i) && isEven(i)
  }


IsOdd and IsEven are so famous that it would not surprise me at all if there a bunch of joke modules using them.


Independent of the insanity that is the npm ecosystem, the API design skill of handling invalid input in a bug-resistant way is important. If somebody sent me a code review with isEven implemented as "typeof x !== “number” ? false : x % 2 === 0" I'd tell them to fix it.


* Tutorials written in the pre-ES6 world being copied and used in the ES6 world.


No. We must move fast, break things, and trust the network (because it's reliable and all that thing).

Implementing your good code with no dependencies is heresy! It's a relic of the past.

</rant>.

On a serious note, you can implement a lot of things with a standard library of any language, and only time consuming machinery should be incorporated as external dependencies, but somebody didn't get the memo, I guess.


NPM is a shitshow. At one point you could literally take over people's packages by asking the NPM staff nicely.

Source: https://twitter.com/sephr/status/1524080664106086400


One of the problems I see are the super popular holdover packages before ES6 was standardized on. The tread mentions the `foreach` package. And remembering back, many previous supply chain attacks were on these small packages like: `iseven`, `isodd`, etc.

I don't fully understand why packages like this are so popular. My guess is it has to be a combo of legacy projects pulling those deps in and inexperienced developers not knowing that these features are now part of standard JS.

While this would cause controversy, I think that NPM should lock all these dependencies down and not allow any modifications. The modules would basically be passthroughs to native ES6 functionality. I'm not saying NPM should lock down all legacy packages, just the ones that implemented ES6 functionality (isBetween, isEven, isOdd, forEach, etc)


I don’t get why anyone would introduce a dependency this small to their codebase.

It’s a much better choice to just copy the source code and possibly license into your own code to just eliminate the overhead.

It’s not like any of these packages depend on being updated for security reasons or anything.


At risk of criticism, I'll bite. I used to think this way and include a lot of small dependencies in most projects I worked on.

The thinking was as follows: Of course you could just copy the code, but then that increases LOC in my codebase that I'm responsible for. More code is more work. Lines in a dependency are the responsibility of someone else. If there's a bug, even in a small function, the community can identify it and fix it. I can get new features I might not have known I needed. I can benefit from all of these fixes indefinitely into the future without ever having to have any mental overhead about that code. So can everyone else; it's good to maximize code reuse.

I don't think I've ever used something that could be an obvious one-liner like `isOdd` but for lots of only slightly more complex stuff like left-pad, email format validation, GPS coordinate math functions -- all stuff that's really less than 30 lines -- it was really nice to just not have to think about the implementation details of that and get back to solving your problem. I could have reviewed the code or written it myself but it's just more work when remaining at a high level `leftPad()` call let's me stay focused on my original task.

That said, I've since realized I was wrong of course. Trying to maintain projects that haven't been touched in more than a year led to hours of fixing dependency issues. We switched to using dependabot, which is better, but just makes it obvious how much work it actually is to keep dependencies up to date week-to-week. Then there's all of the security issues. These days, for small packages, I advocate for reviewing the code from these packages, ensuring we understand it, and then copying it in directly with a comment for attribution. We generally try to keep dependencies low; still more than in other languages but at least some thoughtfulness about whether it's "worth it". I think a lot of the community has shifted similarly, but there's still a lot of older projects with older dependencies.


> Of course you could just copy the code, but then that increases LOC in my codebase that I'm responsible for.

Once your company is owned by a supply chain attack or by an RCE in one of your dependencies, you will learn that you are in-fact very much responsible for the code in your external dependencies.


This drives me crazy. If your software is using a library that library is part of your software whether you originated its code or not. If you are responsible for the software then you're responsible for the LOC in every library your software depends on.


I think a lot of it is from a single engineer: sindresorhus

He’s a prolific open source author that traditionally had a lot of modularization in his packages—though I think he has started to move away from it recently. He talks about it here: https://blog.sindresorhus.com/small-focused-modules-9238d977...

Most projects will include at least one package by him in the dependency tree.


Funnily enough, I was just digging through my current project's dependencies and he's slipped in via a Rollup plugin which uses his "globby" package, which in turn uses his "slash" package.. Slash is about 5 lines of code, which is roughly 4 more than most people need for what it does.

I vastly prefer to use libraries with 0 dependencies but I never quite manage it, so I end up with the same problems as everyone else.


For years, the bane of my existence was always lodash. Import one tiny utility and it brings along 15 of its closest friends. You can't get away from it. It was built so incredibly modularly, that it has to import a bunch of other things. Now most folks will quickly rush to its defense and tell you that Webpack should be able to tree-shake it well enough. My actual experience is that it was a tremendous waste of my customers' bandwidth. Seeing LoDash as a dependency of anything I use is an immediate NOPE for me. I've yet to see anything it provides that I couldn't just do myself.

It's an unpopular opinion for some silly reason. It's like when Guava or Apache Commons was included on every single Java project in the 2010s. "It's just so much easier to use a well-tested library". That line of thinking is what got us here.


I'm the opposite, I tend to be relieved when I see lodash as a dependency. In any normal sized project I'm probably already using it, or it's already a dependency of a dependency. I trust it a lot more than a hundred tiny libs by random authors, both in terms of sketchy behaviour and reliability/performance. If I need to target platforms which don't support modern JS I vastly prefer to stick to lodash than use a bunch of polyfills of unknown quality.

The individual functions are installable separately from NPM, and lodash-es should tree-shake quite well, but I do know what you mean about it dragging in its internal dependencies so you end up with a 15kb of lodash code for a single thing. I probably wouldn't love to use it client-side on an ordinary website, were I to make one.


> I don’t get why anyone would introduce a dependency this small to their codebase.

This might be controversial, but I think it has to do with less experienced developers, beginners, who doesn't know how to find out via "pure code" if a number is even or not, or even if something is a number (IIRC, there is an isNumber package out there as well).

I see this in other languages as well, but not in a "JavaScript scale", but that could probably just mean that the language's availability and popularity decides the number of "stupid packages."


Let's call a spade a spade. It's a bunch of noobs that have become the majority and created a culture of dependency-first development.

I haven't worked with anyone for years that doesn't start troubleshooting an issue by reaching for a random dependency. They just don't even bother learning Javascript or the Web APIs or CSS anymore. Without a solid base of fundamentals every trivial problem seems insurmountable so they just don't even try.


If we're going to call a spade a spade, then let's go ahead and say that the majority of the packages that are always referenced in those discussions (is-even, is-odd, etc) is maintained and primarily used by a minority of developers that are very well known in the ecosystem.

This minority makes money out of their popularity in the NPM ecosystem. Having 1000 NPM packages is better for your reputation than having 100. And having all 1000 packages with lots of weekly downloads is better than having just a portion of that.

But how do they achieve that? Well, they have 1000 NPM packages, so each one depends on 5 to 10, that then depends on a few handful more. You have packages for checking if an HTTP status is a certain number, you have packages that have colors as constants, you have is-even, is-odd and so on. All that exists to maintain that closed ecosystem.

So out of the 1000 they basically have 20 useful tools and 980 garbage packages that exist only to maintain their own ecosystem.

Most people isn't using is-even or is-odd directly. They imported some other packages that are quite useful, but often need 10-20 sub-dependencies. Another interesting thing is that those shitty packages aren't really that important in applications. They're often used in build tools, CI and testing, tools for making CLI tools, and the sort.

The crazy thing is that a lot of people using is-even/is-odd aren't really "noobs": they're probably experienced developers that said "fuck it, I'll use some random tool from the web" when facing some random a problem.


That's a more charitable (and more reassuring) interpretation than "developers don't know the modulo operator".

That said, it still leaves a sour taste as this effectively implies that a certain set of JS developers is very happy to abuse their (maybe initially rightfully earned) prestige to gain even more prestige while leaving behind a mess for the whole ecosystem. I don't understand why this is tolerated. The Node community needs to have a serious discussion about why certain packages are allowed to spread garbage, create forks of the relevant packages that rip out "is-even" etc. and then eventually converge to these forks. But to this day, I don't see the community taking this problem seriously enough.

Now, supply chain attacks and "too many dependencies" are a potential issue for every language with dependency management (see also log4j, etc.), but no other ecosystem seems to be have such a high frequency of issues and (widely used) "is-even" packages are simply not a thing in any other mainstream language (some languages, like Swift, include similar functionality in the standard library, which is totally fair).


The reason it is tolerated is because the philosophy of "thousands of small packages" has spread far and wide.

For every person calling it out like we're doing here, there are ten others praising maintainers able to whip ten semi-useless packages per week.

It's not just random maintainers making small packages. The core infrastructure of Javascript is in it. Babel is made of hundreds of packages, which all live on the same repository (because of course the maintainers don't want the hassle of maintaining multiple things). Some of those packages don't even have anything of importance in it, just metadata, a couple flags and some boilerplate [1]. The package is just a way of organizing code. Webpack, ESLint and others aren't exactly better.

EDIT: And of course I got downvoted :)

[1] https://github.com/babel/babel/blob/main/packages/babel-plug...


You're tacitly giving the people you despise leverage when you say things like:

> The core infrastructure of Javascript is in it. Babel[...]

Babel is not core JS infrastructure. It may be close to fundamental to the modern NodeJS development experience, but JS exists happily (and capably) without any of that stuff (including package.json, for that matter).


Fair enough! True, Babel is only "core" for a subset of JS developers, not for the language.

I don't really despise anyone in Babel, though, I'm only criticising their packaging method. Babel isn't doing the million-packages thing to gain popularity.


> Having 1000 NPM packages is better for your reputation than having 100. And having all 1000 packages with lots of weekly downloads is better than having just a portion of that.

Should we treat them as spammers and polluters, then? Because if what you describe is true, that deserves to be called out and mowed to the ground.


Definitely.

It is more akin to SEO spamming than black market spamming, though. They're polluting NPM in the same way SEO farms spam Google. It makes life difficult for everyone, but it's still a gray area in terms of legitimacy. Which is why nobody really talks about it.


I don't know if this is true, but if it were true you'd have a veritable "tragedy of the commons" [0] situation were privatization of (some) gains leads to the creation of negative externalities for the rest.

If incentives were aligned differently, different results might have resulted. Probably with different externalities (or unintended consequences).

[0]: https://en.wikipedia.org/wiki/Tragedy_of_the_commons?wprov=s...


> Most people isn't using is-even or is-odd directly. They imported some other packages that are quite useful, but often need 10-20 sub-dependencies.

IMO, that's even worse, because that means that a lot of people are using stupid and vulnerable packages without knowing it.

I wish there was some sort of better control over the NPM directory, where someone could block/downvote (or whatever) packages that doesn't deserve to live. How this would - or should - work in practice, I have no idea, but it's just getting scarier by the day to import a package in your application.


The problem is that those fame-chaser maintainers aren't the only ones doing it. Babel does it. Webpack does. ESLint does it. Before them others did.

If we ban those, we'd have to ban Babel and Webpack too... Oh now, wait a minute, now that actually sounds interesting...


Wow. You must be a special kind of stupid?

What if we focused on fixing all the problems, and not just retreating thinking that "we can't solve this problem, because there are so many other problems related to it"?

Your thinking is literally the definition of the problem.


A few developers will register multiple packages that are essentially the same package with small tweaks. They choose the names in a way to get the right keywords to hit from npmjs's search


If a developer doesn't know how to write their own "iseven", I don't want their product.

Maybe you could argue other for other reasons to use these packages.


I suspect a lot of this use doesn’t go into products, it goes into projects.


Comparing to:

1. search for a package that does X

2. scan the search result and find the package that really does X

3. learn to use the package's API

4. import the package in the code

5. use it

Isn't it much easier to just copy-paste code from stackoverflow? There's also a good chance that the you can get some very good explanation and interesting discussion around the implementation there.


Well, yes.

There's also Github Copilot, which pretty much replaced almost my entire usage of Stack Overflow.

But the thing is that with a rando package is that one doesn't have to review the code. Sure, the code is also coming from somewhere else. Sure, it might never get updated. Sure it might be full of bugs. Sure, it might be more dangerous than copying from StackOverflow. But out of sight, out of mind.

In the end the overuse of packages isn't about saving time or "doing the best for business" or "ensuring that the code is maintained by someone else". It's purely about covering our asses.


I feel it's more likely related to how often "don't reinvent the wheel" and "NIH syndrome" is thrown around. It's no surprise to me that people will first check to see if a library already exists before going to write their own, probably lacking function.


I find that I do that more often now than ever.

Most times I get the urge to pull in a rando package I find I really only need a few things it does. I check it out. Read it. Write my own.

I almost never need "all the things" outside the situations where I am using a big framework. So reading it, getting inspired / ideas from someone who did the thing and then I write a much more narrow focused version for myself.


And you can easilly add the features you actually need. :)


4-5 years ago (when the popularity of node was skyrocketing but es6 was still relatively new) I remember some people arguing passionately that it was good to maximize code reuse, i.e. better to pull in a package with one utility than to write 5 lines of code in your own application. I think it was mainly a symptom of the gaping holes in javascript's standard library... thankfully that attitude seems to have mostly died off.


> possibly license into your own code

This can be complicated. Depending on another package is usually very safe, at least as safe as "dynamic linking", but including code in your own source tree needs licenses to be compatible. Even then, you might have to change your license to "BSD-3-Clause + ISC" or similar composite and you will get complaints from users.


Generally, I don't think they're particularly popular, but a package that depends on them is popular. Or in fact, a package that depends on that package (that depends...) is.


At least at some point I've randomly stumbled over a GitHub account that had published many NPM packages with one-two function source code. They marketed themselves as a big contributor to the NPM ecosystem, which was directly based upon the number of NPM packages published and the number of monthly downloads. Some packages seemed to take off because people just do a npm install instead of replicating a one line function, while others I think were more popular nodejs projects of this user where they tried to force as many of their own dependencies as they could.

TL;DR "SEO" spam to market ones own nodejs expertise, would not be out of the question


ES6 really has nothing to do with isEven and isOdd


truly... those are just modulo operations


Hahahhahh isBetween, why

Compared to that, left-pad is rocket science


https://github.com/yefremov/isnan

An entire module for just `return value !== value;`



Which wasn't everywhere in 2017, when isnan was created: IE11 didn't (and still doesn't, though it's less relevant) have isNaN()

(Not supporting a culture of taking a dependency for this sort of thing, though)


Just to be clear, because your comment made no sense to me: what IE11 doesn't support is Number.isNaN(). It definitely does support (and has since IE3) plain isNaN(): https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...


Regular isNaN is very tricky, and almost never what you want: it first coerces its input to a Number, and then checks if that is NaN:

                isNaN   Number.isNaN
                -----   ------------
    {}           true          false
    [true]       true          false
    undefined    true          false
    "{}"         true          false


But none of those things you listed are numbers. The isNaN output is exactly what I want in pretty much every case.


isNaN does not tell you whether something is not a Number either. Applying isNaN to any of these returns false: null, true, false, [], "", "cabbage".


I'm gonna create a package for isX() for each x in the set of integers. Please use my packages for all comparisons from now on.


is-thirteen is six years old: https://www.npmjs.com/package/is-thirteen


Nevermind on that one. I just checked NPM and thankfully that package does not exist.


> I don't fully understand why packages like this are so popular.

It actually works like this: Author X develops `iseven`, `isodd`, etc. No one really downloads such packages. Author X then develops `importantPackage` which does do something useful developers out here download. The thing is `importantPackage` relies on `iseven` and `isodd`. Now `iseven`, `isodd` are downloaded alongside `importantPackage`. Profit.

My point is, we should recognize certain NPM authors as toxic, but I guess "freedom of speech/code" stops us from doing so. Example of such an author: https://github.com/jonschlinkert/


This guy would lose his shit if npm locked down his packages. I’m probably just venting but he’s been nothing but rude to me any time I’ve opened a legitimate issue on any of the larger downstream libraries of his. Not to mention, his Twitter is something else.


>I don't fully understand why packages like this are so popular.

I consider 'iseven' and 'isodd' to be signs that Javascript is a hellaciously engineered piece of crap that should be avoided at all costs. They're popular because Javascript is garbage.


JavaScript supports the modulo operator and the native syntax for this:

  let isEven = theNumber % 2 == 0;
There are reasons to dislike JavaScript, but this isn’t it. The native solution is about as standard as it gets.


Please explain to me why there is a package "iseven" and "isodd" if not for the fact that some Javascript kiddy was unable to work this out for themselves by applying the intrinsics of the language, standardly ..

Because I firmly believe that crap like this happens because the language itself is very, very dumb.


The old C bum in me says this should really be a bitwise AND with 1 but I'm sure modern compilers and JS JIT/V8 runtimes optimize MOD 2 to the same.


I spend most of my time writing C for embedded systems. You should focus first on making life easy for the people reading your code. Use `% 2` and `* 4` when those are the things you mean, rather than `& 1` and `<< 2`. Compilers have been able to do those strength reductions on their own for years, and in most cases even if the compiler didn't, the cycles you'd save aren't worth it.


You’re not expressing what you mean by %.. I’m all against these packages, but in Ruby we have a good standard library.


I’m a big fan of Ruby and yes, Ruby handles this elegantly.

However, some variation on n%2==0 is idiomatic in a large number of languages, both modern and old [0].

This includes Lua, Java, Python, Scala, Perl, PHP, Swift, and many others.

It is true that there are numerous languages with built in convenience methods in their standard libraries, but that does not by itself mean that n%2==0 is problematic and certainly this idiom is deeply entrenched.

There is nothing preventing a developer using those languages from wrapping this in a function, and indeed this is probably a good idea if it’s something you find yourself using repeatedly throughout your code base, just like any other often-repeated code snippets.

- [0] https://www.rosettacode.org/wiki/Even_or_odd


It's... the modulo operator


Yes. But you’re not expressing what you mean. You don’t want the remainder.

It’s like doing a for(i=0; i<items.length:i++) { e=items[i] } to iterate over each element instead of something like foreach, or a lambda.


Of all the possible examples to choose, I would not have guessed a classic for/each would be at issue.

How is that for loop not expressing what it means? for is used to iterate over something. There are legitimate reasons to choose for even when forEach is available.

Does this not express what I mean?

  if (amount > balance)
    return “insufficient_funds”;
Chances are, I’ll put this in a method anyway, and that’s what a language enables me to do - to create meaning that is contextual to my program.

Abstraction can help simplify complex problems, and that’s what it is good at. But when literally everything is abstracted away, it becomes harder to reason about what the program is doing without also learning about what the particular abstraction does (e.g. implicitly handling negative numbers, etc).

A well implemented standard library can be a joy to use, and some people choose languages for this reason. Some languages very intentionally provide a very lightweight standard library, and that’s just fine. All of this shouldn’t absolve the developer from understanding some of the very basic constructs of the language they choose - a lack of which will lead to bugs and vulnerabilities.

If I choose JavaScript (or it’s chosen for me), it behooves me to understand the basics vs. farming this out to some node package that just made my app harder to reason about and vulnerable to supply chain attacks for essentially no value.


I don't usually get drawn into this, but I have an answer to this because I've seen something similar in production.

jbverschoor said (edited for brevity):

> Yes. But you’re not expressing what you mean.

> It’s like doing a

  > for(i=0; i<items.length:i++)
  >   { e=items[i] }
> to iterate over each element instead of something like foreach ...

haswell replied:

> How is that for loop not expressing what it means?

It's true that the for loop is accomplishing the goal, but it is specifically running through the elements from 0 to length.

But sometimes what you want to do is, for example, apply a function to every element, and the fact that they are in a structure indexed from 0 to N-1 is not relevant. Your intent is to do a thing for every element, not to run down a list doing the thing.

There is a difference in intent, and using foreach instead of a for loop quite specifically expresses that.

Small examples like this are never convincing, but there really is a difference between "Run down this list" and "do this thing for every element". The output is the same, but the intent is different, and expressing that intent is important in larger projects.


That intent is implicit, and using for or forEach does not by itself provide enough information about intent. I agree that using one vs. the other can be a strong hint. The surrounding code provides the rest of the story.

It may be true that running from 0…length is not the most important part, but it may also be true that I need more flexibility and control while iterating, and may still choose to use for as a result.

But I think we’re getting a bit off track (and I helped get us here…).

jbverschoor‘s comment was a followup about the expressiveness of the modulo operator vs built-in convenience methods.

At least in the case of JavaScript, there isn’t an isEven available. If it was, an argument could be made that using it is more expressive. Since it doesn’t, the argument can be made that the most expressive option is the idiomatic one. Certain code fragments become idiomatic for a reason.

I interpreted the comments about for and the perceived lack of expressiveness in using it against this backdrop, and my point was just that for still has a time and a place. It may be that jbverschoor recognizes this, but I think the comparison doesn’t quite help when examining the use of modulo.


I suspect we are largely in agreement. The example is not necessarily a good one, but I think we agree that the intent expressed by "foreach" is different from the intent expressed by "for ()", and while that difference might not matter, and might not be large, and might be carried equally by the context, it is real.

Going back to the original, there is a difference between "isEven(n)" and "0==n%2". Sometimes (and this is definitely true in some code I write) there is a conceptual difference between asking if something is even, versus asking if it leaves a remainder of 0 upon division by 2. Yes, they can be proven to be equivalent, but the underlying intent is different.

BTW, I agree entirely that in these cases it's really mostly pointless, but the principle carries over to larger examples. Trying to see that there is a difference is probably worth it, even if we agree that it doesn't really matter in these trivial cases.

Oh, and for a compiler I used to use it really did matter that you used "foreach" when you could, versus using "for ()", because it could dispatch things in parallel in the former case, and couldn't in the latter. So expressing the intent in the code itself and not just in the context really can matter.


Yeah, I think we are on the same page.

The only nit I'll pick is re: the 2nd paragraph. I do agree that there is a difference in the expressions "isEven(n)" and "0==n%2" in isolation. But these expressions will always be surrounded by more code, and the degree of difference will entirely depend on that code and its structure. All of this can be very simply solved with a quick:

  function isEven(n) { return 0==n%2 } 
And let isEven = 0 == n % 2 provides about as much context as isEven(n).

Admittedly, I had to be conscious of how I wrote that expression, but choosing meaningful variable names is also just part of writing code, and isEven() by itself likely doesn't provide enough context about why I'm checking for even-ness to begin with. These considerations all need to go into the design of the class/method, but at no point can the developer wash their hands of the need to provide context through standard practices, just because the standard library provided more descriptive methods.

And I'm not saying this is what you're saying, but the original comment seemed to believe that more expansive standard libraries are somehow inherently better. I'd argue they're just different, and more factors to consider when choosing a language. Sometimes they help. Sometimes they're not worth the price of admission.

I think the discussion could be generalized as: how far should standard libraries go? No matter how good that standard library is, at some point, I must apply the fundamental skill of adding context and structure to the code I write.


Now I want this to be the default twitter ui so much.


I found it very confusing, is the chart trying to be a timeline?


No.

The chart in all forms (there's more than one) shows each tweet as a node, and arrows showing which tweets are replies, and which tweets are "Quote-Tweets".

If A->B, then B is a reply to A.

In this particular version I've taken the longest thread and laid it out on the left, allowing the other descendants to flow to the right. Another layout is to have the initial tweet at the top and lay it out as a simple tree (technically DiGraph), but that often leaves the top left corner empty, leading people who initially open it to think it's empty and close it without scrolling. Yes, they do, that's happened before here on HN, including in this discussion.

Does that help? Are you still confused? It's just a chart showing tweets, quotes, and replies.


Do you happen to know what tool made this? Is it just Graphviz?


I wrote a script to pull the thread recursively, reformat it into a DOT file, feed it to GraphViz, and upload it.


Thanks! (I didn't realize you were the author).


You're welcome.

I have several tools that do this sort of thing, including a 'bot that can be invoked automatically on mastodon. I don't do that on Twitter because popular discussions get seriously out of hand ... there was one that I stopped tracing after if got to 3500 tweets. I have some heuristics in mind to help control that, but it's all still very experimental.

I also have a pig-ugly, pre-alpha, bug-ridden discussion system that works directly on a DiGraph, but when it was submitted to HN sometime ago reactions were ... (significant pause) ... mixed.


I think the main thread is on the left with the reply/qt sub-trees on the right. You can only read the first column. I quite like the layout, if there's only 1 main thread that i need to be aware of it's quite nice. This layout probably won't work if there's more than 2 people in a conversation.

The layout is actually not that much dissimilar from what twitter already has, just more info from non-main threads and better visibility for orphan leafs.


It tries to map the entire tree in one view.


At the time the snapshot was taken that was the entire tree, so it doesn't just "try", it succeeds.


I am starting to think that what we need is package distributions. That is, forked popular packages that are hand-vetted by a group of maintainers, just to make sure that some random malware doesn't find its way into the dependencies. If you want the latest and greatest, or want your custum dependencies checked, you pay. Otherwise it's free.

I'm quite sure that the market for this is really big. Does something like this already exist?


That sounds like a Linux distro with the base OS factored out; the only difference between that and, say, Debian, is that Debian is a kernel + core userspace and not just application libraries. I'm not aware of anything quite like that already existing, but OS distros are sufficient prior art for me to think that it could work.

EDIT: Actually, there are package managers + package collections that intentionally can be installed on top of an existing OS; nix/nixpkgs and pkgsrc are probably the big ones. I'm not 100% sure that those are what you want today, but it's likely that if you could get maintainers to add your desired packages they'd be accepted there.


It also describes software ecosystems like the apache software foundation, dotnet foundation or even just community vetting like PHP has with Tidelift (which is a pretty clever idea on its own) etc...


Yup, I knew I wasn't the first one to see an opportunity here. :) Tidelift sounds great, thank you!


Isn't that kind of antifactory? Still, that also opens up the question of timely updates to solve possible security vulnerabilities.


I see a lot of comments here pointing to new/inexperienced developers. As one of those guys, I have to say JS community is not that great at keeping blogs/docs/tutorials up to date. Take react for example the learning section is quite out of date and no one develops using classes anymore.


Myles has a good point about public email to account emails being different sometimes (https://twitter.com/MylesBorins/status/1523811989797093376). There was a way to resolve account usernames to email addresses through a quirk of the user interface up until last year. Fortunately, they're really good with dealing with bug bounties and had it fixed less than week after I reported it. That said, domain based accounts for something like this is still a big vulnerability.

Perhaps something akin to the UK OFCOM amateur radio license requirement would work, requiring people to update/confirm their details at least once every five years, or in this case, yearly?


> Buy expired NPM maintainer email domains

How do people manage selling or changing a domain they've been using long-term as their primary email address for sign-ins like this? To stop a new owner of that domain from taking over accounts sign-ins that use that email address, you'd need to reliably update all your sign-in details with anything you've signed up to?

You could use a password manager to track everything you've signed into before to help with this but that's error-prone.

Alternatively, this forces you to keep paying for the old domain indefinitely?


Yes, that's exactly what you have to do. There's no way to raze land on the internet.


This also highlights how often developers seem to ignore mortality. When a person expires, soon will their domain registration follow.


I know PGP isn't perfect but why aren't packages signed by their maintainers?


Overhead, and fear of losing the key and being locked out feels like two valid reasons, I'm sure there are more I'm unaware of (probably should add lack of familiarity, hence the lack of knowledge).


Who will check the signatures when so few have signatures?

What dev thinks oh I can’t upgrade because of this error, stackoverflow says use this flag —disable-signature-verification so I do and now I can develop again


Any place with a devops team would not disable that.


For what it's worth, Debian packagers check signatures when downloading from PyPI.


To my knowledge NPM doesn't currently have a mechanism for signing by authors. Packages are signed by NPM itself on upload, which defends somewhat against repository compromise.


This page just shows an empty white page for me on mobile Safari. Is the site down? Does anybody have an archive link?


Try shrinking it, searching for text, or scrolling. It's a chart, and a small part of the top left corner is empty.


Thank you very much!


The chain of trust is a lie, external dependencies are as much a trade-off in security risks as they are in production cost, and your belief in the good intentions of a person/entity does not shield you from their unintentional mistakes or oversights.


What I would like essentially a curated npm, es and esm modules only and some other rules better vetting packages and their dependencies. At first almost no existing npm packages will be suitable to make the migration but I expect with just a bit of work and automation we can migrate over versions of the behemoths as well. Working with this package manager your builds can be simpler because you only deal with and target es modules. goodbye common js! goodbye leftpad! see you in hell!!


Is there an easy tool maybe something for ${bundler} to take your package.json and rewrite everything to refer to static versions hosted on your own cdn?

At least that way upgrades, malicious or otherwise are opt in for production

The other thing would be ideally a crowd funded resource to vet particular versions of popular packages.


For NPM use https://verdaccio.org/ . It can proxy the public registry. Install your project and it will pull and cache the dependencies. Once cached you can remove the uplink and it will only serve the cached version


The link just loads a blank white page for me on mobile. Would have preferred a direct Twitter link.


As it says elsewhere in this discussion, it's not just a blank white page.

Try scrolling. Or searching. Or scaling.

There's a chart there, and you can scan the entire thread, or just click on a node to go directly the the tweet you want to see.


Scroll ~1000 pixels to the right and then zoom out 10x so you can click a node, skip the weird diagram, and just use Twitter.


Non-essential dependencies are for the weak and every single company that I worked for.


Reminder: when ‘colors’ and ‘faker’ packages printed stuff in the output in an endless loop, NPM and Github banned the author and took over the packages.

When ‘node-ipc’ overwrote all files on disk, NPM just waited while the author himself published an amendment and posted that the package ‘only created a text file, no biggie’.


faker was a disease that I fought against in the places where I was unfortunate enough to inherit it in the codebase. lots of devs not knowing better thinking it really made our tests more robust -- it didn't, in fact, just more flaky. also faker's functionality can easily be shimmed or rolled yourself if you want to. I will continue the fight, there are many other super popular npm packages whose value is dubious at best, cause dependency tooling headaches, and are like viruses in the web ecosystem. and the fact this faker guy sabotaged his own package is great because even before that I was against it, and now it is just easier to convince people not to import this clown lib


Vendor and audit your dependencies. You shouldn't rely on GitHub to protect you from security problems.


While I agree in principle, that usually means you'll 1) frequently blindly update all your dependencies, which doesn't really give you much security benefit, or 2) use old dependencies long after security vulnerabilities have been discovered in them. The first point does give you some security benefits by reducing the chance that you'll hear about a malicious package before you upgrade to it, but it's not exactly great.

I don't know what the solution is. I don't think there is a solution. We can't have 5000 dependencies from 2000 random individuals on the internet and still be safe. But if you want to avoid that situation, you're locking yourself out of the vast majority of the NPM ecosystem.


Maybe the ecosystem needs to discourage unnecessary packages imports.


[flagged]


Sorry, but I can't watch your youtube video until it has been thoroughly vetted to the same standard as commits to the linux kernel or peer review in science. As a matter of fact, I'd have to say your video is bad for society.

Surely you see how this is a self-defeating argument? Of course higher-stakes projects where inclusion of incorrect or malicious inputs in the end result could have significant impact on a lot of people will have more stringent checking than social media. Is your ideal world really one where all forums of human communication are peer reviewed?


What YouTube video? I haven't read that comment yet; still awaiting results from the Online Content Vetting Foundation. Yours flew through the process though. Congrats!

In all seriousness though, I do agree somewhat with the GP. Inherent vetting can be good. But it's also a tool used against us for manipulation. An example of this would be app stores. Heavily vetted, but you are force fed the apps they want you to see, and they have every right to deny you access to apps they find undesirable.

We need systems of reputation and trust to emerge, as they always have, and we need the freedom to choose them. Tools like Twitter allow largely unvetted content, sure, but they also allow users to curate their own circle of trust. This comes with risks of course, but I trust it more than some omniscient entity handing me content like mana from heaven.


That’s because they are vetted by ONE organization, that privately owns the store. It is actually an example of private capitalist ownership, again.

In science, no one “owns” physics. No one “owns” articles on wikipedia. The ideal is to have multiple reviews represented and all of their major legitimate concerns should be addressed before publishing something to the public. Think of concentric circles where the inner circle isn’t just Apple or Google but a good mix of experts with different views.

The hard part is resolving disputes between experts, or determining if a criticism is legitimate. This is what “edit wars” are like on wikipedia. Perhaps in this case, both points of view should be included side by side.

In a capitalist market system, though, explaining nuances of WHY Trump’s or Biden’s administration did something, or including the entire context of a gaffe video, would make it uninteresting to their audience. Imagine if Fox News would fairly report the results of studies about single payer healthcare around the world - their audence would bounce. They face intense market pressures to cater to their audience, this is WHY they report as they do and WHY they tell their straight news anchors to “rein it in”.

The one exception may be Tucker Carlson over there, who is able to somehow maintain a show while going against the entire military-industrial complex and establishment, unlike Sean Hannity. It is a rare phenomenon to see someone with such a highly promoted time slot have such a contrarian position. But if it sells to an audience, I guess it works. CNN totally jumped the shark when they discovered they can cover the malaysian airline flight for months 24/7 and their ratings went up. They went from being a straight news network to pandering for ratings too. None of the open source platforms or wikipedia would sell out like that.


It is a matter of scale.

In the video, I bring this up to Noam towards the end. He actually disagrees, that audience is a form of capital. He says it is influence, but not capital. Even though one can do the same operations with it as with any capital, including exchanging it for other forms of capital etc.

You are just USED to celebrity culture, and following what Jonny Depp or Will Smith does can even overshadow a war in Ukraine or humanitarian crisis in Yemen. Four Libyan ambassadors matter more than millions of Libyans.

On Wikipedia or in NGinX there are no celebrities, except maybe the founders. And it is far far more useful and popular than their closed-source capitalist alternatives: Britannica/Encarta and Internet Information Server. They power more machines and serve more people in more ways.

And you bet that there are a lot of operating system maintainers that put software like nginx and php through their paces before giving their members access.

If you don’t like a centralized organization vetting things, then have various distros, like Linux. The point is that the author shouldn’t also be the one vetting each new version and what changed. Security researchers should be.

Your argument for “free speech absolutism” would fail in science and any endeavor that actually matters to people. For humor, entertainment and idle discussion, maybe we can have celebrities.

But yes — my video would normally not matter if not for Noam Chomsky’s amassed reputation and audience. The content in my video would be far better published as part of a collaborative discussion rather than an off-the-cuff one. But a chat between two non-celebrities would be fine for society… people could watch it for entertainment, or whatever. It wouldn’t move markets or convince people of crazy theories about 9/11 or QANON pedophiles.

There is a limit to how much power should be concentrated in one place.


You have grossly misunderstood my comment if you think I was arguing for "free speech absolutism", which I think is an insane idea.

To be completely honest, this entire comment reads to me as rambling largely unrelated to my first comment.


> We should treat things like operating systems do: have collaborative security assessments for each package BEFORE it is published.

This would not work for small open source projects, and every successful open source project starts out as a small project that nobody vetted yet.

I have smaller, but popular open source packages that I know are used by many people,and most of those projects (even now, after a year being public) get zero actual contributions. If my most popular packages don't receive pull requests, how would I find a trustworthy expert committee to judge all my other silly little packages for free?

Also, why would I make the projects I work for fun and for free purposefully a painful, sluggish, bureaucratic experience for everyone involved?


It's a matter of scale, as I said :)

At some point, projects have to bring in more "adult supervision"


[flagged]


(not op) little things are taken seriously/ serious stuffs are taken lightly


It’s like bikeshedding. It’s easier to deal with the smaller problems.


A random connection to bikeshedding being mentioned yesterday in a totally unrelated thread: https://news.ycombinator.com/item?id=31315883 (interesting comment is the large block of text 3 replies deep)

Bit random, but my brain likes linking connections between things. Not sure if useful?


The only thing I take away from this is that you should not use custom domains for your email, at least for important stuff.


I checked the password-reset flows for most major NPM contributors that use regular webmail accounts like Gmail too.

< 10% had useful 2FA enabled. Most were just password reset questions or had a backup email to some old expired and re-registerable forgotten earthlink accounts etc.

I do this stuff all the time. I looked up the password reset questions controlling the zoom.us domain 2 years ago and collected an insulting $200 bug bounty for it. Zoom, like NPM, didn't do signed binaries, so this would have been brutal combined with other issues I found.

The solution is what all sane OS package managers do: code signing.

NPM has rejected this at least as far back as 2013 when they refused a PR by someone that implemented it for them: https://github.com/npm/npm/pull/4016

I don't know what else to do but keep publicly trolling them at this point until code signing is implemented. Unsigned code is a free pass for remote code execution when an account gets taken over.

Meanwhile Debian maintains hundreds of signed nodejs packages proving it is very doable by a low budget team. NPM seems to just want to reduce deveopler friction at all costs :/


The question in that thread, and this later thread,[1] is how to know which keys are valid to sign a package.

For example: I go to release a new version and I've lost my private key, so I roll a new one -- this will happen often across npm's 1.3 million packages. Do I then ... log in with my email and update the private key on my account and go about my business? What process does npm use to make sure my new key is valid? Can a person with control over my email address fake that process? How are key rotations communicated to people updating packages -- as an almost-always-false-positive red flag, or not at all, or some useful amount in between? If you don't get this part of the design right -- and no one suggests how to in those threads -- then you're just doing hashes with worse UX. And the more you look at it, the more you might start to think (as the npm devs seem to) that npm account security is the linchpin of the whole thing rather than signing.

It's not just npm; that thread includes a PyPI core dev chipping in with the same view: "Lots of language repositories have implemented (a) [signing] and punted on (b) and (c) [some way to know which keys to trust] and essentially gained nothing. It's my belief that if npm does (a) without a solution for (b) and (c) they'll have gained nothing as well." It also has a link from a Homebrew issue thread deciding not to do signatures for the same reason -- they'd convey a false expectation without a solution for key verification.[2]

[1] https://github.com/node-forward/discussions/issues/29 [2] https://github.com/Homebrew/brew/pull/4120#issuecomment-4068...


Debian already maintains hundreds of signed NodeJS packages using classic PGP web of trust, with a team of volunteers that lack NPM microsoft money. I don't understand how NPM has any excuses at this point

The web of trust PGP signing approach works reasonably well to protect most linux servers in the world since the 90s. You can complain about it and say there should be a better UX toolchain for it, and I would agree with you. Thankfully the sequioa-pgp team has made huge progress here and it is a shame they are not getting due support for their heroic and near thankless efforts to make this better.

Still, even with todays GnuPG tools, abandoning pgp for supply chain integrity and replacing it with nothing is crazy. Imagine if we abandonded TLS because early implementations sucked. Use the best tools we have then fight to make them better. That's just good engineering.

The software eng commuity at large basically said "Look we just stopped signing code and nothing bad happened... oh wait bad things are happening. Too late to change now!"

This was a reasonably well solved problem, but entities like NPM will need to have the humility to admit that rejecting best effort cryptographic authorship attestation was a mistake.


> What process does npm use to make sure my new key is valid? Can a person with control over my email address fake that process?

It seems there should be some multi-factor process.

Developers need to register a password, an email address, and a YubiKey/TOTP token. If they lose access to the email address, they can log in to their account with the password and token. If they lose the token, they can be issued a new one with the email and password (or recovery codes).

As long as the account stays secure (i.e. an attacker doesn't manage to keylog the developer's NPM password and email password) then the NPM account can be trusted to add new package signing keys. The npm client then needs to trust metadata from NPM which vouches for these new keys.


For all that people love to hate it, GPG's web of trust could cover this. Lost your key? That's fine; go convince a dozen peers that that's what happened and you're really you and they'll cross-sign your new key.


You could make it optional and never permit key updates. Then downstream users could decide to only depend on signed code. The downside of this is some people will lose their keys and never be able to update a project ever again. The upside of this is that businesses that prioritize this can be more confident that an account takeover isn't shipping them evil code.


> < 10% had useful 2FA enabled.

I expect this to change. NPM will roll out mandatory MFA for the most-downloaded packages[0] (RubyGems as well[1]). I expect this will rise to a 100% requirement at some point because Github's decision to require MFA by the end of 2023 will massively raise the waterline of folks who have the capability to MFA and experience with MFA.

[0] https://github.blog/2021-11-15-githubs-commitment-to-npm-eco...

[1] https://github.com/rubygems/rfcs/issues/35


Doesn't really even need to reduce average developer friction - they've proven a few times that they're happy to add extra security requirements just for publishers of high-visibility packages.

"Once you cross 100,000 weekly downloads, all new publishes must be signed" would keep the easy on-ramp for small packages but dramatically reduce the risk of major attacks against all the big targets (like the example here).

Not as good as signing everything, but a good start that can be iterated on later.


Is code signing really a solution here? It seems more like a bandaid to me, because someone without MFA on their account probably also doesn't have the best security around their code signing. In any case, isn't it often handled by CI/CD? If you get access to the developer's account, you can release malicious code, dutifully signed?


Technically what's needed is attestation: I don't care if the author signs it, I care that trustable third parties have reviewed it.

Most people would have no problem marking a couple of big firms and orgs as trusted reviewers and only showing packages to be used if signed by them.


To be honest, rather than magical blessed keys held by a few orgs, I'd feel more comfortable with a slightly larger number of individual reviewers (and prosaic keys), each with some kind of visible distance metric related to the codebase they're signing artifacts for.

For example: "ah yes, a core contributor, a frequent code reviewer, and two downstream consumers of this library have signed off on the changes in this release. even if one of them is having a distracted day/week, that's good enough for our team to be comfortable upgrading it today since it's a minor version upgrade"


Sure: I don't think any of this should go the way of TPM and EFI (magic primary keys held by those big enough to have legal departments for the committees) - but it's a short hand to acknowledge that almost everyone would mark "Google and Microsoft internal use approved" package keys as absolutely trusted and mostly not worry about it.


That seems like it'd be a way for those companies to attempt to capture the technical infrastructure used by packages, and hold their adoption potential hostage (soft pressure for packages to change their practices to become approvable, and latterly a hope that the wider community will tend to believe in and trust those approvals).

You could be correct, maybe this would work in practice, trading on the reputations of those companies. It doesn't feel particularly open or community-oriented, though. Why not present a trust graph built from a broad set of worldwide users instead?

(one of the benefits to a trust graph would be the volume of signers; perhaps you wouldn't want to weight each signing equally -- again referring back to something like the distance metric mentioned previously -- but for even mid-popularity packages, the detailed review possible by careful users of a package could, I expect, be more reliable than automated-and-manual review by one or two large companies)


What folks call signing is, if you step back, basically another kind of attestation. It's an attestation of authorship.


> I don't know what else to do but keep publicly trolling them at this point until code signing is implemented.

I don't think your trolling is doing you any favours anymore (sigh, that guy again). Kicking a dead horse etc. Maybe just let it go?


As long as negligence keeps getting people hurt, I will keep trying to call it out and prevent more negligence. If that means being That Guy then so be it.

In parallel I am writing specs and proposals for widely applicable tooling and improvements.


...or just make sure you keep them under your control...?

Losing access to an account at a major mail provider seems more probable than losing control of a domain you paid for.


Eh, it happens. I pay over $200/year for my domain + GSuite. It’s a luxury I enjoy, but a luxury nonetheless.

If I couldn’t afford that anymore, I’d probably go back to Gmail. I’d probably think to switch over many of my email addresses (for services that I use), but some may slip the cracks when the domain expires.


How much does your domain cost? Most TLDs cost under $20, and GSuite costs like $60 a year... I pay like $60 for domain + mail combined (with fastmail)


My last month’s GSuite invoice was $17.63 (CAD). It has definitely increased in price, but it’s still the cheapest unlimited cloud drive. I think the cheaper plan only includes 30GB storage.


Yes, better use Gmail and loose your whole account in a few months/years


Yes, it's better that your account gets locked and everybody loses access to it than it is to allow someone else access it.


https://twitter.com/lrvick/status/1523827220023853057

If you're trying to protect people, why release an exploit straight to the public instead of responsibly disclosing?

Edit: Looks like it's already known, hard to know that from the tweet context.


It's not a new one. https://news.ycombinator.com/item?id=30403848 was exactly the same exploit.


The issue has been publicly known for years.

The more focus we can get the better - hopefully NPM will add MFA at some point...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: