Hacker News new | past | comments | ask | show | jobs | submit login
A 7KB AWS lambda Node.js library with zero runtime dependencies (npmjs.com)
123 points by albertgao on Feb 15, 2020 | hide | past | favorite | 59 comments



Not knowing this myself, could someone explain the attraction in using a web scripting framework for stuff like this? I get that this particular example makes the whole thing seem fairly straightforward, but have never really heard why people like adapting Node.js and similar to non-web environments beyond "you can". Is there anything more to it than that?


Main strengths:

* The ecosystem is pretty strong and constantly evolving. Lots of companies and individuals are putting a serious amount of work into making high-quality tooling and libraries. (Although with that also comes a lot of lower quality stuff and its sometimes hard to tell)

* The language isn't owned by any single company and is well specified

* TypeScript provides a powerful and flexible type system with advanced dependant-types like features. Lets you gradually evolve from a messy JS prototype to well-organized code with fairly strict type checking

* Fastest dynamic language. The performance is also quite comparable to statically typed GCed languages. (For example, you can probably get to 50%-90% of Java performance on most single-threaded workloads)

* wasm support in the runtime a promising escape hatch to the integration with other languages should that become needed. (Similarly not owned by a single company and well-specified)

* sharing code and types between client and server, including interfaces, validation and data models.

* particularly well suited to handling heterogeneous structured data due to how cheap it is to define new object types (even in typescript)

* Async-IO-first (in fact, async-only IO for the most part)

Main weaknesses:

* Poor standard library with an anemic set of classes, anemic set of implementable protocols/interfaces (more stuff like Symbol.asyncIterable needed) and lacking convenience functions. E.g. see https://api.dart.dev/stable/2.7.1/dart-async/Stream-class.ht... and compare with... well nothing in the language! The official node streams have a terrible API.

* Combining lack of both protocols and standard library leads to pretty bad userspace library fragmentation. (What is the go-to stream library?)

* Restricted data sharing between threads (only SharedArrayBuffer) makes it quite limited for multi-threaded problems.

* (mainly TypeScript) - Insufficient reflection capabilities

* Rigid (non-configurable) node_modules resolution algorithm accepted as standard limits flexibility when organizing projects

Mythical weaknesses that don't matter that much:

* Implicit conversions - largely irrelevant since TypeScript

* DOM/Browser related weirdness - often attributed to the language but actually problems with browser APIs


> The performance is also quite comparable to statically typed GCed languages. (For example, you can probably get to 50%-90% of Java performance on most single-threaded workloads)

Citation needed, please. I disagree with this. I recently was able to achieve a massive speedup in a Node application through writing a native C++ module... and implementing Garbage Collection is required for N-API.



Excellent source; thanks!

It’s probably worth clarifying that Node is ⅓ the speed of Java in some of these cases and that the Node implementations use both fixed size typed arrays as well as worker threads, features that aren’t common practice in most Node programs.


I wanted to illustrate the fairly "wide dynamic range" available. A lot of typed, GCed languages require significant effort to write highly optimized code anyway, and the optimized code rarely looks like the idiomatic one.

But even with idiomatic code the performance is quite good - the JITs are very high quality.


I didn't downvote you – in fact, I upvoted you. But I should also make the point that worker threads do not support importing native Node modules that were built directly on the V8 API. For example, most popular sqlite libraries (better-sqlite3, sqlite3) are not usable with worker threads.

I'm mulling contributing N-API support to the libraries... but I haven't even done any research or planning work yet.


An interesting blog post on how that might be done: https://mrale.ph/blog/2018/02/03/maybe-you-dont-need-rust-to...


That was amazing to re-read, thanks for sharing. news.yc discussion back on tfa back in the day is a pretty good read, too: https://news.ycombinator.com/item?id=16413917


For those situations where the GC becomes a performance (or more likely memory usage) problem, it seems reasonable that the built-in WASM support of most engines will be able to provide an adequate performance (its already getting quite close): https://www.usenix.org/conference/atc19/presentation/jangda


Anything CPU-bound is a very bad fit for node. But most web services are IO-bound, and Node is excellent for them.


> But most web services are IO-bound

Node's still relatively slow for those workloads.

https://www.techempower.com/benchmarks/#section=data-r18&hw=...

See how far down in each section you have to scroll to find node, even for workloads that are purely "accept a request and respond with a static string". You'll see lots of Java and Go on your way down.

And most services will have far more compute than just shoving bytes in between services. There's request parsing, response encoding, usually at least a tiny bit of data manipulation.


This benchmark mainly measures the performance of the HTTP parsing and database libraries, often in a suboptimal default configuration. In node land, they're admittedly not amazing, but nothing about the language prevents them from being better.

For example, for a long time, the only reason that node was slow on these benchmarks was the built-in URL parser. Replacing it with this carefully written module https://www.npmjs.com/package/fast-url-parser resulted in 2x improvements on the benchmark. I haven't looked closely at the situation nowadays but I imagine its still quite similar with lots of low hanging fruit lying around and stalled due to backward-compatibility concerns.

For proof find "es4x" in the benchmark list, which basically replaces the entire stack of HTTP parsing and database libraries with the ones from vert.x and runs JS on Graal, even though Graal is currently at least 2.5x slower than V8 in terms of JS performance: https://github.com/graalvm/graaljs/issues/74

Node core (and the libraries around it) has unfortunately stalled in the "good enough" zone for quite a while. The good bit is that they stay in the good-enough zone after adding your own code.


I think your point is actually just proving something important that I ended my initial post with - that there is almost never a "pure IO" workload, but that compute is actually an extremely important part of any service. Given node's concurrency model it's even more important, as compute can block other operations.


I had another glance at the framework benchmark, and the quality is absolutely dreadful

I removed the unnecessary middleware bloat (pug html renderer middleware for an API server, really? body-parser and form parser even for endpoints where it's not being used?) and switched to standard pg instead of pg-promise (standard pg also supports promises, pg-promise hasn't been needed for quite a while now)

The performance went from 600req/s to 5500 req/s on the db benchmark, 9x improvement with 10 minutes of work. I think thats a pretty damning result for the tech-empower framework benchmarks quality, at least when it comes to node. This is just standard libraries and practices, not even hacks like replacing the built in url parser with fast-url-parser.


Submit that to them then?


It might be a good idea, although I fear that if I only fix one node framework and keep the rest intact it will create a false impression that that framework is somehow amazing.

Still better than a false impression that nodejs is somehow slow.

They really need some QC though.


Maybe we'll have to disagree because I'm still pretty sure nodejs is somehow slow.


You don't have to trust me, here is a diff you can apply of the work I did: https://gist.github.com/spion/2779ae6dc9552c229c1eeacd90c03b...

you can run ./tfb --test express-postgres and compare.

If you can't wait for all the tests to complete, a representative one can be obtained more quickly by running

./tfb --test express-postgres --type db --query-levels 10 --concurrency-levels 128


I agree, although for server side rendering of web frontends, it is still the best choice even though producing vdom and rendering it to string is usually cpu bound.


> (mainly TypeScript) - Insufficient reflection capabilities

You can always use the compiler api to extract type information. Sure, a bit tricky but doable.

The compiler api exposes a lot of awesome things, I'm suprised there aren't that many tools that use it.


With respect to AWS Lambda specifically, a real consideration is how fast the context loads. In the absence of warming schemes, any performance gains you get using Java for example, might be out weighed by the the cost of loading a much larger (in size) context. In general, the zip for the Lambda implemented in Nodejs is going to be considerably smaller than the zip implmented in Java.


NodeJs, in particular, out performs all language run-times in terms of AWS Lambda cold-start time (<200ms), and is second-best for warm-starts (<11ms) behind Python (<9ms).

https://levelup.gitconnected.com/aws-lambda-cold-start-langu...


It’s handy having a single language and toolset for programming. There’s less friction when switching between codebases. I’ve been paid to program in a whole lot of languages over my career, and personally preferred the environments where I didn’t have to switch languages all the time when working in different layers of the stack.

These days, I’m mostly paid to write web applications. So, I default to JavaScript. That said, there are plenty of problems where JavaScript is such a poor fit that I’ll reach for something else.


Deno comes with TypeScript out of the box.


It's a Turing-complete, general-purpose programming language. What should stop you using it for anything?


Node is especially good at handling async stuff. It’s in the JavaScript DNA. Whether through callbacks, Promises or async/await. As many backend apps are basically tying together 3rd party and 1st APIs it’s actually a very logical choice.


Kind of silly in a lambda though, as lambdas do not run concurrently. A bash script would suffice.


Lambda for NodeJs doesn't / won't run multiple requests in the same process but different V8 contexts at the same time?

That sounds wasteful, and imo makes Cloudflare's Serverless tech superior for strictly network-io bound workloads. Lambda, to be fair, supports way many event triggers and all sorts of runtime and user-space constructs, but still manages warm start times <10ms which is really impressive.


Cloudflare uses V8 Isolates feature and they build a lot of the API backends to follow the WebWorker spec. It's very efficient and well-suited for logic running in network calls at the CDN edge but limited to Node/JS code. [1]

AWS Lambda uses their Firecracker micro-vm tech which supports more runtimes, environments and triggers than just Node and also runs container workloads. [2]

1. https://www.infoq.com/presentations/cloudflare-v8/

2. https://firecracker-microvm.github.io/


But you can still have concurrency while handling a single invocation.


I’m new at this, and would love feedback if I’m wrong, but I think that a lambda instance could be reused if the system had need. As a result the best practice is to closure global-like data in the handler function itself. This would then get passed down through the layers much like Golang’s context.


I have many that call a handful of backend services, handle errors, process the data, and send it to the client.

Very async.


I find that to be a mess compared to other languages like C# that have much more solid and clean async/await implementations.


A big part of it is "because you can". I will say, that in general, and acknowledging I have a solid JS understanding, doing projects with node tends to be considerably less friction, because it's so easy to work around friction.

With npm, there are lots of options... generally if one package doesn't provide an interface I like, or performs poorly, or just has odd dependencies it shouldn't need, there's another that's probably closer to what I want. Worst case, if there's a smaller change, I fork the project on github, update to a scoped name, and publish my fork.

It's not always pretty. That said, I tend to be considerably more productive with it.

I've also worked a lot with C#, and am recently learning Rust... I feel the first version of most things should be done in a scripted language, and JS/Node is just as valid as any other option in the space.


IMO, Node's superpower is not really in the language but in the NPM ecosystem. If that is not attractive to you (maybe you develop the whole system by yourself, or need fine-tuned performance), Node only has to offer an ubiquous language, and that's about it.


People overlook the fact that JS is a pretty nice modern language in its own right. "Because you can" is pointlessly dismissive and indicates to me that one still thinks JS is nothing more than a language to toggle a CSS class on <h1> because they haven't used it since people used the word DHTML.

- async-everything, not a blocking language where anything async relegates you to a subecosystem like you have it in basically every other language from Java to Rust to Python.

- simple stdlib Promise that everything uses. no fragmentation between competing byob futures and byob abstractions.

- async/await.

- single-threaded making it ideal for i/o workloads, a crawler being the perfect example.

For these reasons I think it's one of the best languages. Certainly didn't used to be this way.


> “ JS is a pretty nice modern language“

Are we talking about the same JS here? Because the JS I know is a dumpster fire of a language.


Also just the fact that JS runtimes have had a lot of hard work put into them to make them extremely fast (motivated by the fact that JS is so widely used on the web).


I think it's partly "you can", but surprisingly enough I also feel that node's single-threaded architecture makes code surprisingly easy to reason about.


Golang is about as far from single threaded as you can get, and I find it much easier to understand than Node. I actually switched from Node to Go on a Greenfield project, and I find myself much more productive now, and when things need to be done in parallel, Go is much easier to read, write, and maintain.

Edit: to respond to the GP, I think sharing a language between front end and back end can enable people on either side go full stack with more ease. That's a benefit I guess, but not one I considered of particular value.


I’ve seen people think they’re reasoning about it when I fact they are not. I was on a project where the developers we Node pros for years. They were flummoxed when they couldn’t introduce transactions because they didn’t close over the database connection. When I said they had to hold the same connection through all the method calls, I was told I was wrong and that a global pool would be able to handle that without some external tracking mechanism.


Until you have to deal with aaync/callback/promises and mixing those to deal with various library dependencies.

I would love to see JavaScript turn into a less verbose, more consistent and less golf-oriented language.


I don't really see how JS could be less verbose in its concurrency-management side.

    result = somePromise() // run while the next promises are resolving
    pages = await Promise.map(url, url => crawlUrl(url), {
      concurrency: 4
    })
    allResults = await Promise.all([result, pages])
is some of the simplest concurrency code there is. You can keep chucking in more async logic and it doesn't get much more difficult to understand and doesn't introduce much more code.

If you want verbosity, look at Go's equivalent (wait groups) or, god forbid, any concurrency management in Swift.

The only problem you run into with callbacks imo are event-emitters like streams which you need to actually understand. Though I don't think this is any more trivial in other languages with evented/callback APIs like Java and Rust, I think reasoning about event callbacks is always harder for humans but a useful construct and necessary evil in evented code.


Friction is less when I write node.js application. This language and runtime keep evolving.

Sometimes I can remove dependencies and code because they are replaced by the new feature of the runtime, thus the maintenance becomes easier.


I use node.js in AWS Lambda because I run Clojurescript on top of it.

My application (https://operatr.io/) is Clojure (JVM / Back-end), Clojurescript (Browser / Front-end), Clojurescript (AWS Lambda) - there's enormous leverage is using one language in all cases.

I have at least one function that exists in all three environments, but more than that I have one common language for delivery.


I'm not sure if I understand your question. What do you mean by "web scripting framework?". NodeJS is a server-side runtime. It just so happens to use Javascript as the programming language. Its strength is handling async event-driven requests, so given that AWS Lambda's strength is existing within an event-driven architecture, it's a natural pairing.


> Not knowing this myself, could someone explain the attraction in using a web scripting framework for stuff like this?

Node is not a web scripting framework, though the fact that JS is used as the main language for web front end is a big part of the attraction; as it lets frontend and backend share code and be served by the same language competency.


It's mainly economic. There are plenty of JavaScript web developers. Now management can take them off the web silo and put them to work anywhere.


If you’re looking for more full-featured middleware functionality targeted at Lambda, also check out Middy:

https://github.com/middyjs/middy


Someone HAS to start flagging these sorts of posts.


@armatav why should this post be flagged? The recommendation was very helpful to me and lets me compare two solutions.


Sorry, why?


I'm sorry but why can't you just call a few functions in your normal handler and then just return the normal type of object that Lambda wants?

This all seems to be pretty useless indirection to me.


@ilaksh the middleware pattern is a common pattern for many large frameworks like ExpressJS and Ruby on Rails. It allows you to cleanly handle the request and response cycle. From a code maintenance perspective, using a well-established pattern that keeps your codebase module and extensible is desired.

If you have one or two lambda functions, then definitely this may be overkill, like you said just call a few functions. But as your system matures and grows, these few functions become a few modules, then your few modules become libraries, so on and so forth. You might want to consider this pattern before it's unclear what exactly is interacting with your requests and responses and in what order.


It wasn't obvious so I checked: zero runtime since it typescript and the @types/aws-lambda stuff is not needed in resulting compiled JS. Still I'm not sure what this does beyond until functions for events you receive in Lambda and have to parse raw yourself?


I meant util functions. Silly mobile HN clients, bah!


I created and use this tiny helper: http://g14n.info/aws-lambda-res/

I think it is far enough for my needs, it is nice to use no framework other than the one provided by AWS.


Thank you for this library, I am looking forward to reviewing your work. I didn't want to use a framework (like hapi or express) to achieve something similar.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: