Hacker News new | past | comments | ask | show | jobs | submit login
Announcing Go Support for AWS Lambda (amazon.com)
399 points by jitl on Jan 16, 2018 | hide | past | favorite | 97 comments



What's the current state of the art for "I know I use AWS for my 9999-server production instance, but how can I test the majority of my services offline on an airplane w/o spending $199.99/mo to keep a fleet of test AWS servers/services at the ready"?

Last I looked the "fake-aws", specifically "lambda" supported python, not JS/Node, which was not matching my use-case. Go is now on the radar as a usable programming language for this purpose, but what are the chances of being able to also use it with local test deployments?


I'm using this, and so far it's working just fine:

https://github.com/awslabs/aws-sam-local

I'm only testing Lambda functions with it, and not really testing APIG + Lambda integrations or anything fancier.

I prep the local db, mock a Lambda event, write the mocked event to a JSON file, then use command line options to run the event against AWS SAM local, then get the response and compare it to the expected response.


I haven't used it yet but I've heard good things about localstack[0]. Originally it was in the Atlassian GitHub organization[1] but they moved it over to its own at some point.

0. https://github.com/localstack/localstack 1. https://github.com/atlassian/localstack/


When computing on an airplane you may already be in the cloud.


Unit tests are all you should really need. Mock the AWS services and make sure to assert, where you can, the arguments being sent to those services and you should have enough coverage to get the job done.


Do you need to keep them running constantly? The beauty of platforms like AWS, GCP, ... is that provisioning capacity is just an API call away - so could you not have some way of provisioning those resources on demand to run the necessary tests?

(If you can't get internet on your plane then you might not be able to run tests, but that's the case with the full test suite on a fair few systems anyway)


The services are incredibly useful, incredibly powerful, but from a code + dev hygiene perspective it's safer to be able to completely test your code in isolation prior to deployment.

Many people are saying: "Just mock the AWS parts..." or "Provision from amazon on demand...", but there should be "common" mocks, or ideally "toy mocks" which can be used locally in order to verify the production system works how you expect it to, prior to deployment to that production system.

Basically, I'm uncomfortable making AWS a hard-dependency if I can't get something like it running locally for a "100 record use case"


It would be nice if AWS provided mocks but the reality is that's there is more than one lib for creating stubs and mocks for every language. Additionally, I would assume that AWS want's to steer clear of making those types of decisions for the developer. The "what lib do I use for x" decisions that developers have to make.


Yeah, that's probably the worst part of 'serverless computing.'

If using the serverless framework, the serverless-offline plugin[0] supports JS/Node but not Python.

[0] https://github.com/dherault/serverless-offline


I guess I'm confused as to how the support of one compiled language differs from any other compiled language; couldn't you just replace "Go" with "Rust" at this point? I understand how Lambda would have to support scripting languages (NodeJS, Python, etc.) with built-in runtimes and such, but if you are going to compile down to a binary object anyway, why does it matter what language the binary object was written in?

Does it have to do with supporting libraries being pre-installed on the Lambda image or something? Couldn't they build a Lambda image that handles any language that can interface to common "C" headers?


The answer lies in understanding how lambda truly works. Lambda does not start a new version of your program every time the function is called. The real entry point to the program is somewhere else hidden to customers. That program starts up and then waits for requests like a web server. On every request, it invokes our lambda function. It literally invokes the function, it does not spin up a new instance of the program. The program lambda spawns keeps running until no new requests come in for a certain amount of time which is not known. This is also why first invocation of a lambda function after a long time is slower.

Also, there still are other things to take care of. Online code editor support, making sure everything works and there are no bugs, support for compiling the programs edited directly from the browser, etc etc. Also, Lambda support means that the language has access to it's execution context through an API and there is no better way than to implement it in the same language and just pass an object to your handler.

Lambda functions in reality are not one-off CLI commands that boot and terminate fast. They actually spin up a long running service and then the same instance handles multiple requests until the instance is killed off because of being inactive for too long. Having lambda support means AWS implementing a long running server that integrates with your code, handles things like errors, exceptions, logging and provides you with an API to execution context.


Thanks for explanation, it's somehow clearer to me, now. There's one thing I still don't understand: doesn't this mean it's basically just a small server app with a standardized api (as in, api must be implemented in a certain way which allow to call its endpoints as if we were calling functions directly)? What is the specificity of lambda? (beside the pricing style, making pay for requests rather than app execution time)


Yes, in a way. You can also use it to run code on data in S3 or other aws services.

For example I need to munge some data somewhat regularly, which is stored as json on S3. I can trigger it, and have about a thousand processes running in a few seconds, and my large chunk of json processed very quickly. I don't need to maintain any systems for this to work and it's cheap enough it's not worth me worrying about.

Though S3 select might supercede a bunch of my use cases.


Oh, I see: it's not just that it's a simple app with payment on endpoint access, it's that it can spawn tons on this app very quickly.

Am I correct to assume to ideal use case for that is to replace what is typically implemented as a background workers for webservices?


I've not used it much outside my own area but yes I think that's a very good use case. Particularly if there's a "process some files" part. The classic example is thumbnails for images, you can point a very simple bit of imagemagik calling code at a bucket and say "any time a file is dropped in here, create a thumbnail and put it over here" and it'll scale when you have a bunch of images dropped in and cost nothing when nothing is happening.

Not perfect for all use cases, although some people do seem to enjoy running as much stuff in lambda as possible, but when it matches what you need it can be a surprisingly simple solution to quite annoying problems.


Yes, you are right. That's almost it. There are even some open source function-as-a-service platforms out there that do this. The advantage with lambda is that it is dealt with by Amazon so it is virtually infinitely scalable. Also, it integrates with tightly with a ton of AWS services which makes it a great option if you are already invested in AWS.


I see, thanks for confirming this. I was initially skeptic about lambda when it was first released, thinking "so after heroku which makes us pay per process, now we must pay per function call?", but I know better than trusting my doubts about a new tech, I've seen too many people focusing on the parts from a new tech resembling an old one, totally missing what is new.

Would you agree with this statement? => what sets lambda apart is that it boots so fast that it's legit to start one only to call one function, which in turns, thanks to aws infrastructure, allows to massively parallelize short lived tasks.


Yes, I'd agree with that but to me Lambda is much more than that. It's not only suitable for very short running tasks. In fact it is suitable for everything other than very long running tasks. We can build proper APIs and web services on top of lambda and literally only pay for the exact resources we consume. We never have to worry about managing infrastructure, handling auto-scaling, etc. AWS recently also introduced a "serverless" database offering that also comes up when you need it and then goes to sleep. It can scale "infinitely" as well.

I see majority of my workloads implemented on top of Lambda + ECS/Fargate/Kubernestes in near future. Lambda for all short running jobs (1-5mins) like image processing, emails, notifications, web services, APIs, glue code, state machines, etc and ECS for long running tasks like data processing, DB syncs, backups, video encoding etc etc.


> Thanks for explanation, it's somehow clearer to me, now.

Seems weird to completely disarm your "thanks" with "somehow".


He probably meant somewhat.

English is hard.


Indeed, TIL they are not synonymous :) (non native speaker)


So... anything that can be contorted into using Go's calling convention for that one API entry point and embedding something that looks like a symbol table could do the thing?


Probably. That's exactly what some folks did to use the Python runtime to run Go before it was officially supported: https://github.com/eawsy/aws-lambda-go


You can actually sort of do what you are asking. It was common before official support for languages to use node, and simply call out to a binary compiled for Amazon Linux. That does work. However, the support means that the Lambda system can call the specific binary directly, and that libraries are available to expose the data structures easily. There is nothing preventing you from calling a rust program from the node runtime for instance, if you wanted to.


Your code is wrapped so you do not have to deal with networking and serialization. Your code must only satisfy the interface and feels native to the language of choice.


Rust support could be good, but does that mean a lot of work?

I mean, a lots of features that natively supported in Go require crates or coding-by-hand in Rust, maybe that's why Go gets supported first.


The reason is that Go is more mainstream.


And Go has supported cross compilation from the very beginning. Cross compiling in Rust has not been historically that good or accessible for beginners.


Xargo isn't that bad, and neither is using a Docker container to build. Plus, a lot of teams generate their build artifacts in CI environments that are already Linux. So there's a bunch of pretty easy ways to get Rust code running in Lambda.

Honestly the reason why official Rust support may never make sense is that it's so easy to use Neon to build native Node modules with Rust. Not one line of JS is needed and the impedance between the two languages is minimal. Go would require more hacks and copying thanks to both it and Node having their own GC. Basically Rust is flexible enough that official support isn't necessary whereas Go requires official support to avoid ugly hacks.

FWIW, I've never used Lambda to do what I'm suggesting above, but I've done it with GCP's equivalent and it works great.


> why official Rust support may never make sense is that it's so easy

I suspect the only thing Amazon will care about here is consumer demand, and potentially the ability of consumers to move from other platforms to theirs (e.g. if Google cloud supports <thing> they are more likely to do so in order to convince people to move).

Rust is a cool language, but it's use in HTTP services is not as widespread as Go, and there are a bunch of other languages I'd expect first (PHP, Ruby?)


I would assume that the Lambda team is more likely to fill in gaps in support for things that already have official SDKs before starting to officially support something that the rest of AWS isn't yet working on.


First Python 3, now Go, fantastic. Plus that X-Ray support looks great.

The only thing that surprised me is that the Body struct element is a String, not an io.Reader. Can we not stream large bodies through API Gateway into a Lambda? Does it always read the whole thing before passing it onto my Lambda? Or what's the reason behind the String? I haven't tried this yet, so I'm curious.


The maximum payload is 6MB for synchronous invocations and 128KB for asynchronous invocations.

More info here: https://docs.aws.amazon.com/lambda/latest/dg/limits.html


I'm still wondering why Lambda simply doesn't support Docker containers? Would put an end to all this requests like "Please support $my_favorite_language on Lambda"


"simply" is one of those words that makes people twitch... See, the packaging format is not the problem. The protocol is where it gets tricky.

I previously imagined Go support would arrive with a JSON on stdin/out/err protocol. Instead, your binary runs as a service with RPC endpoints. Unfortunately, it's a bit Go-oriented, so you'd have to implement some Go things (go/rpc, gobs) in your language/framework of choice to use it.

But, you get a (much) lower warm function startup cost because it's just an RPC call, all of which is neatly abstracted away for the Go programmer in the runtime library. Great for Go, bit of a bummer for those of us hoping for something more agnostic.

Yes, Lambda could support "Docker containers", but the more interesting problem to solve is the runtime model and protocol.


> bit of a bummer for those of us hoping for something more agnostic.

If you can imitate the go/rpc protocol, you an run any binary from any language on this, including LLVM based systems.

A version of this is already in play in https://github.com/apex/up


You are looking for a different product. What you are looking for is ECS+Fargate. That does exactly this. The difference is that with ECS+Fargate you are in charge of building the entire application from scratch. With lambda you just implement 1 function, the "service" part of your app and all the plumbing to support different things like X-Ray, logging, handling errors, hot and cold boots, etc etc are handler by AWS.


I think Lambda should do this. Lambda seems like its a more managed version of the concepts in inetd + CGI: Like inetd, Lambda will start a server, if needed, to handle a new connection, and like CGI, Lambda passes request data to user application code in a standardized way. You could implement a Lambda-like service using inetd + CGI to run functions in Docker containers, and then destroy the container once you get the result.

Spawning a process in a container is expensive. Not only do you need to wait for the container to init and mount its filesystems, but you then need to wait for the inner application to boot on every request, on top of that. So, allow containers to linger with applications booted, like Lambda does to save on those starts.

Since we aren't launching a new process for every call, we can't use CGI with its environment variables, etc, so we'll have to describe a new protocol for calling your functions, like... HTTP! The interface could be POST /call with JSON params.

I'm not sure how this model would differ semantically from the fully-managed, in-process system that Lambda uses. The management system can always kill containers that go over their memory limits or seem to be doing suspicious amounts of system calls when they're not executing a function. AWS can still go about providing in-process libraries for "supported" languages that implement these protocols, but if they also describe these protocols, then the community can go ahead and develop libraries for other languages that could eventually be adopted and officially supported.

In fact, the community is already producing libraries that use AWS's supported languages to allow hosting lambdas in other languages, via a CGI-like model! Just look at https://github.com/apex/apex - a NodeJS shim will spawn a command in the language of your choice, and then return the result back to Lambda.


Why would a container be slow to create? It's effectively a fork with namespacing options. The overlay file system is just a product of one or more layers that are already computed; there's not much complexity to its mounting compared to something like ext4 or NFS. I haven't timed it, but containers seem damn fast to start with Docker and Kubernetes.


> I haven't timed it, but containers seem damn fast to start with Docker and Kubernetes.

You should time it then, it's incredibly slow these days

    $ time docker run busybox /bin/true
    .388s total

    $ time docker run --net=none busybox /bin/true
    .310s total
The above times, of over 300ms, are with docker 17.06 and the overlay2 driver on a powerful computer with an ssd.

It's horrifically slow because the entire docker codebase is disgustingly bloated.

It has to fork something like 10 processes and synchronize between them (for everything from network initialization to runc hooks). It has to make ipc calls from docker-cli to dockerd to containerd and back again, allocating and freeing large chunks of memory for grpc along the way.

It's incredibly slow.

On the other hand, just forking and running something with namespacing options is effectively instant:

    $ dir=$(mktemp -d)
    $ docker run --name bbexport busybox /bin/true
    $ cd $dir
    $ docker export bbexport | tar x
    $ time nsenter --root=$dir /bin/true
    0.000s total
This is the reason they don't let you use arbitrary docker containers; docker containers are a ton more than a fork and some options. They're a bloated moving target.

What lambda does allows them to optmize much more. In fact, lambda doesn't even have to fork a process sometimes because they can keep the e.g. javascript runtime running and then just add a new request into the existing process directly. They can freeze and thaw environments from known good states for request processing.

But most importantly, by not being compatible in any way with docker, they don't have to deal with the bloated mess that is docker and the ecosystem around it.

They don't have to load 3 daemons measuring 50MB into memory to run 100kb of javascript (which they can currently run more quickly than docker can even start the simpleist container)


Even though a container starts "damn fast" in Docker, you're still adding significant latency to each call of your function if it requires a new container.

    $ time docker run alpine:3.6 sh -c exit
    docker run alpine:3.6 sh -c exit  0.02s user 0.02s system 3% cpu 1.191 total
    $ time sh -c exit
    sh -c exit  0.00s user 0.01s system 94% cpu 0.007 total
Then add to that the cost of loading your function's code into memory. For an interpreted language, that is probably going to take a while as the runtime does a bunch of system calls and parses text files. Even for a compiled, statically-linked language like Go, the executable (a random binary I picked was 50MB) needs to be copied into RAM.

For this reason, Lambda tries to keep your containers/function-wrapper-applications booted up in RAM: https://aws.amazon.com/blogs/compute/container-reuse-in-lamb...


True, but that's Docker. There's nothing, that I know, that would make Linux containers inherently much slower than a fork. Amazon could conceivably also write their own container runtime that kept the container ready to run without actually starting the app (though I'm sure there's a lot more to it -- I don't know how Lambda is implemented).


> I'm still wondering why Lambda simply doesn't support Docker containers? Would put an end to all this requests like "Please support $my_favorite_language on Lambda"

Because they can present that as a separate service (Fargate) and bill more for it.

Fargate isn't exactly what you're describing, but it's a close match. It's in Amazon's best interests to charge people based on the amount of "DIY" they're willing to put up with, and that's why they offer a number of overlapping services at different price points.


Amazon does not launch a brand new container for every invocation of lambda, only cold starts. Supporting arbitrary Docker containers is definitely more costly.


Because it’s a function execution system, not an application execution system. If you need the latter, you can use ECS or K8S.


Fargate is probably the closest AWS offering.


This, also, is the direction the industry is going, regardless of cloud provider.


Isn't that what Fargate is, basically?


Docker executes long running processes that communicate over HTTP/TCP. AWS has ECS, Fargate and has announced K8S support soon.

Lambda is for executing individual functions, which implies that the execution environment must be language aware (could get away with JVM dynamic invoke or the .NET equivalent as well maybe). The way parameters are passed in, the concurrency model etc are all language specific.


And yet, the golang runtime for Lambda is a long* running process that communicates over TCP. :-)

It would be entirely possible to create a language agnostic Lambda runtime model, be it stdin/out/err, or TCP.

* relatively so


That's already around https://github.com/apex/up


Not as a wrapper, but an official Lambda runtime model. This Go release points to a much more interesting way of doing so (RPC) than what I had previously imagined (stdin/out/err).


Yeah, the problem with RPC is that there's no one RPC to rule them all, like there is with JSON or HTTP. They'll have to build RPC wrappers for each language / VM, and choose the most popular RPC packager if there's isn't one in the stdlib.


Couldn't they pick an existing, established cross-language RPC system like Thrift or gRPC?


I don't know about the internals, but my guess is one of the following:

1) there's something about the lambda platform that just doesn't lend itself well to customer provided containers. a lot of people would AT LEAST have issues keeping image size reasonable.

2) they plan to add this. AWS is already investing in serverless container services, take a look at Fargate.


Like many people stated here, Fargate will give you this. Lambda is actually under the hood a container system.


It seems that they benefit from lock-in, whether it's deliberate or not.


There is no lock-in... you can write an agnostic function for both AWS lambda and azure function.


I'm aware of tools like Serverless. They help remove the lock-in in some situations. That doesn't mean there's no lock-in to speak of at all.


In regards to C# functions, there is no lock-in to AWS, it's a piece of code, you can run that function on AWS Lambda, or in ASP.NET Website, or Azure Function, or console app...

It's literally a method which has a single parameter class, and a single class returned...

Even if you're using node, java, golang, python. Other than the packaging. How is there any lock in?


Aren't you spending resources setting up your CI/CD pipeline /deployment tooling to talk to Amazon's APIs?

Unless all of the providers have the same API to deploy functions, they are all trying to lock you in.

So not lock-in at the code level (unless your serverless app talks to S3 or whatever Amazon services), but lock-in of your whole devops process.


So first it's all "Oh you're locked into IIS" and "You're locked into Windows", then "You're locked into AWS" now it's all "You're locked into your CI/CD pipeline"

It sounds like, no matter what you do, you're locked in. Using Docker? Ah you're locked into Docker. Wrote your app in PHP, on you're locked into PHP now.

I just don't understand this whole 'lock-in' mindset, it's just absurd now.


>I just don't understand this whole 'lock-in' mindset, it's just absurd now.

If you don't understand it, then it's probably not absurd.

Look into the whole free software movement and the history behind it (e.g. the evils of Microsoft in the 90s).

If a vendor can change a price on something that forces you to pay up or lose your system, you are locked in.

This isn't just some open source ideology without impacts either. It should be immediately obvious to anyone with any business acumen why making yourself beholden to the decisions of a single vendor is a bad position to be in.


The vital difference being that Docker and PHP aren't subscription-based, and aren't controlled by a single corporation.


Replace Docker and PHP with anything and the same is true tho.

If you write your application in any language, or use any framework or library, you're locked in, by today's standards.


The difference is that you can't be screwed by a single company deciding to change prices. When people refer to "lock-in", they are referring to being able to be screwed by a single vendor.

Nobody cares if they are locked into x86, tcp/ip, and http. If you don't understand this difference as an engineer, you end up causing a big risk to your company (or clients) without even knowing it.


Oh so now it’s paying for something that’s lock-in. This lock in sure keeps changing.


If you look at the amount of work Serverless has to do, packaging can generate quite an amount of lock-in. It is hardly any lock-in in the simple cases of grabbing a single function and deploying it where you want.


The architecture of lambda may prevent this.


> The architecture of lambda may prevent this.

It doesn't; you've always been able to use any language you want, by embedding it within the runtime of a supported language. This is just announcing proper, official support for it instead of having to rely on a workaround.


Lambda's execution model is language-specific runtimes being fed events off a queue through a single well-defined RPC or RPC-like entry point, with the process frozen or discarded upon queue exhaustion.

So the architecture of Lambda may indeed be a severe mismatch for running arbitrary containers.


Go seems like a great fit for this application.

Would love to see stats comparing cold start times etc. for go vs. java/python/js. One would guess that it would be faster, but measurement trumps guessing.


Absolutely. I've done some work with Lambda, and the cold start times for Java and C# were horrendous. Node and Python were pretty fast. I would expect Go to be more the latter than the former, with the added advantage of running really fast as well (and not being JavaScript), so my advice to my former colleagues on implementation language for a Lambda-based solution would be different if that pans out...


In my case when I gave Java a bigger allocation of RAM I got far better result (and it usually got even faster than the equivalent Python/NodeJS code).

At the time at least (I don't know right now, it's been a long while since I touched a lambda function) it seemed that the bigger RAM share == the more CPU you get, which was not written anywhere in the documetation. And JVM startup times with crappy CPUs are indeed horrible. :)


I think it mostly depends on whether it's always executed from source, or whether it's compiled and stored in fast access storage. If the latter, binary size becomes the main limiting factor (for pretty much all of those applications). Java is generally kinda slow to start up, but that may be due to all the frameworks. Python / JS IIRC are interpreted as scripts, not compiled.

Mind you the go compiler is also pretty damn fast, almost script language speed.


I created one yesterday when this announcement was made. I had to upload my compiled binary, not source.


For Go and/or C# lambda support would it be worth even running the garbage collector, or just allocating a block of memory and cleaning it up when the function ends?

Side note I think that should be an option for web servers as well for languages with managed memory. Light isolated non-threaded api endpoints shouldn't be interrupted by garbage collection.


We do this during request execution in our Ruby services. We tell the GC to pre-allocate a few GBs of memory, and then explicitly suspend GC until the end of the request. Then we do a single GC run.


Very interesting! Have you experienced improvements from this approach?


Absolutely. It's a critical part of our runtime configuration, but we're hardly the first people to think of this approach.

I remember reading some discussion in an HN thread about FinTech developers who run Java with 100GB+ heaps and no garbage collector, and then reboot the application after the markets close. I can't find that specific thread, but I did find this which has a few nuggets on the same subject: https://news.ycombinator.com/item?id=6131786


While that does happen sometimes, from where I've worked at it is more common to write in a garbage-less style (basically not allocating or pooling everything) since JIT time matters at startup, especially for certain classes of strategies.

Ive seen applications run and never GC after initial startup until they get bounced to update. Another problem with the big heap is that you still have tlab issues if you keep allocating.

There is a new-ish poc collector called the null collector that literally does nothing, not even instrumentation, i believe, of write barriers.

We always see idiomatic programs benched against each other but I'd really like to see high performance pure java against high performance pure go and others. Some of the low latency Java tricks we use I'm not sure if they can even be copied in other hll (not including c/c++).



That's it. I was looking all over for it. Thanks.


> Very interesting!

Yep. There's a proposal by a couple of folks to get this implemented in the Golang runtime. The feature is aptly named 'Request Oriented GC'.

https://news.ycombinator.com/item?id=11969740


Not needed for Go (IMHO) since any delay caused by GC pausing (1ms? 7ms?) will drown in the time it takes to make a network call.

https://making.pusher.com/golangs-real-time-gc-in-theory-and...


Generally an interesting idea, but Lambda containers are re-used across requests, so might not be a good thing to do in this case. If there's no reuse, and they figure out sub millisecond cold start, they could do this and essentially create a disposable server for each request.

Ironically, I think that's what CGI did :D


This is an interesting question. I think it depends on the situation, You pay for memory so if you are getting near the cap in the execution of your function then managing the memory as you go may be beneficial but otherwise I don't see why you wouldn't just ignore it. I do have an open question weather the memory leak would effect subsequent runs of the function after it has been "warmed up" - https://serverless.com/blog/keep-your-lambdas-warm/.


Lambda containers can run for hours at a time. It doesn't just fork a new process for every request:

https://docs.aws.amazon.com/lambda/latest/dg/running-lambda-...


Most lambdas would be reused many times, the use once and kill the VM is not the only plan...


I think that's arena allocation and that's what compact regions do in haskell.


I wish they'd do the same for Ruby... hopefully Ruby is next!


What about Ruby.. ?


An interesting service this lambda is. Amazon wants to bill people money for basically hosting xinetd scripts. I wonder how much money do they make from it.


They make a lot of money, and I am only too happy to pay for what they give me. Countless hours, or months, of my life saved.


What kind of things do you run on Lambda?



The link there pointed to a Github repo with an almost entirely empty README, so I posted the official blog post with examples and a full walkthrough.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: