IN a direct reply to Op:
There is definitely a lot that goes into learning serverless tech like AWS Lambda.
I lead Developer Advocacy for Serverless at AWS and we're constantly looking at ways to help streamline and simplify this process for you. You can find a lot of information on a site we've launched to rollup lots of resources: https://serverlessland.com .
Right now the best dev process is a mix of using local emulation via AWS SAM CLI (http://s12d.com/sam) or tools like localstack. But it's true that you'll never be able to completely mimic all the various services and their functionality.
Would love to know more of what left you feeling underwhelmed by SAM. It has almost all the functionality of CloudFormation, which is quite a lot.
Thanks,
- Chris Munns - Lead of Serverless DA @ AWS
That is not at all what my words say and I won't reply to that thread which was started by a former competitor to troll this convo today.
The perceived lock-in is really no different than consuming other technologies. You make a trade-off on what you want to manage vs. handoff to a managed service. For many customers the benefits are well worth it.
It's fine that Lambda wasn't for you, but you aren't being clear here about what issues you saw, just waving the lock-in boogey man so many misunderstand.
Cost, lack of debugging capabilities, terrible developer tools, cryptic documentation that misses some key scenarios, I can go on. Now, I get it, you’re at AWS so you would never openly and bluntly come out and say that what you -really- want is to lock in your users, because that brings AWS money. But that’s the reality. See my earlier comment as to why it made sense for our company to move away from Lambda.
What even is your point here and in this thread? Acquiring customers and giving them an incentive to stay is a cornerstone of any enterprise.
Saying lambda is bad because of lock-in is like saying their VPC offering is bad because of lock-in, or IAM is bad because of lock-in. It's not a generic component that you can flip between providers, they usually respond to a specific event from an AWS service and nobody really gives a rats about deploying their specific lambda to a cloud they don't use.
So yeah it seems like a real awesome idea to avoid vendor lock in for a small python function that responds to S3 change events from a SQS queue and updates a DynamoDB table with some values.
If you want "generic" lambdas go check out serverless.com.
Enough that in 2019 it was the most popular topic at re:Invent (our big user conference) and that today per our re:Invent announcement almost half of all new compute workloads in Amazon are based on it. Pretty heavily used across different industries and verticals.
Is there any potential for extending that limit? I work on a product that uses Fargate Spot as a kind-of lambda substitute to run longer-duration tasks consumed from SQS and being able to use lambda to do that would make life easier :)
Instances take longer to start up; Lambda processes requests in milliseconds or less. Lambda automatically manages a pool for you.
If you're running a predictable process and you know how long it'll take in advance, an instance may make sense. If you're running an unpredictable process, where you won't know how long it'll take until it's done, and it might be quite fast, the low startup time and fine-granularity billing helps.
Thats me. We've got some fun things we do behind the scenes to keep Lambda container image support snappy. SO yes, up to 10gb artifacts with container image support.
Not entirely. This is something we've been thinking about for years now and is really about tooling and developer workflow. That is where the biggest benefit is for this.
Hey everyone, we're really excited about this feature launch, and I wanted to come in to clarify any misconceptions.
With this capability you can now package Lambda functions using familiar container image tools (Dockerfile, cli tools, build systems) but you still need to code them for the event model, have a handler, etc.
It's a big improvement, but its not "run any container in Lambda".
Either way, hope you go and try it out and we'd love feedback on how it can be better.
- Chris Munns, Lead of Dev Advocacy - Serverless@AWS
My wishlist for this feature (and for AWS Lambda in general):
- Don't make me build the Docker image myself. I'd like to be able to deploy lambdas by sending you a Dockerfile and having you build it for me, with as few additional steps as possible.
- Make it as easy as possible to run containers that speak HTTP, like Google Cloud Run does. Having a wrapper that translates the AWS Lambda runtime into HTTP calls, maybe even as a proxy I can run inside my own container, would help a lot here.
- Make it as easy as possible for me to host my lambda as the "root" URL of a domain or subdomain that I own - without those nasty /api/ path prefixes.
Zeit Now (the Docker-based predecessor to Vercel) did this the best in my opinion - they had an HTTP API that you could POST a Dockerfile to and they would build it into an image and then start a container listening on an HTTP port, with an HTTPS proxy in front of it, running on a subdomain that I could easily customize.
I would LOVE to see AWS Lambda offer an equivalent developer experience. I'm super-excited about container image support but I'll admit I am absolutely dreading the several hours (to several days) it's going to take me to figure out how to actually get my code running on it.
One of the difficulties with this is that Dockerfiles are usually not very reproducible, mostly due to the design of the distributions they are based on. It could be very surprising to send the same Dockerfile to the endpoint twice and get very different runtime results.
I'd expect that I would send a Dockerfile once and have that compiled into a lambda for me which would then run in perpetuity unless I sent over a fresh Dockerfile to over-write that deployment, so non-reproducible Dockerfiles wouldn't bother me.
I'm OK with developers needing to understand that a Dockerfile may build differently each time if they don't take extra steps to prevent that (like pinning installed versions etc).
This doesn't seem to me like something that should be a core part of the lambda product, but maybe some sort of simplified build pipeline service that has Lambda as a target.
> but you still need to code them for the event model, have a handler, etc.
>It's a big improvement, but its not "run any container in Lambda".
Well that’s... incredibly disappointing and makes this announcement much less exciting.
The entire point of containerization is portability across different services and platforms. It’s seems like a massive miss for the team to tout “container support” but then still require platform-specific customizations within the container. I pretty much don’t care at all for this as-is.
Are there plans for “run any container on Lambda”? That’s what everyone was (falsely) excited about.
Your reading of "its not 'run any container in Lambda'" may be a bit too pessimistic. From what I'm seeing, you can run any container (<10 GB), but it just has to implement the Lambda Runtime API[1]. You can't run a random container and expect Lambda to know how it should communicate with the world.
As others have noted, ECS or Fargate would be more appropriate for cases that fall outside the Lambda event model.
> you can run any container (<10 GB), but it just has to implement the Lambda Runtime API[1]
So in other words, you can’t run any container.
Again, the entire point of containers is portability across execution environments. If I have to build special containers specifically for Lambda because they require these special runtimes, that defeats the entire point.
> You can't run a random container and expect Lambda to know how it should communicate with the world.
Google Cloud Run, which everyone keeps comparing this with, works exactly like that. Upload any random container, tell it which port to open, and bam... running application. You don’t have to mess around with adding any “Cloud Run runtimes” or modifying your code to add special “Cloud Run handlers”. Because that would be silly.
It's not really a very good comparison to be honest, because Lambdas integrate with a whole bunch of AWS services that send them events that aren't HTTP requests via a port.
I had exactly the same thought you're expressing here when I first built a Lambda to serve as a HTTP API manually after previously using Azure a tiny bit. In Azure you write their equivalent function and one of the built-in trigger options is HTTP, you enable that and immediately get a URL. Lambdas are quite close to that now I think if you're using the console UI, but you used to have to go through API gateway and set everything up manually.
But the point is that Lambdas aren't HTTP APIs that listen on a single port. They receive events that can be proxied HTTP requests via API Gateway, or messages on queues, or completely arbitrary invocations from step functions, or notifications that look nothing like HTTP from S3 buckets. I'd like the Cloud Run model when my lambda is acting like an HTTP API, but I don't think it'd be much fun to have to treat every AWS event as a full HTTP request for everything else.
I'm not very familiar with Google Cloud, but I think their equivalent is Cloud Functions which looks very similar to Lambda's pre-container model: https://cloud.google.com/functions/docs/writing
I think your integration comment is spot on but might be an apples and oranges thing. On GCP everything is an HTTP request with an id_token in the header and event in the body, almost surprisingly so. As a result services like Cloud Functions and Cloud Run containers are quite literally just generic HTTP handlers. I suspect the integration feel of Lambda, which I agree with, might be a consequence of inter-service communication on AWS being relatively proprietary, or if that's not the right word, custom? Maybe the loose feeling of GCP is because everything is an HTTP request? It's almost like all of GCP runs on a built-in HTTP API Gateway setup by default. Honestly most of the time I feel like I'm living in HTTP-Request-land anyway, so not having to context switch when working with the platform is kind of nice sometimes.
Thanks for mentioning that, it definitely sounds like you can get a lot further with a HTTP handler in Google Cloud than you can in AWS. I can definitely see why avoiding that context switch is nice. I can also imagine it making things like metadata a lot easier if they use common and well documented headers, where in AWS every service will inevitably have its own way of putting stuff into the event. Everything in AWS feeling custom would be a pretty polite and fair way to put it IMO!
Do Google represent a JSON body HTTP event in a clean way? In AWS when a lambda is working as one, it receives the body as a string in the JSON representing the request, which you have to JSON decode. Again because not everything a Lambda can act on is JSON and your API doesn't have to be, what else can it really do I guess. But it does make it a bit horrible to generate that data to test with - we actually have a small script we use locally to convert a nicely formatted JSON file with test data into their HTTP-request-as-json format because it's so annoying to work with a large encoded JSON body. It's a small thing, but it means when you switch to working with a plain AWS event it's one less thing to think about. And I guess even if you're in HTTP-Request-Land and everything is nice JSON, it's still all going to be custom between services from there on by necessity.
You can run any container. It just wouldn't run as you expect if you don't implement the lambda runtime api.
> Google Cloud Run, which everyone keeps comparing this with, works exactly like that.
Everyone does keep comparing this to GCR, wrongly. Not just because Lambda is not "bring any container" whereas GCR is, but more importantly because GCR and Lambda have very different operational models. GCR is a serverless platform for hosting web servers. Lambda is a serverless platform for event handling. The latter is a super-set of the former and thus requires more specific tools.
Yes, theoretically there could be a setup whereby you expose a port, and some lambda intermediary translates the invocation payload to an HTTP message your container can read. But I am endlessly fascinated why anyone would want that. I laugh a bit on how we've hit peak AWS where people here are legitimately asking for an HTTP request to hit an ALB, the ALB translates it into a lambda payload object, then a lambda component re-translates that payload back into an HTTP request, so your application can translate it into an object. Do you understand how insane that is?
The only tactile advantage lambda has over fargate is scale-to-zero. I think saving $7 per month is a pretty bad trade-off for the insane performance overhead and complexity a true docker-in-lambda solution would necessitate, and thus be unattractive to most consumers.
1. Sorry you are disappointed by this.
2. This is why I wanted to post here, to make it really clear. Andy only got to spend a few seconds on this and couldn't get into all the nuances. The launch post does, and we'll have more posts over the next 3 weeks just on this topic.
Containerization solves a few things. One big one was the container image as a packaging model. As customers struggle with dependency management or installing native packages (RPMs) we're basically faced with reinvent Dockerfile... or just use Dockerfile. This is an over simplification of what's happening, but that was the initial spark of this, many many cycles ago.
I think this is still going to be really valuable for folks, but what you are looking for already exists I'd say, and is Fargate.
Btw, ppl shouldn't be downvoting this, its all very valid.
> but what you are looking for already exists I'd say, and is Fargate.
IIUC though, Fargate doesn't have "scale to zero" like Lambda and API Gateway. Then again, IMO, scale to zero and the associated cold starts probably aren't the best fit for handling HTTP requests that are waiting for an answer right now.
Cloud Run on GCP is the “run any container” solution that this isn’t. It scales to zero, responds immediately to an incoming http request and can handle up to 50 concurrent requests out of the one invocation for no additional cost.
> but what you are looking for already exists I'd say, and is Fargate.
No, Fargate isn’t that at all. Google Cloud Run is what you meant to say.
This is of course still valuable to allow container-based workflows to adapt to Lambda, but it really seems like AWS missed the mark on identifying why people really want to use containers. Just one look at the amount of people on HN threads or on Reddit excited at using “arbitrary containers on Lambda” should tell you what people really wanted - and now they have to come away disappointed.
One thing I'd like to see is buildpacks for that final function contract. It's been done before for other cloud-y respond-y things (we did it on Project riff, Google have done it for Cloud Run), so I am aware it's possible. The nice part is that you won't need to build all the buildpacks yourself -- just the small set that adds your specialisations.
Yeah, this is Day 1 for this and I think we've got a bunch of ideas to make it easier in the future. Most importantly we're looking for feedback just like this!
Thanks for coming here to answer questions. Much appreciated!
Does each new Lambda cold start pull the entire image from the repo? Or if I derive from Lambda base images am I likely to get some of the layers already cached?
I'm trying to think about how the data transfer costs are going to each time the Lambda is instantiated. I didn't take this into account with Fargate and got burned when trying to trigger images on demand in a similar way.
This could replace my whole SQS / CloudWatch Events / Fargate setup using Lambda if I can figure out cold start costs ($ not time).
There's a few things at play. Functions will still stay warm inbetween invocations and will keep local any data already in the worker. We also maintain a couple different levels of cache so as to not hit ECR often.
I know we've got a few blog posts coming out over the next couple weeks on this new feature, and each tells a few bits and pieces about the story.
Depending on volume you'll probably find that Lambda will be cheaper for that workload, especially with the new 1ms billing.
FWIW, a little experimented I just ran showed me that with simple layers the cold start time of my little 3MB Go app was <100ms, using the Docker image `amazon/aws-lambda-go:1` instead took ~1500ms.
- - - -
REPORT RequestId: f905d5fe-a64e-48c8-b1f2-6535640a6f82 Duration: 7.55 ms Billed Duration: 1309 ms Memory Size: 256 MB Max Memory Used: 49 MB Init Duration: 1301.10 ms
- - - -
REPORT RequestId: 89afb20d-bc49-4d89-91f0-f1ef62ac99aa Duration: 12.20 ms Billed Duration: 13 ms Memory Size: 256 MB Max Memory Used: 35 MB Init Duration: 85.37 ms
Just wanted to say thanks for this new feature. The ability to use images up to 10 GB is huge. Being able to customize the container image down to the base OS is also nice. Don't let the negativity around the Lambda-specific bits get you down. I know that Lambda is about more than just serving HTTP, and I for one plan to use this new feature for a non-HTTP use case soon.
I do have a question. I know that Lambda normally reuses a running container for multiple consecutive function invocations. What if I don't want to do that for a particular function? Suppose, for security, I don't want any leftovers from a previous invocation (in case it had data from a different user). Is there a way I could gracefully tell Lambda to create a fresh container instance for each invocation, and just live with the cold start penalty every time? Edit: I could just look to Fargate at this point, but it sounds like Lambda is doing some extra cold start optimization.
No good way to do this today. You are right in that you'd be forcing cold-starts. You could use a Lambda Extension to provide some sort of after-processing clean up of vars or /tmp space.. but thats hypothetical I haven't seen anyone do that yet.
I think what I'll do is write a container entry point that cleans up temporary files like you said, but also repeatedly spawns a new process for the main program, to minimize the findable leftover data in RAM. Just in case an attacker finds a Heartbleed equivalent in my application.
Just a heads up we're trying to do that with Fargate right now and theres a limit to how quickly fargate can spin up new instances (like max 10 at a time); it's not well designed to do a single-execution-per-message right now, at least at a decent load.
If we push an image with an AWS base image do we need to constantly update and rebuild + redeploy it or do you handle the base image updates behind the scenes (e.g. security patches) like normal lambda?
So, apparently I am crazy for thinking that running zip is trivial whereas the ridiculous stack of tooling that surrounds the awkward nested format of docker containers is extremely annoying :/... with the new ability to accept large containers, can I now send you a larger zip, or if I want a larger lambda function am I going to have to use the docker container interface?
Why does the lambda container has to pull events from the lambda runtime using an HTTP GET request, instead of having the runtime push the events with an HTTP POST?
This is all based on how the RunTime API worked already (pre-dates this launch by 2 years or so). We wanted to not change too many bits.
Since function code has no listening socket/port as it were you need something (like the bootstrap script in a custom RunTime) to pull the local interface for the runtime API. That runs on the underlying worker and communicates with the API for Lambda.
How would the runtime know for sure when the container was ready to accept the request? The lambda model--with the container pulling the events instead of them being pushed to it--seems like the correct way to model the concurrency.
I know it’s not necessarily a container related thing, but one I thing that frustrates me a lot, and why deploying takes longer than I expect is because zipping up and uploading a package is a bunch of unnecessary clicks. I just want to “npm install” a few third party libraries in the online IDE. (repl.it has a nice experience here). It’s frustrating to shuffle between my desktop, AWS and back again.
Yes, either those docs will ship today or sometime this week. They are doing everything in batches but I've seen the SAM support pre-launch (which would require it).
Same as Lambda has before. The state you have is most likely short lived as we reap worker environments every so often, but in-between invokes you could persist some data in memory or tmp space.
> ...we really want you to be able to pay for what you use.
Cloudflare Workers has the right pricing model. They only charge for CPU time and not wall time. They also do not charge for bandwidth.
> Lots of sub 100ms workloads...
AWS Lambda (or Lambda at Edge), as it stands, is 10x more expensive for sub 50ms workloads (Workers does allow upto 100ms for the 99.9th percentile) that can fit 128MB RAM.
That's because keeping track of request state is not free. Ask an edge router. If you have a request open, even though it's not doing CPU, that request has to be in a queue somewhere, tracked for a response that can be transmitted back.
I don't know the infra costs of operating lambda, but my guess is that it's far from CPU-dominated.
I would not be surprised if the Cloudflare pricing model is making a tradeoff to make CPU-bound workloads pay for more of the infra than the rest. It's a valid trade-off to make as a business offering, and it might be feasible given the mixture of workloads. Whether it's the right way is debatable. Whether this model can be tanked by an army of actors taking advantage of CPU-insensitive pricing remains to be seen, or is an acceptable risk that you can take (which you can observe and protect against).
Yet, if you're a Cloudflare user, all of your edges are there - so it doesn't matter. We use Workers extensively for
"edge" related things. Lambda, never - but for working with S3 buckets, sure. They feel similar, but differently specialized.
They're not easily comparable (I tried using Cloudflare Workers before going back to AWS). Lambda@Edge runs Node or Python. Cloudflare Workers runs V8 with "worker isolates" which has a few more caveats, an imperfect but improving dev experience, and doesn't work with a lot of npm packages.
What would be really useful for my use case (running browser tests on a schedule) is if Cloudflare workers actually supported running full headless chromium automation in addition to just V8 isolates. Right now I'm using puppeteer/playwright + Lambda, but would love to have more options.
Workers aren't the same as lambdas, they are a super slim JS environments. At 50ms max runtime most browsers won't even start, let alone fetch and process a page.
No to be clear I'm saying you are comparing things that are way more different than our friends at Cloudflare would like you to think. They aren't brought up in any of the convos I have with customers.
It's a quick Google. 128MB max memory, 6 concurrent out going connections max, 1MB code size limit. The use case here is a subset of what AWS Lambda can handle. The supported languages also differ (only things that have a JS / wasm conversion for Cloudflare Workers).
I haven't looked deeply, so please correct me if I'm wrong, but I understand there's also restrictions on the built-in APIs available [1] and npm packages supported for NodeJS.
I would assume some of the above contributes to the price difference.
It isn't about the products, it is about the pricing model in a similar market.
Second, for sub 50ms workloads [0], Workers is absolutely a superior solution to API Gateway + Lambda or Cloudfront + Lambda at Edge if the workloads can fit 128MB RAM and package/compile to 1MB JavaScript or WASM executables, in terms of cost, speed, latency, ease of development etc
[0] For Workers, 50ms is all CPU time and that is definitely not the case with Lambda which may even charge you for the time it takes to setup the runtime to run the code and time spent doing Network IO and bandwith and RAM and vCPUs and what not.
Based. "That's just an edge case. Our customers love this service!"
It's like going to a restaurant that uses bottled water instead of tap water, and they dont provide an answer as to what the benefits of bottled water are
But you're telling us that Lambda's prices are justifiably higher because of the strong vendor lock-in? AWS is starting to sound more like Oracle. Ironic. :)
Besides the fact that Cloudflare's part of the Bandwidth Alliance with GCP and other infrastructure providers from which AWS is conspicuously absent, Cloudflare's also slowly but surely building a portfolio of cloud services.
Lambda's pricing is indeed higher than Cloudflare Workers for sub 50ms workloads (that fit 128MB RAM).
Cloudflare's alliance with other infrastructure providers mean Cloudflare's platform isn't really limited to "API" workloads. This is discounting the fact that Cloudflare recently announced Workers Unlimited for workloads that need to run longer (upto 30mins) though then they do charge for bandwidth.
The question here isn't the price change here (which is in some sense mainly about balancing short functions and long functions, removing the penalty for short functions) , it's where the pricing is at overall vs Cloudflare.
This comment would be much more useful if you gave some clear examples of the difference (presumably something you get on Lambda that makes it worth more per ms than Cloudflare).
>> AWS Lambda (or Lambda at Edge), as it stands, is 10x more expensive for sub 50ms workloads
Not sure about this, most use cases of Lambda use other resources and do not exist in a vacuum. Comparison should be made using complete systems not only parts.
Not if you're actually taking up that much cache storage but bandwidth has plenty of examples of high usage on low tiers. They usually allow it as long as you're not affecting the rest of the network adversely since the lines are already paid for (which is the right approach IMO).
Chris, while I've seen the change in my accounts on regular Lambda, I don't yet see it on Lambda@Edge. I think Lambda@Edge is the place where we'd benefit from this change the most, because many L@E scenarios take single-digit milliseconds, and the cost of L@E is 3x regular Lambda.
Any word on whether we'll also see this change on L@E billing?
Yes, to be clear this change was just for Lambda. L@E is honestly a completely different service run by a different part of AWS that just happens to share parts of our core worker platform. I am not 100% aware of when they might adjust their own pricing on this, but also couldn't share any roadmap here (sorry).
How does that even work? Lambda seems like a challenge even with the entirety of the datacenter resources to work with. Running it in constrained edge environments with a VM per function seems like black magic.
The naming is a bit of a misnomer, today L@E doesn't run at the edge (in our PoPs) but when you deploy it copies to every region and then CloudFront routes you to the lowest latency region for your request.
Okay, nice. And if I would like like 32 vCpus? Having an application today that has a huge degree of parallelism, but utilizing an external cloud provider that offers dedicated machines with very affordable pricing. Would really like to use lambdas instead though.
I would love to see this as well: having 96-vCPU Lambda instances (or instances that match the biggest C-family instance you have) would solve a lot of problems for me. The execution model of Lambda (start a runtime, handle requests, AWS handles pool management) feels much easier to use than managing a pool.
Someone from AWS once commented to me that "if you're ever having to manage a pool rather than letting us manage it, that's a gap in our services".
A lot of this was based around the fact that we've seen languages become just so much more performant. This includes Go/Rust/etc, but a lot of Node.js workloads are also sub 100ms, or fast enough that they'd benefit from this pretty well.
I lead Developer Advocacy for Serverless at AWS and we're constantly looking at ways to help streamline and simplify this process for you. You can find a lot of information on a site we've launched to rollup lots of resources: https://serverlessland.com .
Right now the best dev process is a mix of using local emulation via AWS SAM CLI (http://s12d.com/sam) or tools like localstack. But it's true that you'll never be able to completely mimic all the various services and their functionality.
Would love to know more of what left you feeling underwhelmed by SAM. It has almost all the functionality of CloudFormation, which is quite a lot.
Thanks, - Chris Munns - Lead of Serverless DA @ AWS