Hacker News new | past | comments | ask | show | jobs | submit login
AWS Lambda function URLs: Built-in HTTPS endpoints (amazon.com)
277 points by vvoyer on April 6, 2022 | hide | past | favorite | 173 comments



Very pleased by this addition! :-) Note that it creates special .on.aws URLs so if you want to use your own domain to future proof the endpoint (against linkrot if you ever leave AWS, say) you'll want to set up a redirect/proxy for yourself (whereas API Gateway does custom domains).

Also an interesting note from the docs about how said URL is generated: "Because this process is deterministic, it may be possible for anyone to retrieve your account ID from the <url-id>." I don't know how much of a problem this could be, but it's worth being aware of.


In my opinion, AWS account IDs are not sensitive information.



Yeah, account IDs aren't actually publicly listed, but should be treated as such. No part of your security should rest on an AWS account ID staying secret.


they are useful for social engineering and phishing


To be fair, what isn’t?


You better don‘t have a name, my friend!


Well he is bikingbismuth. He's doing things right here ;)


I believe this is why GCP added random hashes to their Cloud Run URLs whereas AppeEngine gave you a deterministic one e.g. https://{serviceID}-dot-{projectID}.appspot.com


should be able to add these as origin's behind cloudfront since it support multiple origins...

/lambda/* could be routed to your functions

/* everything else your main app


It's super frustrating that AWS has no equivalent to GCP's Cloud Run which offers fast startup, scales to zero but offers the flexibility and simply exposing HTTP to the container it runs.

Lambda has scale to zero and fast startup but its custom RPC interface (presumably an outgrowth of its batch processing origins) does not support streaming responses, has awkward response size limits, and prevents multiple requests from being executed concurrently on the same instance (so caches cannot be shared.)

Fargate provides the flexibility from simply running an HTTP server inside a container but at the cost of slower startup and no ability to scale to zero.


The closest analogous service that runs containers in the simplest manner possible would be AWS App Runner: https://aws.amazon.com/apprunner/

It scales almost to zero (minimum cost is the memory dedicated to a single task).


Last time I looked apprunner was unable to access RDS databases but I see this was recently added which makes it a viable option for me now! https://aws.amazon.com/about-aws/whats-new/2022/02/aws-app-r...


Though with memory limited to just 4gb (vs 10gb for lambda and 16gb for cloud run) it may not be viable for my app.

Somewhat frustrating that the ephemeral storage mentioned in the documentation provides no clue as to the amount available. https://github.com/aws/apprunner-roadmap/issues/112


Do you know how App Runner compares to Elastic Beanstalk?

There are so many AWS services these days I have no idea what to use half the time…


Worse still because Cloud Run is basically just managed KNative it’s not some crazy proprietary technology that Amazon would have to develop in house.


It's not managed KNative - https://ahmet.im/blog/cloud-run-is-a-knative/index.html

It's implementing KNative API but on GCP Infra


Fargate is what, 3 cents an hour? And of course it can scale to zero, it uses standard autoscaling groups. It will scale however you want, heck use custom scaling logic with your control plane running in Lambda :)

https://docs.aws.amazon.com/autoscaling/application/userguid...


For 3 cents a _month_ Cloud Run can store over a gigabyte of containers and run them on-demand.


My understanding was that you would get error responses from the load balancer if you set Fargate to scale all the way down to zero: https://serverfault.com/a/951440


scale to zero means without downtime


My previous company had thousands of lambda functions and api gateway integrations and near impossible to do anything with confidence when you starting integrating with all the other cloud offerings. My current environment is similar scale, but all containers it's night and day difference when it comes to confidence. We can move 100x faster when you can reproduce environments locally or separate account in seconds or minutes with everything baked in. I don't think I could move back, but hey at least this might eliminate a few API Gateway integrations.


Can’t you run local lambda in Localstack or something like that?


My work is heavy serverless and nobody I work with has had any luck with Localstack, myself included. It's just too limited, fragile, and buggy to work for anything we do. Our stack isn't anything particularly unusual either, it's just that if you are using Lambdas heavily they are probably tied in with a whole bunch of other AWS services in ways that are hard to replicate locally; and Localstack just isn't up to the task.

While there are some nice benefits to serverless workloads on AWS, local development and reproing production bugs are major weak points.


Sam local?


SAM local isn't a perfect emulation of the cloud based lambda environment. This is why AWS SAM added the "Accelerate" feature, making it easier to deploy code to the cloud for testing purposes.

https://aws.amazon.com/blogs/compute/accelerating-serverless...


It's alright, but continued to find bugs and edges cases with it. The challenge is when there are a bunch of integrations cross-accounts, eventbridge, cloudwatch, etc. that you just can't emulate well locally. Or once you do, it doesn't work a month from now because things change (e.g. a developer is using a new feature that isn't supported by localstack or something). In container land, you don't have to worry about these cloud integrations. You just spin up the services you want in a docker compose file, k8s deployments, helm charts, etc. and you basically have everything without having to worry about being a AWS specific blackbelt guru expert.


You just have to be a Kubernetes black belt guru expert. It is better but it’s not all rainbows and butterflies on this side of the fence either. Generally it boils down to the stack taking way too many resources to run locally, or still needing access to various persistent data stores, etc..


You don't have to run kubernetes for containers, but even then, only 4 of our engineers know and operate kubernetes. It allows us to enforce routing, authn, and authz standards everywhere (and test locally). Application engineers only need a simple command to run their stack and some code templates to build and test applications. Not much knowledge is typically needed on their part. Is it always perfect? No, but it's a lot simpler then wiring up a bunch of vendor specific offerings.


I use https://github.com/rimutaka/lambda-debug-proxy to run Lambdas locally while still being part of the AWS pipeline. It eliminates the need to emulate the input/output. That tool is for Rust only, but there is no reason why it can't be ported to other languages.


Amazing, now I don't have to pay API Gateway to do just an HTTP routing.


As far as I know, you can call the Lambda via the AWS SDK.


You can but that has security + protocol implications and is not as useful for general web consumers. This seems better IMHO.


Yeah you need an aws iam identity to call lambda invoke, for public consumer this is better.


You could always use an ALB? ALB has some nice extensions as well.


Don't ALBs have a minimum hourly cost? Last time I looked into this, you couldn't run an ALB for less than ~$17/month.

Definitely cheaper than API gateways for even a moderate amount of traffic, but API gateway costs scale down to zero for unused or rarely used endpoints.


What ALBs don’t have is a maximum payload size. :)

Note: The “official” way to work around this is to write your large payload to S3 and then create a pre-signed S3 URL that you return to the caller instead.


Given that we're talking about ALBs with Lambda targets, their limit is actually quite a bit lower than API Gateway: 1MB

https://docs.aws.amazon.com/elasticloadbalancing/latest/appl...


There are limits all the way down.

ALB with Lambda - 1MB

Lambda - 6MB


Using an ALB in front of a lambda, while neat is a pretty expensive way to do it. Something like $15 a month.


That’s not expensive if you have a money making app.


One way to keep your money making all profitable is to not waste money, for instance on cloud services that can be workarounded for cheaper.


Can anyone using lambda at scale pitch in regarding costs? It seems companies are using it to build pipelines which could be much cheaper by writing full services as opposed to small functions that you pay for per invocation.


I use Lambda heavily, especially in new projects that haven't been proven yet. The cost savings are significant for two reasons:

- Easier to maintain so less hours spent handling things like deployment and autoscaling. Payroll is likely the company's top expense so this is not insignificant.

- If the traffic is sporadic or unpredictable you have 100% efficient resource utilization, which is very difficult with traditional servers.

I have some microservices that would cost $10 per AZ / datacenter per month (so at least three to have High Availability) that cost effectively $0 per month on lambda.

At scale, it depends on how consistent your load is. If it is highly consistent a server may make sense. But even for high traffic apps the cost can be lower or -- if it is higher -- so negligibly higher that the saved maintenance cost pays for it.

I have many projects on Lambda, but for example: I run an entire small-startup of mine for < $1 per month and have high availability. The number of hours I would need to spend to get the cost that low would be way too expensive.


> If the traffic is sporadic or unpredictable you have 100% efficient resource utilization, which is very difficult with traditional servers.

"100% efficient resource utilization" is a bit over stating it, but it sure is a lot easier to ensure you don't over provision.


Also given that a lambda instance can only handle one request at a time, it generally gets very poor CPU utilization.

Google cloud-run can handle multiple requests at a time, but still suspends the instance while no requests are being processed, and is billed to the nearest 100ms


Lambda is billed to the nearest 1ms and you can always lower your RAM and CPU requirements per function. Though at some point you hit the minimum.

To flip (because I'm generally pro lambda) the one call per instance also encourages global state (since you don't need to worry about two calls running in parallel using the same memory), which is pretty bad coding practice.


A few years back at $DAY_JOB I was trying to optimize the cost for a serverless stack. To my surprise, small RAM lambda instances had extra latency of up to 100ms for DynamoDB queries!


Depends on if your workload is CPU or I/O bound. It's true that CPU and RAM and proportional in Lambda and raising or lowering the RAM also raises and lowers your CPU.

I'm not sure why but I found with pre-compiled languages like Go it's not as big an issue (as long as your app is I/O bound). With Node.js I've found that increasing the size of the Lambda helps even with I/O bound functions. I assumed it was because the JIT compiling of the JS takes CPU but it also seems slower on subsequent runs and Lambda sleeps the apps it doesn't stop them (until you hit the end of the 5 minute reuse window). I give my Node.js a min of 1 Gig of RAM whereas I've been able to get some of my Go functions down to 128MB with no performance hit.

Which yes, means the Node.js functions are about twice as expensive per MS before you even consider that the Go function runs for less time. But in both cases I've still found it cheaper than servers due to efficiency even though strictly speaking it costs more than a server per GB/Ghz.


Lambda function code tends to be fairly single threaded, as others have mentioned, using a lower memory size with Lambda, also lowers the amount of CPU performance available.

However, it appears to not only lower the amount of cores available, but the performance of those cores.

So even if your code is very single threaded, and has low memory requirements, you still might want to provision a larger lambda memory size.

The frustrating thing with this, is that your single threaded code might only be using one of up to 6 CPU cores made available to it.


Very fair point. You probably won't be saturating your lambda RAM and CPU.

But the granularity is much finer with Lambda so you're closer to full efficiency. No having a server with Gigs of RAM and multi-cores sitting around for hours a day with 10% utilization.


> If the traffic is sporadic or unpredictable you have 100% efficient resource utilization, which is very difficult with traditional servers.

Kinda. Amazon is the one getting 100% efficient resource utilization (or close to it). You will be billed based on what's on their rate chart, not utilization.


> If it is highly consistent a server may make sense.

I'd argue that you should just package and deploy your Lambda as a Docker image and when you need consistency head over to Fargate. It's costs are reasonably comparable with EC2 and you get rid of most Lambda limitations.


Very good point that the Lambda code can be written in such a way that it can run from both Docker and Lambda.

But I'm not a fan of Fargate. I find it easier to use vanilla EC2, and it's cheaper than Fargate. But I'm also very comfortable with EC2 and Fargate takes care of a lot of the ops stuff so to each their own.

What I didn't mention too is Lambda has "Reserved Concurrency" pricing which for extremely consistent workloads lowers the gap... I've never had a product with that consistent a workload though.


Care to share more about your small startup?


Well, I'd love to share specifics but I've kept "throwaway2016a" anonymous for 6 years so gonna keep it that way :).

But some general info:

- API based product

- Frontend using Next.js hosted on S3 behind CloudFront with some CloudFront Edge functions

- Gets a few hundred thousand API calls a month

- I wrote my non-edge Lambdas in Go. I found it to be much faster cold start times (about 100ms for me) and much faster runtime (< 10ms) than Node. The Edge functions are Node though because Edge on AWS only supports the Node.js runtime.

- I use DynamoDB for my database.

You get billed for what you use on Lambda and on average a single API call bills me for 8 - 12ms each. Plus the cost of API gateway. Actual response time is higher since SSL negotiation and API Gateway adds overhead but it's usually < 80 - 100ms which is within my SLA.

But could scale up to a few million API calls a month without much added cost... about 20 - 30 million API calls per month before I hit the cost of the (minimum) 3 servers + load balancer I would need to do High Availability using servers.


We aren't a company that uses Lambda at scale but we have exposure to a few thousand AWS customers' cloud costs and I definitely say that the customers who are all-in using Lambda are saving a lot of money relative to their container/EC2 counterparts. That being said, I see very few companies "all in" on Lambda these days at an organizational level. It's still the exception - I think you need to be very intentional with its use at an organizational level architecture wise...but I see a lot of companies with sprinkles of Lambda usage here and there.

Source/Disclosure: I am Co-Founder and CEO at https://www.vantage.sh/ - a cloud cost platform tracking in the hundreds of millions of dollars of annualized cloud spend.


Off topic/feedback: I like the look of your service and it's something I've been thinking about looking into recently. I have even used the EC2 instance type list a ton without ever realising that it was attached to your site.

However, I'm afraid that I sadly would not touch your service, because SSO is only on your enterprise plan. I hope that eventually you might consider putting SSO on one of the priced tiers.

(https://sso.tax/ approximates my opinion)


Thanks for that feedback.

I've mentioned this publicly before on HN: we are happy to offer anyone SAML SSO even if you fit into our Pro/Business tiers. We have configured "Enterprise" tiers for folks below $200/month and it is no sweat on our end. It just takes some pre-provisioning on our end that usually comes with more direct engagements with Enterprise customers.

Anecdotally, typically _very_ few customers in the self-service tiers make mention of this. It's something we need to message better on our marketing site but for onlookers here, know the option is available to you.


Look into CloudZero. Great product, great service. Dunno bout cost.


An engineering blog from the BBC published an update a few days ago on their migration to a serverless architecture, titled "BBC Online – A year with serverless": https://medium.com/bbc-design-engineering/bbc-online-a-year-...

They report 3.3B function invocations for 2.3B requests served, a total of 61,000 hours of execution time and 1,500 concurrent executions at peak.

Using calculator.aws, simply entering the number of requests at 67ms each (61k hours/3.3B invocations) with 128 MB of memory each computes a cost of only $1,107.

Then of course there's bandwidth, databases, file storage, and a lot more… but 3.3B lambda executions for $1.1k is really cheap. Not to mention that a client like the BBC would benefit from negotiated pricing with significant discounts. Even without discounts, serverless is often pretty cheap.


The CPU throttling of the lower memory settings is often easily visible in a user-facing request, even simple requests just doing some DB IO. I rarely use less than 1024mb for anything

edit: the BBC review is horrifying:

> The page takes around 500ms to render and be delivered to the audience. In that timeframe we invoke around 30 functions. Around 150ms is spent running React to render the content to HTML

> we aim to personalise almost every page in some way — making it relevant for every user on every request

Good luck making perf numbers with all those cold cached personalized pages


I want to call out - performance numbers don't actually mean nearly as much as people have made them out to be.

This is a mostly read only web page. Half a second to load? You're barely going to lose anyone, if you lose anyone at all. Hacker News routinely runs me ~300ms to load and has zero personalization, Facebook takes over a second and a half before anything displays, as does Youtube (on a refresh, no less, so things should reside in cache locally!). Hitting a random person's LinkedIn page (once I've passed the verification, which is a whole different issues) takes 1.2 seconds. Etc.

Now, admittedly those are including the latency on my end, but the point is, no one is so meth addled that a page loading after half a second (or even a full second!) is going to have much effect on engagement. Even the studies that have been done (that I have some major issues with) only really start measuring anything significant well after a second or two.


The BBC article is only talking about generation time, specifically not download time (including linked assets), and we have no idea about dwell time.

It only takes a 3G connection a few miles outside a city to add another 500ms to that opening request. Say we're up to a second before some readable text appears, now we'd like to know how long the user will actually spend reading the text or waiting for images to load before navigating again.

Intuitively, I think that load time/dwell time ratio probably captures what those studies talk about better than just raw numbers. 1 second between 10 second TikTok video loads would be extremely noticeable, but barely worth mention if the user instead was spending 10 minutes reading e.g. a feature length news article.

My personal BBC reading habit regularly involves clicking into an article just to catch the opening paragraph and seeing which opening image they used (they rarely use the same for the thumbnail). The average is probably not as low as 10 seconds, but it's certainly something much less than 2 minutes.

Dwell time probably isn't a great way to capture it either. My pattern is quite "flicky" but I bet there is a spectrum all the way from "reads every last word" to "literally just loves to click". I guess latency becomes increasingly important for folk further along that spectrum


Oh, I know; measuring is imperfect.

My point is just that there's a lot of room before it even hits the numbers quoted by the various 'studies', and given those studies all tend to be from perspectives of advertisers and click through rates (i.e., giving people time to wait makes them go "wait, do I even care about this?", whereas this is someone explicitly looking to a news source), it hardly is the problem the parent presents


That’s only averaging 72 request per second. Yes they, scale to 1,500 requests per second sometimes, which is an advantage of lambda, but we are talking about pretty small traffic.

Still interesting though and glad it works for them.


And that's where serverless excels, highly variable loads.


You can certainly run a web server capable of 1,500 concurrent connections for under $100/mo.

Run it on Linode or something, they'll throw in 10TB of bandwidth which probably covers it.

Most of the serving is done by the CDN..


Right, but once you're in the territory of "everything for $100" or "everything for $1000", they're both the same so you want the thing with the organizational efficiencies of independent deploys and permissioning, resilience and scaling, high availability etc.


Two $60 Linodes in different data centers are highly availablish considering each one could easily do x10 those numbers. The thing in the link sits behind a CDN.

Certain kinds of organizational efficiencies frequently lead to "I'm bored lets use the new hotness that would look good on my resume for the next job", hence lambda.

Actually you don't need anything at all, this is a job completely for the CDN which has enough features they aren't using to do the same thing for no additional cost.


You seem to have missed the aim of the game on their part is personalization. While it might be possible that they could do some client side processing, split the page up sufficiently, to then make multiple requests and assemble a "personalized" page, it's hard to say for certain that it would be. Certainly, it makes it less "personalized" and more "regionalized", which though being the case they describe, is likely a decision they didn't want to be locked into. For it to be truly personalized it likely needs to pull in some backend data.

Given that...you can't rely on the CDN being an optimization. You could still make the personalized calls through the CDN, if you wanted and had a clear caching policy, but something like "on load, assemble a page with real time updates on the stories this person expressed interest in" isn't going to benefit from a CDN.


Heh. OK for fun:

They have a very finite amount of new articles a day and a little box on the page with personalized headlines. All these articles, in fact all the articles on BBC of all time ever, comfortably fit into ram if you are so inclined.

Those little linodes have fast enough SSDs, try playing around and use one of countless stress loaders to get a feel.

I've served hundreds of thousands of requests a second from a single server using Varnish. A well tuned beefy VPS would do tens of thousands.

How to "personalize"?

For example Varnish has this thing: https://en.wikipedia.org/wiki/Edge_Side_Includes https://varnish-cache.org/docs/6.0/reference/vcl.html

It will happily read whatever cookie you set and stitch together a remixed personalized page for all the BBC users in the world without a performance penalty much faster than your lambda cold starts.

CDNs sometimes let you hook into Varnish directly or expose something similar to essentially do exactly this.

The aim of the game of the guy who wrote that blog post is to reinvent something from the 90s, but worse, and blog about it.


It's not just new articles; it's also updates to existing. The idea then being you could put together a feed of updates of articles you expressed interest in, plus recommendations and such; this isn't what they're doing now, but it may be their long term goal and this is their path to it.

I will 100% agree that what they describe in the link can be done trivially without lambda (they describe a single use case of regionalization, which doesn't even take anything special on the part of the CDN), but the point is I can easily see a future use case they're building towards, that I don't think those standards you mention will support.


I don't get why not. How do you think this was achieved before lambda? Or even before aws?

It is a news site not Facebook. The feed would only ever be a subset of all the articles in cache. It can itself be cached, there are only so many permutations. Different people will have different feeds, but there will be people with the exact same feed as you.

Crunch those recommendations however you want wherever you want and spew them unto S3 at your leisure. Have your varnish cache sit on top of that. Set your personalized subscriptions via a cookie. Varnish on each request will decide which one out of n (probably n<10000) feeds to include in your html response, in under 2ms.

This is all stateless. Have two servers in two regions. Done.

Maybe you'll reboot them every couple of years.


Been there, done that. Can confirm it works swimmingly.

It is much more fun though: don't be clever. Just cache-bust on change.


Hmm, I don't think that's the primary motivation. But the NomadList guy runs everything on one VPS and he makes $2m/yr so more power to you guys. I wouldn't do it because I'd want to be able to scale the team and keep their velocity high and I think clearer process-impact boundaries help.

For what's it worth, I've worked in Ad Tech at 2 m qps and now in HFT so I'm not averse to efficiency, but org efficiency is (in my experience) very important if you're scaling the company to the $100+ m revenue range.


I think if you're scaling the company to the $100+ m revenue range you'll be able to find a team that can handle either VPS, Lambdas, both or whatever else is required.

It really depends on what it is that you're actually, you know, doing.


I think that's a bit facile. What you're actually able to do is principally a function of your agility and I think a business's choice of tech impacts that. But far be it from me to prescribe. I think these things make us more agile and you think your things make you more capable. May the market favour the approach it does.


AWS has this way of making things more approachable. It is a subtle trap. Before lambda running a VPS was not considered rocket science. It may or may not come with scaling issues, depending on your actual application and the architecture.

Lambda lets you get going quicker but comes with its own idiosyncrasies, don't get lulled into a false sense of security. You're going to end up with a devops guy either way and they'll be able to handle either scenario.

Which one you should pick really really depends. But, all things being equal, it is better to shoot yourself in the foot than having AWS shoot you in the foot.

You gave your own counter-example. NomadList has a simple app, Lambda is the wrong choice for him. BBC link tells me their thing is so trivial it is moot - whatever floats their boat.

If I was doing some complicated EMR in BigCorp with many departments and stakeholders, slinging millions of files around S3 I'd pick lambda.

If I am setting up ad servers I would not pick Lambda because it is latency sensitive. 'etc.

Get yourself a devops guy and a proper architect and game it all out.

Think of your devops guy not as a server administrator but an advance AI that accelerates the velocity of your team to new peaks of productivity. They are not only debugging their code in production at 3am but your org processes.


Doesn't have a universal answer.

Lambda doesn't charge for idle time or capacity, for example. Depending on your usage patterns and availability requirements, you may have a high cost in idle waste when using servers.

But I think where Lambda really shines is removing time from infra setup, maintenance, etc. These are hidden costs and, even when accounted for, frequently underestimated.

If you need even a part time DevOps or SRE engineer to make sure you'll keep things running smooth or will recover fast from a disaster, their salary alone already pays for shitloads of Lambda invocation and compute time.


Don’t you need an SRE or Devops focused person regardless?

You still need a way to test and deploy the code your pushing. You also need to set up logging and networking that ties it into the rest of your systems


I use lambdas for nearly everything at a games startup. The "Serverless framework" - https://www.serverless.com - has built-in concepts of dev/stage/prod and routing from URLs -> function etc. Logging goes to cloudfront but you could use any logging library you wanted inside your functions.

We don't have an operations team, so each dev deploys their own services; this framework gives them all a consistent interface so we know how to test or deploy someone else's code.

Advantages from my POV:

   * zero or simplified ops; no load balancer setup etc. 
   * we don't pay for idle hardware
   * scales well by default (built with parallelism in mind)
If a certain game spikes in popularity then it COULD become expensive - but IMO that's better default behavior than a server falling over.


I think it's awesome that it works for you. I've built out all types of infrastructure setups and was simply pointing out that whether you need an Ops/SRE person is a cost that doesn't seem to be based off of whether or not you go serverless.

You mention that serverless.com offers a sort of PaaS that simplifies your life. That's similar to Heroku, which does the same thing but with regular servers and similarly allows you to defer having a dedicated Ops person. Having worked at plenty of places that leveraged PaaS's for most of their needs, I'd say it's normal to not have an Ops team until you begin hitting issues around general maintenance and secure networking setups.


Xsmasher,

Would you be interested in sharing your Serverless Story with our Serverless community? We are looking for guest blog contributors, community call showcases and case-studies.

Please fill out the form in the link below and I will follow up with you! https://formfacade.com/public/112252076977537904120/all/form...

Cheers!


It's kind of a nice to have but no you don't ""need "" that type of person to get started at least.

I was the fourth engineer at a smallish currebtly Series A startup that used lambdas for ~90% of their services and we ( the software devs) built the CICD, API gateway integrations, all of that stuff. We figured out a good solution early and then maintained it. Which made moving forward pretty trivial.

It only got a little bit more complicated as we wanted to use different authorization methods or VPC private link when we did eventually spin up a couple ECS clusters.

The company now has a full-time DevOps/SRE, but they don't really work on the code deploy CI stuff. They more deal with I am audits and security and stuff...


I definitely didn't mean to imply that you needed an SRE/Devops for starting off. The OP just came across as sounding like you don't need SRE/Devops in general if you use Serverless, which from my experience at multiple orgs just seems idealist.

I'd argue that the need for someone focused on SRE/Devops comes from factors that are mostly unrelated to whether or not you go serverless.


> already pays for

huge understatement even as written :)


They are fast to setup and require zero care and feeding. That's the big "time to market" cost.

Beyond that, depends on your use case. I love using Lambda for webhooks that may be called anywhere from a few times a day to thousands of times per day. Once they get to the point of being "oh wow, this one lambda is expensive", you can clearly afford the budget to move it to a real server. But below that $20/mo mark (which is more than 5 million invocations of a small function) - you're golden.


My lambda bill is $4,000 per month for 21,000,000 invocations per month.

Something seems off compared to your numbers.

Maybe the length of a single execution?


Lambda cost is invocation time * memory size pretty much.

If your lambda bill is $4k a month, either your functions run multiple seconds each or you're using multiple gigs of memory.


You can always start with a lambda then move to its own service if costs justify. Develop as a Docker and there’s minimal if any conversion work


If you use the Go runtime, you can also trivially wrap in a Docker later and your iteration will be faster.


There is absolutely no need to do that. You can run any standard MVC framework on Lambda using APIGW proxy integration and deploy the same code to Docker.


Oh I don't mean "switch to Go". I mean if you're already using Go, it's easier to use the Go runtime than the Docker runtime.


I've never seen (healthy) fleets with average CPU utilization above 70-80%. That includes several-thousand machine fleets when I worked at Amazon. Most things I've seen in my career have 10-20% utilization at best. The Lambda "premium" over EC2 is only like 30-50%, so unless your workload is super consistent/predictable, and your code/hardware is well optimized, then Lambda will likely win on price. This is even more true now that you don't have to pay for API Gateway.


I have. AdTech frequently runs close to 100%. Cryptomining style.

I'm not sure I agree, why Lambda and not AWS spot instances? Or heaven forbid, one beefy server / office warmer.

Lambda doesn't win on price it wins on flexibility.


As others have mentioned, one advantage is that lambda gives you the kind of uptime that would require at least 3 VMs spread across different AZs.

But then, I have been running my SaaS on a single, rented beefy bare metal server in a data center with unmetered data and have had no more than 15 minutes of downtime per year for 3 years now.


Exactly, you've answered yourself.

Also nothing prevents you from using multi zoned auto scaling group with a min/max of 1, you'll fail over to another AZ gracefully. You don't magically get 9 nines with lambda in practice either, trust me.


This is relevant for CPU bound loads. Lots of lambdas do networking stuff where 90+% of the time is waiting for an HTTP download.


Well enterprises that build pipelines often end up with Rube Goldberg type horrors regardless, and Lambda is more easy and fun to prototype with than say, Spark. If you know someone like that wait until they click around and find "step functions".

There are more legit use cases. Lets say you are in charge of Eurovision.

You spike from 0 to xxx,xxx of votes on an irregular basis during a few nights a year.

You run an auction or ticket sales website of some sort.

Your blog gets 3 clicks a month except for when you actually post something and it briefly gets noticed by HN.


For blogs there's no need for "serverless" ( in the "classical" sense, FaaS/CaaS), static hosts exist, have an added security benefit and are very cheap, often with generous free tiers. They're also serverless in the sense that you don't need to manage servers, and can scale from 0 to millions in the blink of an eye. ( For reference, my blog costs me 0€/month, once going over the free tier of Firebase Hosting and costing me the grand total of 0.03€ in a month where it was twice on the HN front page, including briefly at #1).


Remember you don’t pay for invocations. You pay for memory, latency and traffic.

That constrains how you build architectures with lambda. Anything bloated or written in low efficiency languages, or does any amount of compute or latent network traffic will burn you. That includes dealing with slow clients and synchronous APIs.

As for rules, there aren’t really any catch-all ones. We build out on Kubernetes now though as that’s portable. The lambda developer story is quite horrible and that’s where our cost is.


I'm fairly certain you do pay per invocation; $0.20/million. Its not the bulk of the cost for sure but its non-zero nonetheless.


Yes that’s the point. The marketing points to that as the big cost but it’s insignificant compared to the real cost which is running workloads on it that aren’t optimised for price first. Which is probably most of them from experience.

Then again that’s how Bezos rolls.


I don’t work at a startup, but am working on a startup-like project where we are building a greenfield application that allows law enforcement officers to collaboratively edit reports such as accident reports. The entire API is built on Lambda, S3, and DynamoDB. The experience has been wonderful so far.

We have a large dev team (~12-16 devs) working on the project and our dev account costs are only $100-250/mo. We even deploy each PR to the cloud to run automated tests against. Our production costs have been very manageable too.

Lambda/serverless has been great for us. Our organization is relatively new to building SaaS cloud apps and doesn’t have the most mature devops practices (we’re growing there). Building this app on Lambda and DynamoDB and letting AWS help us with most of the scaling has really been a win for the team.


It all depends. Cost-wise, Lambda is the only real option for running tens of thousands of Monte Carlo-style simulations at once (which require high performance for a very short execution time). Containers are too slow to scale up for our target time, and if using regular instances, they'd need to be always-on which costs a ton for the performance tier we'd need.

With Rust on Lambda we found a good mix of start time and parallelism of each that made it much cheaper and faster.

EDIT: The workload times are only somewhat predictable so even doing something like scheduled scaling is a half-solution.


You're doing whatnow??

This may come off as arrogant, but it sounds like something that could be rewritten in Fortran to run on an old laptop with similar execution time.


I think they mean that they may need thousands of HPC cores for a few hundred milliseconds, and then zero cores right after; and (for some reason) absolutely cannot just do the work serially instead of in parallel.


Yeah, I also interpret it that way, but I cannot understand what kind of system requires such a scheme. Even if the state space is non-ergodic, you should be able to get by with running your sampling on a dozen larger systems in some sense. Or just run on a workstation for a few months to precompute and tabulate the result for all possible inputs.


I realize this was a week ago, but the parent comment is basically the case. The "some reason" is the expected time from request->response (which includes aggregation/interpretation of results) is 1-2 seconds with a sample size of ~50,000 for a pretty involved model that needs to be run on demand/in "real time" with no good, predictable time window.

We can definitely get by running the sampling on a dozen larger systems, but those systems still cost more than (one thousand Lambda executions * a few hundred times a week max).

Pre-computing is an interesting idea, but would certainly be an epic amount of work. The input set is also quite large (compiled motorsports data based on current state of an event).

I'm also totally willing to admit I'm probably in over my head in designing such a system, but I did try a few different ways of architecting this and tweaking the parallelization in the source and this was the best cross of time/money I could get! It's definitely been a learning process for me; although I've got some years experience with scaling out, the system and requirements were much different than I've been exposed to. I'm sure this first take will be changed or at least refined once it's been in use for awhile.


I run mini monoliths on lambda functions, instead of tiny microservices. It's highly cost effective for unpredictable spiky traffic with high availability requirements. With Fargate, I have to run a bunch of unused overhead capacity just in case load spikes suddenly. Fargate scales up uselessly slowly to respond to sudden capacity needs without causing downtime.


It’s best to do the math in advance, but not everyone is willing to put in the time required for a proper business case.


We trialed switching our EC2/ASG based queue worker over, price went up 10x-100x.


I'm confused, the whole news is that you can directly call lambdas without having to go through API gateway, like you do on cloudflare?


I think the idea here is that you don't have to set up API gateway anymore to get HTTP/API access to your lambda.

You still can of course but it's one less thing you have to do.

Like another comment said you can expose it through IAM/SDK but then you're getting random permissions credentials whatever out to the world as well.


Yes, that's the whole news. It's exciting to those of us who had to live without that feature until now.


Today we are coincidentally releasing the beta for https://tinyfunction.com/ TinyFunction is the simplest NodeJS and Python function deployer. All functions are deployed in AWS.


Is it possible to front this with your own domain using a CNAME or are the function URLs dynamically genrerated on each commit/upload/build?


You can't use your own domain/certificate. If you CNAME it the cert will be invalid.


It sounds like you can get a stable URL for a function, but it'll still be a Lambda URL.

I'd love to use this with a custom domain, without having to use an API Gateway.


Yes, but you'll have to either set up a proxy or do a client-side redirect.


You can always proxy it (e.g. Cloudflare).


For a lot of use cases if you proxy it through something external like Cloudflare, you might as well just write the code in Cloudflare Functions. Even faster as no proxying will be required.


You can't run a docker images on Cloudflare functions, nor can you run a code with 10GB size.


finally. Having to setup a gateway is so cumbersome.


Claudia.js is an absolute godsend when it comes to writing and deploying Lambda functions. Can't recommend it highly enough: https://www.claudiajs.com/


Are those function urls backed by WAF and AWS Shield ?

If not -> get prepared for huge bill of ddosed function invocations.

I hope we can at least attach something to those urls.


You can and should limit the number of parallel invocations and as a result your maximum bill.

If you want ddos protection and proper rate limiting amazon will happily charge you for it several different ways. Maybe you can hide these urls behind cloudflare if you're penny pinching..


I was thinking the same. What we have here is a security nightmare. You have executable code on globally unique URLs with no protective mechanisms in front of it except for the ability to do IAM. Yikes.


Is it not possible to set the lambda url has an origin in CloudFront and get WAF protection that way?

For all production workloads we typically have cloudfront in front of everything.


Not for WAF. The original article states that you must use API Gateway if you want AWS WAF.


Really Cool addition. I just moved my lambdas from API Gateway to ALB (because of API G limit to 30s). I also use Serverless framework. It was a day of work, but developing with ALB is a bit more of pain. Maybe this would be better. Are there any timeout or mb constraints on these URLs?


Yeah didn’t see anything about timeouts or body size limits.. is a good question. Lambda + apigw vs invoke vs alb vs the rest having different limits has got me a few times.


Could you elaborate a bit? I find this really interesting. My story is: I started with invoke (with boto3) then I wanted some more abstraction and use normal Requests. So i moved them to serverless framework with API Gateway. Then the 30 sec timeout became an issue (I do data engineering stuff). So i moved them to ALB. And now everything runs, but the ALB serverless support isn't great. (why do i have to give everything a unique priority aaahh :D)


So lambda invoke direct via api has 6mb limit: https://docs.aws.amazon.com/lambda/latest/dg/gettingstarted-...

But api gw is 10mb, not for lambda tho, confuse: https://docs.aws.amazon.com/apigateway/latest/developerguide...

Use s3 object lambda and have no limit to push to s3 with logic, adds a bit of latency tho https://aws.amazon.com/blogs/aws/introducing-amazon-s3-objec...

All historical but hard to understand the limits as a poor user..

Ended up going with cloudflare workers for s3 uploads with extra logic just to avoid the unknown, workers are great btw.


If you are using Python and looking for a framework to quickly get your Aws Lambda Functions up and running, try out the framework I am developing. It is still in the early stages, but it has some optimizations that make it simple to do things like use 3rd party packages. If you are interested, a good place to start are the docs on how to connect functions to Api Gateway https://staging.cdevframework.io/docs/examples/httpendpoints....

If you want the developer experience of Django with the benefits of Serverless Compute platforms check it out!


I wish that Azure Functions had this.

I have a function triggered by Cron once a day that goes wrong about once a month. I trigger it again using the debugging tools, but it would be nice if I could just hit a URL to trigger it again.


I use EasyCron (https://www.easycron.com) for some simple scheduled tasks. Basically just hit a URL on a set schedule. I include a specific API key in there and the URL is not discoverable, so it works well for me.

It's a nice way to do cron/scheduled tasks without any extra work. Just deploy another serverless endpoint and have something else hit that on a schedule. If anything goes wrong, EasyCron notifies me and I just hit the URL directly to re-run the task. Simple and crude, but highly effective and near zero effort.


Unless I'm misunderstanding your use case, Azure functions have had HTTP triggers[0] for years, the bindings documentation even calls out support for binding different trigger types to the same function. I was actually surprised that AWS has only just received support.

[0] https://docs.microsoft.com/en-us/azure/azure-functions/funct...


I don't believe you can have multiple triggers for the same function, so I can't bind it to a time trigger and a http trigger.

If you can find an example for doing otherwise I'd be delighted!


I dug a little further and it looks like you're correct, I had mixed up the concepts of triggers and input bindings after reading the documentation that claims:

> You can mix and match different bindings to suit your needs. Bindings are optional and a function might have one or multiple input and/or output bindings.[0]

However, there is some documentation explaining how to execute a function that does not have a HTTP trigger via HTTP[1]. The example uses the function app's master key though, it'd be interesting to see if that's a requirement or if you could use a key scoped only for invocation of the specific function.

[0]https://docs.microsoft.com/en-us/azure/azure-functions/funct...

[1]https://docs.microsoft.com/en-us/azure/azure-functions/funct...


Thank you!


This is fantastic, I'm making an iOS application that is 100% serverless. Having no servers feels great but managing API Gateway endpoints is annoying. I don't know about performance but Google Cloud Functions definitely had an edge there, because I believe they had native function endpoints since launch.

I wonder if it's worth changing my current API Gateway endpoints to the built in Lambda URL's, since I haven't launched yet.


Although it doesn't mention in the blog post, the HTTPS endpoints are dual-stacked.

Seems like AWS is actually launching new endpoints with IPv6 support by default now.


So its a public https endpoint, with no built in throttling? This... doesn't seem like a ddos vulnerability to anyone? All it would take is one script kiddie to rack up an unsuspectingly large aws bill, no?


There's a paragraph dedicated to this in TFA.


That wasn't very polite, mr. lemon.

Also, I'm not sure what you're referring to, having read it twice.


How is it that my UX looks completely different to the blog post?

I don't have advanced settings. Instead I have to go "Configuration->Function URL" to find this.


Where are at least ACL and green header filtering?! This is 2022. If this is not supported at the entry point the product should be sent back to design.


Nice! Of course this has always been possible to do, but removing the API Gateway dependency will make simple use cases a lot simpler.


Now if only you could add an EIP to a lambda function without a VPC NAT and the $20/mo minimum that comes with it.


It's not an official AWS offering, but there is this project that can automatically add EIPs to VPC-attached Lambda functions for you. Then you don't need the NAT gateway.

https://github.com/glassechidna/lambdaeip


Or just allow an IPv6 address to be attached.


Very happy to hear it has first-class alias support. Now if only they would allow per-alias environment variables...


For someone who has only worked with Cloud Functions on GCP, can someone explain to me how is this different?


I was wondering the same, I only know the GCP ecosystem and there I already consider cloud functions to be a “legacy” choice relative to Cloud Run where I no longer need to have a 1 to 1 relationship between requests and invocations along with a bunch of other advantages.

Every time I peek over into the AWS ecosystem I’m very glad I don’t have to work in it. This seems like it’s multiple years behind what GCP has unless I’m missing something obvious?


How is it different than cloudflare workers?


Cloudflare Workers is head and shoulders above AWS CloudFront Functions and Lambda@Edge [0], if you can fit your workloads in 50ms (CPU time) or 30ms (IO time). Workers is wayy cheaper, wayy faster [1].

Workers has 1MB script size limit (post compression), so that's there, too, and can run WASM or JS workloads (which CloudFront Functions can't, but Lambda@Edge can).

As for AWS Lambda Function URLs: Well, it isn't comparable to Workers at all. But if my use case fits Workers, then that's what I'd would prefer. In fact, I've gone many lengths to make my workload fit Workers. Deno Deploy is another viable alternative.

[0] https://docs.aws.amazon.com/AmazonCloudFront/latest/Develope...

[1] dated, but relevant: https://medium.com/@zackbloom/serverless-pricing-and-costs-a...


Just some small corrections. I think you mean 30s of CPU time for unbound workers? Time spent waiting for i/o can be “infinite”. I think maybe you’re referring to billing where we bill on wall clock time for unbound workers (vs CPU time for bundled workers). For proxied requests, billing even stops once you hand back the Response object.

If you need more than 1MB of script size, please reach out to Cloudflare support.


OP meant 30/50 ms under the guise of "Workers is wayy cheaper, wayy faster". You can have unbounded workers that do whatever you want. But the cheap Bundled workers need to stay under 50ms https://developers.cloudflare.com/workers/platform/limits/#w...


I was specifically trying to clarify the second half of "fit your workloads in 50ms (CPU time) or 30ms (IO time)". The only time IO time is relevant for Workers is billing of Unbound Workers, not whether your workload fits. The only time-based workload limits for Workers are 50ms of CPU time (Bundled), 30s of CPU time (Unbound), or 15 min (via Unbound Cron Triggers).

I thought our Unbound Workers are supposed to also be cheaper as well but I need to double-check that piece.

Bundled and Unbound Workers are equally fast.


From what I knew, you couldn't have a Bundled Worker wait on IO or sleep for more than 30s. May be it isn't true anymore?

For most workloads, I'd reckon that Unbound Workers are about the same cost as Bundled. In fact, Unbound will be ~2x cheaper than Bundled if your average workload completes within 50ms IO or 10ms CPU.


It's not true now AFAIK [1]. Not sure if it was ever true as it's kind of core to how Workers works, but I've only been here for just over a year. Can't find anything to suggest this was ever the case.

As for cost, a Bundled Worker definitely had a price advantage if you have CPU-light but IO-heavy workload. If my math is right then Unbound is cheaper up to roughly 220 ms of wall clock (I used 100M requests as an example). So if it takes > 220ms of time to send the response fully, Unbound will be the same price as Bundled and only get more expensive the longer the response takes. This isn't the RTT time to your origin. It's the total request time. So if you're doing lots of round-trips to origins, proxying WebSocket messages back and forth over a long time, proxying a large response body from somewhere else etc. This gets more complicated since we put in an important optimization that makes Unbound much cheaper if you're just proxying a response without modifying it since billing will stop once you return the Response so now the Unbound Worker has to actually be meaningfully involved in generating the Response body for it to bill until the response finishes sending to your client [2].

[1] https://stackoverflow.com/questions/68720436/what-is-cpu-tim... [2] https://blog.cloudflare.com/workers-optimization-reduces-you...


There's no time limit on requests, as long as the client is still connected. However, two things you might be thinking of:

* If you use waitUntil() to schedule async work that completes after the HTTP response has been sent, this work is limited to 30 seconds.

* In general, if a request runs longer than 30 seconds, the chance of random cancellation increases a lot. For example, when we upgrade the Workers Runtime to a new version, we will give in-flight requests 30 seconds to finish before the process exits, which will cancel all remaining requests. (Of course, any application that relies on long-running connections needs to handle random disconnects regardless, due to the general unreliability of networks.)


Thanks kentonv and vlovich123. I stand corrected.

I've been using Workers since 2019 and quite haven't kept up with Cloudflare's pace of innovation ever since. It has been dizzying. Looking forward to handling TCP and WebRTC workloads (announced last year) with Workers next.


Double-checked and Unbound Workers are indeed priced more economically than Lambda@Edge. I had misread their pricing to be 8x cheaper than what it was which is why I was confused.

So TLDR Workers is faster, can run longer, and costs less than Lambda@Edge.


Is this similar to a cgi-bin script?


Closer to fastcgi, but yes.


AWS Lambda has now gone full PHP. Never go full PHP.


Last step: https://bref.sh


Not sure I understand what you mean, can you elaborate?


Reference to the movie quote in "Tropic Thunder" from Robert Downey Jr.s character.


We live in such bizarre time, things that technology already solved in the past is now considered innovative.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: