I'm terribly skeptical of AWS Lambda as the engine of a main-line REST API. There seems to be no way to avoid the cold-start time: not only when your function is first called from a dormant state, but also when request load rises, bringing additional workers online. (A timer that constantly pings your service only protects against the first, and I hope we all agree is a dirty hack.)
I'm hoping someone will correct me if I'm missing something. Otherwise it seems like Lambda isn't really suited to use-cases where random latency spikes would degrade user experience.
Fun fact, on the lowest memory available, cold start times are really bad in Python when you have a lot of libraries that need to be imported.
The Lambda runtime is exceedingly slow at that on the 128MB tier. This is in fact the case for Django apps, even in their default state.
Now here's where the fun fact starts: If a function times out during cold start, it hasn't successfully been warmed. That means upon its next trigger, it will coldstart again.
Now lets say you deploy a tiny little function over wsgi. You see in your metrics that it takes an avg of 200ms... now, you're a smart dev, you set its global timeout to 3 seconds down from the default of 30 seconds because you don't want to get billed more than necessary. But as it turns out, the cold start takes on average 10 seconds. Your function now never successfully completes.
Sure, but why would you do that? If you’re running Django, set up your ec2 instance, and run Django. If you’re running lambda, don’t spin up an entire framework in it to run for individual requests. That basically defeats the purpose. You’re not running a tiny little function, as you said. Your running an entire web framework that’s running one tiny function. Instead of doing that, write an actual simple function that does the one thing it needs to do. It will take 200-300 ms to spin up and run the first time, as well as on requests when it scales up, but otherwise will run in 2ms. Keep it really simple and stateless. If some part of that doesn’t work for your use case, then don’t use lambda.
It was not meant to run an entire framework in. If you’re reaching for an entire framework to run a simple function in a lambda, I’d bail on one or the other. Don’t use a framework, or don’t use lambda.
I mostly agree, but sometimes it's simply convenient to use the regular Django wsgi routines to serve parts of a Django app. The alternative there if you want to serve stuff behind URLs is using the Api Gateway and that is atrocious to work with. Also, whenever there's a discussion about lambda, people talk about "lock in" and the api gateway is a much bigger lock-in than lambda.
Example: I have an API in Django which uses DRF, served on a classic web server instance behind load balancers etc. Parts of that API, which have the same authentication/business logic/etc, are much more suited to lambda, I'm sure it's not hard to imagine :)
Also, Lambda is just the glue code. You have to use many other AWS stuff that really locks you in deep, even if you functionallity could "theoretically" be ported to different providers.
If there is a real need to port your infrastructure you can do it in small steps. Using single cloud provider is also not smart. If I need to do huge data processing I will never do it with AWS although this is where the rest of the infrastructure is as Google has better solutions. Just pick whatever is best.
You’re spot on. It’s really not good for any scenario where an occasional random long response time is a deal breaker. It gets particularly crappy when you have lambdas dependent on other lambdas. That being said, if an occasional long request is ok, it’s pretty great. It just depends on what you’re doing with it.
The advantage and disadvantage of lambda is that it decouples your economics from AWS's (or Azure's or GCP's) economics.
That is, the cost of idle time is shifted from your account to their account. This sets up a bit of a race: Amazon's goal is to have as few hidden warm instances for as little time as possible. The user's goal is to have as many hidden warm instances as possible for as long as possible.
This is where you start to see pinging to provoke AWS into keeping a warm copy. As time goes on they will have a growing incentive to prevent tricks like these from working.
At the moment, on the economics, hosted FaaSes will always have this tension. Self-hosted FaaSes open up the possibility of setting your own policy for trading latency against idle cost.
Disclosure: I am part of a team working on Project Riff, a FaaS.
> Self-hosted FaaSes open up the possibility of setting your own policy for trading latency against idle cost
if you already have a K8s cluster it's probably cheaper to just create your application regulary and use it like that.
No need to use a FaaS which complicates your whole setup for no additional benefit.
Self-hosted isn't free either. You're still paying for resources when they're idle. The difference is that you can decide on whether to scale down to 1 or not.
As it happens Riff and most other FaaS projects are built on kubernetes.
I think we have different meanings of "idle". There's idle in the sense of "this resource is reserved but not in use" and idle in the sense of "this resource is not reserved and can be used elsewhere".
Most folk want cold starts for economic reasons and warm starts for performance reasons. In a hosted setting you can't truly set your policy, you can only try to encourage by external stimuli.
IMHO serverless is a dangerous trend. It locks the developer to one of the large cloud vendors with a propietary API, a problem I thought would have been solved with orchestration software (like k8s) that abstracts away service providers. It may be useful for minor quick-and-dirty trigger functions - but no way for business-level logic!
We decided to base our new platform on serverless - previously docker. Couldn't be any happier. Everything runs as expected and I don't need to worry about a tone of things.
People talk about vendor lock-in and all that. If you don't write things from scratch there is a vendor lock-in everywhere. Open source is also vendor lock-in. For example, we used to use xulrunner from Mozilla for some time and look where it is right now - nowhere.
I was also concerned initially with being locked in into a particular platform but then again, what I realised is that if your project is extremely successful you can always spend big to migrate when you have the funds to do it. As a startup, that is not the case. You start with nothing. So platforms like AWS are absolutely amazing because not too long ago you still had to buy your own datacenters.
> So platforms like AWS are absolutely amazing because not too long ago you still had to buy your own datacenters.
But just in terms of dollars, how much does it cost to own hardware? I have never worked in an "infrastructure" team so I'm am outsider looking at it with no experience but I hope someone will correct me.
Imagine a startup needs a 42U cabinet in four co locations. Each one costs a thousand to fifteen hundred dollars a month? Let's say that puts the colocation bill at $5k a month. Next, we need the actual computers and it seems this is the expensive part? I simply searched for servers. Just for kicks, I selected a 2U supermicro 2023US-TR4. With 2x epyc 7301 16 core processors, 16 x 4GB RAM, 2TB Intel P4600 SSD, 4TB Hitachi 7K6000 HDD, and a five year next business day on site service quotes about $10k.
I imagine you can fit up to 20 such computers in a full 42U cabinet. This means just buying the computers will set us back 4 * 20 * $10k? Spending almost $1M on admittedly low end server hardware sounds expensive.
Colocation cost over let's say five years will be about 5 * 12 * $5k or about $300k.
I guess just because you have space for 20 computers doesn't mean you have to fill them all up. Also, I guess you don't need to be in four locations.
If we scale down our ambitions to 10 computers each in two locations, initial cost is 2 * 10 * $10k or about $200k and colocation costs are 5 * 12 * < $2k? or about $100k? because we no longer need a full cabinet...
I know I'm making a lot of assumptions and I'm hand washing a lot of important details but this seems like the smallest possible cost of self hosting, no? About $300k over five years is about $5k per month. It won't make sense for everyone but I guess if we have a thousand people paying us ten dollars a month, that pays for the hardware?
I welcome all corrections and insights. Like I said, I have no experience in these things.
You probably need a dedicated sysadmin as soon as you've got physical infrastructure that needs to stay online for your business to keep running - you honestly probably need this even if you're leasing dedicated hardware.
I used to work for a company that leased dedicated servers, and I talked to A LOT of customers that lost their business or lost a significant amount of their runway/savings/whatever due to mistakes caused by not having a real sysadmin around to handle their infrastructure. Long running issues they didn't have the experience to troubleshot, bad (or nonexistent!) backups or restoration procedures, extended outages due to server related issues, bad update/firewall/other security practices, etc.
There's pitfalls with every option, but running your own infrastructure is a completely different skillset from developing an application, and is a professional trade in and of itself - you can try to force this on a developer and hope they have the time and expertise to handle all of these different things and run the risk that they don't, or you can tack on another ~100k/yr for a professional. And as you grow, so does the complexity, and maybe now your infrastructure spans a lot of racks, and you need network engineers, you need multiple sysadmins, etc, etc, etc.
You can't DevOps your way out of managing physical infrastructure
Lambda and dynamo are currently 600% cheaper than running our own instances. For us, we are making substantial savings. In fact, the biggest cost is not Lambda itself but dynamo because we had to create quite a few indexes. Frankly, these things can be avoided with better planning. That being said, it is still way cheaper in our case.
I’d argue that if you’re doing serverless right you should be abstracting away the particulars of the serverless provider and implementing your functions in an provider agnostic way.
A good example would be using something like apex/gateway which simply proxies lambda requests to golang standard http handlers. Virtually elminating vendor lock-in.
I think serverless has a place (yes even for business logic) but like anything you have to pick the right took for the job.
I disagree that it’s half the hassle. With serverless I don't have to keep the os’s updated and hardened, manage load balancing or scaling (yes I’m aware of the cold start latency cost). It does that all for me. I also don’t have to worry about my service going down. Yes with good devops practices you could automate all of the above but serveless is still easier imo.
I won’t argue that its cheaper because it’s not. Once you get to a certain point you should probably migrate onto real instances which is why I think that the abstraction I talked about is important.
> With serverless I don't have to keep the os’s updated and hardened,
You can easily get this with CoreOS + docker images on GCE
> manage load balancing
Load balancers are one click since what, 2005?
> or scaling (yes I’m aware of the cold start latency cost).
Except you do. You run a 4 CPU database on the backend. How many lambdas can hit it at once before it dies? You need to manage this still.
Lambda is not the worst thing ever, but many many things using it would be far better off on something like kubernetes or a docker image behind a load balancer.
the example in the blog seems fairly straightforward -- low cost solution with minimal operations. not sure why anyone would want to replace that with instances, unless they enjoy maintaining and patching large surface areas (or has an ops team that does it for them)
(You can even get ones that "autoscale" because the load is sharded on the hosters side, meaning the CGI will start up on a number of different servers - can be pretty high - when required, and only when required)
Implementations for pretty much every language are available.
Serverless does, of course, mean WAY less efficient (but you're charged for that by Amazon, so that's a plus, not a minus for cloud providers), far less language implementations than cgi, less support, closed source, much bigger memory usage, ...
Or even just having your provider distribute the load of a number of servers. The same FastCGI handlers will startup on many actual machines, providing a measure of scaling.
Sure, but then the burden is on you to build the Lua-specific tooling and framework around that. Pretty quickly it becomes a bespoke platform which, although fun to build, is rarely an economically sound investment.
I'm hoping someone will correct me if I'm missing something. Otherwise it seems like Lambda isn't really suited to use-cases where random latency spikes would degrade user experience.