IME the AWS GUI is not useful for all that much in a meaningful context. Thus this post is pretty misleading to a new person getting involved in cloud; it doesn’t qualify as “full” in my mind.
As a number of other comments have pointed out, there are packages that help with managing your infrastructure in code; which is critical outside of anything other than a play environment. Terraform, serverless, cloudformation, SAM, zappa, etc - are all essentially requirements for cloud usage.
At Amazon they’ve tooling which makes it relatively easy to set up an AWS account to develop against; the tool primarily configures the billing to the correct team within amazon. Additionally your manager gets “rights” to the account, IAM accounts can be linked to AD, etc, etc. This is all part of a push to get Amazon employees building infra on AWS proper. When you create the environment you have to denote whether it’s dev, or whether it’s production; in the former case it’s widely accepted that you use the GUI to set up your trial. However in the latter case you’re strongly encouraged to use something like cloudformation. It was always a shock to me that for prod accounts they don’t disable the GUI for anything other than viewing.
My experience... while Lambda is capable of hosting micro-services, any coding pipeline that emphasizes continuous integration (CI) will be better served with one, generic, monolithic Lambda function that dispatches to sub-modules behind the scenes.
However, this only scales up to a 50MB binary distributable.
I am not sure if 'Full Guide' in title of the article made me smiling or maybe it was recall of my experiences related to configuration of AWS Lambda and AWS API Gateway in CloudFormation. There is a lot of things which has not been mentioned in this article and it is what I am looking forward to - even trivial things which looks easy at the first glance, i.e.:
* How to configure caching and caching rules - please note there are few ways (query strings, headers, parameters, cache enabled only on single method etc.) to specify how to cache responses.
* Attach your REST API on custom domain as a regional endpoint (instead of edge-optimized, since regional endpoints are more configurable)- create ACM certificate (per each region, and one for CloudFront), create own CloudFront distribution (think: multi region deployment), add DNS record in Route53 and configure WAF (some magic DDoS protection).
I took me a lot of time with CloudFormation to get through where I am today and yet I think I would be grateful if someone will share his knowledge gained on more sophisticated use-cases than just 'Hello World'.
I were using SAM for month or two and then switched back to CloudFormation, because I felt limited (not all features of API Gateway were implemented, duplicated stages, problems with instrict functions). However, I watch their GitHub repository for changes and I noticed many missing features are implemented on AWS re:invent basis (duration between next conference). The worth-noting feature of SAM is definetely aws-sam-cli (former: aws-sam-local) [1], which is a tool for developers to parse SAM template and invoke Lambda function in the docker on local machine. It was great to test simple APIs (start-api mode), but when some API started using custom authorizers or response was a compressed payload of png image it was not very helpful. Personally, I am working on a fork of aws-sam-cli to implement it to work with CloudFormation.
Yes, it adds a new custom types AWS::Serverless::* [1] and new Globals section [2] and it is really pleasant to work with for basic, not complicated API. It is transformed into CF template via samtranslator [3] (recently open-sourced) so it inherits a lot from CloudFormation but also that's why I encountered differences between SAM and CF types.
Basically, caching allows you to cut your costs if you do it right. Also it is worth to note after 12 months API Gateway stops being free and caching may decrease amount on your bill. However, caching cluster is paid on hourly-basis and you must know if it will be worth to use it.
Personally, I don't imagine to provide aaaaaaaaa.execute-api.eu-west-1.amazonaws.com/v1/transactions/XXXX/cancel as a API endpoint in the documentation of some product. Well.. It'd definitely made my day if I will see such endpoint in i.e. Stripe docs. Everyone uses short like address like api.example.com so there is definitely use for custom domain name as a feature of API Gateway. Custom domains allows you to separate an API by base paths (i.e. Transactions API, Notifications API, Refunds API) and it easier to upgrade/shutdown... basically maintain your API. Also if you are interested from requests made in two AWS regions and you also care about latency regional endpoints might be way better. Of course, if you will use it with WAF and CloudFront you have a lot of things to configure, which for some companies might be important.
Going through the motions of setting up my first API Gateway and Lambda was eye opening to me.
There are a ton of options in these two services — rate limiting, authorizers, concurrency, monitoring, etc. — that point to how much power you gain building an app like this.
If you’d like to see an automated, infrastructure-as-code approach to setting up an API, check out my boilerplate app:
I wonder if I'm missing something. Given 10 minutes of load for pre warming, I was getting 95th percentile times of 1.6 seconds with a bare minimum node lambda hello world function. I threw 1,000 concurrent requests at it across 4 threads, with a 1,000 concurrent execution limit configuration on aws.
I was really excited about lambda, but that is flat out terrible performance for a lot of users. Is this not an issue for others? Why not?
Given that, I wonder why people keep writing about it? I get the appeal for an internal app or a background job, but for a public facing rest/graphql api? Seems like the performance isn't there.
If using NodeJS there is a very similar package to zappa called Serverless https://serverless.com/
One thing I would improve with the authors article is he crammed a giant conditional statement on the http-method inside his one function. Serverless makes it easy to bundle a single function per HTTP method and even share common code between them. You can build out a whole application as a single npm package.
Bonuses include plugins for serverless that let you unit test your handlers, run a simulated APIG/Lambda/Dynamo environment locally for development, plugins that let you interact more deeply with AWS like assigning a custom domain to your APIG.
Serverless Framework is also great for Python if you're not porting a WSGI application. Plug/Disclaimer: I'm the creator/maintainer of the defacto python dependency packaging plugin for serverless[0]
I'm currently working on a project where -- in serverless.yml -- a single API gateway path for all HTTP methods leads to a Flask app... largely because this scheme seems much much easier and faster to test and develop locally.
Are there any particular benefits to exporting and reimplementing all of that routing and organizational logic into instructions that "program" the API Gateway instead?
In this particular case, the API is solely for external consumption by a third party
To me, the beautiful thing about this zappa approach is that I still follow the spec of a UWSGI-compliant app and I get the benefit of quick deployment to lambda without any form of vendor lock-in as it is also straightforward to plug the same app into, say nginx, without code change.
What is the group thought on running these containers that have access to (for example) a Postgres or Redis dB?
Sure Zappa, keeps the “container-function-thing” alive so you can cache it as a global variable, but this is a jacket approach that I’m not clear enough on the consequences to use.
The most obvious limitation is the scale out concurrency model of lambda (serial requests within a container). This is great for simplicity but means you can’t have a connection pool.
Something like dynamo db is a better fit if it works for you, otherwise most people seem to be forced to run a few servers with pgbouncer
There is no remark that an admin IAM user should not be used. This is highly critical and should not be done ever.
Also using something like serverless framework is better suited for the average developer to get this setup up and running. Clicking through the aws cli is not something professional aws users do often. (exceptions would be Cloudwatch metrics and logs). The article should at least mention that infrastructure better is managed in code.
Is there a use case for AWS API Gateway if one is not using Lambda ? Can it protect a web application against DDOS ?
I am not convinced API gateway+ Lambda can substitute for a Web Application , atleast for J2EE apps. The Lambda "boot" time is way too high unless we resort to tricks to keep the lambda instance active. I find the extra $30 or similar is well spent on a a AWS Elastic Beanstalk with auto scaling. The packaging and deployment is cleaner and simpler.
Yes, you can use API Gateway without lambda. "API Gateway" is an industry term and not specific to Amazon. The point is exactly your line of reasoning, a funnel/gateway to your APIs. You can handle Auth, Throttling, etc at that point. AWS does provide other products for use against DDOS, WAF, Shield. ELB, Cloudfront, VPCs/SG. Their idea is to be able to scale, deflect if possible and absorb the attack. They also mention that with AWS API Gateway you can get Layer & & 3 DDOS protection and throttling for your backend. I've no practical experience nor have read any reports on how it holds up.
Might AWS Lambda behind AWS API Gateway (with caching, burst and rate limiting, custom authorizers) behind AWS CloudFront behind AWS WAF and/or CloudFlare work for you?
What is your issue with keeping Lambda instance alive? They mostly stay in standby mode some minutes to keep listening for requests and then it will shutdown whether staying idle - what means next request will boot a Lambda function (container?; It usually takes ~1 second), after that time to receive response is approx. ~100ms.
After seriously kicking the tires on aws serverless, I still prefer running vm's as you do. Beanstalk is awesome, I like Heroku too. Other than the db, not too hard a move to beanstalk if it gets popular.
The cold starts are really bad for the 95th percentile latency, and you can't use a relational db with decent performance and throughput until aurora serverless rolls out.
It provides a fixed contract to the outside world that is independent of your code-level infrastructure so both parties(provider and consumer) are required to conform to it as an integration. In that sense I find it a compelling thing.
As a number of other comments have pointed out, there are packages that help with managing your infrastructure in code; which is critical outside of anything other than a play environment. Terraform, serverless, cloudformation, SAM, zappa, etc - are all essentially requirements for cloud usage.
At Amazon they’ve tooling which makes it relatively easy to set up an AWS account to develop against; the tool primarily configures the billing to the correct team within amazon. Additionally your manager gets “rights” to the account, IAM accounts can be linked to AD, etc, etc. This is all part of a push to get Amazon employees building infra on AWS proper. When you create the environment you have to denote whether it’s dev, or whether it’s production; in the former case it’s widely accepted that you use the GUI to set up your trial. However in the latter case you’re strongly encouraged to use something like cloudformation. It was always a shock to me that for prod accounts they don’t disable the GUI for anything other than viewing.