Hacker News new | past | comments | ask | show | jobs | submit login
Support for tagging of Lambda functions and for the Python 3.6 runtime (amazon.com)
183 points by muramira on April 18, 2017 | hide | past | favorite | 66 comments



This is awesome. If you're deploying stuff with Python on Lambda, I highly recommend checking out Chalice (https://github.com/awslabs/chalice) and also an extension to it that I wrote for using Lambdas on SNS/S3/Cloudwatch events or scheduled events - https://github.com/kislyuk/domovoi (which I now need to go update to support this as well).


Zappa can do all of that and more out of the box:

https://github.com/Miserlou/Zappa

Comparison: https://blog.zappa.io/posts/comparison-zappa-verus-chalice

It also lets you build fully-fledged event-driven apps with a single line of code: https://blog.zappa.io/posts/zappa-introduces-seamless-asynch...


Zappa is amazing I am currently using it to make an alexa app.


AWS now has Sam which I hadn't heard of before. It's an extension of cloud formation specifically for deploying serverless code to lambda, building API gateways etc. Their command line tool is pretty slick, I'm wondering if these other third party frameworks are needed anymore.


SAM solves a different set of problems from the frameworks like Chalice. It does not address the app request/response API, deployment package building process and dependency bundling, the test-development cycle, routing and other API Gateway configuration specifics.

Also, Chalice is a first party framework in that it's led by an AWS affiliated developer.


Not just an affiliated developer, one of the developers of the python SDK. It's in an official aws repo too (awslabs).


Chalice locks you into using AWS services - Zappa is cross-platform.[1] Also, lambda can get very, very expensive [2][3], just FYI.

1. https://blog.zappa.io/posts/comparison-zappa-verus-chalice

2. https://www.reddit.com/r/Python/comments/4hebys/cost_analysi...

3. https://news.ycombinator.com/item?id=14075634


Re: #3 – that's a huge claim to toss around without showing the numbers you used including the full cost to manage your own VMs. Yes, Lambda can get more expensive but staff time is also significant and many tasks don't get enough volume to outweigh saving even a developer hour or two per month.


Finally!! Thank you Amazon.

... I know what I'm doing tomorrow.

Edit: I know what I'm doing today.


Finally Zappa(https://github.com/Miserlou/Zappa) would be able to support web services written in python 3!


Yep!

Follow the progress here: https://github.com/Miserlou/Zappa/issues/793


I much prefer Chalice (https://github.com/awslabs/chalice) which just got a synchronized release supporting Python 3 as well.


Have you tried both? Can you explain what you prefer in Chalice over Zappa?


Yes. In my impression Chalice is designed to be a microframework (emphasis on micro) with the bare minimum of glue necessary to make deploying Lambda apps easy. Zappa is designed around an ill-fitting API (while WSGI is widespread, it's clunky and a layering violation of sorts when put in the Lambda runtime environment) and has a development philosophy that I don't agree with.


I wrote up a comparison of the two here: https://blog.zappa.io/posts/comparison-zappa-verus-chalice

Zappa has even more features now. Basically, you can't even build real server-less apps with Chalice, plus Zappa has support for tons of other magic features.


I stopped using Chalice in favor of Zappa because Zappa made it trivial to run a local dev environment (it's just Flask/Django/WSGI). Have I missed something about this with Chalice?


Chalice does provide a local mode, yes. It worked well for me.


Yeah but in what region? As a european I've been disappointed before by lack of python support in european regions.

Edit: Just noticed in my AWS account that there's 3.6 in Europe. This will get me away from GCP.


Is there a particular reason of getting away from GCP? I am just curious.


Grass is greener and I already have two major projects that require me to use python 2, excl. ansible modules, so I don't want more.

I want to use european regions for my testing, not us.

Other than that I find the GCP interface easier to work with, when it's working.

I'm eager to try AWS.


Would like to know that as well :)


The page says "NOTE: These features will become available in all AWS Lambda regions within 24 hours."


No CloudFormation support for tagging and embedding python3.6 code in the templates - they really do their best to push us away from using CloudFormation!


CloudFormation has developed support for Lambda tagging, Python3.6 and Node6.10 inline code and it will be part of our next release. Keep an eye on http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuid... for more details.


They do get their tags automagically from the CF stack, right? (haven't tested it myself, but that's usually what happens)


No, they don't. You can't even specify template-level tags in the template.

I think they have very incapable developers on the CloudFormation project. This could have been a game changer, but it's been a source of pain.

For example, they introduced YAML, and !Sub, but you can nest tags, yet, !ImportValue in many cases needs a nested !Sub. So, also, you can't have "$", "{", and "}" characters in the exported name, but they didn't add string templates to functions such as !ImportValue. Total nonsense!

Also, as you've assumed logically, all stack resources need to inherit the tags of the owner stack, but no, you have to do tons of copypasta!

Last, but not least - it's all designed that the templates are stored on S3 - most people use source control. Their other services already support Git - Elastic Beanstalk, CodeBuild, CodePipeline, etc. Why they don't allow Git-hosted templates?!

Anyway, when I see the complexity of my templates to have a basic Magento infrastructure running in VPC, which I've been working on, it's very disgusting. Lots of manual steps if you don't want to have a monolithic template, lots of CLI, and build steps. This is not how things like these should be implemented in 2017!

Lastly, they introduced CloudFormation exports. Okay, decent feature, but not in the real world! So, if you refactor your infrastructure, it becomes a huge pain as you cannot delete exports for some reason - they belong to the stack. So, if I decide to rename or split an export, I need to have an intermediate step, which duplicates the old and the new exports, I updated all importing stacks, to use the new values, and so on. Most AWS resources have "retain" capability - S3 buckets, ECRs, Route 53 records, etc. - CloudFormation exports don't! Honestly, they need to put some more experience and brighter developers on the team!


Good. Use Terraform instead :)


Can Terraform do that?


It was strange that it ever supported Python 2 given how recent the AWS Lambda Python product is - they could have avoided the support burden of Python 2 entirely.


Thank you AWS Team!! Can now look forward to moving ahead with python3 dev fully


Maybe tangential but how do people feel about implementing a typical RESTful database API in Lambda & API Gateway?

Most of the prototypical examples of Lambda I see are for things like data processing pipelines. I know in theory Lambda should be able to handle just about any kind of request from a browser or mobile app short of a websocket connection, and Amazon does have some sample code and a brief case study on their site. But I'm wondering if it's really ready for this or if people have experiences going near-100% serverless for their apps.


I find it to be an antipattern.

Lambda excels at taking in arbitrary amounts of long-running jobs and feeding you its output. For example: Upload a png image to convert it to jpg. Zip a directory of S3 objects. Etc.

Lambda gets very costly and inconvenient when you're just taking in requests you could handle by a couple of load-balanced web servers. The whole "running a whole website on Lambda" craze does not actually yield any benefit and is more complex, harder to play with than a simple ec2 instance (which, with a good setup, needs very little "server" management at all).

Also, API Gateway is just horribly inflexible imho.


I've heard API Gateway is painful to configure but that frameworks like serverless.com help a lot.

But I'm surprised to hear that Lambda gets costly. Is this from real experience or is it just theoretical? My impression was that Lambda saves you money by not having to pay for excess capacity. But I haven't done the math. I'll admit, it's also appealing to not even have to worry about configuring a web server cluster to scale up and down.


Lambda costs are two-fold: You have a set cost per API request as well as a cost per 100ms chunks of work.

On top of this, you can't increase one of CPU or Memory allocation without increasing the other. This means if you're very memory-efficient and CPU-bound, you'll be eating extra runtime costs. You also kind of end up using the entire Amazon toolchain, which has costs embedded in every single bit of it. SNS, SQS, API Gateway, S3 requests, S3 network out, etc they all have costs.

And here's the thing: Lambda has a ton of layering on top of it, which you wouldn't have in an EC2 environment where you have full control. You can't optimize Lambda, you can optimize EC2.

My company is currently paying $4k/mo in Lambda costs, parsing log files in Python into XML and doing an S3 call at a peak of 40 requests / second. Back of the envelope, we can probably get this down to <$300/mo by overprovisioning a few m4.large instances. But now there's the question of having to manage a processing queue, reprocessing, etc so it's hard to tell how much would actually be saved. (On top of that, if a box goes down, that's a significant chunk of processing unable to be taken care of; with Lambda, that doesn't happen).

All in all, has been excellent to us to get started, but there's a point where we definitely want to investigate a dedicated system we have full control of. Lack of Python 3 support was the #1 reason I wanted to do that, so now there's a bit less motivation - it's a lot of work.

I'll certainly write a blog post about it if we decide to move our main processing off Lambda.

Edit: This looks like it has a lot of interesting numbers. https://www.reddit.com/r/Python/comments/4hebys/cost_analysi...


What sort of logs and processing are you doing? Do you mean 40 log entries per second, or 40 calls against your log processing service/infrastructure?


~40 500kb logs per second. The logs are Hearthstone games.


API Gateway needs less config since they released the proxy style integration. In Lambda's case this means you can just return a JSON object like:

{ "statusCode": 200, headers: {}, body: "<h1>Hello!</h1>" }

And API Gateway knows exactly what to do. Before this, yes there was some extra config and it was not fun.

PS. Our AWS bill went down when we moved everything to Lambda but that was not the motivating factor for the switch


Is it possible that perhaps your EC2 instances were bigger than they should be, and perhaps to make Lambda work in satisfactory manner you also placed CloudFront in front of it?

Because those sites you mentioned (bustle.com and romper.com) look like something that could benefit a great deal from using CDN which would then drastically reduce need for large instances.


> I'm surprised to hear that Lambda gets costly. Is this from real experience or is it just theoretical? My impression was that Lambda saves you money by not having to pay for excess capacity.

I delve into this in detail here:

https://news.ycombinator.com/item?id=14075634

In short, using Amazon's own pricing example, yes, it's extremely expensive for production web app workloads compared to just running an autoscaling group with 10 (!) nodes on ELB/ALB, and the pricing disparity increases as load increases.

With that said, Lambda is a great fit for little tasks that will never need a whole server.


We did this at CloudSploit. Our entire service is written using Lambda-based APIs behind the API Gateway. The biggest challenge at first was the cold boot times. If our API went un-pinged for more than five minutes the next request would take > 5 seconds (!). This was solved with a CloudWatch event that invokes the function every 4 seconds (which is silly in my mind, but it works).

You can read more about our process here: https://blog.cloudsploit.com/we-made-the-whole-company-serve...


I've built a complete web application that is entirely serverless except for the database - I have an instance running Postgres. This is because Postgres I the only database I want to use because it always meets my needs.

Everything else is serverless however.

My application has:

-- AWS Cognito as the user management and auth

-- AWS Cloudfront serving a Reactjs front end as static files

-- AWS EC2 running Postgres as the database

-- AWS Lambda Python functions for the back end

-- AWS SES serverless email inbound and outbound processing

-- AWS SQS for email processing coordination

Oh I do in fact have another server which does spam checking of emails, PDF conversion of emails, text extraction of emails and parsing of emails - this works best on a server rather than serverless.

The most interesting thing I found along the way is that the API gateway (which I liked a whole lot by the way, and found to be powerful and easy to configure) is completely unnecessary. My application simply directly calls the Lambda functions to get and set the data that it needs - dramatic simplification.

Of further note is that I only have one Lambda function for the entire back end - this further reduces the need for layers of APIs and parameters. All my Python code goes into the one function which is structured as a complete application. An additional benefit of this is that AWS Lambda can be slow to first run a function unless it is "warm". If you have lots and lots of small Lambda functions then any given function is less likely to be warm. With everything in the one Lambda function, then all parts of my Lambda function are more likely to be warm.

So no API gateway completely rips out an entire layer of complexity, and then only one single Lambda function rips out another layer of complexity. It's very nice to be able to write front end functions in ReactJS that just call the back end function that they need. Making changes means I just change the functions and don't need to fiddle with all sorts of REST API layers or anything to accommodate the change.

I started with node.js as the Lambda back end but switched to Python because personally I find that Python with its synchronous programming model is much easier to reason about for back end stuff. I'm more than happy using ES2015 at the front end for ReactJS.


Is this a hobby project? What ballpark are your hosting costs? I left AWS for Heroku after they wanted over $50/mo for postgres+route53+ec2 small; if Heroku got expensive for me I'd hit something like Linode/DO before AWS at this point.

As a reference, this is for non-revenue generating hobby sites/projects.


Well it's intended to make money, although not finished yet.

So I don't know what the costs but I don't imagine much.


That's interesting but I was asking about a RESTful service. I thought API Gateway is the only way to invoke Lambdas over HTTP.

Incidentally, AWS Aurora now supports Postgres.



If you go to www.bustle.com or www.romper.com you get a JS app server side rendered by AWS Lambda and delivered through API Gateway (CDN in front of that). We do 50+ million unique visitors a month. Our team is super happy with the result and our AWS bill is smaller than it was on EC2. Cost was not a motivating factor for the switch but it is nice when the bill goes down.

It is most definitely ready. I gave a talk at Node Interactive last year that has some more detail https://www.youtube.com/watch?v=c4rvh_Iq6LE&index=2&list=PLf...


Very cool. At your scale have you noticed any problems with cold start latencies?

I know cold starts are an issue when services have a lull in usage, and that you can work around it. But I'm curious whether cold starts also happen when there's a surge in demand and AWS needs to spin up more containers or whatever it uses to host Lambdas.


We haven't seen any issues. Most of our latency critical functions are called a lot. We do get significant spikes in traffic but our CDN does handle some of that load.


I would be very interested in the answer to this question. If it can surpass Flask in terms of simplicity in creating a simple REST API it would be a major milestone.


Uh, the API Gateway can't handle binary requests/responses very well and it certainly cannot handle multipart requests/responses. Additionally it has a 30 second timeout so I would say it's definitely not something you could just replace your hosted web server with completely yet.


Looks like it does handle binary now, if not multipart yet.[1]

Granted the 30 second timeout is a constraint, but how bad a constraint is it? Ideally long-running requests like that should be rearchitected to return fast and deliver the results asynchronously, right? The bigger problem I see is the lack of websockets support, which makes delivering async results harder. Supposedly AWS IoT does it but that seems like an even more exotic usage than implementing a REST service in Lambda.

Basically I get that it's still early days and there are gaps here and there, but am wondering if it's actually an "antipattern" to use Lambda this way or just a little early.

[1] https://aws.amazon.com/about-aws/whats-new/2016/11/binary-da...


It handles binary, but not well. http://forum.serverless.com/t/returning-binary-data-jpg-from...

> Ideally long-running requests like that should be rearchitected to return fast and deliver the results asynchronously, right?

Ideally for whom? I can't think of an engineer who wouldn't rather do a simple request/response. It's certainly cheaper than spending the money to keep a connection open longer than 30 seconds, but far from ideal.


>Ideally for whom?

Anyone running on heroku works with a hard coded 30 second timeout on all web requests. It works fine for everyone there.


> It works fine

You must have a different definition of ideal. I'm sure no one using Heroku would be upset if that limit was removed.


We did a write-up last year about combining our WebSocket gateway with a FaaS backend: http://blog.fanout.io/2016/09/29/serverless-websocket-chat/

The FaaS used in the article is Microcule, but the same approach should work with Lambda now that API Gateway supports binary.


I'm running/wrote a flask-restfull app on lambda via Serverless with the wsgi plugin behind API Gateway. Not a huge fan of Python, but it's all working well enough.


CloudFormation has developed support for Lambda tagging, Python3.6 and Node6.10 inline code and it will be part of our next release. Keep an eye on https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGui... for more details.


What about Python 3.6 support for ElasticBeanstalk?


At least you can use the Docker environment type and deploy your own image with Python+your code.


If you want to spend multiple weeks rewriting your deployment process. EBS promised a new Python image a few months ago, and now that Django 2.0 is going to drop support for 3.4 by the end of the year it's getting to be time for them to deliver on that.


Well this will drastically clean up some code I've just written! I was using 2.7 for the lambda code and 3.x for the rest of the processing.


It's a shame boto3 doesn't support async...


Tangent: anyone find a satisfactory way to do blue/green or canary deployments with Lambda?


Create a new alias with a unique ID for each lambda deployment. Then manage your application to point at the proper alias. Roll out just becomes updating the apps however you update them normally.


Don't bury the lede; we get tagging too!


Thanks, we updated the title from “AWS introduces lambda support for python3.6” a little while ago.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: