Hacker News new | past | comments | ask | show | jobs | submit login
New for AWS Lambda – Environment Variables and Serverless Application Model (amazon.com)
264 points by luhn on Nov 20, 2016 | hide | past | favorite | 83 comments



The environment variables thing was desperately needed, I raised this as a Feature Request with AWS over a year ago (my work has enterprise support)

I remember them telling us a way to make a lambda function distinguish whether it is in a DEV/TEST or PROD environment was to do some sort of regex on the function name, which was suboptimal, especially if you have Lambda's created via CloudFormation.

We "got around" this problem by creating tables in DynamoDB as our DEV/PROD environments are in separate AWS accounts, so the dynamo tables contained simple key/value pairs that you would read once when the lambda container started up. Another option would be to have a file in S3 or something, but you still have to write code to manage the retrieval of those resources.

Looking forward to dropping all that infrastructure and using this feature instead.


Did you happen to raise a feature request for Python 3 as well? With any luck, we'll get that before Python 4 is released... :)


We primarily use Java 8, but I'd imagine a lot of people have raised it


Why aren't you using aliases[1] for denoting dev, prod, etc?

[1] http://docs.aws.amazon.com/lambda/latest/dg/versioning-alias...


This was before AWS introduced aliases and versioning, think back to summer 2015.

Aliases are still a half solution to this problem though, you would still need logic in your code to say "if alias == PROD then ... else ..." for configuration, something I'd rather do without.

EDIT: by configuration I mean configuration that's specific to an environment. Imagine you had a Lambda that wrote data to an S3 bucket, you wouldn't want to accidentally mix TEST/LIVE data in the same location, so you would use a configuration property to inform the Lambda where to write.


Thanks for the explanation, that makes sense. We wanted to see aliases as well, because it's cumbersome setting a client ID for our Lambda monitoring service.


My team has been using Lambda/Dynamo/API Gateway/Cognito for a couple of web services this year and really loving it. The addition of CloudWatch Events to the mix was really the final piece of the puzzle for us, because now we can launch lambda functions on a schedule to perform asynchronous background tasks (think data crunching, email notifications, etc).

Lambda is maturing so fast that I almost feel bad complaining about it, but of course we have run into our fair share of issues, too. One thing that's making our life more difficult right now is the fact that when you launch a Lambda function in a VPC via CloudFormation, the function's ENI doesn't get attached until the function runs the first time, so CloudFormation doesn't know anything about the ENI. Thus, when you tear down the CF stack, the ENI gets orphaned and hangs around in a detached state. Throw some automation into the mix and you can start eating up IP addresses in your subnet really fast. I have no doubt they will fix this soon.


Interesting. I want to see how Terraform fares in this aspect.


Do you use / know of any services like AWS Lambda that supports Ruby out of the box?

I've been playing with AWS Lambda using travelling Ruby and Mruby, but have hit issues (with native gems etc).

I have used Iron Worker previously, but they seem to be going up market and don't even display pricing on their site.

Thanks in advance for your input!


There are a few libraries for using JRuby (like https://github.com/c9katayama/aws-lambda-jruby) But they may also have issues with native gems


I haven't used it it, but the new (like, really new) Iron Functions may be what you're looking for. I've only used the Workers/Cache so far, but they're dockerized so pretty much language agnostic.

I'm making an assumption that Functions is similar.


I wonder if you could use opal?


yeah that was one of my very early thoughts...


I'm not normally interested in Opal per se, but it would be great for this. I'd much rather write Ruby for simple call/response APIs than JS.


You can probably use JRuby without too much effort.


My suggestion is to just use Python.


Can anyone share their experiences using AWS Lambda in Production?


I wrote the apex(1) tool and created https://apex.sh/ping/ with Lambda, in general it has been great, has scaled flawlessly since launch (granted I'm only doing ~8M requests / day).

Conceptually I think it's great for pipelines or use-cases like this, VMs are generally a terrible level of abstraction for a lot of problems, and the Lambda style promotes better architecture because of this.

The connectivity between Kinesis/SNS and friends is great. I'd agree that Lambda is not currently a good fit for "regular" apps, APIs should be fine now that the proxy stuff is in there, though there's slight latency.

No need to worry about gracefully stopping or restarting daemons, just push new code and the old stuff goes away, it really is a great abstraction that way. Basically replace anything you'd use a Go channel for, with more Lambda or SNS->Lambda if you need retries and backoff, it'll spare you a lot of code.

I find the workflow great as well, the slowest part for me is compiling the Go binaries, the rest is virtually instant. Especially now with all this needlessly complex Docker stuff it's refreshing to use something simple.

Cost is prohibitive for sustained use, so make sure you price things out properly, it sounds very cheap until you look at say a constant 100 requests/s behind API Gateway. It's easily 300-400% what you'd pay on EC2.

Cold-start is really a non-issue in most cases, it seems to take very little to keep a function warmed, so unless you get zero traffic (which would be dirt cheap on a t2.micro anyway) you'll be fine.


Had never heard of apex. Looks really nice and giving it a try now!

Edit: Feedback: Really like it. One feature that would be very nice would be the ability to trigger a ping on demand, e.g., for testing auth set up on a request. Runscope implements a similar feature. Otherwise great so far!


Thanks! Agreed, I have that on the list, quick sanity check is always good. Taking a bit of a break to work on other products but I'll keep adding to it.


I've looked at Apex a few times. I was going to ask you for a hacker plan but it sounds like you're already doing such numbers it might not be justified. It would be really nice for me to be able to have a master subscription for my contracting work for client projects.


I've thought about adding a smaller plan, I still might at some point but it certainly reaches a level where it's not really worth it, especially since I want to provide equal support to everyone. I'll have to experiment with that.

I had limited free plans originally but that went horribly wrong haha, free users only attract other free users, a few days later I had like 4000 free people. Maybe that works for startups, but not "real" companies.


I used Apex briefly a few months back and was impressed but I didn't fall in love with Lambda so I'm not using it anymore.


Heads up -- Chrome is blocking is blocking a lot of your site's requests to lambda endpoints (net::ERR_INSECURE_RESPONSE) :)


If you need an API Gateway that can work with Lambda, but with better performance and feature set, take a look at https://github.com/Mashape/kong - there is an open PR for Lambda support which will land in the next version (I am one of the core committers).


This is very interesting. Especially when you have a large volume of trivial requests, the cost-per-request of AWS API Gateway dominates overall cost (the Lambda functions themselves are cheap by comparison). Rolling one's own in EC2 can potentially be much cheaper.


Can you elaborate on the slight latency you've experienced?


I haven't tested it extensively but if you boot up API Gateway and a hello-world Lambda function, the function itself takes maybe 1ms to run, while API Gateway seems to add roughly 150-200ms on top of that.


We gave it a run in prod and abandoned it quickly. We were using it for parallel file processing (S3->lambda for CPU-heavy task->return a few numbers in a JSON).

Last I calculated, it's nearly 5x the cost of comparable t2 on-demand EC2 instances. That can be mitigated if you have spiky traffic where you'd need to turn on several EC2 instances for less than an hour but are stuck paying for the full hour, or if you can scale to zero for extended periods of time.

Critical to us: you don't have the ability to serve concurrent requests from a single lambda worker, so if you're waiting for async IO (e.g. downloading from S3) then you're wasting money you wouldn't be wasting with EC2. You end up with one file per lambda worker, whereas we could handle about five files concurrently on a normal VM because of async IO. -- For us lambda was prohibitively expensive at scale. I suspect a fair number of people reporting significant savings vs EC2 had over-provisioned instances.


It would be interesting if someone made an EC2 image that implemented the Lambda APIs, and handled as many requests as it could, then optionally delegated the overflow to real Lambda. Like "reserved" Lambdas.


This is a _fascinating_ idea. You could make an image that queried the Lambda API, downloaded all Lambda functions (and their configuration) and served up exactly those functions. Make it completely turnkey.

Hm...


You might want to check out IronFunctions that was released last week: https://github.com/iron-io/functions . You can run Lambda functions anywhere, can even export/import them directly from Lambda. It doesn't have the burst to Lambda part you speak of though... yet.


I wrote a blog post about how we process Hearthstone log files into replays for https://hsreplay.net

https://hearthsim.info/blog/how-we-process-replays/

TLDR: AWS Lambda is awesome if you have indeterminate-but-high amounts of small CPU-bound tasks you want to be able to do in parallel, as soon as they are needed. Awesome for file processing (eg. image resizing) for example.

But being stuck on Python 2.7 sucks. PLEASE Amazon, announce Python 3 support already.


Using Python for an API call that is basically a thin wrapper around one of our supplier's (not very good) APIs running ~500k requests per day. We've had problems twice with 500 errors for all requests at night when they do maintenance on the API gateway/lambda hardware and don't tell us (in both cases they announced new features the next morning). We reported it to support both times and got "we're sorry, but since it's all good now there is nothing to do" as the response both times. Left a bad taste in our mouth and so we haven't deployed anything new to it and probably won't for a while despite it working well otherwise.


Those 500s may mean you aren't deleting your ENIs. Make sure you have privileges to delete on your role.


We tried about a year ago and gave up after too much wasted effort on getting any kind of a good dev/CI/CD pipeline going. Mainly due to lack of tooling. Things like environment variables, etc.

Taking another look now on some side projects, to get a sense of if all the of the issues have been addressed.

But the big ones were lack of env variables, and the tooling was atrocious.

I really like the concept, and the current serverless (serverless.com) framework has been simplified, runs based on cloud formation, and also allows the project to be run as an express server, and potentially with additional plugins on a competing serverless architecture.


Don't use it if you require low latency - performance is abysmal and the bottleneck is Lambda overhead that you cannot influence. Also, if you don't have a high traffic function, cold start is also an issue, at least when using Java. There is a "hack" to avoid this issue: create a cloudwatch check that triggers your function every minute or so. This sounds weird but if you consider that this is good practice, this gives you an overview how production-ready Lambda is. Nice thing is that is's pretty cheap though.


Warming it every minute seems kind of excessive. Did you really have to do it that much? I haven't experienced this with Python but my jobs are more cron-ish so an extra 100 ms doesn't make a difference.


Loading JVM from a cold VM is a lot more than 100ms, more like 5-10 seconds. Same problem Google App Engine had 7 years ago before you could pay to keep an instance warm.


Ah, okay an issue specific to Java.


I've used Lambda in production under pretty high throughput conditions before. It's fairly stable – given that we were using Kinesis to feed into it (which was a pain to setup).

Recently, I've been using stdlib (https://stdlib.com) to do function-services since it's a lot easier to get started with it a bit more intuitive for newcomers to understand.


I agree with you and have found stdlib to be very intuitive for newcomers. I like stdlib so much that I created an stdlib intro article at http://thisdavej.com/creating-node-js-microservices-with-eas....


We use Lambda and API gateway for all production[1] requests. HTML, Server side rendered JS, API calls. All through lambda. There are some rough edges. We wrote our own framework/deployment tool[2] to fix some of them. AWS has been making lots of improvements too. Many in just the last few months. This setup was much harder to run when we first started a year and a half ago. Overall our team is super happy. It is cheaper and simpler to operate than our previous ec2+Opsworks setup. We get code to production faster and spend more time on actual business problems vs infastructure problems. I also pretty much agree with everything Ben Kehoe said in this article: https://serverless.zone/serverless-vs-paas-and-docker-three-....

[1] www.bustle.com and www.romper.com do a combined XX million unique visitors per month.

[2] https://github.com/bustlelabs/shep


How do you guys deal with potential cold-start issues?


Mostly it is not a big problem for us. Our functions that need to be fast have very high traffic. Our other functions don't need to be that fast. We also have a CDN in front of most things.

That said, we use node. I have heard cold starts are worse for Python (I have no data). I have heard they are near unbearable for Java (JVM startup and all that). We try and minimize external calls outside the main handler. Previous to this update we were packing configs with webpack at deploy time. Some people read them from S3 or dynamo. I don't consider that a good solution. I'll be working this week on Shep 3.0 which will use the new ENV var features.

I have talked to other people who do various hacks to force their functions to be warm. I've looked at doing this and haven't found it necessary yet for our use case.


We use route53 health checks to invoke API gateway and thus the backend Lambda.


I started with a manually-packaged Python function and credstash [1] to store credentials for non-AWS services. It gets invoked nightly by CloudWatch and has been bulletproof for months. I recently built a microservice [2] for Election Day with Flask and deployed it behind API Gateway with Zappa [3]. Lambda isn't for every use case, but for those it is, I'm bullish.

[1] https://github.com/fugue/credstash

[2] https://topics.arlingtonva.us/2016/11/voter-registration-sea...

[3] https://www.zappa.io/


You can check out my talk at AWS E-Business Day. For ~50 minutes we are talking how we switched to serverless architectures for one of the largest online retailers of Europe and our learnings.

https://www.youtube.com/watch?v=WYVamFcphxo


I use it for one of my clients to send an average of 400k browser push notifications, competitors send then in 40 minutes. With Lambda + kinesis firehose(16 shards) my solution is able to send them in less than 4 minutes.


I've been using Lambda in production for multiple things.

I'm not using Apex or any of the other frameworks, just plain Go code wrapped by the JS function.

It's really awesome to work with, you just deploy a function and quit worrying about it.

Things like callbacks, message monitoring and stuff like that.

Example:

Monitor SNS messages for auto-scaling and send a message to slack when something fails. https://github.com/KensoDev/sns-lambda-notifier-golang


Same here, deployed two functions. One thumbnail converter after s3 uploads which is used a couple of times per day, running for over a year. The other is a log aggregation script that runs thousands of times per hour, running for a few months now. Both run flawlessly, never had to worry about them after deploying.


We use it at https://iotsky.io/ for all backend interaction with aws IoT.

Wrote a blog post about it too: https://medium.com/@iotsky/how-we-built-the-iotsky-backend-u...


I wrote Gimel[0] - an A/B testing backend using Labmda and Redis (it can also work with Google BigQuery w/Kinesis).

It's used in production for a few months now and we're really happy with it. It provides a negligible-cost replacement to Optimizely for us.

[0] https://github.com/Alephbet/gimel


It has been good for us, although our use case has mainly been ETL pipelines which seems quite a good fit, especially when you connect it up to Kinesis Firehose/Kinesis or S3 notifications.

The main issue we've seen is, even at the "top" tier (1536mb) the CPU performance doesn't seem very good, it's difficult to tell as they abstract you away from what is actually going on under the hood.

Main advantages: easy to deploy, no infrastructure worries if your problem fits, event based triggers means you don't have to write any of that code, fits well in the AWS ecosystem.

I've seen people making serverless websites and things like that, but I'm not convinced Lambda is a good fit for such a thing (happy to be proven wrong)


Didn't get to prod. Looked at it around Juneish 2016. High error rates, no SLA, slow boot times, no tooling, deeply inadequate logging.

My technical recommendation is as follows: "No".


What about the logging is inadequate? CloudWatch Logs seems to work pretty well.


Not the original poster, but IIRC cloud watch improved a lot in recent months - before that web interface was kinda hard to use with all the log streams, right now it's much easier to search for some event for example (time brackets were added, auto-loading of previous event if I'm not mistaking)


Compared to, e.g., kibana, splunk, or `cat`, I find it effectively useless. The UI is ... ungood.


It works remarkably well if you use a development toolchain (serverless and claudia are both good though I prefer the latter).

My only gripe is that API Gateway (APIG) charges $3.5 per million requests and there is no other way to invoke lambda functions over HTTP. Had to scale back my micro-service ambitions because of this.


Hmmmm, I know it seems ridiculous but I wonder if it would it be cheaper to just spool up an ec2 machine and async invoke the lambda call e.g. via boto?

It seems like APIG has a boat load of features, which is great... but they are mostly unused.


potentially I'm looking for something like this (to proxy_pass to a lambda function with nginx) https://github.com/washingtonpost/nginx-aws-lambda


I wrote a AWS Lambda plugin for the Caddy web server which can take the place of API Gateway.

https://caddyserver.com/docs/awslambda


I use Lambda in production as a replacement for a cron server. So far, all it does is keep my cache warm. So far, no complaints. It was actually TJ Holowaychuk's blog posts on his use of Lambda that got me to give it a shot.


Interesting. Slowly carving away at the reasons that serverless.com was created.


Very slowly, Serverless had this a year ago. By the time AWS gets something like a 'sls function deploy' it's pretty likely that serverless will have support multiple cloud providers, in which case they should be pretty safe.


With cloudformation you can deploy full stack serverless apps. IMHO You don't need serverless any more.


I beg to differ. CloudFormation is great, but crafting CF templates for non-trivial backends is not for the faint of heart. Also iterating on them while tweaking things in CF templates and corresponding logic in micro-services running on Lambda is not the shortest/fastest feedback loop with the current AWS tooling. Also things are pretty hard to test. After over a year of struggling with the current tooling, I've started working on a framework [0] that would allow writing serverless infrastructures at a higher abstraction level.

[0]: http://qmu.li


Has anyone been able to get Chalice to work?

https://github.com/awslabs/chalice

Basically, I'm interested to know how the Swagger template file will look a like. It'd be nice if you could use Swagger to quickly create REST api on Lambda stack.

Being able to quickly generate scaffold CRUD REST api on Lambda behind Authentication + DynamoDB would be an absolute killer app.

I'd imagine that Azure is not far behind. But one thing that's killing it for Azure is the Visual Studio IDE. Edit my ASP web app, one click deploy to Azure from inside VS is a killer app as well.

I really wish AWS came up with their own IDE where it would be tightly integrated with AWS. Imagine if you could write code and deploy it instantly without setting things up yourself through the AWS console (not that it's bad or anything).


I'm biased saying this, but Chalice is a half-baked rip-off of Zappa[0], deliberately nerfed by Amazon to lock you into the AWS stack.

If you want to see an example of how easy it is to use Zappa + Lambda + DyanamoDB, you can check out Zappa BitTorrent Tracker, which can use S3 or DynamoDB as a back-end.[1]

As a bonus, Zappa has had both local and remote environment variables as a feature for months - although it's pretty cool that this new announcement can use KMS, although that will of course mean more vendor-lock in if you choose to go that route.

[0] https://github.com/Miserlou/Zappa [1] https://github.com/Miserlou/zappa-bittorrent-tracker


(I work for AWS) I've been livestreaming building a zappa app on twitch.tv/aws ! daysuntilreinvent.com is powered by zappa!

I will say that chalice wasn't meant to copy zappa. Chalice and Zappa both started in Jan of 2016. Zappa has way more features and is a wonderful piece of software.

One thing Chalice does well that I'd love to see in Zappa is automatically figuring out what kind of IAM policy/permissions you need by analyzing the code.


I suspect an Amazon IDE is coming. They did acquire[1] Cloud 9 not too long ago.

[1] https://www.google.com/amp/www.forbes.com/sites/janakirammsv...


Nice, This could be useful also if you could have "secret" variables as well, similar to [1].

[1] https://www.visualstudio.com/en-us/docs/build/define/variabl...


Does anyone know if the 256 character limit for the value is in place like in ECS? Super annoying.


You could probably split longer data into multiple variables and concatenate them, but maybe having such large environment variables is a sign that you might be better served with a different mechanism.

For example, instead of passing in a big JSON blob, one could split the JSON keys into separate environment variables.


I wanted to add a 2K public key, could be split up but annoying solution. Used a IAM role with rights to an S3 object in the end.


If you can, you should switch to elliptic curve keys – they are less than 256 bytes.


I just tested and didn't run into a 256 character limit. Looks like you can use anything and any number of variables as long as the total size is <4k.


This is a very welcome improvement to an awesome service that I use daily. Now if we could just get a smoother bundling / deployment workflow or a way to edit uploaded bundles on the AWS Lambda dashboard.


Has anyone been able to use Lambda for relatively high-memory load applications? (~2GB+ RAM)

That's our biggest restraint at the moment, so far I haven't seen any good options.


Lambda only supports up to a maximum of 1536mb of RAM right now.

I've been involved with developing lambda functions that consume roughly 1.2gb of ram each time, but the memory usage is easy to predict as the function is triggered by files in S3 that are about the same size.

They say to break your problem down into smaller chunks to fit into the memory - is that possible in your case?


Not easily! I'm doing speech recognition; I might be able to run a segment through two separate models, with half the lexicon available in each run, then just combine the results. However I don't think the overall savings will be enough to justify running it twice. It's getting close though, once they get the RAM up more, I'll try it out.


Environments (dev, prod etc) support would be great, currently need to use apex to simulate.


my current annoyance with Lambda is the lack of support for timezones in the scheduler. we use it for spinning up/down EC2 instances at particular times, and having to manually work around DST is a pain.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: