AWS Lambda: Run Java Code in Response to Events

pramodliv1 · on June 15, 2015

Has anyone used IronWorker in production? It seems to be more flexible than AWS Lambda. http://blog.iron.io/2015/01/aws-lambda-vs-ironworker.html

stephenitis · on June 15, 2015

Here are a few updated differences. disclosure I work for Iron.io

Lambda: node, java

IronWorker: node, ruby, python, php, java, binaries, .NET, go (binaries), and more (specific language runtime versions available)

Lambda: allows 500mb of local disk space to persist your task

IronWorker: 10 gb of local disk space available

Lambda: only current version of code

IronWorker: versions all uploads and allows you to revert

Lambda: no built in scheduler

IronWorker: a flexible scheduler out of the box to run tasks

Lambda: maximum execution time 60 seconds

IronWorker:maximum execution time 1 hour (customizable up to 24 hours for dedicated users)

Lambda: 100 maximum concurrent request. (The maximum can be higher, can someone point me to the absolute maximum if it exist?)

IronWorker: concurrency maximum can go much higher as needed (250 for a production plan). You can also limit concurrency per IronWorker, aka set a worker to only run up to 50 concurrent tasks. (useful if you have bottlenecks)

Lambda: 90 inactive code retention period

IronWorker: no limit to inactive code retention period

Lambda: available on aws

IronWorker: choose your cloud provider: AWS, Rackspace, Microsoft Azure, private clouds, and more.

We've also enabled users to pause the execution of their task queue incase they need to patch or revise their worker code.

supporting your own docker images

http://blog.iron.io/2015/06/no-lock-in-thanks-to-docker.html

http://blog.iron.io/2015/05/full-docker-support-for-ironwork...

We’ve been doing some very neat stuff with our local docker development workflow

https://medium.com/@treeder/why-and-how-to-use-docker-for-de...

edits* formatting

Overall we are are glad Lambda is validating workflows around eventable compute services. It's an experience we've been innovating for some time now.

xur17 · on June 16, 2015

It looks like your pricing starts at $49 / month. Do you offer on demand, or cost per task, similar to what amazon lamdba offers? I have some tasks that I'd love to try iron worker for, but $49 / mo for a side project is more than I'm willing to spend.

IanCal · on June 16, 2015

There's a free tier which you may be interested in: http://www.iron.io/pricing#worker

There's then pay-per-second pricing if you go outside of your plan (which I assume covers the free tier, but I don't know) [edit - if you have this enabled, you don't have to]

> Compute hours that exceed the rated plan will be billed at a rate of $0.075/hr. The free plan provides 10 hours/mo.

Edit 2 - Disclaimer, I don't work for them, just been looking at their pricing pages a lot.

Man, I'm full of edits today :|

xur17 · on June 16, 2015

Thank you, that's a lot closer to what I am looking for. Note to iron.io - you should make the free plan a lot clearer. I still can't find the details for it, and this is exactly what I am looking for.

dsjoerg · on June 15, 2015

Thanks for the comparison. Regarding Lambda maximum concurrency, I am currently configured for a limit of 20,000 concurrent invokes so perhaps your figure of 100 is not the maximum but simply the default maximum?

treeder · on June 15, 2015

Yes, default is 100: http://docs.aws.amazon.com/lambda/latest/dg/limits.html

heyalexej · on June 16, 2015

Curious: Was it a lot of back-and-forth with support to lift the limits?

akhatri_aus · on June 16, 2015

If your cloud provider is AWS how do you make so many concurrent requests with monthly pricing. It seems the pricing would incentivise you to use a single machine for all the requests.

filearts · on June 15, 2015

I think you might find http://webtask.io interesting for similar use-cases where the full spectrum of http methods are supported.

The premise is a sort of RPC service running arbitrary javascript code within the life-cycle of a request. The huge benefit here is the ability to securely expose secrets to webtask code, allowing you to do things like connect to a database, stripe, 3rd parti api, etc.

Let me know if you have any questions on this model.

stuartaxelowen · on June 15, 2015

The IronWorker doesn't seem to fulfill the same promise of simplicity that Lambda does. It's talking about scheduling and creating docker instances where Lambda says "just give us your code".

treeder · on June 15, 2015

Docker support is a new feature to enable more complex jobs where you need to control the entire stack, for instance, maybe you need imagemagick installed on the core machine, or ghostscript. IronWorker can support that. It also allows you to test your workers locally before uploading: http://blog.iron.io/2015/03/the-new-ironworker-development-w....

That said, you don't need to think about Docker if you don't want to and an IronWorker can be a few lines of code too, for example:

https://github.com/iron-io/dockerworker/blob/master/ruby/hel...

Or node: https://github.com/iron-io/dockerworker/blob/master/node/hel...

stuartaxelowen · on June 15, 2015

Thanks for posting those, their splash page gave me the feeling that you were always responsible for the docker mental overhead.

weego · on June 15, 2015

I couldn't see anything on timings here compared to the node version. Does it keep the code 'hot' or is there the cold start overhead? I assume it has to be the former for this to be at all viable.

encoderer · on June 16, 2015

This is a great blog for all things AWS Lambda: https://alestic.com/

The author is popular in the AWS developer community, and he's a really good guy to work with.

fwoqpdw · on June 15, 2015

I tried out lambda to handle backend requests with a go executable. For a trivial request/response it took 100-300 ms for the first request and 3-100 ms for subsequent requests. Using better hardware tended to help with the subsequent requests but there was still a lot of variance.

This is too slow and has too much variance for my purposes, which is to respond to user events. Even if I added an optimization of having a go process which lives through multiple lambda calls, the first request latency sucks. (And I haven't verified that the optimization would work - unfortunately you can't use sockets which would make it simpler).

TL;DR: pls support go (with consistent performance)

siscia · on June 16, 2015

What would be "fast" in your opinion ?

fwoqpdw · on June 16, 2015

Consistency is more of an issue with me than speed. That said, I would be happy if a trivial request (prints a short line of output) to a go handler would consistently be <10ms.

siscia · on June 16, 2015

I am sorry but 10ms is really too short...

You need to consider physical latency into the equation...

If you live next to a Amazon data center you can hope to achieve such result, otherwise...

fwoqpdw · on June 16, 2015

I'm talking about handler duration, doesn't include network latency. If you use lambda you can look at the durations in the logs... a trivial go handler often gets ~5ms if you're using good hardware (but sometimes jumps up to ~100ms)

siscia · on June 16, 2015

Okok, now it makes senses :)

wnevets · on June 15, 2015

I'm pretty high on AWS Lambda right now. Setting and forgetting is so nice.

PretzelFisch · on June 15, 2015

I find the idea very appealing, but it seems like it would lead to micro service hell. How are you managing the Lambdas for test and deployment?

ajaynairataws · on June 15, 2015

A few options - Codeship (https://blog.codeship.com/integrating-aws-lambda-with-codesh...), popular community extensions such as Kappa (https://github.com/garnaat/kappa) Jenkins plugins (https://wiki.jenkins-ci.org/display/JENKINS/AWS+Lambda+Plugi...) and Grunt plugins (https://github.com/Tim-B/grunt-aws-lambda/commits/master/pac...) that you can use as part of your CI/CD pipelines. There was even recent blogpost about using Lambda to deploy Lambda functions :) (https://aws.amazon.com/blogs/compute/new-deployment-options-...). Would be great to hear what else you would like to see.

cddotdotslash · on June 15, 2015

Jenkins has a Lambda plugin now so I've setup the typical GitHub Branch > Webhook > Jenkins > Lambda workflow. Push to a stage branch, the function is updated pretty quickly. Test then push to prod.

IanCal · on June 16, 2015

Great to see it move on from just JS (although I know you can run binaries, it's a little awkward).

There was a 60s execution limit in preview, anyone know if thats still around / likely to change? I've got a task to solve a the moment which is basically

"When a user puts a file on S3, hit an API for each line, store all responses in S3"

Lambda would be awesome for the low end of that, small files would generally be alright, but as the filesize grows I'd need to try and setup workflows to split the file & rejoin the results (and guess how many requests I can make in 60s) which is starting to get into a more complicated setup than I'd really want so I'd be likely to just rent a machine and setup my own queue processing (again more complicated, but more in my control). Maybe I'm actually just looking for simple grid software and I should just rent some servers.

Essentially, what I'd like to use lambda for is "Run this code. It's not big or heavy, or requires much power, but run it and stop charging me when it finishes". I don't want to request machines, setup disks, compare multiple reserved costs, orchestrate them starting & stopping or handle updating them, and I don't want to worry about starting 100 5s jobs and that costing me 100 hours. I'm aware that's rather "moon on a stick", but I do feel like a lot of this is almost there.

I'm missing a big hole in my toolbox since PiCloud shut down. Millisecond pricing, tasks, queues, workflows, custom environments. Anyone here an old picloud user? What have you moved to?

GCE is interesting, with fast boot times and a 10 minute min, 1 minute increment pricing model. Ironworker is interesting, but the potential delay for a standard job (15 min) along with the cost vs computing time it's hard to justify for my tasks over just renting a machine from OVH ($50/mo would pay for a decent 32G machine, so I could run a lot more than the 25/500 limit for the smallest tier) and the 1 hour limits on the smaller plans cause the same issues as with the 60s limits on lambda.

I'm not knocking these services, they're all really impressive, just unfortunately have various restrictions / limitations that limit their application to the problems I currently have. If I had slightly different problems, I'd already be using one of these options. If people are using other services, I'd love to hear what you're using!

Edit - offtopic, but every post I've made recently has gone through twice. Chrome 43.0.2357.124 on a mac here. Anyone else having the same problem?

BillinghamJ · on June 16, 2015

Seems like it would be good to take the file, then dump each line onto a queue, then have lambda run for each thing on that queue. It should parallelize really nicely.

IanCal · on June 16, 2015

That does sound like quite a good approach, there are a few issues though:

1. Some vague throttling is useful, but I can only do this on an account level with Lambda rather than per function. Trying to call the API in bursts of 10000 requests may be problematic, which would limit our use of lambda for other tasks (for which we may be happy running that many in parallel). The default limit of 100 would probably serve well enough though.

2. I've now got to add splitting and recombining code on each end, with concerns around failed jobs being silently missed from the final output file. Although that extra work may leave me with a better approach to handling some failed jobs out of a large file. Hmm.

Part of the issue is that without the execution limit, this is an amazingly simple script. I've got a 10-20 line python script which does the actual processing of a file (read, hit api, store with a pool of N concurrent requests). Lambda is impressive because it adds only a small level of complexity to a problem and gives you a lot in return, just because my use-case is so simple that small amount of complexity adds up to relatively quite a lot.

Currently the setup doesn't hit the API, it just creates a dedicated instance to process a batch of data locally, but I'm hoping to simplify things to send everything through the API and just scale & load balance separately. Having code to automatically turn on & be responsible for turning off machines makes me a bit nervous :) I've already missed that I'd deleted the shutdown command in a script and left a box running for a day while developing.

Thanks for the suggestion though, I'll try and work through it in more detail today, see if I can see a clean way of dealing with the recombination. I think that's the side that I'm less clear on at the moment.

siscia · on June 16, 2015

The main problem that I see in implementing what you are asking is related to the bandwith cost.

I still haven't figure out how to measure how much bandwith a trivial application is using, so my price is only based on time.

If I let you using the service for as long as you like you either need to pay A LOT or I don't allow you to use the network connection...

djhworld · on June 16, 2015

Lambda doesn't support SQS as far as I can tell.

thescrewdriver · on June 16, 2015

Scala support for Lambda:

https://news.ycombinator.com/item?id=9724933

https://aws.amazon.com/blogs/compute/writing-aws-lambda-func...

alexdean · on June 16, 2015

One of the larger painpoints with running Hadoop/Spark jobs on Elastic MapReduce is around classpath conflicts, so it would be great to get a page similar to [1], showing what's available on the Java classpath for AWS Lambda functions.

[1] http://docs.aws.amazon.com/ElasticMapReduce/latest/Developer...

manishsharan · on June 16, 2015

Stripe devs -- if you are listening-- an example of using web hooks with AWS Lambda would be very helpful.

willcodeforfoo · on June 16, 2015

It'd be pretty neat if Lamba just let you run a Docker container.

siscia · on June 16, 2015

What HN dislike about AWS Lambda ?

akhatri_aus · on June 16, 2015

Lack of support for proper sockets is a real bummer.

If binaries can't be used or talk to the app the applications are quite limited.

siscia · on June 16, 2015

Can you elaborate ?

You want socket just to bring down latency ?

You want different client to connect to the same environment so that you can keep a local state of your application ? I am thinking about a videogame or a collaborative office tool.

You simply want to send binary data to a function ?

akhatri_aus · on June 16, 2015

Lambda is great for doing lots of tasks in bulk. Say I have a binary that is able to process data quickly.

On a lambda request nodejs instructs a binary on what to do using a socket. Meaning it binds to a local port and interacts with the binary getting and receiving instructions until the task is complete. In all sub 1 second.

Lambda doesn't allow sockets so its uses are very limited. Binaries also cut down the time it takes to do a task and allow a greater variety of things to be done.

siscia · on June 16, 2015

Wouldn't be simple to ask Lambda to allow binary execution ?

Then you could just pass the input as base64, or whatever other encoding system you please...

SomeStupidPoint · on June 16, 2015

I guess I'm not understanding why you'd use lambda that way instead of just using container instances -- what would be the point?

akhatri_aus · on June 16, 2015

There are tasks to be done that require binaries. Typically 1 second tasks.

This binary needs to interact with Java or NodeJS so it uses sockets [creates a server and nodejs connects to it to tell it what to do] for the request.

A container seems silly since its a sub 1 second task

on June 15, 2015

[deleted]

GauntletWizard · on June 15, 2015

Java lambdas have nothing to do with this feature. This allows you to write a function that is evaluated by a lambda handler; You write only the innermost bit (i.e. "{System.out.println(eachperson)}")

whateveracct · on June 15, 2015

It looks like this just needs a jar, so any JVM language with better FP support would also work.

There is a limit on JAR size though:

> Note that deployment packages uploaded to Lambda are limited to 50 MB

zrail · on June 15, 2015

That's really interesting. The JRuby 9.0rc1 binary tarball is 41M, which means, theoretically, if you didn't package too many additional gems, you could write your lambdas in ruby!

jeffbarr · on June 15, 2015

I would love to hear about some success stories that involve alternate languages on the JVM. If the 50 MB limit proves to be restrictive, please let me know!

aaronharnly · on June 15, 2015

As an example of a use case that fails with the 50MB limit, I'd love to be running Clojure code on Lambda that depends on the Stanford NLP toolkit[1], the minimal form of which is 260MB. (To examine student writing and make useful suggestions to teachers, if you're curious.)

[1] http://nlp.stanford.edu/software/corenlp.shtml

alexdean · on June 16, 2015

I've just put together a quick Scala fatjar containing just Scala 2.11.5 and json4s (which has a fair few dependencies), and that came in at 27,369,865 bytes, so all looks promising for Scala lambdas!

kanwisher · on June 15, 2015

This is for Amazon Lambda service, not java 8 lambdas

michaelvkpdx · on June 15, 2015

But wait... didn't the "What Is Code?" thing say that Java is just for white-collar IT workers? If you've got Node.js, why would you ever need Java?

Yes, I'm being sarcastic.