Here are a few updated differences. disclosure I work for Iron.io
Lambda: node, java
IronWorker: node, ruby, python, php, java, binaries, .NET, go (binaries), and more (specific language runtime versions available)
Lambda: allows 500mb of local disk space to persist your task
IronWorker: 10 gb of local disk space available
Lambda: only current version of code
IronWorker: versions all uploads and allows you to revert
Lambda: no built in scheduler
IronWorker: a flexible scheduler out of the box to run tasks
Lambda: maximum execution time 60 seconds
IronWorker:maximum execution time 1 hour (customizable up to 24 hours for dedicated users)
Lambda: 100 maximum concurrent request. (The maximum can be higher, can someone point me to the absolute maximum if it exist?)
IronWorker: concurrency maximum can go much higher as needed (250 for a production plan). You can also limit concurrency per IronWorker, aka set a worker to only run up to 50 concurrent tasks. (useful if you have bottlenecks)
Lambda: 90 inactive code retention period
IronWorker: no limit to inactive code retention period
Lambda: available on aws
IronWorker: choose your cloud provider: AWS, Rackspace, Microsoft Azure, private clouds, and more.
We've also enabled users to pause the execution of their task queue incase they need to patch or revise their worker code.
It looks like your pricing starts at $49 / month. Do you offer on demand, or cost per task, similar to what amazon lamdba offers? I have some tasks that I'd love to try iron worker for, but $49 / mo for a side project is more than I'm willing to spend.
There's then pay-per-second pricing if you go outside of your plan (which I assume covers the free tier, but I don't know) [edit - if you have this enabled, you don't have to]
> Compute hours that exceed the rated plan will be billed at a rate of $0.075/hr. The free plan provides 10 hours/mo.
Edit 2 - Disclaimer, I don't work for them, just been looking at their pricing pages a lot.
Thank you, that's a lot closer to what I am looking for. Note to iron.io - you should make the free plan a lot clearer. I still can't find the details for it, and this is exactly what I am looking for.
Thanks for the comparison. Regarding Lambda maximum concurrency, I am currently configured for a limit of 20,000 concurrent invokes so perhaps your figure of 100 is not the maximum but simply the default maximum?
If your cloud provider is AWS how do you make so many concurrent requests with monthly pricing. It seems the pricing would incentivise you to use a single machine for all the requests.
I think you might find http://webtask.io interesting for similar use-cases where the full spectrum of http methods are supported.
The premise is a sort of RPC service running arbitrary javascript code within the life-cycle of a request. The huge benefit here is the ability to securely expose secrets to webtask code, allowing you to do things like connect to a database, stripe, 3rd parti api, etc.
Let me know if you have any questions on this model.
The IronWorker doesn't seem to fulfill the same promise of simplicity that Lambda does. It's talking about scheduling and creating docker instances where Lambda says "just give us your code".
Docker support is a new feature to enable more complex jobs where you need to control the entire stack, for instance, maybe you need imagemagick installed on the core machine, or ghostscript. IronWorker can support that. It also allows you to test your workers locally before uploading: http://blog.iron.io/2015/03/the-new-ironworker-development-w....
That said, you don't need to think about Docker if you don't want to and an IronWorker can be a few lines of code too, for example:
I couldn't see anything on timings here compared to the node version. Does it keep the code 'hot' or is there the cold start overhead? I assume it has to be the former for this to be at all viable.
I tried out lambda to handle backend requests with a go executable. For a trivial request/response it took 100-300 ms for the first request and 3-100 ms for subsequent requests. Using better hardware tended to help with the subsequent requests but there was still a lot of variance.
This is too slow and has too much variance for my purposes, which is to respond to user events. Even if I added an optimization of having a go process which lives through multiple lambda calls, the first request latency sucks. (And I haven't verified that the optimization would work - unfortunately you can't use sockets which would make it simpler).
TL;DR: pls support go (with consistent performance)
Consistency is more of an issue with me than speed. That said, I would be happy if a trivial request (prints a short line of output) to a go handler would consistently be <10ms.
I'm talking about handler duration, doesn't include network latency. If you use lambda you can look at the durations in the logs... a trivial go handler often gets ~5ms if you're using good hardware (but sometimes jumps up to ~100ms)
Jenkins has a Lambda plugin now so I've setup the typical GitHub Branch > Webhook > Jenkins > Lambda workflow. Push to a stage branch, the function is updated pretty quickly. Test then push to prod.
Great to see it move on from just JS (although I know you can run binaries, it's a little awkward).
There was a 60s execution limit in preview, anyone know if thats still around / likely to change? I've got a task to solve a the moment which is basically
"When a user puts a file on S3, hit an API for each line, store all responses in S3"
Lambda would be awesome for the low end of that, small files would generally be alright, but as the filesize grows I'd need to try and setup workflows to split the file & rejoin the results (and guess how many requests I can make in 60s) which is starting to get into a more complicated setup than I'd really want so I'd be likely to just rent a machine and setup my own queue processing (again more complicated, but more in my control). Maybe I'm actually just looking for simple grid software and I should just rent some servers.
Essentially, what I'd like to use lambda for is "Run this code. It's not big or heavy, or requires much power, but run it and stop charging me when it finishes". I don't want to request machines, setup disks, compare multiple reserved costs, orchestrate them starting & stopping or handle updating them, and I don't want to worry about starting 100 5s jobs and that costing me 100 hours. I'm aware that's rather "moon on a stick", but I do feel like a lot of this is almost there.
I'm missing a big hole in my toolbox since PiCloud shut down. Millisecond pricing, tasks, queues, workflows, custom environments. Anyone here an old picloud user? What have you moved to?
GCE is interesting, with fast boot times and a 10 minute min, 1 minute increment pricing model. Ironworker is interesting, but the potential delay for a standard job (15 min) along with the cost vs computing time it's hard to justify for my tasks over just renting a machine from OVH ($50/mo would pay for a decent 32G machine, so I could run a lot more than the 25/500 limit for the smallest tier) and the 1 hour limits on the smaller plans cause the same issues as with the 60s limits on lambda.
I'm not knocking these services, they're all really impressive, just unfortunately have various restrictions / limitations that limit their application to the problems I currently have. If I had slightly different problems, I'd already be using one of these options. If people are using other services, I'd love to hear what you're using!
Edit - offtopic, but every post I've made recently has gone through twice. Chrome 43.0.2357.124 on a mac here. Anyone else having the same problem?
Seems like it would be good to take the file, then dump each line onto a queue, then have lambda run for each thing on that queue. It should parallelize really nicely.
That does sound like quite a good approach, there are a few issues though:
1. Some vague throttling is useful, but I can only do this on an account level with Lambda rather than per function. Trying to call the API in bursts of 10000 requests may be problematic, which would limit our use of lambda for other tasks (for which we may be happy running that many in parallel). The default limit of 100 would probably serve well enough though.
2. I've now got to add splitting and recombining code on each end, with concerns around failed jobs being silently missed from the final output file. Although that extra work may leave me with a better approach to handling some failed jobs out of a large file. Hmm.
Part of the issue is that without the execution limit, this is an amazingly simple script. I've got a 10-20 line python script which does the actual processing of a file (read, hit api, store with a pool of N concurrent requests). Lambda is impressive because it adds only a small level of complexity to a problem and gives you a lot in return, just because my use-case is so simple that small amount of complexity adds up to relatively quite a lot.
Currently the setup doesn't hit the API, it just creates a dedicated instance to process a batch of data locally, but I'm hoping to simplify things to send everything through the API and just scale & load balance separately. Having code to automatically turn on & be responsible for turning off machines makes me a bit nervous :) I've already missed that I'd deleted the shutdown command in a script and left a box running for a day while developing.
Thanks for the suggestion though, I'll try and work through it in more detail today, see if I can see a clean way of dealing with the recombination. I think that's the side that I'm less clear on at the moment.
One of the larger painpoints with running Hadoop/Spark jobs on Elastic MapReduce is around classpath conflicts, so it would be great to get a page similar to [1], showing what's available on the Java classpath for AWS Lambda functions.
You want different client to connect to the same environment so that you can keep a local state of your application ? I am thinking about a videogame or a collaborative office tool.
You simply want to send binary data to a function ?
Lambda is great for doing lots of tasks in bulk. Say I have a binary that is able to process data quickly.
On a lambda request nodejs instructs a binary on what to do using a socket. Meaning it binds to a local port and interacts with the binary getting and receiving instructions until the task is complete. In all sub 1 second.
Lambda doesn't allow sockets so its uses are very limited. Binaries also cut down the time it takes to do a task and allow a greater variety of things to be done.
There are tasks to be done that require binaries. Typically 1 second tasks.
This binary needs to interact with Java or NodeJS so it uses sockets [creates a server and nodejs connects to it to tell it what to do] for the request.
A container seems silly since its a sub 1 second task
Java lambdas have nothing to do with this feature. This allows you to write a function that is evaluated by a lambda handler; You write only the innermost bit (i.e. "{System.out.println(eachperson)}")
That's really interesting. The JRuby 9.0rc1 binary tarball is 41M, which means, theoretically, if you didn't package too many additional gems, you could write your lambdas in ruby!
I would love to hear about some success stories that involve alternate languages on the JVM. If the 50 MB limit proves to be restrictive, please let me know!
As an example of a use case that fails with the 50MB limit, I'd love to be running Clojure code on Lambda that depends on the Stanford NLP toolkit[1], the minimal form of which is 260MB. (To examine student writing and make useful suggestions to teachers, if you're curious.)
I've just put together a quick Scala fatjar containing just Scala 2.11.5 and json4s (which has a fair few dependencies), and that came in at 27,369,865 bytes, so all looks promising for Scala lambdas!