All the benchmarks were from a single instance. (Note that I have done some test...

jedberg · on Jan 5, 2016

Hi OP, nice writeup! I hope my comment wasn't construed as dismissing the work, just a criticism of one small part.

It sounds like that wouldn't have been a factor, except for the cap you seem to have discovered on Amazon that you called out.

My only suggestion then is you may want to make it explicit that you ran the benchmarks from a single instance.

zbjornson · on Jan 5, 2016

Thanks! Not at all, it's a great point and something I didn't realize would play into the equation.

hrez · on Jan 6, 2016

Any comments on how it worked out with Lambda?

zbjornson · on Jan 6, 2016

Reluctant to say much because the benchmarks weren't formal. However...

The throughput correlated directly with how much RAM we allocated to the Lambda function (which presumably means we were sharing the VM with fewer other jobs).

512 MB RAM, 19.5 MB/s

768 MB RAM, 29.8 MB/s

1024 MB RAM, 38.4 MB/s

1536 MB RAM, 43.7 MB/s

Note that this also used the node.js AWS SDK, which is slower to download files than some other APIs.

hrez · on Jan 6, 2016

Thanks. I'd guess bigger RAM uses bigger instance types as a host hence more bandwidth. If this was my goal I'd try gof3r to stream data from s3.