Laptop to Lambda: Outsourcing Everyday Jobs to Thousands of Transient Containers

keithwinstein · on July 25, 2021

Thank you for this nice writeup! This paper was led by my student Sadjad Fouladi (https://sadjad.org/), part of a broader theme of coercing a "purely functional-ish" design onto everyday applications. There's a less academic-ese version with a few extended results that was published in ;login: magazine (https://www.usenix.org/system/files/login/articles/login_fal...). There was also a good analysis here (https://buttondown.email/nelhage/archive/papers-i-love-gg/) and don't miss https://buttondown.email/nelhage/archive/http-pipelining-s3-... .

Some of Sadjad's other work has included:

- ExCamera, which somewhat kicked off the trend of "fire up 4,000 lambda workers in a burst, all working on one job" -- for things like making a neural network search a video frame-by-frame, video compression in parallel at sub-GOP granularity, etc. (https://news.ycombinator.com/item?id=16197253)

- Salsify, which reused the "purely functional" video codec from ExCamera to improve WebRTC/Zoom-style live video (https://news.ycombinator.com/item?id=16964112 , https://news.ycombinator.com/item?id=20794541). Sadjad is giving an Applied Networking Research Prize talk about this work at IETF tomorrow.

- 3D ray-tracing (running PBRT on thousands of Lambdas, sending rays across the network), SMT/SAT solving, etc.

We're working to extend this line of work towards a more general, Wasm-based, "purely functional" operating system where most computations operate on content-addressed data and are content-addressed themselves, and determinism and reproducibility is properties guaranteed by the OS. Sort of analogous to how the operating systems of today (try to) guarantee memory isolation between processes. Imagine, e.g., a Git repository where you could represent the fact that "blob <x> is the result of running computation <y> given tree <z> as input," and anybody can verify that result, or rebase the computation to run on top of their own input. If you're interested in this general area, please consider doing a PhD at Stanford and/or get in touch -- I'm hiring.

boulos · on July 25, 2021

Hi, Keith! Glad to see you're still enjoying hipster compute :).

How/what do you think about Cloudflare workers, fly.io, and similar "run pure-ish functions anywhere"? I no longer have any skin in the game, but it seems to me that "ignoring locality" just means having to reinvent locality later on.

keithwinstein · on July 25, 2021

Heya, great to see you pop up here! I gotta be honest -- I think EC2 (and, in general, doing computation in units of VMware/Xen-style virtual PCs) is the actual hipster compute substrate. AWS Lambda feels closer to cgi-bin from 1995, i.e. back when things still made sense. (Have you ever joined a tech company and been handed a 10 gigabyte VM image that uses a Vagrant pipeline to provision itself so you can get a working dev environment, except the pipeline only works if 100% of its 10,000 downloads succeed, so the whole thing is super-flaky, but nobody at the company knows because they only ran it once when they first joined and have just kept the same local dev VM ever since? That's what hipster compute means to me.)

All that aside, Cloudflare workers/fly.io/Fastly Compute@edge/Lucet/Google Cloud Run seem really cool, and the resulting work on Wasm and its ecosystem is fantastic, but they're also not exactly what excites me. Deploying code close to the edge (or "anywhere" in particular) isn't very important if the application only makes one round-trip. Even if my code is pure, it's not like fly.io is willing to sign a certificate saying, "We evaluated function <y> on input <z> and the correct answer is <x>, signed Fly Inc., and if you can prove us wrong in the next 10 years, our insurance company will pay you $1 million from our E&O policy." Which would really be cool. And, I don't know of people spinning up 4,000 nodes on those systems in 100 ms to do a 1-second-long computation. I haven't seen any of the providers or outsiders benchmarking the "burst-to-N,000-nodes latency" numbers averaged over many trials at various times of day. (We measured GKE a small number of times in the gg paper [fig. 7] and found it to be... really slow at that particular metric.)

I don't think we want to ignore locality! But I do want the OS to be able to secure access to thousands of cores in <1 second for <10 second duration workloads, and I think many applications would be willing to compromise on locality, or accept heterogeneous/irregular locality, in exchange for that. I'd still love visibility into the locality I end up with, I'd love not to have to do flaky NAT-traversal hacks to get direct communication among nodes, and I could imagine the application bidding more to persuade the infrastructure owner to provide computation in larger units (i.e. more cores on fewer machines, machines in a placement group with full bisection bandwidth, etc.), which is sort of where Lambda seems to be heading already.

(Long term, I don't really think applications should be renting cores and RAM per unit time and thinking about locality; I'd love to be dealing with the infrastructure provider in terms of some higher-level abstraction, because then you could imagine the provider might be genuinely incentivized to discover better ways of computing the same answer, to our mutual benefit.)

r3trohack3r · on July 25, 2021

I’m loving this train of thought Keith.

What are your thoughts on program correctness and runaway cost. I’m a little uncomfortable running a workload that could scale unexpectedly to a denial of wallet.

For this research, how did you enforce bounds on your workload to prevent exceeding your funding budget? Is the whole compute graph calculated locally? The recursive workloads seem particularly anxiety inducing.

dflock · on July 26, 2021

You can set daily (and weekly, monthly etc...) budgets in AWS now, which helps. I think other providers have something similar.

alfiedotwtf · on July 25, 2021

> but nobody at the company knows because they only ran it once when they first joined and have just kept the same local dev VM ever since?

Lol. I've got a blog post for exactly this, and is the reason why I try to push the opposite - Dockerise everything and treat machines like git branches i.e disposable. Life's great when you don't have to tread on eggshells around a pristine environment and you're only a single `apt-get install` away from disaster!

seg_lol · on July 25, 2021

I too have been thinking about <<general, Wasm-based, "purely functional">> content addressed computation.

I think it can support both legacy applications as well as purely functional uses. I really want to support both, the case for linking against any git commit and doing live differential testing is really enticing. I toyed with a serverless deployment system years ago where code was callable by githash and was run directly from git. One could execute any version at any time. This system would be able to automatically rerun executions against new code to track regressions across many dimensions. On failure, the system could fall back to older code paths. TBD how to manage modularity and coherence across sets of functions, restart might need to happen at a much higher level.

For processing an input stream, I think the lambdas would need to be tail recursive so that the internal state could be externally checkpointed, stream_setup/process_chunk/stream_close. process_chunk would need to emit either a total copy of its internal state, or a token linking to persistent storage.

Curious what your current set of basis functions are and how failures are accounted for?

alfiedotwtf · on July 25, 2021

> content addressed computation

Isn't this what smart contracts were all about?

mbo · on July 26, 2021

> We're working to extend this line of work towards a more general, Wasm-based, "purely functional" operating system where most computations operate on content-addressed data and are content-addressed themselves, and determinism and reproducibility is properties guaranteed by the OS. Sort of analogous to how the operating systems of today (try to) guarantee memory isolation between processes. Imagine, e.g., a Git repository where you could represent the fact that "blob <x> is the result of running computation <y> given tree <z> as input," and anybody can verify that result, or rebase the computation to run on top of their own input.

How different is this to NixOS (https://nixos.org/)? (This isn't a leading question, I don't know the intricacies of how NixOS works to compare that with what you're proposing)

nynx · on July 25, 2021

This is a super interesting direction. I wrote a wasm OS [0] a few years ago—it didn't go quite that far, but it could've had I spent an additional couple years on it.

[0]: https://github.com/nebulet/nebulet

JZL003 · on July 25, 2021

My work uses GCP not AWS so I've been experimenting with google cloud run (it's actually parallelizing R code so need the docker container infra). My only problem is that I have very bursty useage and the auto-scaling is too slow. I made one attempt [1] to encourage larger allocation but don't know another way. Do people have experience with this

[1] Slightly costly but ~ 5minutes before I need it, I set the minimum instance size to a larger number so it starts ramping up, then when I'm done I lower it

nuclearnice1 · on July 26, 2021

> google cloud run (it's actually parallelizing R code so need the docker container infra).

I'm not sure how the docker container support of Google Cloud differs from AWS. But FYI AWS Lambda supports containers as of Dec 2020.[1] The scaling on AWS [2] might be more to your liking.

[1] https://aws.amazon.com/blogs/aws/new-for-aws-lambda-containe...

[2] https://docs.aws.amazon.com/lambda/latest/dg/invocation-scal...

harpratap · on July 26, 2021

GKE with autopilot is slightly faster than Cloud Run, but you end up paying for master nodes (if you have more than 1 cluster per region).

chubot · on July 25, 2021

2019 thread on the paper: https://news.ycombinator.com/item?id=20433315

Another review: https://buttondown.email/nelhage/archive/papers-i-love-gg/

haolez · on July 25, 2021

This could be very useful for quantum chemistry simulations, which are generally parallelizable and very CPU intensive. If gg gets tweaked to support MPI, this niche could have a breakthrough!

neolog · on July 25, 2021

Hi Micah, I'd like to follow these posts but I don't like signing up. Would you mind adding an Atom/RSS feed?

MarkSweep · on July 25, 2021

There is a feed. It is title “untitled” though so it may be hard to find in your feed reader after you add it:

https://www.micahlerner.com/feed.xml

gumby · on July 25, 2021

Good find -- I searched in the page source and there was no reference to it.

gumby · on July 25, 2021

I filed a github issue asking the author to enable the RSS feed which their web tool (Hugo) has built in.

mlerner · on July 25, 2021

Thanks!

mlerner · on July 25, 2021

Thanks for the feedback! Does this Atom feed work for you? https://www.micahlerner.com/feed.xml

neolog · on July 25, 2021

Yep, thanks.