I don't know if that's a blocker. Ordinary people commonly rent a $40k machine f...

jetrink · on Jan 11, 2023

That's a great comparison. For a real number, I just checked Runpod and you can rent a system with 8xA100 for $17/hr or ~$700 for 38 hours. Not cheap, but also pretty close to the cost of renting a premium vehicle for a few days. I've trained a few small models by renting an 1xA5000 system and that only costs $0.44/hr, which is perfect for learning and experimentation.

amelius · on Jan 11, 2023

It would be great if a tradeoff could be made, though. For example, train at 1/10th the speed for 1/10th of the cost.

This could correspond to taking public transport in your analogy, and would bring this within reach of most students.

londons_explore · on Jan 11, 2023

Slower training tends to be only a little cheaper, because most modern architectures parallelize well, and they just care about the number of flops.

If you want to reduce cost, you need to reduce the model size, and you'll get worse results for less money.

mk_stjames · on Jan 11, 2023

The problem with that is currently, the available memory scales with the class of GPU.... and very large language models need 160-320GB of VRAM. So, there sadly isn't anything out there that you can load up a model this large on except a rack of 8x+ A40s/A100s.

I know there are memory channel bandwidth limits and whatnot but I really wish there was a card out there with a 3090 sized die but with 96GB of VRAM solely to make it easier to experiment with larger models. If it takes 8 days to train vs. 1, thats fine. having only two of them to get 192GB and still fit on a desk and draw normal power would be great.

buildbot · on Jan 11, 2023

Technically this is not true- there are a lot of techniques to shard models and store activation between layers or even smaller subcomponents of the network. For example, you can split the 175B parameter bloom model into separate layers, load up a layer, read the prev. layers input from disk, and save the output to disk.

And NVIDIA does make cards like you are asking for - the A100 is the fast memory offering, the A40 the bulk slower memory (though they added the 80GB A100 and did not double the A40 to 96GB so this is less true now than the P40 vs P100 gen).

Oddly, you can get close to what you are asking for with a M1 Mac Studio - 128GB of decently fast memory with a GPU that is ~0.5x a 3090 in training.

sbrother · on Jan 12, 2023

Do you know if there's any work on peer-to-peer clustering of GPU resources over the internet? Imagine a few hundred people with 1-4 3080Tis each, running software that lets them form a cluster large enough to train and/or run a number of LLMs. Obviously the latency between shards would be orders of magnitude higher than a colocated cluster, but I wonder if that could be designed around?

pizza · on Jan 12, 2023

Bloom-petals

sbrother · on Jan 12, 2023

Amazing. Thank you.

pizza · on Jan 13, 2023

No prob. I think it’s a great idea

amelius · on Jan 11, 2023

I guess this would only become a reality if games started requiring these cards.

mcbuilder · on Jan 11, 2023

Well if it used to cost you $1 for 1hr at 1x speed, now it will take you 10hr at 0.1x speed, and if my math checks out $1. You need to shrink the model.

amelius · on Jan 11, 2023

But of course now you run it on your own computer instead of in the DC, which changes the numbers. Especially if your student dorm has a shared electricity bill :)

willseth · on Jan 11, 2023

The good news is that, unlike vehicles, the rate for rented compute will continue to drop

Apofis · on Jan 11, 2023

Let's not forget that rendering 3D Animations in 3DSMAX or Maya used to take days for a single frame for a complex scene, and months for a few minutes.

swader999 · on Jan 11, 2023

You have to gas it up and heaven help you if it gets a scratch or a scuff.

speed_spread · on Jan 11, 2023

Great news! Cloud instances energy usage is included in their price, and because they're remote and transient it's impossible to permanently damage them.

aequitas · on Jan 11, 2023

I think the equivalent of being not careful and getting a dent in this context is to leave it open to the internet and having a bitcoin miner installed.

Aissen · on Jan 11, 2023

You free the instance and the miner is gone.

iso1631 · on Jan 11, 2023

As you are paying for the resources you use that's fine.

The closest would be if you used some form of software bug to actually cause physical damage, certainly not impossible, but extremely unlikely compared with actually physically damaging a car.

idonotknowwhy · on Jan 11, 2023

A better fit would be, if you have unlimited liability like with AWS, and you leak your key pair. Then someone runs up a 100k bill setting up mining instances

DesiLurker · on Jan 11, 2023

but you still have to pay for network ingress/egress traffic.

ofcourseyoudo · on Jan 11, 2023

Similarly maybe we should only let people rent a NanoGPT box if they are over 25 and they have to get collision insurance.