To the extend that systems like chat-GPT are valuable, I expect we'll have open ...

wuiheerfoj · on Nov 8, 2023

Retrospectively, a lot of the comments you made could also have been said of Google search as it was taking off (open source alternative, SETI-like distributed version, copyright on data being the only blocker), but that didn’t come to pass.

Granted the internet and big tech was young then, and maybe we won’t make the same mistakes twice, but I wouldn’t bet the farm on it

xeckr · on Nov 8, 2023

>distributed training networks

Now that's an idea. One bottleneck might be a limit on just how much you can parallelize training, though.

Aeolos · on Nov 8, 2023

There's a ton of work in this area, and the reality is... it doesn't work for LLMs.

Moving from 900GB/sec GPU memory bandwidth with infiniband interconnects between nodes to 0.01-0.1GB/sec over the internet is brutal (1000x to 10000x slower...) This works for simple image classifiers, but I've never seen anything like a large language model be trained in a meaningful amount of time this way.

resters · on Nov 8, 2023

Maybe there is a way to train a neural network in a distributed way by training subsets of it and then connecting the aggregated weight changes to adjacent network segments. It wouldn't recover 1000x interconnect slowdowns, but might still be useful depending on the topology of the network.