llm.c: multi-GPU, bfloat16, flash attention, ~7% faster than PyTorch

pama · 2024-05-04T05:16:13.000000Z

Much faster yet than stable pytorch 2.3 (46% on A100, as per the tweet), and much much faster yet compared to pytorch 2.2, which was the stable version a couple weeks ago. Also llm.c is much faster yet when the performance comparison is on H100 instead of A100, or on multiple GPU instead of a single one.

gpapilion · 2024-05-04T00:03:54.000000Z

I’d be happier with 93% of PyTorch but works on multiple gpu manufacturers.

reallymental · 2024-05-04T04:26:51.000000Z

That... wasn't the original intention of the project. It was to create a C version of the PyTorch code that could train GPT-2.

tyfighter · 2024-05-04T21:23:02.000000Z

Yeah, I'm sure that's what anyone trying to build some kind of AI startup that's managed to acquire a small handful of A100 or even better H100s thinks too. "Those cards sure were expensive, but ethically, I'd rather the software run slower to give me future imaginary options than to get the most out the hardware I just bought."

michaelgiba · 2024-05-04T17:47:23.000000Z

it’s pretty impressive that PyTorch is only 7% slower than this given it can be used so generally

wanderingmind · 2024-05-05T00:51:14.000000Z

How does it compare to GGML? That I'd what they must be comparing and yet I don't see any comparison made

ngcc_hk · 2024-05-05T15:33:15.000000Z

CPU or coda … if it is c can it be used on Apple and intel etc. cpu and Gpu.

6r17 · 2024-05-05T15:24:20.000000Z

what about CPU / GPU diff ? is this an improvement across all or only on GPU ?

ein0p · 2024-05-04T04:28:24.000000Z

Crated over the period of like 4 weeks by random people all over the internet

clay_the_ripper · 2024-05-05T00:22:48.000000Z

Tinfoil hat time. The recent gpt2 chatbot that everyone thought was a new open ai product - could it be?

“ You start with the gpt2.c pure CPU implementation, and see how fast you can make it by the end of the course on GPU, with kernels only and no dependencies.”

Remarkably similar nomenclature. I give it 1% chance this is related. I did play with that chatbot and it was smarter than gpt4 whatever it was.