Hacker News new | past | comments | ask | show | jobs | submit login

With the integrated ram and cpu and gpu on apple silicon, however it's done it yields perf results. I do think that probably has higher cost than separately produced ram. And even separate from that, because they have that unified memory model unlike every other consumer device they can charge for it. So 64, 96 or 128 gb?



Its not done for perf results, Xbox doesnt have ram on package and somehow does 560 GB/s


The perf results I was referring to was the ability to run an llm locally (like llama.cpp) that uses a giant amount of ram in the gpu, like 40gig. Without this uniform memory model, you end up paging endlessly, so it's actually much faster for this application in this scenario. Unlike on a pc with a graphics card, you can use your entire ram for gpu. This isn't possible on the xbox because it doesn't have uniform memory as far as I know. So having incredible throughput still won't match not having to page.

Edit - I found an example from h.n. user anentropic, pointing at https://github.com/remixer-dec/llama-mps . "The goal of this fork is to use GPU acceleration on Apple M1/M2 devices.... After the model is loaded, inference for max_gen_len=20 takes about 3 seconds on a 24-core M1 Max vs 12+ minutes on a CPU (running on a single core). "




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: