It isn't the hobbyists who are making sure that PyTorch and other frameworks runs well on these chips, but teams of engineers who work for NVIDIA, AMD, Intel, etc. who are doing this as their primary assigned jobs, in exchange for money from their employer, who are paying those salaries because they want to sell chips into the enormous demand for running PyTorch faster.
Hobbyist and open-source are definitely not synonyms.
Special mention to Facebook and Google AI research teams that maintain PyTorch and Tensorflow respectively. And also to ptrblck on the PyTorch forums [1] who has the answer to basically every question it seems. He alone is probably responsible for hundreds of millions of dollars of productivity gain.
People don't usually get employed to make things with no demand, and people who work for companies with a budget line don't really care how much the nVidia tax is. You can thank hobbyists for creating a lot of demand for compatability with other cards.
There are so many billions of dollar being spent on this hardware that everyone other than Nvidia is doing everything they can to make competition happen.
I can only point you to cloud financial results and the huge cost of the AI race. Note also the story recently about OpenAI looking at building their own chips. Companies absolutely care immensely about the cost of GPUs. It's billions of dollars.
There is huge demand for AMD cards that can efficiently multiply matrices together. The issue is that while there are currently isolated cases where people can make them do that, it doesn't seem to be possible at the scale that it needs to happen at.
AMD are being dragged along by the market. Willingly, they aren't fighting it, but their focus has been on other areas.
They've shifted a large pool of experienced engineers from legacy software projects to AI and moved the team under a veteran Xilinx AI director. Fingers crossed we should see significant changes in 2024.
Didn't he do what he always does. Rake in a ton of money, fart around and then cash out exclaiming it's everyone else's fault?
The way he stole Fail0verflow's work with the PS3 security leak after failing to find a hypervisor exploit for months absolutely soured any respect I had for him at the time
To be fair, kernel crashes from running an AMD provided demo loop isn’t something he should have to work with them on. That’s borderline incompetence. His perspective was around integration into his product, where every AMD bug is a bug in his product. They deserve criticism, and responded accordingly (actual resources to get their shit together). It’s not like GPU accelerated ML is some new thing.
That's a tough issue to read through, thanks for the link. 'Your demo code on a system setup exactly as you describe dereferences null in the kernel and falls over'. Fuzz testing + a vaguely reasonable kernel debugging workflow should make things like that much harder to find.
> The way he stole Fail0verflow's work with the PS3 security leak after failing to find a hypervisor exploit for months absolutely soured any respect I had for him at the time
That sounds interesting. I tried googling about it but can't really find much other than that failoverflow found a key and didn't release it, and then geohot released his own subsequently. I'd love to hear more about how directly he "stole" the work from the Fail0verflow team.
edit: Reading some sibling comments here, it seems you are either mistaken and/or were exaggerating your claim about the "theft" here. As far as I can tell, he simply took their findings and made his own version of an exploit that they had detailed publicly. That may be in poor taste in this particular community but it's certainly not theft. I do agree that his behavior there was lacking in decency, but not to the degree implied here where I was thinking he _literally_ stole their exploit by hacking them, or something similar to that.
People here generally try to bash people who are much smarter than them, throwing shade at their background. They will say that he abandoned his first company, gave up on tiny grad but both of them are very much alive projects
Marcan (of Asahi Linux fame) has talked about it many times before. But an abridged version
Fail0verflow demoed how they were able to derive the private signing keys for the Sony Playstation 3 console at I believe CCC
Geohot after watching the livestream raced into action to demo a "hello world!" jailbreak application and absolutely stole their thunder without giving any credit
This apparently worked pretty well for him, as I still remember him primarily as "that guy who hacked PS3". Some people let someone else do the hard technical core, then do all the other easy but boring stuff and claim 100% credit.
I remember geohot as being one of the people who developed a fairly successful jailbreak for iPhone. I understand that iPhone jailbreaking is often standing on the shoulders of predecessors, but I believe he does deserve significant credit for at least one popular iPhone jailbreak.
Are you interested in being factually correct, or are you interested in hating? If it's the former, I think you should do some research. If it's the latter, :salute:
this is so far from accurate it should be considered libelous; from the link
> PyTorch/XLA is set to migrate to the open source OpenXLA
so PyTorch on the XLA backend is set to migrate to use OpenXLA instead of XLA. but basically everyone moved from XLA to OpenXLA because there is no more OSS XLA. so that's it. in general, PyTorch has several backends, including plenty of homegrown CUDA and CPU kernels. in fact the majority of your PyTorch code runs through PyTorch's own kernels.
You can use OpenXLA, but it's not the default. The main use-case for OpenXLA is running PyTorch on Google TPUs. OpenXLA also supports GPUs, but I am not sure how many people use that. Afaik JAX uses OpenXLA as backend to run on GPUs.
If you use model.compile() in PyTorch, you use TorchInductor and OpenAIs Triton by default.
Thank you for saying something useful here. I was vaguely under the impression that pytorch 2.0 had fully flipped to defaulting to openxla. That seems to not be the case.
Good to hear more than a cheap snub. OpenAI Triton as the reason other GPUs work is a real non-shit answer, it seems. And interesting to hear JAX too. Thank you for being robustly useful & informative.
Now what I'd like to see is real benchmarks for compute power. Might even get a few startups to compete in this new area.