Hacker News new | past | comments | ask | show | jobs | submit login

Yep, did exactly that. IMO he threw a fit, even though AMD was working with him squashing bugs. https://github.com/RadeonOpenCompute/ROCm/issues/2198#issuec...



To be fair, kernel crashes from running an AMD provided demo loop isn’t something he should have to work with them on. That’s borderline incompetence. His perspective was around integration into his product, where every AMD bug is a bug in his product. They deserve criticism, and responded accordingly (actual resources to get their shit together). It’s not like GPU accelerated ML is some new thing.



That's a tough issue to read through, thanks for the link. 'Your demo code on a system setup exactly as you describe dereferences null in the kernel and falls over'. Fuzz testing + a vaguely reasonable kernel debugging workflow should make things like that much harder to find.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: