Hacker News new | past | comments | ask | show | jobs | submit login

Late reply, but it was a bug using XLA GPUs to add concurrency to the training process. Maybe someone figured it out or fixed it, but I've moved on already.



You shouldn’t need XLA for multi GPU training. Have you tried training without it?




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: