Hacker News new | past | comments | ask | show | jobs | submit login

Yes. 100% need to scale up the number of gpu workers and scale them back down based on request queue size, which is bursty. Otherwise we could spend 5 figures/month on gpus doing nothing for half the day and then still have unacceptable waits during traffic spikes



Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: