Hacker News new | past | comments | ask | show | jobs | submit login

It's for KV caching. In most conversations that will mean inference. But you can do reinforcement learning using sampled sequences, and you could use KV caching to speed that up too, so that would be an instance where training could get a slight boost.



Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: