Hacker News new | past | comments | ask | show | jobs | submit login

That's an interesting idea. Analogous to how LLMs are simply "text predictors" but end up having to learn a model of language and the world to correctly predict cohesive text, it makes sense that "video predictors" also have to learn a model of the world that makes sense. I wonder how many orders of magnitude further they have to evolve to be similarly useful.



Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: