In October I wrote a blogpost on this subject: https://hlfshell.ai/posts/llms-an...

newswasboring · 2024-03-25T08:57:12 1711357032

Working my way through your blog post and it is so refreshing. Unfortunately my algorithm currently is showing me takes which are extreme on either end (like in your blog post).

> Technology’s largest leaps occur when new tools are provided to those that want to make things.

I love this sentence. And the general attitude of curiosity of your post.

hlfshell · 2024-03-25T16:05:39 1711382739

Thanks! Appreciate the kind words. I should have in the next month or so (interviewing and finishing my Master's, so there's been delays) a follow up that follows more advancements in the router style VLA, sensoiromotor VLM, and advances in embedding enriched vision models in general.

If you want a great overview of what a modern robotics stack would look like with all this, https://ok-robot.github.io/ was really good and will likely make it into the article. It's a VLA combined with existing RL methods to demonstrate multi-tasking robots, and serves as a great glimpes into what a lot of researchers are working on. You won't see these techniques in robots in industrial or commercial settings - we're still too new at this to be reliable or capable enough to deploy these on real tasks.