Hi all, excited to share our latest work, OK-Robot, which is an open and modular framework to perform navigation and manipulation with a robot assistant in practically any homes without having to teach the robot anything new! You can simply unbox the target robot, install OK-Robot, give it a "scan" (think a 60 second iPhone video), and start asking the robot to move arbitrary things from A to B. We already tested it out in 10 home environments in New York city, and one environment each in Pittsburgh and Fremont.
We based everything off of the current best machine learning models, and so things don't quite work perfectly all the time, so we are hoping to build it together with the community! Our code is open: https://github.com/ok-robot/ok-robot and we have a Discord server for discussion and support: https://discord.gg/wzzZJxqKYC If you are curious what works and what doesn't work, take a quick look at https://ok-robot.github.io/#analysis or read our paper for a detailed analysis: https://arxiv.org/abs/2401.12202
P.S.: while the code is open the project unfortunately isn't fully open source since one of our dependencies, AnyGrasp, has a closed-source, educational license. Apologize in advance, but we used it since that was the best grasping model we could have access to!
Would love to hear more thoughts and feedback on this project!