There seems to be a bunch of work in this area, but I have no idea how you measure progress in this area, it's not like you can do evaluations on a shared task.
And it's clearly not solved yet either - 76% grab success doesn't really seem good enough to actually use, and that with 100k real runs.
I don't really know how to compare the difficulty of sim-to-real transfer research to sample efficient RL research, and it's good to have both research directions as viable, but neither seems solved, so I'm not really convinced that "just scaling up PPO" is that practical.
I'm hoping gdb will be able to tell me I'm missing something though.
And it's clearly not solved yet either - 76% grab success doesn't really seem good enough to actually use, and that with 100k real runs.
I don't really know how to compare the difficulty of sim-to-real transfer research to sample efficient RL research, and it's good to have both research directions as viable, but neither seems solved, so I'm not really convinced that "just scaling up PPO" is that practical.
I'm hoping gdb will be able to tell me I'm missing something though.