I agree that 2500x48hrs is probably a reasonably cost to pay for these kind of sweet results. But it is a bit prohibitively expensive for an ML hobbyist to try to replicate in their own free time. I wonder if there is some way to do this w/o all the expensive compute. Pre-trained models is one step towards this, but so much of the learning(for the hobbyist) comes from struggling to get your RL model off the ground in the first place.
It'd be interesting to see in the graphs (when the OpenAI team gets to them) how good you get at X hours in. Because if you're pretty good at X=4, that's still amazing.
Transfer learning is about the best we can do right now. Using a fully trained ResNet / XCeptionNet and then tacking on your own layers after the end is within reach to hobbyists with just a single GPU on their desktop. There's still a decent amount of learning for the user even with pre-trained models.
+1 this is what I do for my at home (non work) experiments in using word embedding and RNNs for generative text summarization. Using transfer learning makes this affordable as a hobby project.