Perhaps I'm saying the obvious here, but why not train with a bunch of images wh...

goldenkey · on Feb 12, 2018

The network is being trained for photographs, not CGI. I suspect the different cues will end up producing wildly different trained networks. But the green screen idea is still an interesting and worthy proposal.

lsb · on Feb 12, 2018

Training via synthesis from at least one pre-existing generative model (image / speech / et cetera) is definitely a thing that people do

allenz · on Feb 12, 2018

1. I don't think that they have enough images taken in front of a green screen. Just changing the background has diminishing returns because the network may start to memorize the foreground images.

2. The network may rely on differences in lighting, etc, and fail to generalize.