Creating photorealistic images with neural networks and a Gameboy Camera

jMyles · on Feb 17, 2017

The site appears to be overworked for the moment, so I can't read it.

My question (which may or may not be answered in the post): what made you embark on this project?

I don't mean "because I can" stuff - it's an awesome hack. What I mean is: what made the Gameboy Camera culturally relevant for you to hack on? What ideas started this? Did you own a Gameboy Camera in your youth? Did you have a fantasy of something like this then?

DanBC · on Feb 17, 2017

It's a shame you got downvotes for a polite, friendly, question. (I upvoted you but it doesn't seem to have made much difference).

In the cache the author says:

https://webcache.googleusercontent.com/search?sclient=psy-ab...

> Back in 1998, the Gameboy camera got the world record as “smallest digital camera” in the Guinness book of records. An accessory you could buy was the small printer you could use to print your images. When I was 10 years old we had one of these cameras at home and used it a lot. Although we did not have the printer, taking pictures, editing them, and playing minigames was a lot of fun. Unfortunately, I could not find my old camera (no colored young Roland pictures, unfortunately), but I did buy a new one so I could test my application.

khedoros1 · on Feb 17, 2017

>As you can see the random noise on top of the image creates the “gradients” you see in the gameboy camera images that give the illusion of more than 4 colors.

The general approach being described is called "dithering". It usually involves more than adding random noise to the image. One common algorithm is called Floyd-Steinberg (https://en.wikipedia.org/wiki/Floyd%E2%80%93Steinberg_dither...), and it involves tracking cumulative error between neighboring pixels and adjusting color when enough error accumulates.

excelangue · on Feb 17, 2017

> A big problem trying to create color images from my own face was getting them off the gameboy camera. ... What was left was the great method of taking images of the screen.

There is another option! You can get a backup device called the "Mega Memory Card" and an EMS64M Game Boy flash cart. Back up the GB Camera's SRAM with the Mega Memory Card and restore to the EMS64M. Then you can use the flash cart's transfer the save to PC and dump with software.

I regularly use this method together with a little utility I wrote[1] to get GB Cam images onto my website. The Game Boy Camera is a cool little gadget!

1: https://github.com/excelangue/gbcdump

gwern · on Feb 17, 2017

"Thanks to this post http://distill.pub/2016/deconv-checkerboard/ I found out about the alternative deconvolution layer and implemented that one."

Yeah, I saw the same thing with WGAN. ConvTranspose2d is not that great for upscaling because it creates artifacts. That said, the post actually recommends doing a 'subpixel' convolution for upscaling (something like, do a convolution out to channels for each pixel, then use PixelShuffle to map it back into a 3-channel image), not doing a bilinear/nearest-neighbor + Conv2d(3,3).

(A GAN would probably also deliver better colorizing results in general.)

Kroniker · on Feb 17, 2017

If this keeps up, that trope of CSI's "enhance, zoom, enhance" will become reality.

misnome · on Feb 17, 2017

Except instead of enhance you have "Make up something based on your training set".

accountface · on Feb 17, 2017

Our brains kind of do that a lot anyway

mistercow · on Feb 17, 2017

I wonder which is better, actually. On the one hand, a lot of people will swear that what their internal pattern matcher produces is reality (although people with training tend to know that you can't put that much stock in it). So maybe the output of a computer will feel a little less real to laymen.

But I kind of doubt it. I have a feeling that, for example, it's going to be difficult to explain to a jury that the image that the computer spat out from eight pixels on a security feed is not reliable – that any resemblance they see to the defendant is simply not relevant. If an artist took those eight pixels and drew a picture, they'd be laughed out of the room. If a computer does it, people primed by shows like CSI might think that it's actually valid.

Maybe you can appeal to people's common sense, and show them the original input. A crafty defense might show alternative "enhancements" based on non-face training sets to drive the point home. But in the end, we're probably going to need to ban this kind of technology as evidence to avoid confusing jurors.

Cerium · on Feb 18, 2017

Just like the old man's testimony in "12 angry men". He didn't see, but his mind filled in the wrong details.

dghughes · on Feb 17, 2017

It has, almost, pretty close.

https://arstechnica.com/information-technology/2017/02/googl...

andrewclunn · on Feb 17, 2017

"I was robbed!"

"We'll put out a call."

"Don't you want to know what he looked like?"

"That's okay, we've got a likely composite sketch on file."

Karunamon · on Feb 17, 2017

Site looks like it got flattened, database error. Here's an archive: http://archive.is/R6m1h

tomatsu · on Feb 17, 2017

> Error establishing a database connection

Another reason why static website generators are great.

Anyone got a mirror?

pirocks · on Feb 17, 2017

Link for the lazy:https://webcache.googleusercontent.com/search?sclient=psy-ab...

laurentdc · on Feb 17, 2017

In Google, type cache:http://www.pinchofintelligence.com/photorealistic-neural-net...

sparky_ · on Feb 17, 2017

The reddit.. er, HN hug of death

LeoPanthera · on Feb 17, 2017

Is it not called the Slashdot Effect anymore?

monksy · on Feb 17, 2017

> Slashdot

Whats that? I thought they went out of business after the dot com crash.

sharemywin · on Feb 17, 2017

How fast does it take to run? could it be done in real time for compression? Also, wonder how low of resolution you could go?

dewitt · on Feb 17, 2017

Yes, it can be used for image compression:

https://research.googleblog.com/2016/09/image-compression-wi...

Here's the paper:

https://arxiv.org/abs/1608.05148

(No connection to the authors, just found it super cool when I read it a few months back.)

simcop2387 · on Feb 17, 2017

I wish they would have compared a JPEG with quality reduced to match the filesize of their NN compressed image. It'd be useful to compare what artifacts are introduced and seeing just how much better it does for the same size.

curuinor · on Feb 17, 2017

Any and every machine learning algorithm can be used for compression. Most ConvNet shit cannot be run in true real time in any way shape or form even with GPU: latencies are on the order of 500ms-1sec, so noticeable even if you have more datacenters than God, like Google does.

draugadrotten · on Feb 17, 2017

So, smoothing and applying a brownish/pink skin color.

semi-extrinsic · on Feb 17, 2017

What would be really interesting to see in these kinds of articles is how well you could do with just a batch Photoshop operation hand-tuned on a handful of photos, and then a comparison of how much better the NN does.

fudged71 · on Feb 18, 2017

Isn't that exactly what he did?

I'm assuming he didn't actually take those photos with a gameboy because he has the full color versions of the shots as well.

semi-extrinsic · on Feb 18, 2017

No, he generated "fake" Gameboy Camera images from the full color shots, by applying a crude form of dithering to graytone. I think that's perfectly reasonable.

What I'm saying is I would like to see comparison with a fourth image produced from the first (black and white) image by a batch Photoshop operation. I would guess a suitably tuned application of "brown tint, then selective Gaussian blur, then unsharp mask" would get you pretty close.

SamBam · on Feb 17, 2017

That's pretty much what I thought. I think the concept was pretty neat, and it does a pretty job of smoothing the gradients while maintaining the detail, and I think that's probably the hardest part.

It would be hard for it to accurately get skin color, since the camera is only seeing skin, and it all gets leveled to a similar lightness. So, I'm not surprised that skin color was hard, but the report still probably should not have included the line "Note that even skincolor is accurate most of the times" -- I guess "most" could almost be considered accurate, if the majority of the samples had the same pinkish skin to start with. Almost all the results from the celebrity dataset ended up with the exact same tone, despite wildly different input tones: http://imgur.com/a/daJUa

alex_hirner · on Feb 17, 2017

The model clearly overfits if to compare the fidelity of pictures from the training set vs the out of sample ones (sent in by others).

saycheese · on Feb 17, 2017

It would be interesting to see what the real people looked like, maybe even use this data against what's generated to gauge the version that's generated.

oskenso · on Feb 17, 2017

For anyone else trying to pull gb camera images as well as other gameboy saves off their carts.. I used this arduino shield!! https://www.insidegadgets.com/projects/gameboy-cart-shield/

It works really well :D

amelius · on Feb 17, 2017

Seems similar to "super-resolution" algorithms. Some of those also use neural networks.

yeahdef · on Feb 17, 2017

shout out for andrew eldritch from sisters of mercy

SyneRyder · on Feb 18, 2017

I see you got downvoted, but I'm glad someone could remind me who the goth guy in the sunglasses was.