Hacker News new | past | comments | ask | show | jobs | submit login
The Four Painters: A Video Work Created with Deep Learning (odoruinu.net)
59 points by noradaiko on Dec 23, 2015 | hide | past | favorite | 18 comments



I produced something similar to this shortly after the release of https://github.com/karpathy/char-rnn:

http://i.imgur.com/rb0GJvQ.gifv

Basically, you create a video, dump the video's frames with ffmpeg, run each frame through the RNN, and stitch them back together. It took me several hours to produce just ten seconds of video. Unfortunately, unless you have a Titan X GPU the max size of each image is quite small (certainly less than 1080p), which may be why the frames in this video are split into four quadrants.


Nice result! About 5 years ago, I did the same technique of frame dump->process->join but with a custom algorithm instead of a NN for the process stage. A more manual effort to tweak the filter, but it was fun and came out nice.

I got a new GPU recently and have been doing mostly text RNN these days, but it's fun to look back on this occasionally:

http://binarymax.com/reality_remix.html


I really like your result. The colors especially make it hard for me to believe a computer made that.


> "Split image into 4 parts

My iMac has 2GB of GPU memory because it’s for home-use. It’s insufficient size for larger image output. It can output a little bit larger result when its display resolution set down to 640px x 480px.

It still needed to split images into 4 parts for 720p high quality video.

It caused a side effect which each frame has a border at the joint of the parts."

You seem to be correct on why the frame is split into four quadrants.


Take one meeee.....


Something about the current deep learning hype reminds me of fractals and cellular automata back in the day.

They're all very important and useful scientific advances that can be used to produce visuals that seem to automatically create something strikingly natural. This seems to make people kind of overreach about exactly how much of nature can be represented by this one method alone.

The world is probably not a fractal, the universe is probably not a cellular automaton, and the mind is probably not a "deep learning system" either (of the specific type currently implied by the term).


Deep learning shines when there is a lot of data and computation power. A lot of important problems don't have that much data and there are much better algorithms right now for those problems [1]. Deep learning is definitely faddish though it has its uses.

[1] http://www.sciencemag.org/content/350/6266/1332.full.pdf


The trouble with neural networks is that they're just good enough to keep us from continuing to question our assumptions about human intelligence.

A thousand years ago, even Ptolemy's geocentric model was good enough. After all, it offered an effective means for simulating and predicting planet movement and solar eclipses. But it was complex. It took men of "great learning" to understand the solar system according to Ptolemy's model --something that most current grade schoolers understand intuitively.

In all likelihood, concepts like string theory and neural networks are comparable traps for us today. This is not a pleasant thought. So we're not motivated to search for better alternatives until the current model evidences an insurmountable flaw.

I'm not saying that string theory or the neural network approach is wrong. Obviously it's not. Just remember that Ptolemy's model wasn't wrong either, at least with regard to the moon.


> Just remember that Ptolemy's model wasn't wrong either, at least with regard to the moon.

Yes, it was, even with regard to the moon. Though, since the Earth-Moon barycenter is within the Earth, it may have managed to be tolerably wrong with regard to the Earth-Moon system. But, still wrong.


I am making jellyfish, lily pads, whales and flowers via Fractal Flame software. All expressionistic and/or impressionistic. [samples: https://twitter.com/SCAQTony/media ]

However, there is something about digital that degrades the art by allowing it to be massed produced. One print is equal to another and therefore it almost becomes trash. Example: Imagine an original Mickey Mouse drawing versus a rendering of Woody and Buzz from Toy Story?


As a fractal artist myself as well, I've been ranting about this to myself lately. I've been thinking that the problem is inherent to algorithmic art in particular; I guess I can actually see some worth in renderings of e.g. Pixar characters, but maybe that's because I have kids who appreciate them, or because I've browsed Pixar art books.

What I've noticed is that in order to make fractal art that typical art gallery-types find worth their time, you have to take on a kind of Simon Cowell role and inject your own intuition or library of culturally-attuned sensation memories into the creation phase, turning it into more of an audition phase. "This one just sucks...this one is OK...this one needs a little tweak and it'd look great, kind of like a rainy summer day." In order to make your product useful to other humans, you have to typologize, sometimes really brutally, and especially moreso since your computer can just keep pumping this stuff out.

If you go the other way and shun the story, shun the critique, and say "typology is for the closed-minded," you can end up in scientism, completely free of such hyperbolic typology but lacking a compelling presentation of a set of convergences. Lacking a story, lacking any depth with which humans really identify.

No matter how much machinery goes into computing scientific results, the results are still somewhere along a normal curve, and unfortunately that's not super useful to humans. But psychologically we find extreme typologies very useful. As a group, we'd rather say "ooh, a sports car" or "oh wow, dripping ice cream!" than "oh, it's a blob that could be any number of things."

At its most vain, the typology-free approach can end up as a sort of pornography, an obsession with process to the exclusion of context, and to the deep satisfaction of barely anybody.

That's why I really get frustrated with fractal art or generative art. After a point I have to put on my designer hat, and I might as well have painted the thing from scratch anyway. Or I might as well have used the software more like an artist would use Alchemy, as a sort of imagination cue.

https://www.youtube.com/watch?v=zYYSxZZzgjc


I am going straight to the public - I am also going to diversify (Catwoman panting in the works). Going to rent a gallery, have a party, get photos of people staring at a few giant Chromaluxe prints on aluminum and then set up a site. The art will feature anchoring pricing: 60'x80" limited edition Chromaluxe prints for $4,000 or a 24x36" lithograph for $35. Hoping I sell 10-posters a month of each. I wish you well.


Hey, I really appreciate you sharing that. Those seem like great ideas to me. Best of luck to you as you get that into motion!


Would be more interesting to apply this photo transformation given, as an input, not only the painting the artist produced but also a photograph of the thing the artist was trying to represent. Otherwise all you are doing is recreating the artist's visual tools, not the way the artist translates what they see into the visual palette.

But then, if you tell the algorithm that The Scream is just a painting of a man on a bridge... you're not going to get a very good result.


Does anyone have any idea the style of painting that he was actually doing? (In the first one)


It looks like colored paper covered with black pigment, then then black was scraped away to make the lines.


thanks.


It says at the end what the four styles are: 1. Vincent Van Gogh 2. Pablo Picasso 3. Kiyoshi Yamashita 4. Katsushika Hokusai




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: