Hacker News new | past | comments | ask | show | jobs | submit login
Artbreeder – Extend Your Imagination with GANs (artbreeder.com)
211 points by etaioinshrdlu on May 11, 2020 | hide | past | favorite | 73 comments



Hey! this is Joel, the maker of Artbreeder (originally called ganbreeder). The project was inspired by interactive evolutionary algorithms and novelty search (https://www.youtube.com/watch?v=dXQPL9GooyI&t=1s)

The goal was to make a tool that reflected the collaborative and explorative aspects of creativity. Also to make GAN's (and high-dimensional spaces more generally) accessible to everyone in a fun way.

Let me know if you have any questions :)


Awesome work! I love this concept, but I'm a writer who's interested in collaborative generative writing. Do you know of any efforts to create something similar for writing, such as poetry or micro-fiction? If not, I'm curious if you could point me in the right direction for putting something like that together.


If you'd like to write poetry or fanfiction, I provide GPT-2-1.5b models for both on https://www.gwern.net/GPT-2 - I don't provide a slick interface for writing (although Astralite has an upcoming web interface which is very snazzy), but they could be plugged into any such interface.


Hey, thanks! I dont know about anything collaborative and for writing. https://aidungeon.io/ Maybe collaborative to play. And then there are a lot of writers exploring fine-tuning text models like GPT2 for personal use. https://twitter.com/MagicRealismBot is one of many examples. https://towardsdatascience.com/how-to-fine-tune-gpt-2-so-you...

Maybe you should make it!


Thank you!


If you check out the archives of NaNoGenMo over at github you'll likely find what you're looking for:

https://nanogenmo.github.io/

Check out the yearly resources.


a direct transfer of this concept to text isn't possible because the SotA for text generation is based on transformers, which do not work using latent space sampling. The directly analogous version of stylegan in the nlp world would be vq-vae, but currently it doesn't match transformers in generation quality.

There are other ways to do generative writing though, mostly with prompts.



How did you make it so fast? Are the multiple combinations generated in parallel ahead of time once you breed images? Seems like the GPU power must be immense.


The 1024x1024 StyleGAN2 network can generate an image in around 200ms - 500ms. And Artbreeder may be powered by some beefy GPUs, so that might reduce the latency to something like 100ms. Throw in 100ms of server lag time, and it feels almost instantaneous. It's quite a technical achievement!


Nothing fancy, just running on AWS :) Specifically, https://aws.amazon.com/ec2/instance-types/g4/ - The uploading is actually the expensive part since it can take ~2 minutes per image.


How much are you paying to keep this up? Is it expensive to run GPU instances like this?


it's not cheap but there are ways to make it manageable, i.e a aws spot g4 instance can be as low as 15 cents an hour


How do you handle interrupt? Do you keep multiple instances in different zones?


Joel, heads up the images that aren't supposed to be visible allow directory browsing. e.g. https://s3.amazonaws.com/artbreederpublic-shortlived/ https://s3.amazonaws.com/artbreederpublic/ https://s3.amazonaws.com/ganbreederpublic/ s3cli can navigate and download the images. Cool project!


I had a similar idea inspired by JWildfire but didn't got enough momentum to make it. You made it! And you've so much content on it already. Appreciate your effort. Spent half day having fun on your website :)


Thanks! Glad you liked it. JWildfire is super cool, I would love to add a fractal flame category to Artbreeder.


I recall you had 'a font one' in the works, but I can't find anything about it now, is that still coming? The demo is really neat in the header image!


I still have the model! I started working on the image -> vector conversion (which is more complicated than I expected) and then got distracted. I should finish that up, thanks for the reminder :)


It's such a cool idea, imagine a webfont that changes slightly each page load.

I could make a document that slowly became more illegible after each new visitor, just for fun!


Great work mate! Any plans of openning up an API for a subscription fee or something? :))


Thanks! No plans at the moment but it’s possible!


Hi Joel,

Given that nVidia has a pretty clear license saying that their model and all derivates may only be used non-commercially (https://github.com/NVlabs/stylegan2/blob/cec605e0834de5404d5...) and Artbreeder is clearly commercial (https://artbreeder.com/pricing) but is using StyleGAN (https://artbreeder.com/i?k=755521d9fe07456286a06f06), do you feel that Artbreeder is intentionally violating nVidia's license?

For what it's worth, I fully support your project and I nearly launched something similar myself. But many people choose not to do projects such as yours due to licensing considerations, and I'm wondering in general how to reconcile the moral dilemma. On one hand, nVidia almost certainly doesn't care. On the other hand, they did say very clearly "thou shalt not use this model, or any derivative thereof, commercially."

(It's tempting to believe the license only covers the code, not the model. But the same license appears in their download drive: https://drive.google.com/drive/folders/1QHc-yF5C3DChRwSdZKcx...)


Hi there! Good question, it's one I get a lot and don't mind.

First off, I am not a lawyer but have consulted them. A lot of this is a gray area and not well defined. As you mention, there is the code, the given models, models fine-tuned on those models, models trained from scratch, using styelgan reimplementations, etc. Even with the license being included there does not mean that the output of a model trained from scratch counts as derivative work. And technically I just sell things like upscaling, keeping images private, google sync etc and provide unlimited image creating for free (not sure if that would hold up in court but I think it's relevant). So, no I do not think I am violating the license. But some lawyers may feel differently.

Also, I try hard to cite everything involved. Google was thrilled by my use of the biggan model and how it showed off the model. With the amount of money I have spent on Nvidia cards, I hope they are happy too :)

and then image ownership is a whole other question. Everything in arbreeder is CC0 to keep things simple https://artbreeder.com/terms.pdf

Best, Joel


Are you saying that Artbreeder is using a model trained from scratch?

If so, was that a recent change? When I checked several months ago, your latent space sliders were compatible with nVidia's FFHQ model: https://news.ycombinator.com/item?id=23149739

The only explanation seemed to be that Artbreeder was fine-tuned from nVidia's released FFHQ model.

To rephrase the question: It seems like you were using nVidia's model (or a fine-tuned version) at some point in the past. At the time, you were also selling subscriptions via your pricing page. Do you just sort of ignore the legal implications and hope that it never becomes an issue?

(That's sometimes an effective strategy.)


^ IANAL, but I could imagine it being a slightly less effective strategy if he replies here in the affirmative.


How does that image link tell you which specific model they use?


Answering that requires a bit of technical detail.

If you click "Edit Genes" (e.g. on the upper-right of https://artbreeder.com/i?k=755521d9fe07456286a06f06) you'll find a bunch of sliders. StyleGAN uses sliders like this to control various facial features. For any given StyleGAN model, you can find a slider for a given attribute as follows: Create a classifier for that attribute (for example, "blond hair" probability), generate 50,000 or so outputs from the model, classify each image, then plot a fitted line through all 50k latents. You now have a blonde hair slider for your specific model.

Ok, but sliders aren't unique to nVidia's licensed model. How do we check?

Suppose Artbreeder used a model that was trained from scratch. Since it has a random initialization, its latent space wouldn't be compatible at all with nVidia's FFHQ model. Sliders on Artbreeder wouldn't work with nVidia's model, and vice-versa. A "blond hair" slider for Artbreeder would result in nonsense when applied to nVidia's model.

Artbreeder has an API, through which you can get the latent vector of any given image. You can use this to extract Artbreeder's sliders as follows: Extract the latent vector for a given Artbreeder image; move a slider to the maximum setting; extract the latent vector for the result; subtract the two latent vectors. Presto, you now have the slider.

If you add the resulting slider values to nVidia's FFHQ model, you will find that they control the same facial features on both Artbreeder and nVidia's FFHQ model. They don't match exactly – Artbreeder appears to be fine-tuned from nVidia's base model – but it's very close. So it's clearly derivative, and thus falls under nVidia's "no commercial usage for derivative work" restriction.


[flagged]


I feel bad for asking the question, so maybe I deserved that.

But I asked it because I've spent many nights wondering the same thing about my work in AI, in the United States. Should I pursue something knowing it's legally questionable? How about knowing it's illegal? Big companies seem to walk the tightrope. Joel isn't a big company, yet he's guiding Artbreeder to commercial success. Should I be more ambitious?

These questions affect us all, not just Joel. So I ask it with the sole intention that it might help us navigate the strange waters we often find ourselves in, both as AI researchers and as individuals. I suspect in the coming years that we'll see many more lone-wolf type projects (https://thisfursonadoesnotexist.com/ has recently been making headlines) and legal issues are becoming more and more of an elephant in our shared room.

There are broader ethical questions at play here, too. nVidia trained their model using people's faces, without paying any of them, then claims to hold the rights to limit how people are allowed to make money on the result. Legal, perhaps, but ethical? And if it's unethical, what does that mean we should do, if anything?

Joel's project is a concrete example to point to when someone says "This project has license X, so get real and stop dreaming about starting a business." (The sentiment is shockingly common with almost any AI project, since most AI work comes from big companies with licenses like nVidia's.) Personally, I find it inspiring that someone is doing it anyway.


Don’t feel bad you asked the question. I can tell from content and context that you were asking in good faith. The reply you got, well. Let’s just say it seemed in less so good faith.


It's sad you even had to ask, don't be sad yourself, and your contribution to the conversion taught me several important things I didn't know before about this area, so thanks a bunch!


In a previous career, I was an IP lawyer in the United States, so I've honed a decent sense of what can and can't be said without running afoul of bar guidelines.

> Can we safely take your comment as legal advice in making business decisions about the use of the nVidia model?

What advice were they giving? It seemed like they were mostly asking questions. I don't think any bar in the United States would take those comments as advice, legal or otherwise.

Maybe I missed something. In the interests of quality comments and constructive feedback, maybe you could walk us through your line of reasoning?


FWIW I read sillysaurusx's comment as someone being friendly and pointing out a potential bug, not as snark or arrogance.


He’s not nvidia’s lawyer either so what does it matter either way? Please don’t do this here.


This is really interesting and fun!

One bit of UX feedback - I played about with children, cross breeding and editing genes and never figured out a mental model for what interactions caused a derivative image to be previewed vs being used to update the current seed image.

Clicking on images often saved them, though I'm not sure what that means for me, the user. Do I need to save a child image in order to derive from it? Even if it's just an intermediate result I want to play with? When I breed images, I see the image on the left is updated, but it doesn't seem to update the seed image being used in the 'Children' tab, or the genes available for editting? I ended up saving and then re-opening images as a relatively safe workflow, but it would have been a lot clearer if there was more UI feedback showing me what's a preview and what's an input at any point.

All those questions were rhetorical btw, just talking through my user experience :)


This is a super-cool project. GANs are extremely interesting to experiment with in a creative manner. One thing I often miss in these websites, however, is what I think is one of the most interesting aspects of GAN visualization - namely, a video feature for showing latent interpolations. I guess this is often skipped because inference is done server-side and there's (naturally) limits enforced to avoid excessive use of resources.

In that vein, I've created a webapp [1] that runs entirely client-side using TensorFlow.js, and let's you generate videos of latent interpolation, images and tweak with parameters using a less-conventional generator trained on many different datasets.

[1]: https://www.thispicturedoesnotexist.com


Very cool! Artbreeder does have an animation tool, just go to create -> category -> animate

best, joel


As a game master, it's basically an infinite source of NPC portraits and some are even PC-worthy.


Right?! If only I could reliably get elves, orcs, dwarves, etc.


you simply need to find a relatively big database of those and train a GAN against it!


This was much funner than I expected and the results were really impressive. I read the site title and I was like ehh, its probably another style transfer-like generator. Nope, this is probably one of the best generators I played with!



This is impressive, but at the same time can't help but notice art is kind of its way out when no skills nor time spent doing anything can be exchanged for just one click and generate art. I watched the intro video and the narrator was mentioning "no skills are needed", "anybody can do this", etc and I can't stop thinking how making art is at anybody's fingertips and yet means so much less. I looked through the generated images and wow, one and one are not bad looking, but after a few hundred images i said OK and closed the tab. I had a quantity overload. So where is the human if novels can be auto generated and have a plot and be good novels and where movies can be made in such a way too with auto generated music and dialogue and so on. You get my drift. It's cool but what is it for, art or entertainment? Is this just a phase that will go away?


Hey! So I feel a lot of creativity is 'combinatorial', i.e knows what two things might go well together. Artbreeder kind of gamifies that by making it very easy. So many images may look similar but some people can really develop their own style with time.

Also, often my favorite part of artbreeder is when artists take what they save as the inspiration or building blocks for full works. It's really an inspiration tool, but saying inspiration-breeder is a mouthful.

More generally, I think computation can meaningfully augment human creativity by providing surprise and break us out of our loops.

Best, Joel


How was the training data acquired though?


Humans generate art by selection. There's no reason you couldn't have an art ecosystem where AIs do all the generation, humans do the selection, and that's how it bootstraps. AI Dungeon 2 and 15.ai are already taking steps to close the loop by using human interactions to score outputs and train on them ("preference learning").


> where is the human if novels can be auto generated...

not needed. That's where. Better hope you have ownership of some shares in the companies that are generating these things!

> I can't stop thinking how making art is at anybody's fingertips and yet means so much less.

art has always meant something, but only to those who feel it. There's no intrinsic value, no utility to art cept by the eyes of the beholder.


In a sense I agree, but think that some people used to make a living out of it and that is becoming less and less possible. In the long run fewer people are going to be artists and more will steer towards engineering. I wonder if that is a good thing or not.


It is a good question relevant to the big 3 creative mediums today, visual, music, and text... which are all seeing very interesting results with regards to neural net generation. Remember that we are still in the very early stages of the development of these techniques. The quality will end up phenomenal and you will be able to generate works perhaps on par of quality with a human, but in infinite more volume and lower cost. There may even be artistic techniques that are simply out of reach without these techniques. (Think of trying to make movies in the year 1000... it's not quite the same as today.)

I think we will also see that art is not entirely about communication, but also to a big deal simply created in the eye of the beholder.


But is it about quantity and speed though? I think, as human, lingering on a subject is actually working/brewing an idea subconsciously. If everything becomes possible with these tools and they become indispensable we;re going to end up with poorer writers. The effort and the hard stuff makes good writers...

I've had this vision that one day you may write a simple text and when done with it one will be able to choose a command and the same content of the text transforms to the style of Italo Calvino or Hemingway or GG Marquez or whatever you want. At the end of the vision I asked myself what it would mean. My answer was complete vulgarization of anything. This general trend we're heading to where anybody can do anything effortlessly reminds me of Arthur Clarke's Childhood's End. Great book btw.


this is really wild and sort of causes my brain some distress in the same way that scene from Alien Resurrection when Ripley is in the clone lab with all the failed human alien hybrids


This is amazing! What a clever concept to use these for inspiration.


It's a pity that you cannot upload your own image(s) as starting points.

My theory is that because this is built be using pre-trained BigGAN models, they do not have access to the encoder to convert an image to the latent space, but instead only to the generator that converts the latent space to an image.

Accordingly, they can only remix parameters in latent space, but not locate the latent space that a real-world photo would reside at.


> It's a pity that you cannot upload your own image(s) as starting points.

What do you mean? Click for example "create" -> "portrait" -> "upload".


Apparently it works for portraits. I tried the general case, which doesn't offer an upload.


It sure is quite slow at ~1 hour per uploaded portrait. That looks more like an exhaustive parameter search to me, than a trained encoder network.


Sorry, but this site has a terrible, incomprehensible, user interface. No help, no tooltips on mysterious icons, no UX flow.


I always thought it would be amazing to have something like this for composing music.



Pretty bizarre: this site reliably causes graphics glitches for me. MacBook Pro 16" + Chrome.

https://imgur.com/a/TJnRSGY


Restart your machine and report if the problem still exists.


How many images do I need to mark as interesting before it generates something? I did about 20 on iOS but nothing. Really interested in project though.


Marking them interesting doesn't update the model. If you click on any mage you can 'breed' it with others, or go to the create age to compose or upload something!


I had the same idea for dogs: https://need4breed.github.io


Weird and wild and fantastic!


They forgot one more use case: nightmares! ;D


This is a cool project. Generating 'new' and creative images with computation is very interesting topic. I was particularly curious about the architectural and interior explorations from this user [1].

I signed up and explored what the free account permits [3,4]; however, I have to say I feel the same about this project as I did about "NLP Tool for Technological Research" (30 days ago) [2]. Unless you can upload an image you're familiar with, something you've worked on, I just don't understand what I'm seeing.

Personally, I enjoy studying composition. I enjoy abstract works of Suprematists, Cubists, Futurists, Abstract Expressionists. All of which I think could be more or less reimagined with computational means.

So for my WOW, I want to see what Artbreeder does to an image I'm already familiar with, or how it can imitate other visual styles.

In the last case I really need to see the input, because if the results appear similar to the input there will be copyright issues. In one of my experiments [5] I used the mixed images feature. This provides a graph which gives the appearance of showing how the image was created. Then again, not seeing any code, I don't know for sure what else was used which may or may not be subject to copyright. I'm not saying the project should be open source, but maybe there is some detail that can be exposed, and used to recreate the effect with a different image. If you can do that, then QED the result is not cribbed from un-cited sources. It's just computational pixel-pushing (heck, is it?).

I did some graphic design work early in my career and when we "appropriated" an image to be part of a photoshop composition, the rule of thumb was always you should not be able to reference the source from what you sampled. Some people spoke about 5-10% of an image; however, if you were doing something with any significant visibility and didn't pay for the source, the first rule was make it unidentifiable.

I don't have any particular need for images right now. This is just my first impression based on my experience and what I can't get enough of--more abstract art, or something that takes an image I've studied, have intimate familiarity, and says, This will look much better upside down! haha

[1]: https://artbreeder.com/lineage?k=03b855cd8c852c56b338

[2]: https://news.ycombinator.com/item?id=22847429

[3]: https://artbreeder.com/lineage?k=ee55675001b83a64f71f

[4]: https://artbreeder.com/lineage?k=14366daddf2f12087211

[5]: https://artbreeder.com/lineage?k=a8ec4e77d27dc33f92f3


really cool.


This kills the arts


Not really, it can only make derivative art.


Art should not be easy


my little brother is making music and was quite depressed when I showed him AIVA, I can understand the mixed feelings of 'real' artists


-4 Karma. I'm going for a PB




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: