Machine Learning Music Composed by Fragments of 100s of Terabytes of Recordings

thanatropism · on Jan 17, 2019

Computer-generated art will have to face very soon the problem that the sheer novelty that a computer can generate (at least apparently) meaningful content like photos, etc. will run out, and people will start expecting that the results are actually good.

chongli · on Jan 17, 2019

As someone who has played roguelikes for nearly 2 decades now, I'm not holding my breath. For a niche genre with a small developer community, roguelikes punch well above their weight in novelty, complexity, and replay value.

Computer generated art isn't really about computers, it's about what the human artist can achieve by tuning the parameters. Like any other medium, there is an enormous range to operate within. At the same time, there are suitable constraints which tend to enhance (rather than limit) creativity.

thanatropism · on Jan 17, 2019

> Computer generated art isn't really about computers, it's about what the human artist can achieve by tuning the parameters

That's not "computer generated art", that's "art generated with computers".

I understand that there's a lot of nontrivial tuning to current generative NN models (I spent the downtime between christmas and new years learning to train image-making GANs in several simultaneous DigitalOcean droplets; the major issue for me as a non-expert was balancing the expressive power of networks and the complexity of training them in a short span of time). But the novelty in the results of a GAN isn't "whoa, a computer enabled me to make this surreal photo-like bitmap", but "whoa the computer learned how to draw photorealistic noses".

This is partly due to an anthropomorphization that we indulge even for relatively dumb programs, but also due to the fact that when we mess with hyperparameters we're not learning a correspondence with (aesthetic) results, we're at best learning a correspondence with network performance, training time, discriminator vs. generator learning curves, etc.

This will be a poor analogy because I'm typing in a hurry, but a painter of fine representational art learns a precise relation between brush movement (something that he does with muscles) and brush stroke (a result); but the aesthetic result is the brush stroke itself. He eventually learns to think in brush strokes; he exists in very close relation to the canvas while making brush movements.

What we're looking at with computer-generated art is a situation where (1) what is sought is a surprise, not a system-and-method for obtaining finely-chosen aesthetic results from expertly-wielded tools and (2) we seek it by becoming masters at the thing that produces the surprise -- we learn to think in terms of online optimization first and then with the higher order abstractions of dense and convolutional layers and training regimens. We're not making paintings, we're making brushes.

And in the beginning it's awesome that paintings are even possible. But if they fail to continue to attract and capture attention, well, brush-making becomes a money-loser.

chongli · on Jan 17, 2019

That's not "computer generated art", that's "art generated with computers".

I don't think such a distinction exists.

whoa the computer learned how to draw photorealistic noses

It didn't, though. You just fed it a whole bunch of photos of people's noses and it learned how to combine them in a way that was pleasing to you, after you spent a long time tuning the parameters and adjusting the algorithm. I fail to see how this is any different from tuning a roguelike's dungeon generating algorithm to produce natural caves and other structures that are pleasing to you. Heck, lots of roguelikes already include datasets of pre-made dungeon pieces (sometimes called vaults) that they can snap together like Lego. This is exactly analogous to brushes in your claim.

The only real difference between a roguelike's dungeon generator and a neural network is that the developer has a better understanding of what's going on so that it doesn't always seem like magic. Having said that, roguelikes still tend to surprise their developers every day.

pjc50 · on Jan 17, 2019

That problem is easily solved by producing enough machine-generated content to drown out all actual human creativity, so people will come to expect and demand that sheen of bland meaninglessness.

(For a less dramatic version of this consider the role of autotune)

DavidComposing · on Jan 22, 2019

Regarding some machine Learning concept, Do you also think some websites which helps in finding chords can have a problem with their cloud storage as well. Like I am amateur to ask this question but just want to know if machine learning is being used in some websites like for ex. say ( https://www.guitaa.com/), Then will it affect the efficiency or responsiveness of the website as well or not?

Because if the website is providing the whole youtube library, then it must be needed a vast amount of storage on the cloud as well. This question is maybe a little out of relation with the main one, But still would like to know the answer!

sbuttgereit · on Jan 16, 2019

Reminds me of some of the 60's-70's era electro-accoustic experimenting, with the exception of some of the more modern synth sounds that crop up here and there. Actually, during the first couple of minutes Berio (https://en.wikipedia.org/wiki/Luciano_Berio) came to mind, though I'm not exactly sure how fair that reference really is.

darkpuma · on Jan 17, 2019

Rather abrasive music in my opinion, although I suppose "abrasive music" is still music, so maybe this is qualified praise?

How does "100s of Terabytes" compare to the amount of music the average human composer/musician listens to over the course of their career?

posterboy · on Jan 17, 2019

> I suppose "abrasive music" is still music

I recently read three distinct legal definitions of "art" have developed (in German courts at least). The details are interesting but the key notion is that in the strictest sense, art is an expression of emotions, so in some sense, this isn't music. Jaringly in this definition, the expression must be recognizable. Of course that's debatable and I don't mean to review the different philosophies. The only insight is that it depends and whether something elicites emotions or not is subjective. It's a useful attempt at a definition to differentiate intention and accident. Edit: Actually watching it now I have to say it's pretty cool and I wonder how much human input was given to control and curate the creation. NN synths over wavetables or granular synthesis are actual instruments one can play, for comparison. And I hold that whatever AI generates, it's still due to the authors.

> How does "100s of Terabytes

It's more than a lifetime.

If a minute of music is 1mb in mp3, then that's a whoopin n x 100 x 1'000'000 minutes. With n=1, that's over a million albums.

Even if the algorithm works with uncompressed pcm data, the ten times difference can be made up for by the n factor.

jondeuce · on Jan 17, 2019

Very roughly, a 3 minute song is ~5MB, and so 100 TB ~ 60M minutes ~ 114 years; needless to say, this is many times the amount of music one person could listen to in their lifetime.

Edit: additionally, it's possible (likely?) that the music is downsampled before training, so this figure may be something of a lower bound.

caseymarquis · on Jan 17, 2019

Maybe they should have used less data of 'higher' quality then?

sp332 · on Jan 17, 2019

It's probably hard to train a computer on music that has been MP3 or AAC encoded since that throws away a lot of the acoustic information. Amazon's Alexa APIs for example upload WAV audio.

3131s · on Jan 17, 2019

It's probably not much more abrasive than the music it's sampled from, if the source is mostly modern classical music.

I'm really surprised by how good this is. At times it sounds a lot like something from certain popular experimental electronica artists, e.g. Daniel Lopatin, Tim hecker.

TheOtherHobbes · on Jan 17, 2019

If you start from terabytes of some of the highest quality orchestral recordings available, you have to work quite hard to make the output sound terrible.

It's a very nice demo piece, and vastly better than most of the NN efforts around.

But it still has the usual problem that it's sort-of-random orchestral wallpaper. Even if you create a workable grammar - by NN accident or design - you still have to solve the problem of intention and agency to generate music that isn't just mimicry or wallpaper.

I've heard some very, very good classical pastiche writing now - piano, chorales, orchestra - so it's essentially a solved problem technically.

Not so much artistically though.

sscotth · on Jan 17, 2019

The Rite of Spring is pretty abrasive in everyone's opinion.

MisterOctober · on Jan 17, 2019

Not to mention the opening of Beethoven's Op. 130

2Ccltvcm · on Jan 17, 2019

Little did we know that an AI studying over 100 years worth of music would approximate Merzbow.

peapicker · on Jan 17, 2019

Too mellow for merzbow. And not as interesting.

MisterOctober · on Jan 17, 2019

"Inner AI Mystique"!

thomasfl · on Jan 17, 2019

Fast foreard 7 minute. The start of Stravinsky’s Le Sacre du printemps, that continous to become something else. Nice remix.

pjc50 · on Jan 17, 2019

Potentially important question: is the ML system a "derivative work" of the input music? How about its output?

IshKebab · on Jan 17, 2019

The bit I random skipped to sounds pretty terrible. Barely "music".

I've definitely heard much better AI-generated music, e.g. stuff from Wavenet.

nerdymom26 · on Jan 17, 2019

Thanks for sharing. I have always found machine learning fascinating; to see it put to use for something beautiful is a wonderful experience.

kilon · on Jan 17, 2019

Nothing impressive but its a start