Hacker News new | past | comments | ask | show | jobs | submit login

I agree, the samples sound very natural. I ask myself though how similar they are to the data that has been used for training, as it would be trivial to rearrange individual pieces of a large training set in ways that sound good (especially if a human selects the good samples for presentation afterwards).

What I'd really like to see therefore is a systematic comparison of the generated music to the training set, ideally using a measure of similarity.




A nice property of the model is that it is easy to compute exact log-likelihoods for both training data and unseen data, so one can actually measure the degree of overfitting (which is not true for many other types of generative models). Another nice property of the model is that it seems to be extremely resilient to overfitting, based on these measurements.


Good point! Are (some of) the chords completely made up, for example, or is it only using chords it has heard before?


Filtering out certain notes from a piano chord can be done by e.g. Melodyne, but that seems far from what's necessary to generate speech, so it would surprise me, if WaveNet can do that?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: