So char-by-char models is the next Word2Vec then. Pretty impressive results.
It would be interesting to see how it performed for other NLP tasks. I'd be pretty interested to see how many neurons it uses to attempt something like stance detection.
Data-parallelism was used across 4 Pascal Titan X gpus to speed up training and increase effective memory size. Training took approximately one month.
Everytime I look at something like this I find a line like that and go: "ok that's ncie.. I'll wait for the trained model".
Yeah, part of what let word2vec make such a splash that it became the one word embedding model everyone has heard of, is that the word2vec team released their model.
This is a really cool example OpenAI has, but I don't know why I should ultimately care about their character model more than anyone else's if all we've got is their description of how cool it is.
I hope OpenAI defies their reputation for closedness and releases the model.
I don't know why I should ultimately care about their character model more than anyone else's if all we've got is their description of how cool it is.
Well an unsupervised technique that learns this much meaning from text is amazing! I meant it when I said this might supplement word2vec, and that would make it one of the most important breakthroughs in years.
The comments critical of OpenAI don't make a lot of sense. They have always been very good at releasing stuff, and my comment about waiting for a trained model should be read as jealousy over not being able to train it myself..
It would be interesting to see how it performed for other NLP tasks. I'd be pretty interested to see how many neurons it uses to attempt something like stance detection.
Data-parallelism was used across 4 Pascal Titan X gpus to speed up training and increase effective memory size. Training took approximately one month.
Everytime I look at something like this I find a line like that and go: "ok that's ncie.. I'll wait for the trained model".