Long Short-Term Memory-Networks for Machine Reading

vonnik · on Feb 8, 2016

Great paper. For anyone who needs a primer on LSTMs, this one has a GIF: http://deeplearning4j.org/lstm.html

jimfleming · on Feb 8, 2016

Chris Olah also has a great introduction: http://colah.github.io/posts/2015-08-Understanding-LSTMs/

jeremynixon · on Feb 8, 2016

Andrej Karpathy has a great resource at: http://karpathy.github.io/2015/05/21/rnn-effectiveness/

nl · on Feb 8, 2016

We're so close to getting rid of feature engineering for text processing.

As someone who does this professionally, this is a good thing!

transpy · on Feb 8, 2016

Can you ellaborate on this please? My goal is to work in text processing professionally, I have the intuition that text should 'learn' its own features. I'm learning Python, reading about machine learning and practicing my coding all the time.

existencebox · on Feb 8, 2016

Recently I had to build an algorithm that did a form of substring detection. As part of this process I had to generate feature vectors for the actual model to classify tokens of the string as part-of-substring or not. Prior to this though, I had to do a few levels of tokenization, normalization, and preprocessing, to get the raw text into a form that the substring classifier could use the resultant data effectively.

Ideally if you have a good parse of the text from step 0, you don't need to do nearly as much munging/processing yourself, and can just focus on the thrust of your specific algo, rather than cleaning and generating the feature data to drive it.

I'm personally very skeptical as to if this will make feature engineering disappear entirely, if for no other reason than that we do a lot to tweak our features other than just _getting_ them, whether this be second order processing, aggregation, smoothing or transformation. That being said, as with your parent post, I am VERY hopeful for new techniques to arise that can cut into that overhead at least a little.