Deep learning for NLP resources

rsingla · on Oct 26, 2015

I'm quite happy to see this, although I can't speak to the quality of the resources (it's not my area at all). What makes me happy is that this is a reasonable amount of resources that are up to date mixed with some well known names (Ng and Hinton). I wish my research area had something like this!

cmwright · on Oct 26, 2015

Agreed! I've been trying to put a list like this together for Big Data use cases (hadoop, storm... ??) and have had trouble finding up to date resources as the space moves so quickly. Now what we need is a list of these lists.

etrain · on Oct 27, 2015

Ion Stoica's Big Data Systems research class at Berkeley has a pretty solid reading list to get you started:

http://www.cs.berkeley.edu/~istoica/classes/cs294/15/class.h...

navbaker · on Oct 27, 2015

While the Coursera course by Hinton looks amazing, it also appears the last offering was in 2012. I haven't used Coursera before, is there a way to access the class material after a class has finished?

andrewtbham · on Oct 27, 2015

the link goes straight to links to all the videos and materials.

navbaker · on Oct 27, 2015

Thanks, I didn't click through the enroll button.

bernatfp · on Oct 26, 2015

Surprising there is no mention of any CNN based methods for NLP. A different approach that seems to provide results on par with the state of the art in some NLP tasks [1].

[1] http://arxiv.org/abs/1408.5882

tianlins · on Oct 27, 2015

I think there are two major reasons why CNN might not be a good model. The first is that CNNs expect translation invariance, which is pretty common in images. But language sentences do not have this structure. Another reason is that in NLP outputs usually have varying length, this is why RNNs, LSTMs are so popular these days.

juxtaposicion · on Oct 27, 2015

I think you're right when thinking about CNNs on words. It's max-pool that's usually combined with CNNs that helps with translational invariance, less so the CNN filters themselves (which, if it were a full convolution, would show up in the complex phase).

I think it makes more sense when CNNs are applied at the character level. The filter banks then activate for specific n-gram patterns of characters, like certain prefixes, suffixes, and root words. The higher level LSTMs are then relieved of having to understand that level of structure. Also, tokenization is hard, and might be especially wrong for media with grammatical abuse like Twitter, and this avoids that janky preprocessing. See: http://arxiv.org/abs/1508.06615

chebyshev3 · on Oct 27, 2015

CNNs are used in NLP all the time for a range of problems. see Collobert and Weston's work, NLP (almost) from scratch.

even in images, you're able to zero pad the input.

chejazi · on Oct 27, 2015

Since it's on GitHub, you could fork the repo, add any resources, and submit a merge request! Or create an issue that brings the omission to everyone's attention.

p1esk · on Oct 27, 2015

Here's one by Lecun: http://arxiv.org/abs/1502.01710

andrewtbham · on Oct 27, 2015

It is strange because CNNs are so popular in image processing. Are there anymore papers besides that one?

deet · on Oct 27, 2015

Also see https://github.com/robertsdionne/neural-network-papers which includes a lot of research applicable to NLP.