NIPS 2014 papers

zackchase · on Nov 29, 2014

I took a stab at trying to interpret the topics output by this run of LDA. Green is one the clearest: generally convolutional deep nets, image classification, empirical work.

Brown seems to have picked up on linear algebra. "Vector", "matrix", "tensor" and "decomposition" all get consistently labeled brown, as do "eigenvalues", "orthogonal" and "sparse".

The rest are not as useful. Black almost always has "number", "set", "tree" and "random", but little else. Purple at times seems to signify topic modeling, but also contains "neural" and "feedforward". Blue seems to be the stats topic, containing "Bayes", "regression", "gaussian", and markov processes. But it also contains random words like "university" and "international".

Overall, very interesting. I wonder if these topics would be even better defined with a higher setting of k.

taneliv · on Nov 29, 2014

Karpathy had a different interpretation (in the green bar at the top of the page). For example, purple would be neuroscience.

In addition to adjusting k, another change that might be interesting would be to include also previous years' papers in the model estimation. Changes in component (topic) weights year-over-year could perhaps reveal something about the topics, or the papers.

dustintran · on Nov 29, 2014

Yup, it seems k was fixed since the first time these scripts were made for NIPS 2012 (?). Some of the more well-established advances since LDA would also likely help, like HDP.

slashcom · on Nov 29, 2014

Karpathy constantly shows the gap between "Anyone could've done that" and "Yeah, but he _did_."

nl · on Nov 29, 2014

It's only in the last 12 months that it became clear this was possible. The Ng "Zero Shot Learning" paper came out at NIPS2013, and given the lead time for a paper like that I think they must have started work at about that time.

nl · on Nov 29, 2014

Wow, those downvotes are pretty strong! Clearly I'm wrong - what am I missing?

mlla · on Nov 29, 2014

This is done using Latent Dirichlet Allocation (LDA). The original algorithm was published by David Blei et al over ten years ago, link to the paper: http://machinelearning.wustl.edu/mlpapers/paper_files/BleiNJ...

There are many machine learning libraries that have good implementations of LDA (e.g. Gensim), so it should be "relatively" straightforward to create the topics and clustering based on the abstracts of the papers.

jamessb · on Nov 29, 2014

I think there might be confusion about wht nl was referring to. Yes, the link is to a list (produced by Karpathy) of papers on which LDA has been performed.

But one of the listed papers is also by Kapathy ("Deep Fragment Embeddings for Bidirectional Image Sentence Mapping"), and I think this might be what nl is complimenting as being done quickly.

nl · on Nov 29, 2014

Yes this is the case. Thanks

j_juggernaut · on Nov 29, 2014

Also check out the octopus visualization. http://cs.stanford.edu/people/karpathy/scholaroctopus/

redlabs4000 · on Nov 30, 2014

When the papers mention that code will be released, is that right now, or when the conference happens? I couldn't find any links to the code in any of the papers, including the karpathy one

bra-ket · on Nov 29, 2014

93 occurrences of "deep"

sushirain · on Nov 30, 2014

I counted 15 occurrences in the deep learning topic.

nl · on Nov 29, 2014

A Multi-World Approach to Question Answering about Real-World Scenes based on Uncertain Input is very cool.

The Karpathy paper, too.

I love the cross-modal work that's going on at the moment.