List of review articles on ML and AI that are on arXiv

Uehreka · on Jan 26, 2019

This is at once an awesome and overwhelming list. Major kudos to whoever took the time to put it together. I wonder if there’s a way to tag these or group these into categories so that they’d be easier to bite into.

etiam · on Jan 26, 2019

"Arxiv Sanity Preserver: Because things are seriously getting out of hand" --Andrej Karpathy

:)

painful · on Jan 30, 2019

An experimental RSS feed for it is now available: https://us-east1-ml-feeds.cloudfunctions.net/arxiv-ml-review...

It's configured to be updated once a day.

tomrod · on Jan 26, 2019

I agree. Though auto-generated, that someone chose to do this and share their curation with the world is wonderful.

painful · on Jan 26, 2019

Categorization would definitely be good to have, but it requires the use of good quality ML to discern them accurately :) Meanwhile, please use Ctrl+F.

A higher priority is to serve a RSS feed for the results.

OkGoDoIt · on Jan 26, 2019

RSS feed would be great! Even better would be to pull the paper’s content out into the content body of the feed, so I could read it directly in an RSS reader. Probably no easy task, but I can dream.

painful · on Jan 30, 2019

An experimental RSS feed for it is now available. Please find the link to it at the new project repo: https://github.com/ml-feeds/arxiv-ml-reviews

painful · on Jan 26, 2019

The feed is in the works, but its content body will only contain the abstract, not the full paper.

mlevental · on Jan 26, 2019

it's auto-generated - it says so at the top. i don't see the point of this at all since you could reproduce by simply searching "survey" or "introduction". at minimum a cite count would've been helpful to distinguish well written ones from poorly written ones

yorwba · on Jan 26, 2019

The list of terms to include/exclude looks like it has taken some trial and error to compile: https://github.com/impredicative/arxiv-ml-reviews/blob/maste...

For example, it excludes "aerial survey", "peer review" and similar false positives.

joker3 · on Jan 26, 2019

It'd be easy to split them by year or subject, as that's provided by the arXiv.

painful · on Jan 26, 2019

Just how is the subject provided by arXiv? By subject, do you mean the categorization such as stat.ML, cs.AI, etc.?

mlevental · on Jan 26, 2019

yea sure why not? like just a tiny bit more curation would've made this actually useful. as it stands now it's just an unordered list.

painful · on Jan 26, 2019

It's currently ordered by date, with most recent on top. It's not unordered.

theblackcat1002 · on Jan 26, 2019

I written a service recently which predict articles future citation from arXiv, IEEE as rank which would save time from avoiding reading all the articles. It's still a work in progress, especially the keyword filtering part. link: https://www.notify.institute/

iamantee · on Jan 26, 2019

As Panoramix queried, could you introduce more details to your project? From your website, I only got that you are analyzing paper based on the author info and the citation history of his previous works and filtering papers by some tech like topic modeling. Though the project is still in progress, will you share info about what you have done and what you are about to do?

theblackcat1002 · on Jan 27, 2019

My model is based on these papers [1,2,3]. I found that adding the paper meta info such as table count, page count improves my model performance ( R^2 score of future citation of 2 years later ). For now, I am working on better filtering method using word embedding, such that a keyword "CNN" would also include papers about convolutional neural network.

1. Xiao, Shuai et al. “On Modeling and Predicting Individual Paper Citation Count over Time.” IJCAI (2016).

2. Dong, Yuxiao et al. “Can Scientific Impact Be Predicted?” IEEE Transactions on Big Data 2 (2016): 18-30.

3. Yan, Rui et al. “Citation count prediction: learning to estimate future citations for literature.” CIKM (2011).

Panoramix · on Jan 26, 2019

How does it work? Do you take into account the authors' previous success? You have a typo at the top of your page btw.

theblackcat1002 · on Jan 27, 2019

1) It predicts the citation count 2 years later using a mix of features from the articles, author and venue it was published.

2) I guess H-Index and previous citation count stats (mean, max, min). But I find the most influential factors are the author's H-Index, publish venue, author rank.

3) Thanks, I just fixed it.

gumby · on Jan 26, 2019

Someone needs to work on the deep learning problem of automatically curating these things and surfacing the important ones.

I suspect it's harder than the self driving car problem.