R packages I wish I'd known about earlier

cschmidt · on Feb 12, 2013

For some reason, I find it harder to find what I'm looking for in the R help than just about any other language. That's why you end up not knowing about useful packages for a very long time. Is that just me?

rcthompson · on Feb 12, 2013

There's an R-specific search engine called Rseek: http://www.rseek.org/

It's still not perfect, because if you don't know the right keyword for a concept you might not get what you're looking for. But it's pretty good.

S4M · on Feb 12, 2013

When looking for an R package that does something specific, I usually go to the R package list [1] and search for what I am looking for ("SQL" if I want an SQL connector or "sparse" if I want something with sparse matrix). I generally end up finding a package that does what I need.

[1] http://cran.r-project.org/web/packages/available_packages_by...

glamp · on Feb 12, 2013

It's not just you. This was a huge problem for me at first. When you're googling search "R cran" + your query

You'll get way better results

hablahaha · on Feb 12, 2013

This article should really just be called "Everyone should just follow Hadley's Github". Actually, someone should write that.

danso · on Feb 12, 2013

If only all such lists could include example code and visualizations, very nicely done. Thanks!

glamp · on Feb 12, 2013

thanks if you like the visuals, check out Hadley's talk at google http://www.r-bloggers.com/engineering-data-analysis-with-r-a...

EzGraphs · on Feb 12, 2013

If you want to do any sort of stock analysis, quantmod is really great as it bundles a number of other related packages with financial application together: http://www.quantmod.com/.

If you have to deal with directed graphs, iGraph http://igraph.sourceforge.net/.

Related Blog Posts

http://www.r-chart.com/2010/06/stock-analysis-using-r.html

http://www.r-chart.com/2010/06/analyze-twitter-data-using-r....

drags · on Feb 12, 2013

ddply can be very slow. Strongly recommended to get your data into the form you want in something map-reduce-y _then_ throw it into R for analysis and graphing.

jme3 · on Feb 12, 2013

I never cease to be amazed at how people who work with data large enough to make tools like plyr slow assume that everyone works with data like that.

I have been using R daily for 7-8 years now and have only occasionally turned to somethig like data.table for performance reasons. "Big data" receives waaay more attention and hype than there are actual human beings working on data of that scale. I can assure you that for the vast majority of R users world wide plyr is plenty fast enough for their needs.

HelloMcFly · on Feb 12, 2013

Sounds like you may have some experience that could assist me. I have six 3.6GB CSV files that I can't even get into R (reading them in just never happens, the program freezes), much less manipulate the data. I've not done any mad-reduce type functions before - is there a tool I can use to leverage that and get the data into R?

santa_boy · on Feb 12, 2013

data.table deserves a mention here too. It simplifies and speeds up dataframe operations

see ?data.table

crayola · on Feb 12, 2013

Absolutely. The improvements to code and performance are really spectacular.

felixr · on Feb 12, 2013

Caret (http://caret.r-forge.r-project.org/) is another package that should make the list, if you are using machine learning packages in R.

Caret gives you a common interface to you use a huge list of classifiers. Plus, it has some nice functions for data preparation.

phillc73 · on Feb 12, 2013

The list of database connectors should also include RSQLite:

http://cran.r-project.org/web/packages/RSQLite/index.html

A really useful article though. I'm going to investigate RandomForest.