Hacker News new | past | comments | ask | show | jobs | submit login

`dplyr`, the data manipulation library in R.

Some prefer base R or data table, or python pandas, but for me, dplyr is essentially a _perfect_ data manipulation library for small to medium data, and nothing comes close.

Given that for a data scientist, data cleaning is “80% of the work”, and in a world where science is increasingly data science, means that dplyr has done a LOT of good.

Honorable mention goes to ggplot2, the plotting library, and the rest of the tidyverse.




I use OpenRefine for that (I don't use R though). How comparable is it according to you, apart from the fact one is a lib for a language and the other is a GUI?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: