Hacker News new | past | comments | ask | show | jobs | submit login

> data.table is a completely separate implementation of a data structure that is functionally similar to a data frame but designed from the ground up for efficiency, though with some compromises, such as eschewing R's typical copy-on-modify paradigm.

This is totally false. data.table inherits from data.frame. Sure, it has some extra attributes that a tibble doesn’t but the way classing works in R is so absurdly lightweight, that’s meaningless in comparison. Both tibble and data.table are data.frames at their core which are just lists of equal length vectors. You can pass a data.table wherever you pass a data.frame.




Thank you for the correction. I knew that tibbles were essentially just data frames with an extra class attribute, but for some reason I didn't realize this was also true of data.table. I think assumed that data.table's reference semantics couldn't be implemented on top of the existing data frame class, but I guess I'm wrong about that. Unfortunately it's too late for me to edit my original comment.


Tibbles are not just data frames with extra class attribute. For one - they don't have row names. Second, consider this example, demonstrating how treating tibbles as data frames can be dangerous:

    df_iris <- iris
    tb_iris <- tibble(iris)

    nunique <- function(x, colname) length(unique(x[,colname]))

    nunique(df_iris, "Species")
    > 3

    nunique(tb_iris, "Species")
    > 1
R-devel mailing list had a long discussion about this too: https://stat.ethz.ch/pipermail/r-package-devel/2017q3/001896...


Ok, fine, to be more precise, tibbles and data frames and data tables are all implemented as R lists whose elements are vectors which form the columns of the table. And also `is.data.frame` currently returns TRUE for all of them, whether or not that is ultimately correct.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: