Hacker News new | past | comments | ask | show | jobs | submit login

If the data can fit on a thumb drive it's not big data.



I think I read this somewhere here a few months ago (paraphrasing, obviously): "When the indices for your DB don't fit into a single machines RAM, then you're dealing with Big Data, not before."


And following up: Your laptop does not count as a "single machine" for purposes of RAM size. If you can fit the index of your DB in memory on anything you can get through EC2, it's still not Big Data.


There's still 40x difference to biggest EC2 instance to a maxed out Dell server (244 GB EC2 vs 6 TB for a R920). Not to mention non-PC hardware like SPARC, POWER and SGI UV systems that fit even more.


This is true, but at the upper end the "it isn't Big Data if it fits in a single system's memory" rule starts to get fuzzy. If you're using an SGI UV 2000 with 64 TB of memory to do your processing, I'm not going to argue with you about using the words "Big Data". ;-) I figured using an EC2 instance was a decent compromise.


Would it be fair to approximate it as, "if you can lift it, it's not big data"?


if a single file of the data can fit on the single biggest disk commonly available it not big data




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: