Hacker News new | past | comments | ask | show | jobs | submit login

As I pointed out a while back (https://news.ycombinator.com/item?id=22974882), the "SHARD" system described in that paper didn't actually have anything to do with "sharding" as the term is currently used. It was designed to replicate data, but it didn't do any kind of partitioning; each replica stored a copy of the entire dataset.

For that reason (in addition to the low number of citations), I think it's very likely that the name is a total coincidence. Pretty much any word you can think of has been used by somebody as an acronym for some project.




I'm the person you replied to in that thread, and in support of your point: after that discussion, I spent some time crawling through the proceedings of Very Large Databases (VLDB) and the ACM Digital Library, and I could find no instances of "shard" used to mean the partitioning of a database prior to 2001. (That paper is "Minerva: An automated resource provisioning tool for large-scale storage systems" in Transactions on Computer Systems, free-to-read at https://dl.acm.org/doi/abs/10.1145/502912.502915.)

Other the other hand, I found many papers citing the SHARD paper - more than the official count. That's a difficulty with citation counts of old papers: a lot of the papers citing it are also old papers, and we're not consistent at tracking the citations of old papers. Personally, I don't have a conclusion. The SHARD paper is decently cited, and its usage is close to the modern one. On the other hand, I can't find any smoking gun pre-1997 usage of "shard" in the modern meaning.


Interesting, thanks for putting a lot more effort into answering this question than I did!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: