Hacker News new | past | comments | ask | show | jobs | submit login

> "smart storage" essentially labeled blocks with an MD5 hash when you went to store them

Asking out of curiosity - isn't this similar to what Venti (from Plan 9) did? Of course, Venti was content-addressed, and in this case I'm guessing this system sat above WAFL (which is definitely not content-addressed).

* http://doc.cat-v.org/plan_9/4th_edition/papers/venti/




Many of the same pieces solving a slightly different problem. In the NetApp case it was pushing the edge of the de-duplication for efficient archival storage problem. EMC had discovered that MD5 hash collisions made deduping on a document level dangerous (you could think you had a document when you didn't). Those collisions came from dissimilar sized documents and indeed you could "attack" MD5 signatures in that way. With a fixed document size, the probabilities went back to the actual MD5 collision probabilities. Those probabilities were acceptably small. On the "fast" part of the archival server, instead of storing 8K blocks you could store 16 byte "block identifiers" (but still using all of the standard WAFL file system layout, it thinks it is using 16 byte blocks. Those could be stored in "fast" storage (think SSD) and the actual data on slow "cold" storage. Your back-end server does do "content addressable" kinds of things in terms of hash->block location services but it is simple, only three operations "does block <x> exist?", "give me block <x>" and "store this block as <x>."

For a write mostly fabric attached archival device it had some benefits over the SATA based filers (higher density, lower watts/terabyte, less CPU load on the filer head (spread out to the storage retrieval unit which could have many of the hash->block tranalators) etc. I don't believe NetApp ever built a complete one though. Just "too many degrees off their bow" as an engineer I knew would say.


Thank you, this is fascinating to read! This is a piece of NTAP history I was completely unaware of :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: