Hacker News new | past | comments | ask | show | jobs | submit login

Would hyperscalers jump at a 72 TB 5.25” HDD? Maybe-though the larger issue is that de-duplicated, layered large-file storage (e.g., VMs, whole-disk images) still requires warm storage for the base layers.

Might be an excellent choice for the Iron Mountains of the world, especially for long-form media storage, though I think that the majority of personal long-term storage is actually shrinking, in terms of growth rate.




You might be surprised at how much of hyperscalers' data is really cold, and also has very weak access-time requirements. For every user-visible gigabyte, there are many gigabytes of secondary data like old logs or analytics datasets that are highly unlikely to be accessed ever again - and if they are, waiting minutes for them to be loaded is OK. And the trend is accelerating since the days when Facebook kept the entire Blu-Ray market afloat to deal with the deluge. I think there's quite significant appeal in the hyperscale/HPC markets for super high capacity disks, even if they're really slow (so long as they're not slower than tape).

Background: I used to work on a very large storage system at Facebook, though the one most relevant to this discussion belonged to our sibling group. I've also missed any developments in the year-plus since I retired.


Would be relevant for folks like Backblaze and the Internet Archive, where you write once, read many, but rarely delete. 60 72TB drives gets you 4.3PB per chassis/pod, and assuming 10 pods to a rack, 40PB racks. For comparison, 3 years ago, the Internet Archive had about 50PB of data archived and 120PB of raw disk capacity.

https://news.ycombinator.com/item?id=18118556


And as "rarely" approaches zero ( think legal hold-type "you may surely not delete" ) there is a cost-saving in warm-ish storage in terms of replication and maintenance. Ensuring that your Tape archive is good is a pain unless you have huge tape robots - https://www.youtube.com/watch?v=kiNWOhl00Ao


Note he speculated about a 5.25" form factor drive. You're not going to fit 60 of those in something Backblaze-pod sized.


Looks like 60 5.25" drives from their website?

https://www.backblaze.com/blog/open-source-data-storage-serv...

https://www.backblaze.com/blog/wp-content/uploads/2016/04/bl...

Edit: My mistake! I was confusing 5.25 form factor with 3.5 :/ much shame.


Those are 3.5"


Hyperscalars use a blend of storage flavours covering the whole spectrum, and for most data-heavy purposes can mix hot and cold bytes on the same device to get the right IO/byte mix. At which point you can simplify down to _"are they currently buying disks to get more bytes or more IO"_ - if the HDD mix skews far enough that they're overall byte constrained, yeah they'll be looking to add byte-heavy disks to the pool. If they've got surplus bytes already, they'll keep introducing colder storage products and mix those bytes onto disks bought for IO instead.


> Hyperscalars use a blend of storage flavours covering the whole spectrum

Probably including taping, which most non-enterprise folks are often surprised still exists.

There's an upfront cost for the infrastructure (drives, usually robotic libraries), but once you get to certain volumes they're quite handy because of the automation that can occur.


Tapes are awkward though, since they can't directly satisfy the same random-access use-cases. E.g. even GCS's 'Archive' storage class, for the coldest of the cold, offers sub-second retrieval, so there's at least one copy on HDD or similar at any time.

Tapes are suitable for tape-oriented async-retrieval products (not sure if any Clouds have one?), or for putting _some_ replicas of data on as an implementation detail if the TCO is lower than achieving replication/durability guaranteed from HDD alone. But that still puts a floor on the non-tape cold bytes, where this sort of drive might help.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: