Hacker News new | past | comments | ask | show | jobs | submit login

To be fair, there's a good chance I'm missing something here, but 25 gb of data (and not super-redundant low-entropy ascii, a lot of executable machine code) being compressed into 11 gb seems pretty amazing. Have any third parties verified these numbers?

Editted to add: Beyond that, it seems like all this tech would make random lookups abhorrently expensive, because a seek(30) doesn't necessarily jump to the (30 * A + B)'th place. But I'm not a file systems/storage guy at all, so I'd love for some education here.




re: seeking

SSD's are block devices -- the operating system sees them as a pile of uniformly sized blocks on which it lays the filesystem. When you read() or seek(), the os always fetches the entire block that the read or seek falls in into memory and takes it from there. Any compression on the ssd will necessarily happen per-block, so seeking is not really any more difficult than on more normal systems.


it seems like all this tech would make random lookups abhorrently expensive

SSDs are virtualized (like virtual memory), so there is a mapping table from virtual to physical addresses. This is needed for wear-leveling, so you might as well also use it for thin provisioning (aka trim), dedupe, etc. In the worst case, one read command may require reading several pages of metadata from flash to find the data. Because flash is fundamentally so fast, an SSD can tolerate a lot of overhead and still be faster than a hard disk.


I just compressed the 1.6MB cgo application into 400kB using standard Mac OS X zip. A few other applications I tried compressed similarly well.


standard zip is far more cpu intensive than anything you could afford to run on something where you (a) have very limited cpu resources and (b) need to add very very little latency.


tar+gzip performed similarly well (1.6MB -> 450kB) and is a lot faster than normal zip. GZip was similarly able to compress my Mac's System folder from four gigabytes to two. Anyway, the point stands that executable files are quite compressible. If you'd like me to try it with an algorithm you consider more appropriate, please direct me to the download page.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: