Hacker News new | past | comments | ask | show | jobs | submit login

I'm pretty skeptical of the performance of a billion files.

I'm sure it will go at okay speed once you actually construct it, since it's basically a tree, but with so many entries I'd expect it to be much less pleasant than a database in many ways. The "preprocessing" is going to be especially awful.




There's probably a large overhead in storage used. It would probably result in much larger overall storage than 37GB. And I agree preprocessing would be painful.

But I'm curious on how lookup speeds would compare to the author's 1ms.

I'm also curious on how addition of a new hash would compare against adding a new hash to the single sorted file used by the author.

Leveraging any database is probably better in any case :)


If we're running on a SATA SSD, we can probably chain 5 or fewer accesses to the drive before we go over a millisecond. And each directory of depth likely requires 2 accesses.

> I'm also curious on how addition of a new hash would compare against adding a new hash to the single sorted file used by the author.

In a fair fight with that requirement, the sorted file would be allowed to add a few percent of extra blank entries and then it could insert in a millisecond too.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: