Hacker News new | past | comments | ask | show | jobs | submit login

Tangentially related - what's a good option for a cross-platform (portable to all platforms with a filesystem) read-only virtual file-system today like e.g. quake's pk3 file format ? e.g. let's say I want to access a few ten of thousand small files fast, much faster than what e.g. NTFS allows since I know that I'll likely have to read more-or-less all the files and I can mmap the whole thing, what are my options? My prime concern is having an api such as

    handle = vfs_fopen("/my/file1.txt")
    pointer_to_the_file_bytes = vfs_map(handle, <start offset>)
which would be as fast as possible. compression, encryption aren't needed.



ZIP is the closest to an "industry standard" portable filesystem. It's directly comparable to Quake's PK3 format because that's all that PK3 was, a ZIP with a custom file extension.

It's also what "powers" a wide range of portable filesystem in a single file tools such as DOCX and ODT and quite a few other modern Office and Office-adjacent file formats.


Would a RAM disk fit the bill? Just read all the contents from the copy in non-volatile storage at boot. Cross-platform then by virtue of using what-ever RAM-based filesystem or block-device options are commonly available on the target OS.

For “as fast as possible” you'll need to experiment and benchmark with your workload. Which filesystem is optimal may depend on how you are laying out the data and where the latency/throughput sensitivities are in your use case and the given filesystems.

> My prime concern is having an api such as...

Having a different API other than it looking like a filesystem would make cross-platform more of a concern, as you then have a data access library not a general filesystem. It will likely to be necessary for best performance though: any filesystem is going to have significant overheads (orders of magnitude) compared to being able to map chunks of the data directly into your process' address space.

If abandoning a generic filesystem, perhaps something like sqlite with an in-memory table/db (https://www.sqlite.org/inmemorydb.html)? Again like the ramdisk option just load up the content from permanent storage on first use.


> Would a RAM disk fit the bill?

A normal unprivileged app cannot create RAM disks easily on any OS as far as I know ; also it wouldn't really work on e.g. WASM

> as you then have a data access library not a general filesystem.

That's fine for me - although I don't see a particular difference between either, a filesystem is just a system to access files, whatever that means


> although I don't see a particular difference

The distinction I'd make us that a filesystem provides a common generic API that practically all processes on the OS understand and share access to. Pretty much always implemented out-of-process (in the kernel or another userland process via kernel stubs/hooks like FUSE).

A data access library is usually much more specific to a particular data set or set of applications, and likely doesn't follow the filesystem abstraction (at least not in the same way).


SQLite?

There's also a vfs module for it that imitates a filesystem on top of a single SQLite DB.


ended up biting the bullet and started https://github.com/celtera/uvfs




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: