DwarFS: A fast high compression read-only file system

dumbfounder · on July 24, 2022

Favorite part: “I had several hundred different versions of Perl that were taking up something around 30 gigabytes of disk space”. That’s a lot of Perl.

simcop2387 · on July 24, 2022

I have this same kind of thing, for running https://perl.bot/ and related services. I'm using BTRFS however (due to a long story and actual reasons) and use it to dedupe, compress and discard a file backed filesystem for them. I'm at a logical store of about 46GB with an on disk size of 25GB. Most of that is the fact that I have several hundred (550ish last i counted) libraries and modules installed into each install of perl. This means that they dedupe incredibly well since they're basically perfect copies, and most of it is text based so it compresses well too.

I've looked at DwarFS before for this same use case but the fact that it's read only makes it more difficult to handle since I'd have to have an uncompressed version sitting out elsewhere. Though I've now got the hardware to actually put that all into a CI/CD that generates the image. I might actually work on that once I get my turing pi 2 boards since I want to port this whole setup to ARM as well as x86_64. I might also put it on my RISC-V board too but it's too slow so I think it won't be as useful.

EDIT: fixed the tbd size, it was taking a while to calculate.

liuliu · on July 25, 2022

Can you have OverlayFS on top of it to make it writable and periodically generate new base DwarFS from that overlay?

pdimitar · on July 25, 2022

People do something similar with e.g. Raspberry Pi machines with a microSD card; they make an overlayfs setup where the OS logs et. al. get written to memory and are only being synced to the main microSD card once an hour or so -- to avoid amortizing it too quickly.

So I'd imagine making something like that but with DwarFS below would be quite easy although it'd require you to set it up by hand. Still, once done it'll likely be a rock-solid setup for a long time.

simcop2387 · on July 25, 2022

I'd imagine so, that might be a good strategy for doing it and you could measure the newly setup stuff fairly easily.

jszymborski · on July 25, 2022

It'd be very cool to have something like virtualenv or Anaconda that uses DwarFS. Python environments take up so much of my harddrive.

metadat · on July 24, 2022

"DwarFS compression is an order of magnitude better than SquashFS compression, it's 6 times faster to build the file system, it's typically faster to access files on DwarFS and it uses less CPU resources."

Credit and thanks to coldblues for alerting me about this!

https://news.ycombinator.com/context?id=32212870

karteum · on July 25, 2022

"DwarFS compression is an order of magnitude better than SquashFS compression, it's 6 times faster" : I suspect this is on a highly specific test-case and no generalities can be made...

N.B. there is also EroFS https://www.kernel.org/doc/html/latest/filesystems/erofs.htm...

pdimitar · on July 25, 2022

I thought I was the one mentioning it?

https://news.ycombinator.com/item?id=32211651

metadat · on July 25, 2022

You did mention it, although with less context and somehow it didn't stick in my brain. Regardless, it wasn't intentional to not give you credit, and I am happy to give you all the credit sir, madame, or they!

pdimitar · on July 25, 2022

Well, don't get me wrong -- not that I care that much but I found it impossible to find the other reference to it that you mentioned so I was wondering if you (or me) made a mistake. Thanks for clarifying!

jcelerier · on July 25, 2022

Tangentially related - what's a good option for a cross-platform (portable to all platforms with a filesystem) read-only virtual file-system today like e.g. quake's pk3 file format ? e.g. let's say I want to access a few ten of thousand small files fast, much faster than what e.g. NTFS allows since I know that I'll likely have to read more-or-less all the files and I can mmap the whole thing, what are my options? My prime concern is having an api such as

    handle = vfs_fopen("/my/file1.txt")
    pointer_to_the_file_bytes = vfs_map(handle, <start offset>)

which would be as fast as possible. compression, encryption aren't needed.

WorldMaker · on July 25, 2022

ZIP is the closest to an "industry standard" portable filesystem. It's directly comparable to Quake's PK3 format because that's all that PK3 was, a ZIP with a custom file extension.

It's also what "powers" a wide range of portable filesystem in a single file tools such as DOCX and ODT and quite a few other modern Office and Office-adjacent file formats.

dspillett · on July 25, 2022

Would a RAM disk fit the bill? Just read all the contents from the copy in non-volatile storage at boot. Cross-platform then by virtue of using what-ever RAM-based filesystem or block-device options are commonly available on the target OS.

For “as fast as possible” you'll need to experiment and benchmark with your workload. Which filesystem is optimal may depend on how you are laying out the data and where the latency/throughput sensitivities are in your use case and the given filesystems.

> My prime concern is having an api such as...

Having a different API other than it looking like a filesystem would make cross-platform more of a concern, as you then have a data access library not a general filesystem. It will likely to be necessary for best performance though: any filesystem is going to have significant overheads (orders of magnitude) compared to being able to map chunks of the data directly into your process' address space.

If abandoning a generic filesystem, perhaps something like sqlite with an in-memory table/db (https://www.sqlite.org/inmemorydb.html)? Again like the ramdisk option just load up the content from permanent storage on first use.

jcelerier · on July 25, 2022

> Would a RAM disk fit the bill?

A normal unprivileged app cannot create RAM disks easily on any OS as far as I know ; also it wouldn't really work on e.g. WASM

> as you then have a data access library not a general filesystem.

That's fine for me - although I don't see a particular difference between either, a filesystem is just a system to access files, whatever that means

dspillett · on July 25, 2022

> although I don't see a particular difference

The distinction I'd make us that a filesystem provides a common generic API that practically all processes on the OS understand and share access to. Pretty much always implemented out-of-process (in the kernel or another userland process via kernel stubs/hooks like FUSE).

A data access library is usually much more specific to a particular data set or set of applications, and likely doesn't follow the filesystem abstraction (at least not in the same way).

pdimitar · on July 25, 2022

SQLite?

There's also a vfs module for it that imitates a filesystem on top of a single SQLite DB.

jcelerier · on July 28, 2022

ended up biting the bullet and started https://github.com/celtera/uvfs

staticassertion · on July 25, 2022

Maybe this is a really dumb question but how does one use a read-only filesystem? Can you mount it as writable temporarily or something?

Or is it that you create a compressed 'file' that you can mount as a file system? Like a zip file kinda I guess?

koolba · on July 25, 2022

You mount it as a read only file system to a mount point. It’s like putting in a CD (remember those?) into a drive.

The source data can be a raw block device or more likely a local file. It doesn’t matter as either way the kernel is just reading blocks of bytes.

staticassertion · on July 25, 2022

> remember those?

Not really tbh. I haven't used a CD in my adult life. I remember "burning a CD" was a thing and that's about it.

bee_rider · on July 25, 2022

How could you make us all feel so old. You monster.

unsafecast · on July 25, 2022

I was born in the early 2000s and I've used CDs a lot.

staticassertion · on July 25, 2022

As an adult? For what?

I was born in 1991 and I haven't used one since I was in 9th grade and made a girl a 'mix tape'.

o_1 · on July 25, 2022

we won't ask about 3.5 drives, funny thing is they have their special uses too.

chongli · on July 25, 2022

Or 5 1/4” for that matter! I remember my cousin’s old PC used those. Very flimsy but seemed to work remarkably well!

AndyJames · on July 25, 2022

IMHO they wasn't flimsy just a bit floppy

syntheticnature · on July 25, 2022

Another option is to use a read-only filesystem with overlayfs to provide a writeable filesystem (like JFFS2) that also has a solid static FS underneath in case of error. OpenWrt does, or at least did, this.

hinkley · on July 25, 2022

You got a few good “how” answers but as to “why”, it’s common for containers to use an overlay file system to handle writes. So in a container situation, you’ve already paid the overlay tax. Adding compression is a smaller incremental cost.

hansvm · on July 25, 2022

Just like a read-only file or service. There's some kind of a construction step, and thereafter it's read-only. One might do that to make it explicit that updates are expensive, to grant read-only privileges to a less trusted process, or whatever.

Somebody else can chime in with the exact mechanism by which this one is written, but common solutions include being writable sometimes or having a program to build the filesystem from known data. That might be filesystem-as-a-file, filesystem on a separate partition, or what have you.

staticassertion · on July 25, 2022

Right I want to know what the construction step is in this case, but that makes sense that there's multiple approaches.

Nekit1234007 · on July 25, 2022

Just like creating any sort of archive. You (or a program of sorts) create the fs structure in some directory, then invoke something like `mkdwarfs -i /path/to/that/directory -o /path/to/output/file.dfs`

Arnavion · on July 25, 2022

And squashfs works the same way - mksquashfs takes a directory as input and writes a file as output. That file can then be loopback-mounted to present the readonly filesystem.

staticassertion · on July 25, 2022

Got it, yeah so that tracks then, thanks.

kretaceous · on July 25, 2022

Slightly OT.

What resources would people here suggest for learning about file systems? I see a lot of new file systems like zfs, btrfs, etc. I looked up for resources but couldn't find anything substantial.

I want to learn how they work so that I can appreciate projects like this and compare them.

I looked into the Build Your Own X repo but didn't find anything. I found a book called Practical File System Design: The BeOS file system but it's apparently dated, and I'm not sure I want too much of a deep dive.

adastra22 · on July 25, 2022

The BeFS book is seriously good. I would start there.

mprovost · on July 25, 2022

I remember learning about filesystem design from the classic "demon" book (The Design and Implementation of the 4.4 BSD Operating System by Marshall Kirk McKusick) which has a lot of details about the original Fast File System.

8n4vidtmkvmk · on July 25, 2022

is zfs still considered new?

rmatt2000 · on July 24, 2022

A few years ago, I designed an incredibly fast write-only file system, but for some reason couldn't leverage it into a commercial product that anyone was willing to buy.

gabagool · on July 24, 2022

http://www.supersimplestorageservice.com/

> The Super Simple Storage Service (S4) is a new innovation in cloud storage. Our advanced write-only storage provides the highest security, lowest cost, and simplest management available.

notamy · on July 24, 2022

https://devnull-as-a-service.com/ (:

osigurdson · on July 24, 2022

I didn’t dig into the details, but unless DwarfFS is a joke, I assume they mean the system supports create, read and delete operations.

klyrs · on July 25, 2022

There's plenty of read-only media out there. I can see this being useful for container deployment -- you've got a database somewhere that you have read/write access to, but the deployed code doesn't need to be updated in the container. A readonly filesystem is a great way to enforce principle of least privilege.

askvictor · on July 25, 2022

I'd guess you build a fs image from an existing filesystem, so you'd only have read operations.

nora-puchreiner · on Aug 4, 2022

It looks good for NixOS (or Guix) store.

hawk_ · on July 25, 2022

> Clustering of files by similarity using a similarity hash function.

Does that mean that I can store all those 100s of photos taken at slightly different angles, in an efficient way?

thewakalix · on July 25, 2022

That depends on whether they are bytewise similar, which in turn depends on the details of the image encoding of the format you’re using. I would expect that, by default, no.

lvncelot · on July 25, 2022

That does lead to the question whether, with enough files, storing a lot of similar photos in uncompressed png is more space efficient in DwarFS than it is storing them in jpg.

dataflow · on July 25, 2022

How does this compare to WoF compression in NTFS (which uses LZX)?

lathiat · on July 25, 2022

Sounds like the main benefit is hash-based dedup. You can do this with ZFS and borgbackup - they'd likely be a better comparison - but most other compressed filesystems won't do it.

Most compressed filesystems will fail to dedup stuff because for various performance related reasons the compression window is usually quite small maybe up to 128kb and the files are often not sorted in the archive such that small files would manage to compress anyway in that window.

EDIT: Apparently squashfs does include file de-dupe however DwarFS is fixing what I basically said above. It sorts files in order of similarity so that you can compress accross file boundaries as well as just removing complete duplicates. That's pretty cool.

"Clustering of files by similarity using a similarity hash function. This makes it easier to exploit the redundancy across file boundaries."

This kindof thing is more difficult to do on a writable filesystem though not impossible.

vardump · on July 24, 2022

Seems like this a pretty nice read-only filesystem.

Shame that it can't be used much commercially in many cases due to GPLv3 license. This could improve so many embedded systems currently using SquashFS.

isr · on July 25, 2022

Well, if you want to build a commercial product using dwarfs, you could always (shock, horror) contact the dev and *pay* him to license it to you under a bsd-like license (or whatever).

(the dwarfs dev would have to work it out amongst all who contributed to his repo)

vardump · on July 25, 2022

Who said I'd want (or need) to use it in a product?

What I'm saying is that due to the license fewer companies want to use and thus we're getting a bit worse products as a consequence.

Of course I completely agree the developer should get licensing income (heck, that's in my self interest as well!), but what's going to really happen is that the cheapo companies just make do with less and use SquashFS or whatever instead.

It's a tragedy of commons there's so little will to donate for or crowdsource open source projects and that they're taken for granted.

fps_doug · on July 25, 2022

> It's a tragedy of commons there's so little will to donate for or crowdsource open source projects and that they're taken for granted.

Which is not made better by your sentiment? I mean GPL at least forces you to either try negotiating a different license with the author (for money ideally), or suck it and use something inferior. Using a permissive license that allows multi-billion dollar companies to use your code for free certainly doesn't help changing the mind set of "open source is other people working for me for free".

vardump · on July 25, 2022

"I mean GPL at least forces you to either try negotiating a different license with the author (for money ideally)"

Sounds great in theory, but even for a popular open source library developer this happens so rarely that it rarely pays the bills regardless of license.

"Using a permissive license that allows multi-billion dollar companies"

Most embedded devices by far are not developed by multi-billion dollar companies. Their legal departments are also actively steering away from GPL licenses.

"try negotiating a different license"

In my experience, by far the most companies don't want to negotiate anything. If there's a product with a set price, yes, then it might be purchased. They typically also want some kind of product support with it.

"use your code for free certainly doesn't help changing the mind set of "open source is other people working for me for free"."

How to solve this? Ideally the library developer would get paid AND the consumers get more value for their money. This would encourage more developers to write useful libraries that provide great value in the big picture, but are uneconomical or otherwise too much trouble to deal with on an individual company or product level.

As it is, everyone seems to lose.

fps_doug · on July 27, 2022

This is all true, but how do you think a more permissive license would improve things for open source authors? Maybe there is a miscommunication here, but how is a company more likely to give you money if you release under MIT oder BSD?

Dylan16807 · on July 25, 2022

By "can't be used" do you mean by unjustified whiners, or is there a real problem?

If you're including this existing filesystem you shouldn't have any relevant patents to worry about, and the clause about letting the user write their own firmware shouldn't be an issue for 99% of products.

Is there anything else I missed in the differences between v2 and v3?

freemint · on July 25, 2022

The clause about letting the user write their own firmware is a huge issue for 99% of products.

Use some vendor blob -> problem. Have some NDA for registers used in some hardware -> problem. Hardware does not have user firmware flashing -> problem. Legal compliance requirements for hardware bearing your name -> problem.

Dylan16807 · on July 25, 2022

> Use some vendor blob -> problem. Have some NDA for registers used in some hardware -> problem.

That's the same as GPLv2. If you don't mix the filesystem into your proprietary code, you don't have to reveal any of those things.

> Hardware does not have user firmware flashing -> problem.

GPLv3 only requires you to let the user have the same flashing ability you have. If it can't be flashed, you don't have to do anything.

> Legal compliance requirements for hardware bearing your name -> problem.

That's a 1% of products situation.

synergy20 · on July 24, 2022

it's not in standard linux kernel(unlike squashfs) plus it is less field tested comparing to squashfs, more importantly I too avoid GPLv3 in any commercial product. Anything GPLv3 will be off my check-out list for anything immediately because I honestly don't know what GPLv3 really means to me. I use GPLv2 cautiously wherever it fits as I feel at least I know what it implies.

palata · on July 25, 2022

Doesn't it "just" say that on top of providing the sources of the binaries you distribute, you must provide/document a way to update them?

dcj4 · on July 25, 2022

I think the author overlooked that a read-only file system is of absolutely no practical use. If you can't write to it, then there will never be any data in it to be read.

pdimitar · on July 25, 2022

There are plenty of data archival projects that can benefit from DwarFS. Stuff like e.g. old magazines or ROM collections or scanned books/comics that don't have copyright attached etc., don't need modification and if the collection is big enough then the deduplication can reduce the final size of the archive by a lot.

unsafecast · on July 25, 2022

Read-only means you can't change it _after_ you initially put all your files in it.