Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Dut – a fast Linux disk usage calculator (codeberg.org)
396 points by 201984 6 months ago | hide | past | favorite | 148 comments
"dut" is a disk usage calculator that I wrote a couple months ago in C. It is multi-threaded, making it one of the fastest such programs. It beats normal "du" in all cases, and beats all other similar programs when Linux's caches are warm (so, not on the first run). I wrote "dut" as a challenge to beat similar programs that I used a lot, namely pdu[1] and dust[2].

"dut" displays a tree of the biggest things under your current directory, and it also shows the size of hard-links under each directory as well. The hard-link tallying was inspired by ncdu[3], but I don't like how unintuitive the readout is. Anyone have ideas for a better format?

There's installation instructions in the README. dut is a single source file, so you only need to download it and copy-paste the compiler command, and then copy somewhere on your path like /usr/local/bin.

I went through a few different approaches writing it, and you can see most of them in the git history. At the core of the program is a datastructure that holds the directories that still need to be traversed, and binary heaps to hold statted files and directories. I had started off using C++ std::queues with mutexes, but the performance was awful, so I took it as a learning opportunity and wrote all the datastructures from scratch. That was the hardest part of the program to get right.

These are the other techniques I used to improve performance:

* Using fstatat(2) with the parent directory's fd instead of lstat(2) with an absolute path. (10-15% performance increase)

* Using statx(2) instead of fstatat. (perf showed fstatat running statx code in the kernel). (10% performance increase)

* Using getdents(2) to get directory contents instead of opendir/readdir/closedir. (also around 10%)

* Limiting inter-thread communication. I originally had fs-traversal results accumulated in a shared binary heap, but giving each thread a binary-heap and then merging them all at the end was faster.

I couldn't find any information online about fstatat and statx being significantly faster than plain old stat, so maybe this info will help someone in the future.

[1]: https://github.com/KSXGitHub/parallel-disk-usage

[2]: https://github.com/bootandy/dust

[3]: https://dev.yorhel.nl/doc/ncdu2, see "Shared Links"




Nice work. Some times I wonder if there's any way to trade away accuracy for speed? Like, often I don't care _exactly_ how many bytes is the biggest user of space, but I just want to see some orders of magnitude.

Maybe there could be an iterative breadth-first approach, where first you quickly identify and discard the small unimportant items, passing over anything that can't be counted quickly. Then with what's left you identify the smallest of those and discard, and then with what's left the smallest of those, and repeat and repeat. Each pass through, you get a higher resolution picture of which directories and files are using the most space, and you just wait until you have the level of detail you need, but you get to see the tally as it happens across the board. Does this exist?


Something like that exists for btrfs; it's called bdtu. It has the accuracy/time trade-off you're interested in, but the implementation is quite different. It samples random points on the disk and finds out what file path they belong to. The longer it runs the more accurate it gets. The readme is good at explaining why this approach makes sense for btrfs and what its limitations are.

https://github.com/CyberShadow/btdu


Damn, `ext4` is organized differently entirely. You can't get anything useful from:

    sudo debugfs -R "icheck $RANDOM" /dev/nvme1
    sudo debugfs -R "ncheck $res" /dev/nvme1
and recursing. That's a clever technique given btrfs structs.


That's so cool.


That is so cool!!! I have always wanted something like this! Arg I wish other filesystems supported a strategy like this!


Thanks!

What you described is a neat idea, but it's not possible with any degree of accuracy AFAIK. To give you a picture of the problem, calculating the disk usage of a directory requires calling statx(2) on every file in that directory, summing up the reported sizes, and then recursing into every subdirectory and starting over. The problem with doing a partial search is that all the data is at the leaves of the tree, so you'll miss some potentially very large files.

Picture if your program only traversed the first, say, three levels of subdirectories to get a rough estimate. If there was a 1TB file down another level, your program would miss it completely and get a very innaccurate estimate of the disk usage, so it wouldn't be useful at all for finding the biggest culprits. You have the same problem if you decide to stop counting after seeing N files, since file N+1 could be gigantic and you'd never know.


Yeah, maybe approximation is not really possible. But it still seems like if you could do say, up to 1000 stats per directory per pass, then running totals could be accumulated incrementally and reported along the way.

So after just a second or two, you might be able to know with certainty that a bunch of small directories are small, and then that a handful of others are at least however big has been counted so far. And that could be all you need, or else you could wait longer to see how the bigger directories play out.


You would still have to getdents() everything but this way you may indeed save on stat() operations, which access information that is stored separately on disk and eliminating these would likely help uncached runs.

You could sample files in a directory or across directories to get an average file size and use the total number of files from getdents to estimate a total size. This does require you to know if a directory entry is a file or directory, which the d_type field gives you depending on the OS, file system and other factors. An average file size could also be obtained from statvfs().

Another trick is based on the fact that the link count of a directory is 2 + the number of subdirectories. Once you have seen the corresponding number of subdirectories, you know that there are no more subdirectories you need to descend into. This could allow you to abort a getdents for a very large directory, using eg the directory size to estimate the total entries.


> Another trick is based on the fact that the link count of a directory is 2 + the number of subdirectories.

For anyone who doesn't know why this is, it's because when you create a directory it has 2 hard links to it which are

    dirname
    dirname/.
When you add a new subdirectory it adds one more link which is

    dirname/subdir/..
So each subdirectory adds one more to the original 2.


This seems difficult since I'm not aware of any way to get approximate file sizes, at least with the usual FS-agnostic system calls: to get any size info you are pretty much calling something in the `stat` family and at that point you have the exact size.


i thought files can be sparse and have holes in the middle where nothing is allocated, so the file size is not what is used to calculate usage, it's the sum of the extents or some such.


Yes, files can be sparse but the actual disk usage information is also returned by these stat-family calls, so there is no special cost to handling sparse files.


Wish modern filesystems maintained usage per dir as a directory file attribute instead of mandating tools to do this basic job.


CephFS does that.

You can use getfattr to ask it for the recursive number of entries or bytes in a given directory.

Querying it is constant time, updates update it with a few seconds delay.

Extremely useful when you have billions of files on spinning disks, where running du/ncdu would take a month just for the stat()s.


This is an excellent point and I wholeheartedly agree!


Is it? That would require any update to any file to cascade into a bunch of directory updates amplifying the write and for what? Do you “du” in your shell prompt?

Not to mention it would likely be unable to handle the hardlink problem so it would consistently be wrong.


> That would require any update to any file to cascade into a bunch of directory updates amplifying the write and for what?

You can be a little lazy about updating parents this and have O(1) update and O(1) amortized read with O(n) worst case (same as now anyway).


This is probably the right solution, but tou need to rebuild on an unclean unmount if you do it lazily.


Disks have improved in I/O and write speed metrics substantially to the point where windows will literally index your file system so you can search faster, and antivirus will scan files in the background before you open them. I don’t think maintaining size state on directories would be all that much of a challenge.


I expect performance would suffer quite a lot. In a system with high I/O, there would be a lot of contention on updating the size of such directories as /home or /tmp, let alone /.

Also, are you going to update a file’s size for every write (could easily be a thousand times if you’re copying over a 10MB file) or are you going to coalesce updates to file sizes? If the latter, how do you recover after a crash?

Virtual directories such as /dev and /proc would require special-casing.

Mounting and unmounting disks probably would require special-casing.


Haven’t many similar issues been solved in journaled file systems and/or things like database transaction logs and indexes? Real-time high precision accuracy is not required, knowing how big a directory is, is a frequent use case of directories. Hell, ‘df’ tracks this at the partition level, including edge cases, as does ‘du’


As far as I am aware, neither of those cascade sizes up.

Also, doing that in databases isn’t a solved problem. count(*) can be slow in databases. See for example

- PostgreSQL: https://dba.stackexchange.com/questions/314371/count-queries..., https://wiki.postgresql.org/wiki/Count_estimate

- Oracle: https://forums.oracle.com/ords/apexds/post/select-count-very...

(Both databases use MVCC (https://en.wikipedia.org/wiki/Multiversion_concurrency_contr...) to ensure that concurrent queries all see the database in a consistent state. That makes it necessary to visit each row and check their time stamp when counting rows)


I have a "du" command currently running that has been running for ~50 hours. I'd much rather have it update a half-dozen directory entries on each write.


> but I don't like how unintuitive the readout is

The best disk usage UI I ever saw was this one: https://www.trishtech.com/2013/10/scanner-display-hard-disk-... The inner circle is the top level directories, and each ring outwards is one level deeper in the directory heirarchy. You would mouse over large subdirectories to see what they were, or double click to drilldown into a subdirectory. Download it and try it - it is quite spectacularly useful on Windows (although I'm not sure how well it handles Terabyte size drives - I haven't used Windows for a long time).

Hard to do a circular graph in a terminal...

It is very similar to a flame graph? Perhaps look at how flame graphs are drawn by other terminal performance tools.


I've used graphical tools very similar to this, and I always come back to this:

   du -h | sort -rh | less
(might have to sudo that du depending on current folder. on macos, use gsort)

You just immediately see exactly what you need to delete, and you can so quickly scan the list. I'm not a terminal die-hard "use it for everything" kinda guy, I like GUIs for lots of stuff. But when it comes to "what's taking up all my space?" this is honestly the best solution I've found.


I like to use:

    du -shx \*
I used to pipe that to:

    | grep G
to find anything gig sized, but I like your:

    | sort -rh
Thanks!


Good tip! Yeah, the `sort -rh` is what makes it sing, it's such a cool feature of coreutils that `sort` knows how to sort human-readable output from `du` or `df`


"Disk Usage Analyser" / "Baobab" on Linux is awesome with the same UI: https://apps.gnome.org/en-GB/Baobab/


And Filelight from KDE - https://apps.kde.org/filelight/


Also Filelight (KDE)


I don't like radial charts because the outer rings have a larger area, which makes it look like files deep in the hierarchy take more space than in reality. And also it leaves the majority of screen space unused.

I prefer the more classic treemap view, my personal favorite being the classic version of SpaceMonger but it's Windows only and very slow.


WizTree also uses a treemap view and is very fast. It's also Windows-only though.


Linux and MacOS have QDirStat: https://github.com/shundhammer/qdirstat


Thank you for this, I've always looked for a WinDirStat alternative for Linux.


DaisyDisk on mac does that. Also it's blazing fast, it seems to even beat "du" so I don't know what tricks they're pulling.


I think they're reading some of the info from Spotlight metadata already collected by the OS for indexing, but I could be wrong.


That’s probably it. It’s likely powered by whatever thing gives you quick directory sizes in Finder after you do View Options (cmd+j), and select “Show All Sizes”. I have that setting always on for all directories and pretty sure it’s cached as it’s fast.


That’s called a ring chart or a sunburst chart.


duc http://duc.zevv.nl/ does this


On Windows I always used to use Windirstat but it was slow, then I found Wiztree which is many orders of magnitude faster. I understand it works by directly reading the NTFS tables rather than spidering through the directories laboriously. I wonder if this approach would work for ext4 or whatever.


NTFS is pointlessly slow, so bypassing the VFS provides a decent speedup in exchange for the ridiculous fragility.

Linux doesn’t have the same issue, and I’d be quite concerned if an application like this needed root access to function.


I think you underestimate how much of a speedup we're talking about: it can pull in the entire filesystem in a couple seconds on a multi TB volume with Bs of files. I have yet to see anything in the linux world (including the OP) that comes anywhere near this performance level via tree walking.


I want to take this opportunity to recommend the talk "NTFS isn't that bad" (https://www.youtube.com/watch?v=qbKGw8MQ0i8). NTFS prefers a different access pattern than most usual file systems. I remember that a part of the talk was about speed-ups on Linux as well. So even if it doesn't sway your opinion it should enhance your perspective on how file systems work.


The issue usually isn't NTFS, but the other layers in the I/O stack.

NTFS-the-on-disk-structure by itself can easily provide setup comparable to XFS realtime extensions.


If you do like WinDirStat, there's a good Linux equivalent called QDirStat: https://github.com/shundhammer/qdirstat


There is a fork of Windirstat that also reads the NTFS MFT as well https://github.com/ariccio/altWinDirStat


> it works by directly reading the NTFS tables rather than spidering through the directories

Maybe I'm just ignorant of linux filesystems, but this seems like the obvious thing to do. Do ext and friends not have a file table like this?


> I don't know why one ordering is better than the other, but the difference is pretty drastic.

I have the suspicion that some file systems store stat info next to the getdents entries.

Thus cache locality would kick in if you stat a file after receiving it via getdents (and counterintuitively, smaller getdents buffers make it faster then). Also in such cases it would be important to not sort combined getdents outputs before starting (which would destroy the locality again).

I found such a situation with CephFS but don't know what the layout is for common local file systems.


It's also interesting that the perf report for running dut on my homedir shows that it spends virtually all of the time looking for, not finding, and inserting entries in dentry cache slabs, where the entries are never found again, only inserted :-/ Great cache management by the kernel there.

ETA: Apparently the value in /proc/sys/vm/vfs_cache_pressure makes a huge difference. With the default of 100, my dentry and inode caches never grow large enough to contain the ~15M entries in my homedir. Dentry slabs get reclaimed to stay < 1% of system RAM, while the xfs_inode slab cache grows to the correct size. The threads in dut are pointless in this case because the access to the xfs inodes serializes.

If I set this kernel param to 15, then the caches grow to accommodate the tens of millions of inodes in my homedir. Ultimately the slab caches occupy 20GB of RAM! When the caches are working the threading in dut is moderately effective, job finishes in 5s with 200% CPU time.


Are you referring to the kmem_cache_alloc calls in the profile? If so, that's all in kernel space and there's nothing I can do about it.

https://share.firefox.dev/3XT9L7P


No, see how your profiles have `lookup_fast` at the leaves? Mine has `__lookup_slow` and it is slow indeed.


I just saw your edit. You have WAY more stuff under your home directory than I do. I only have ~2.5M inodes on both my laptop drives combined. The difference in the buff/cache output of `free` before and after running `dut` is only 1 GB for me.

Also, TIL about that kernel parameter, thanks!


Yeah I have a TB of bazel outputs in my cache directory. Unfortunately automatically deleting old bazel outputs is beyond the frontier of computer science and has been pushed out to future releases for 6 years and still going: https://github.com/bazelbuild/bazel/issues/5139


Reminds me of someone's script I have been using for over a decade.

    #/bin/sh
    du -k --max-depth=1 "$@" | sort -nr | awk '
         BEGIN {
            split("KB,MB,GB,TB", Units, ",");
         }
         {
            u = 1;
            while ($1 >= 1024) {
               $1 = $1 / 1024;
               u += 1
            }
            $1 = sprintf("%.1f %s", $1, Units[u]);
            print $0;
         }
        '


I don't understand the point of the script, it's nothing more than:

  du -h --max-depth=1 "$@" | sort -hr


`-h` is not available in all `sort` implementations


Even the busybox port has it. The only sort implementation I know of that doesn't have -h is toybox (I guess older busybox implementations are missing it as well), but I'm using -h for well over a decade and seldom had it missing


i was actually curious when busybox's sort added it; but didnt search too hard. was certainly easy to see gnu get it in 2009 i think (but even then if the dude setup there bashrc long ago and that func/alias works, likely no reason to change it immediatly)

i can say an `BusyBox v1.35.0 (2022-08-01 15:14:44 UTC)` did not have -h; so it having it now is kind of a shock to me (looks like busybox v1.36.1 has it - at least from 2023-06-22) - good too! always frustrating when a dev tries using gnu-args and it blows up and i gotta explain the diff between mac-shell-cmds, gnu, and busybox


I found this online a long time ago, and it's been with me across BSD, Macintosh and Linux. So I can't say why it is that way, and I didn't know about sort -h before today.


The point is that it is faster.


A bash script for postprocessing the sorting is certainly slower than just having sort do it correctly in the first place.


Any particular reason for doing the human readable units "manually"? `du -h | sort -h` works just fine.


Nice!


I will definitely try this one and compare with my daily stuff

`du -s -k * | sort -r -n -k1,1 -t" "`


I'm surprised statx was that much faster than fstatat. fstatat looks like a very thin wrapper around statx, it just calls vfs_statx and copies out the result to user space.


Out of curiosity, I switched it back to fstatat and compared, and found no significant difference. Must've been some other change I made at the time, although I could've sworn this was true. Could be a system update changed something in the three months since I did that. I can't edit my post now though, so that wrong info is stuck there.


I have this in my bashrc:

    alias duwim='du --apparent-size -c -s -B1048576 * | sort -g'
It produces a similar output, showing a list of directories and their sizes under the current dir.

The name "duwim" stands for "du what I mean". It came naturally after I dabbled for quite a while to figure out how to make du do what I mean.


> Anyone have ideas for a better format?

Hi, how about flamegraph? I always want to display the file hierarchy in flamegraph like format.

- previous discussion: https://x.com/laixintao/status/1744012609983295816

- my work display flamegraph in terminal: https://github.com/laixintao/flameshow


I'm away from my Linux machine now but I'm curious whether/how you handle reflinks. On a supported file system such as Btrfs which I use, how does `cp --reflink` gets counted? Similar to hard links? I'm curious because I use this feature extensively.


I've actually never heard of --reflink, so I had to look it up. `cp` from coreutils uses the FICLONE ioctl to clone the file on btrfs instead of a regular syscall.

I don't handle them specifically in dut, so it will total up whatever statx(2) reports for any reflink files.


You’ll probably end up with dupes (and removing these files won’t have the effect you intend) but I don’t know that there’s a good way to handle and report such soft links anyway.


Btdu will be you friend.


I often want to know who there is a sudden growth disk usage over the last month/week/etc, what suddenly take space. In those cases I find myself wishing that du and friends would cache their last few runs and would offer a diff against them, this easily listing the new disk eating files or directories. Could dut evolve to do something like that?


  du[t] > .disk-usage-"`date +"%d-%m-%Y"`"
And then use diff later?


Almost all of them will have some difference. What is needed is to parse the previous state, calculate the difference in size, and show only the "significant" difference.


btdu extra (or expert?) mode with snapshots kinda does that: you can see what's only in the new version and not in a snapshot; and vice-versa. Also it offers attributing size to folders only for extends that aren't shared with a different folder (snapshots are essentially just special folders), to kinda get a diff between the two (stuff only present in the old snapshot is shown there; stuff only present in the new version is shown there).



That looks perfect, thanks!


Looks nice, although a feature I like in ncdu is the 'd' key to delete the currently highlighted file or directory.


This isn't an interactive program, so ncdu would be better for interactively going around and freeing up space. If you just want an overview, though, then dut runs much quicker than ncdu and will show large files deep down in subdirectories without having to go down manually.


Nice job. I've been using dua[0] and have found it to be quite fast on my MacBook Pro. I'm interested to see how this compares.

[0] https://github.com/Byron/dua-cli


I benchmarked against dua while developing, and the results are in the README. Note that dut uses Linux-specific syscalls, so it won't run on MacOS.

TL;DR: dut is 3x faster with warm caches, slightly faster on SSD, slightly slower on HDD.


What I need is a du that caches the results somewhere and then does not rescan the 90% of dirs that have not changed when I run it again a month later...


And it would know they did not change without scanning them because how?


Maybe it could run in the background and use inotify to just update the database all the time, or at least keep track of what needs rescanning?


Thinking about this some more, does this system not already exist for the disk quota calculation in the kernel? How does that work? Would it be possible for a tool to scan the disk once, and then get information about file modifications from the system that's used to update quota info?


It could hash the contents of a dir. Along the lines of git


Except hashing requires... reading.

There is not much to be done here. Directories entries are just names, no guarantees that the files were not modified or replaced.

The best you could do is something similar to the strategies of rsync, rely on metadata (modified date, etc) and cross fingers nobody did `cp -a`.


I would be fine with the latter, the program could display a warning like "Results may be inaccurate, full scan required" or something.

I guess I'm just annoyed that for Windows/NTFS really fast programs are available but not for Linux filesystems.


And to hash something needs reading all of its data. I think deducing the file size would actually be faster in some file systems and never slower with any.


Faster in all file systems I'd guess, stat is fast, opening the file and reading its contents and updating a checksum is slow, and gets slower the larger the file is.


I've been using my own function with `du` for ages now, similar to others here, but I appreciate new tools in this space.

I gave `dut` a try, but I'm confused by its output. For example:

  3.2G    0B |- .pyenv
  3.4G    0B | /- toolchains
  3.4G    0B |- .rustup
  4.0G    0B | |- <censored>
  4.4G    0B | /- <censored>
  9.2G    0B |- Work
  3.7G    0B |   /- flash
  3.8G    0B | /- <censored>
   16G  4.0K |- Downloads
  5.1G    0B | |- <censored>
  5.2G    0B | /- <censored>
   16G    0B |- Projects
  3.2G   42M | /- <censored>
   17G  183M |- src
   17G    0B | /- <censored>
   17G    0B |- Videos
  3.7G    0B | /- Videos
   28G    0B |- Music
  6.9G    0B | |- tmp
  3.4G    0B | | /- tmp
  8.8G    0B | |- go
  3.6G    0B | |   /- .versions
  3.9G    0B | | |- go
  8.5G    0B | | |     /- dir
  8.5G    0B | | |   /- vfs
  8.5G    0B | | | /- storage
  8.5G    0B | | /- containers
   15G  140M | /- share
   34G  183M /- .local
  161G    0B .
- I expected the output to be sorted by the first column, yet some items are clearly out of order. I don't use hard links much, so I wouldn't expect this to be because of shared data.

- The tree rendering is very confusing. Some directories are several levels deep, but in this output they're all jumbled, so it's not clear where they exist on disk. Showing the full path with the `-p` option, and removing indentation with `-i 0` somewhat helps, but I would almost remove tree rendering entirely.


It is being sorted by the first column, but it also keeps subdirectories with each other. Look at the order of your top-level directories.

  3.2G    0B |- .pyenv
  3.4G    0B |- .rustup
  9.2G    0B |- Work
   16G  4.0K |- Downloads
   17G  183M |- src
   28G    0B |- Music
   34G  183M /- .local
If you don't want the tree output and only want the top directories, you can use `-d 1` to limit to depth=1.


Ah, I see, that makes sense.

But still, the second `Videos` directory of 3.7G is a subdirectory of `Music`, so it should appear below it, no? Same for the two `tmp` directories, they're subdirectories of `.local`, so I would expect them to be listed under it. Right now there doesn't seem to be a clear order in either case.


Subdirectories cannot be larger than the directory that they are within, so they cannot be sorted _below_ their parent. Thus, the tree branches upwards, not downwards. The root is at the bottom, where a tree’s root should be!

Incidentally, dust sorts things the same way but presents it with a nicer tree:

     db48x  ~  1  dust -Db
    2.6G     ┌── saves
    2.6G   ┌─┴ .factorio
    3.4G   │   ┌── Steam
    3.4G   │ ┌─┴ share
    3.4G   ├─┴ .local
    1.8G   │ ┌── EgoSoft
    3.5G   ├─┴ .config
    7.2G   │ ┌── Amadeus (1984) DC (1080p BluRay x265 HEVC 10bit AAC 5.1 Tigole)
     16G   │ ├── Amadeus.1984.DC.INTERNAL.REPACK.1080p.BluRay.x264-CLASSiC[rarbg]
     23G   ├─┴ video
    1.9G   │   ┌── build
    2.0G   │ ┌─┴ notcurses
    2.2G   │ ├── wezterm
    2.1G   │ │ ┌── master
    2.6G   │ │ │   ┌── tiles
    2.6G   │ │ │ ┌─┴ obj
    4.1G   │ │ ├─┴ missions
    2.6G   │ │ │   ┌── tiles
    2.6G   │ │ │ ┌─┴ obj
    4.1G   │ │ ├─┴ iteminfo
    5.6G   │ │ ├── follower_rules
    8.5G   │ │ │     ┌── pack
    8.6G   │ │ │   ┌─┴ objects
    8.6G   │ │ │ ┌─┴ .git
     10G   │ │ ├─┴ uilist
     26G   │ ├─┴ cataclysm
     33G   ├─┴ src
     72G ┌─┴ .


Gotcha, thanks for explaining.

Yeah, I guess my confusion was with how the tree is rendered in dut. The pipe rendering of dust makes this clearer.


Neat, a new C program! I get a little frisson of good vibes whenever someone announces a new project in C, as opposed to Rust or Python or Go. Even though C is pretty much a lost cause at this point. It looks like it has some real sophisticated performance optimizations going on too.


I have been using diskonaut, its fast enough given that it also produces a nice visual output.


Did you consider the fts[0] family of functions for traversal? I use that along with a work queue for filtered entries to get pretty good performance with dedup[1]. For my use case I could avoid any separate stat call altogether, the FTSENT already provided everything I needed.

0 - https://linux.die.net/man/3/fts_read

1 - https://github.com/ttkb-oss/dedup/blob/6a906db5a940df71deb4f...


Those are single threaded, so they would have kneecapped performance pretty badly. 'du' from coreutils uses them, and you can see the drastic speed difference between that and my program in the README.


fts is just wrapper functions.

You cannot around getdents and stat family syscalls on Linux if you need file sizes.


Nice work! There is also gdu[1], where the UI is heavily inspired by ncdu and somehow feels way faster...

1: https://github.com/dundee/gdu


> https://dev.yorhel.nl/doc/ncdu2

I wasn't aware that there was a rewrite of ncdu in Zig. That link is a nice read.


This looks handy. Do you have any tips for stuff like queued ‘mv’ or similar? If I’m moving data around on 3-4 drives, it’s common where I’ll stack commands where the 3rd command may free up space for the 4th to run successfully - I use && a to ensure a halt on failure, but I need to mentally calculate the space free when I’m writing the commands as the free space after the third mv will be different to the output of ‘df’ before any of the commands have run.


I haven't run into a situation like that, but if I did, I'd be doing mental math like you. `dut` would only be useful as a (quicker) replacement for `du` for telling you how large the source of a `cp -r` is.


This looks awesome!

One comment, I find the benchmark results really cumbersome to read. Why don't you make a graph (e.g. a barplot) that would make results obvious at a quick glance. I'm a strong believer in presenting numerical data graphically whenever possible, it avoids many mistakes and misunderstandings.


I think that 'ls' should also be evaluating the size of the files contained within. The size and number of contained files/folders really does reveal a lot about the contents of a directory without peeking inside. The speed of this is what would be most concerning though.


GPLv3, you love to see it. Great work.


You should include the "How to build" instructions near the beginning of the main.c file.


Done


Not as featureful, but what I've been using. If you can't install this tool for some reason, it's still useful. I call it usage:

    #!/bin/bash

    du -hs * .??* 2> /dev/null | sort -h | tail -22


dut looks very nice.

One small surprise I found came when I have a symlink to a directory and refer to that with a trailing "/". dut doesn't follow the link in order to scan the real directory. Ie I have this symlink:

    ln -s /big/disk/dev ~/dev
then

    ./dut ~/dev/
returns zero size while

    du -sh ~/dev/
returns the full size.

I'm not sure how widespread this convention is to resolve symlinks to their target directories if named with a trailing "/" but it's one my fingers have memorized.

In any case, this is another tool for my toolbox. Thank you for sharing it.


Does it depend on linux functionality or can I use it on macos?

Well I can just try :)


From the author:

„Note that dut uses Linux-specific syscalls, so it won't run on MacOS.“


great app. very fast at scanning nested dirs. I often need recursive disk usage when I suddenly run out of space and scramble to clean up while everything is crashing.


I always want treemaps.

console: rust: cargo install diskonaut python: pip install ohmu GUI: gdmap windows: windirstat mac: grand perspective (I seem to r call)


Would be great to have a TUI interface for browsing like ncdu.


Ncdu is easy to remember and use, clicking through etc. would be cool to find a faster replacement, same usage instead of a new tool with parameters to remember..


Nice work. I really miss the simplicity of C. One file. One Makefile and that's it. Has anyone tested with a node_modules folder ?


Someone, please create a Gdut, a fork that will produce graphs for a quick and easy way to read, it’s almost impossible to read in small vertical screens.


If this accurately shows hidden stuff, such as docker build cache and old kernels, then it will become my go-to!


As long as it has permissions, it totals up everything under the directory you give it including names that start with a '.'. It won't follow symlinks though.


Uh, even basic du "shows hidden stuff" accurately doesn't it?

dot files are just a convention on unix.


I get boatloads of "undefined reference" errors. Where's the list of dependencies?


The only dependency is a recent Linux C standard library. What are the missing symbols? On older versions of glibc you do have to add -pthread.


The author should have included a Makefile. You need to add -lpthread to command provided in README.md


did you consider using io_uring? if not, was there a reason other than portability?


io_uring doesn't support the getdents syscall, so there's no way to traverse the filesystem with it. I considered using it for statx(2) to get the disk usage of each file, but decided not to because (a) it would be complicated to mix normal syscalls and io_uring and (b) perf showed the kernel spending most of its time doing actual work and not syscall boilerplate.


Are you sure the perf may not be misleading?

E.g. memory accesses might show up as slower die to CPU caches being flushed when switching between user and kernel space.

I would be extremely interested in a quick (standalone?) benchmark of e.g. 1M stats with vs without uring.

Also https://github.com/tdanecker/iouring-getdents reports big uring speedups for getdents, which makes it surprising to get no speedups for stat.

If uring turns out fast, you might ignore (a), just doing the getdents first and then all stats afterwards. Since getdents is a "batch" syscalls covering many files anyway, but stat isn't.


I appreciate the explanation!


neat tool, congrats on the release and thank you for this and the analysis/comparison.



ncdu has been my go to for years. Pleased to have a modern alternative.


Ideas for a better format: do what xdiskusage does.


What specifically do you feel xdiskusage does well?


It graphically displays the relative sizes of things, and allows you to interactively zoom into any particular subdirectory to see the relative sizes of the things inside it


Why C and not Rust or even Zig?


Because (a) I felt like it, (b) it would be more difficult to make raw syscalls in Rust, and (c) I use[1] flexible array members to minimize allocations, whereas Rust support for those quite bad[2]. It doesn't even look possible to allocate one with size known at runtime.

[1]: https://codeberg.org/201984/dut/src/branch/master/main.c#L16...

[2]: https://doc.rust-lang.org/nomicon/exotic-sizes.html#dynamica...


> it would be more difficult to make raw syscalls in Rust

Would you like to expand on this? Is it because of type conversions that you'd have to do?

> I use[1] flexible array members to minimize allocations

I was under the impression that FAM was a non-standard extension, but alas it is part of C99.

From what I'm seeing you have intrusive list where each `entry` points to the previous and next element, where the path itself is a bag of bytes as part of the entry itself. I'm assuming that what you'd want from Rust is something akin to the following when it comes to the path?

    struct Entry<const NAME_LEN: usize> {
        ..
 mode: Mode,
 name: InlineName<NAME_LEN>,
    }

    struct InlineName<const NAME_LEN: usize> {
 value: [u8; NAME_LEN],
    }


For syscalls, I would have needed to either pull in dependencies or write FFI bindings, and neither of those options are appealing when I could simply write the program in Linux's native language.

For the FAM, your example looks like it requires a compile-time constant size. That's the same as hardcoding an array size in the struct, defeating the whole point. Short names will waste space, and long ones will get truncated.


> For the FAM, your example looks like it requires a compile-time constant size. That's the same as hardcoding an array size in the struct, defeating the whole point. Short names will waste space, and long ones will get truncated.

You made me realize that the const generics stabilization work hasn't advanced enough to do what I was proposing (at least not in as straightforward way): https://play.rust-lang.org/?version=nightly&mode=debug&editi...

Those are const arguments, not const values, which means that you can operate on values as if they didn't have a size, while the compiler does keep track of the size throughout.

I'd have to take a more detailed look to see if this level of dynamism is enough to work with runtime provided strings, which they might not be, unless you started encoding things like "take every CStr, create a StackString<1> for each char, and then add them together".


This is not an appropriate question to ask, that I see sometimes in these threads. "Because the author wanted" is good enough a reason for them to write a program in C. It being a new project written in C can also be good enough reason for you not to use it: dust already exists and is written in Rust, which you can use instead.


Why Rust or Zig and not C?


Better for the author's resume, if they want to make it hype driven.

Also some nebulous "being more secure". Never mind that this tool does not have elevated privileges. You gotta watch out for those remote root exploits even for a local only app, man.


I mean you could extract an archive you've downloaded to your filesystem and said archive could have funky file names and then you use this tool..

But I suppose it's not a very likely bug to have in this kind of tool.


Or “they” could enter your house at 2 am, drug you and hit you with a $5 wrench until they get access to your files :)


Never understood this line of thought. Not everything needs to be super secure. Not everything is going to be an attack vector. No one is going to deploy this onto a production server where this program specifically is going to be the attack vector.

Memory safety is cool and all, but a program that effectively sums a bunch of numbers together isn't going to cause issues. Worst case the program segfaults


Because C is like a cult Toyota Supra with twin turbo and Rust or Zig is like another cool boring Corvette roadster.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: