I took a brief look at the issue and it's very strange. People reporting the issue and piling on clearly don't understand sparse files (which is fine in and of itself), and their claims that "the rest of the system treats the spare file as regular file" are just not very credible to me.
The thing is, if you want to look at the physical size of a file/directory, you don't use ls, you use du. Similarly, if you want to look at disk usage and free space on a volume, you use df, not summing up ls reported size of all files. Using df for free space is especially important on APFS since you have APFS volumes sharing the space in a container. And of course, du and df correctly report actual disk usage. If you check Docker.raw in Finder, it says something like "63,999,836,160 bytes (10.25 GB on disk)", too. Btw, even `ls -l` reports correct total disk usage:
$ ls -lh ~/Library/Containers/com.docker.docker/Data/vms/0/data
total 9.6G
-rw-r--r-- 1 <omitted> staff 60G Jun 17 02:01 Docker.raw
There's just no way "the rest of the system treats the spare file as regular file". I suspect people actually have disk space tied up in APFS snapshots (Time Machine takes a lot of those and I have to clear them constantly with tmutil on machines with meager SSDs), but they're not aware of that, instead they ran some crappy script or something found online, located this "huge file", then just blamed it for the disk usage. Any proper tool, e.g. Disk Utility (builtin), DaisyDisk won't make this obvious mistake. I thought maybe people were using Finder search to find large files and that's the problem, but I just tried and Finder search also makes a distinction between logical size and physical size, so it's not even that.
Edit: In case it's not clear, sparse files don't "reserve" their space. Try this:
for i in {000..999}; do dd if=/dev/zero of=/tmp/sparse-$i bs=1024 count=0 seek=104857600; done
You just created a thousand 100GiB files totaling zero bytes in disk usage (not counting metadata).
I understand what a sparse file is and so do many of the people in that thread. The important point is, do all the apps on the Mac understand that sparse files don’t take up all that space? Some clearly don’t, which means their developers either don’t or didn’t account for them. It’s not the fault of the users in that thread.
They don't need to. If an application needs to know how much free disk space there is, they use an API like statvfs(3) that queries the filesystem, which clearly knows how to compute the disk usage of itself. No application is gonna ask for the logical size of every single file on the disk; why the hell would they even care about this particular Docker.raw file?
As I said, reports like "macOS installer says there's not enough disk space because of a sparsefile somewhere" and "macOS tells me disk space is low every two minutes because of a sparsefile somewhere" just aren't credible if you understand how it works. Otherwise, how would they behave if you create a thousand "100GB files" like I demonstrated? Going all crazy? Your free disk space is negative?
> just aren't credible if you understand how it works.
Yes, your theoretical knowledge, which several of the people in that thread obviously have, beats actual experience.
Not only is that not credible, it's about as useful and insightful as "it works for me" - and who on Earth is using statvfs on a Mac? Is the installer using statvfs? You've looked, right? And you believe every developer uses the correct APIs and never introduces a bug…
Try reading and not skimming, it's especially important when dealing with bug reports.
The alternative theory is that the installer is spending a very very long amount of time (on my computer i bet that would take at least ten minutes) scanning every file on my disk to look up the declared size and totaling it to check whether the size of the disk minus that number is enough disk space, which is so absolutely ridiculous of a premise that I don't even know how it is being taken seriously.
That's an alternative theory, among many other possible ones that wouldn't lead to the dismissal of an entire thread of developers' comments based on a quick skim and the strange idea that not one of them knows what a sparse file is.
I have no trouble creating a 1TiB file this way on a 512GB max volume. Trying to create a 2TiB file this way does error out. MacBookPro11,5, macOS 10.15.5.
It is a sparse file: copying it is very fast when it's empty. I don't know if the OS recognizes the space claims, but it would be logical if it did so.
I’m sorry, but am I being downvoted for being factual? Docker sets up a HyperKit VM (which brings with it a fair amount of overhead, since it includes a complete Linux kernel and a sizable userland). overlayfs runs inside that VM image.
I’m not sure why you’re being downvoted but your comment doesn’t address the issue. Of course Docker will need some pre-allocated space but setting it so high as default seems excessive and ignores the real problems it’s causing to people in that issue thread.
How do I view server logs in real time while developing if daemonized (while preserving all the colorization and stuff). Do I just do some kind of tail log stream or something?
For local dev, I run all my docker stuff via minikube (`minikube docker-env` sets things up so your shell uses the docker daemon running in minikube, allowing you to run docker containers outside of k8s) and then from time to time I just run `minikube delete`, which deletes the entire VM - docker images and cruft included.
It’s a bit of a scorched earth approach, and it won’t help if you want to preserve volumes, but it’s a good way to ensure no docker cruft is left behind on your local machine.
Client/server model is one of the leakiest things of Docker. Try mounting a relative path, or run docker inside docker, or try to inherit the current user’s security privileges.
To this day I don’t understand what problems the client/server model solved, and why it was worth all the problems it created.
I guess - and I might be very wrong here - that the reason for client / server architecture was to be able to schedule docker containers on several hosts without the need to ssh into them. And I guess something like docker-in-docker or docker containers accessing the Unix socket would be more complicated.
But why the need for Unix sockets or anything like that? Creating a container is a fancy fork(), and executing that through a foreign process (especially when on the same server) makes no sense to me.
Remember, containers are just Linux cgroups, there is nothing “special” about a container that requires a client/server.
For me and my coworkers, the leakiness is mostly around networking. If you start the wrong vpn, docker doesn’t work. If your ip tables aren’t set up just so, docker doesn’t work. Yet the documentation would lead you to believe that how docker does networking should be considered an implementation detail.
Off topic:
I was surprised about how amused I was by this. It looks like normal humor doesn't cut it for me anymore.
> Because I cannot be bother to remember this, and don't want to google how it's done only to end up at my own blog, I did a quick realias to add a new bash alias
Between that and Electron apps taking up any smidgen of available RAM I'm surprised my machine is able to run anything else at all.
[1] https://github.com/docker/for-mac/issues/2297