Hacker News new | past | comments | ask | show | jobs | submit login
Is there ever a good reason to disable swap?
29 points by warrenm 6 months ago | hide | past | favorite | 82 comments
https://serverfault.com/a/684800/2321 indicates it is "not recommended" (at least on Linux), and I have always setup at least some swap on every machine I have had since moving off DOS in the early 90s

Is there ever a good reason to not enable it (on any OS)?




Most people think of swap as "emergency memory in case I run out of memory" and while it's true that it can get used in this way, it usually serves a much more critical purpose in your OS's ability to reason about and use memory.

For a good article on why this is true for Linux: https://chrisdown.name/2018/01/02/in-defence-of-swap.html

I believe that most operating systems are going to make use of memory in a similar manner.

With that said, I'll turn off swap on devices that have unreliable storage. (Anything using an SD card)


I've always hated one bit of that article. It doesn't address the "low" and "no" memory contention cases separately, even though they're quite different and it's possible to get a system to stay in the "no memory contention" case all the time (unless there's a memory leak, but swap just fills up then and provides no benefit).

> Under no/low memory contention

> With swap: We can choose to swap out rarely-used anonymous memory that may only be used during a small part of the process lifecycle, allowing us to use this memory to improve cache hit rate, or do other optimisations. > Without swap: We cannot swap out rarely-used anonymous memory, as it's locked in memory. While this may not immediately present as a problem, on some workloads this may represent a non-trivial drop in performance due to stale, anonymous pages taking space away from more important use.

This is under the low/no memory contention. For the low memory contention case, this can make sense, but for the no contention case it's nonsense. There's no more important use, all uses are fulfilled and there's still free memory (that's what "no contention" means)!

So clearly that very page has presented a case where swap is useless: when you have enough RAM to ensure there's never contention.

Swap also only extends the amount of memory by the size of the swap partition. If you've got 64GiB of memory and an 8GiB swap partition, you could just as well have 96 or 128GiB of memory and reserve an 8GiB zram for swap.

Indeed, Fedora changed to use zram instead of swap-on-disk by default in Fedora 33[1].

Swap does allow some fancier optimizations to memory layout, but again swap-on-zram is better for this than on a disk if you've got enough RAM.

The big benefit of swap is on laptops where hibernation may actually be desirable (assuming encrypted swap & disk) and RAM is harder to come by. A laptop with 96GiB+ of RAM is a LOT more of an expense than a desktop with the same.

One huge disadvantage of swap-on-disk that article neglects to mention is that sensitive data from RAM can be written to persistent storage (swap) and henceforth leaked more easily (assuming unencrypted swap). Swap must be encrypted if it's disk-backed.

[1] https://fedoraproject.org/wiki/Changes/SwapOnZRAM


For a couple of decades I have ran my desktop without swap, for the simple reason that all it does in practice is slow down the system to a crawl when a rogue process is gobbling all RAM. For my taste, a straightforward failure is better than an unresponsive system. Not that it happens very often, anyway.


I prevent to auto kill the biggest memory eaters (which is always chrome/chromium/electron) once the the system actually is well into swapping. Usually it comes back without any killing without me hardly noticing unresponsiveness and otherwise it comes back fast once the chrome has been killed. This way I don’t risk the system OOMing an actual important process (like a compile I am doing) while that will easily and without issues continue after Chrome is not there. Anything important, in my case, is never a non terminal process.


Maybe it isn't that bad if the swap file / partition is on an SSD? I actually haven't ran out of RAM since I moved to SSDs, even when I still had 16 GiB. And now there are some ultra fast NVMe drives out there.


Last time I checked, which was long ago to be fair, SSDs had a latency that was still orders of magnitude higher than RAM, so the system would still slow down to a crawl regardless.


Just to put some numbers on it.

DDR5 Latency = 16.67ns

DDR5 Bandwidth = 64GB/s

Now, keep in mind, this doesn't account for things like dual or quad channel which bumps up the bandwidth even further though doesn't really affect latency.

The fastest and most reasonably priced consumer NVMe I could find:

NVMe Latency = 200000ns

NVMe Bandwidth = ~3GB/s

The fastest consumer (or low enterprise) grade persistent storage is terribly slower and takes longer to respond than the cheapest DDR5 ram.


isn't optane latency more like 8 µs or 8000ns[1]?

Still waaaaaaaaaaaay slower than ram, but getting closer. Before giving up on optane, intel used to make memory modules with optane on them (as a form of backup I think)

[1]: https://www.intel.com/content/www/us/en/products/docs/memory...


I didn’t know about those Optane SSDs they are very interesting and I will look into acquiring one if they are available for consumer hardware.

From the mentioned article:

“Bandwidth problems can be cured with money. Latency problems are harder because the speed of light is fixed; you can’t bribe God". David Clark


My understanding is that the slowness comes from program code being removed from memory to free up space, forcing the system to constantly re-read it from disk. I still get the slowness without swap.


If you're running a raspberry pi type machine with storage on an SD card or similar. It's too easy to kill them with many writes.

Also on a dedicated systems where you have full control of the stack, the apps, and have resource limits in place anyway. If you know your embedded system will never need more than X GB, you can use it as resource limit of last resort. (And hopefully let the watchdog reboot it after the required app fails to check in)


Nowadays it is the inverse. Is there ever a good reason to enable it? Most devices run on TLC or other fast wearing flash, and swapping there is both expensive in terms of durability loss, as well as still much slower than just having enough RAM.

I think my only device with swap is my Mac laptop and it is relatively conservative when it swaps, unlike Linux with default settings.


> Is there ever a good reason to enable it?

Yes. It's a rare system that it shouldn't be enabled.

RAM is a precious resource. It's highly likely programs will allocate RAM that won't be used for days at a time. For example, if you are using docker once the containers are started the management daemon does nothing. If you have ssh enabled only for maintenance it unlikely to be used for days if not weeks on end. If your system starts gettys they are unlikely to be used, ever. The bulk of the init system probably isn't used once the system is started.

All those things belong in swap. The RAM they occupied can be used as disk cache. Extra disk cache will speed up things you are doing now. Notice this means most of the posts here are wrong - just having swap actually gives you a speed boost.

One argument here is that disabling swap provides you an early warning system you need more RAM. That's true, but there is a saner option. Swap is only a problem if the memory stored there is being used frequently. How do you monitor that? You look at reads from swap. If the steady state of the system is showing no reads from swap, you aren't using what's there, so it can't possibly have a negative speed impact. But if swap is being used and isn't being read from, it's freed some RAM so it is having a positive speed impact.

One final observation: the metric "swap should be twice the size of RAM" isn't so helpful nowadays. There aren't a lot of programs that sit around doing nothing. Maybe 1GB or so, and it's more or less fixed regardless of what the system is doing. Old systems didn't have 2GB of RAM, so the "twice RAM" rule made sense. But now a laptop might have 16GB. If you are using 32GB of swap and not reading from it would be a very, very unusual setup. If you are reading from that 32GB or swap, you're system is RAM starved and will be dog slow. You need to add RAM until those reads drop to 0.


The best modern reason to have as much swap as RAM is to make hibernation to disk more reliable, but a lot of people don't use that anymore. It's more reliable because the kernel doesn't have to work as hard to find space to write the system image to.


> One final observation: the metric "swap should be twice the size of RAM" isn't so helpful nowadays

Remembered when I thought "if double is recommended, four times should be even better!". It was not.

Nowadays I don't use swap because I rarely run out of RAM, it sits there eating a few precious GB of SSD, largely unused. The rare cases when I run out of RAM have been buggy Steam games on Proton. In 2024, it has been only "The Invincible", and that game has reports of running out of memory on Windows too.


But why do you have swap on with default settings?


They don't. They said their "only device with swap is my Mac laptop", so they have swap off in their Linux machines.

If your question was "why not keep swap on and change the default swap-iness settings", if keeping it totally off works fine for them, why bother?


Take in mind that executable code on Linux is mapped in from disk, either from an executable or from a shared library. So every application's performance on Linux is heavily dependent on the disk cache.

If you have no swap, anonymous pages (stacks, heaps) cannot be evicted to disk and the thrashing is forced onto the disk cache. So the hard lock-up occurs earlier.

If you want to delay the lock-up as much as possible, enable swap and set swappiness high.


Is this really from real world experience? And is that only in certain conditions?

My experience is really the opposite from this. Thrashing was a normal occurrence on my desktop, where it went into non-recovery and a manual hard reset.

And also, with 16GB ram and 4GB swap, my running applications got moved to swap. Switching tabs in Firefox will be slow because it has to come from swap. My swappiness was set to 1 that it shouldn't swap, but it did swap always.

Now without swap and using early_oom everything is fine. When I see in /proc/vmstat that there has been a kill, it is time to reboot.

On my laptop though, my usecase is different. It only has 2GB ram, so I prefer swap over a hard kill. And I reboot it more than once a day if I am using it.


Yes, i learned it the hard way when debugging production outages. Gitlab's Praefact recommended VM sizes were too small for our usecase and we had, per provisioning defaults, no swap on all machines. 150 MB of binaries in virtual memory, only 50 MB disk cache left, this is where it made click for me.

If you want a hard OOM kill, i don't know. I'm only talking about the I/O lockup that happens in these situations.


Thank you. So ram was quite minimal, just barely enough to run the applications, and almost none left for disk io. On my laptop that is the same situation. On my desktop however, I have way more ram than needed to run the applications. So I assume it is dependent on the situation if you want (need) swap or not.


This is the correct answer that needs to be at the top. No swap doesn't mean OOM killer magically kicks in earlier. It just means the anonymous memory has no where to go and your executable pages get evicted and then you are really hosed.


And the machine crashes which in production environments is far preferable to dog slow.


Unfortunately no crash. This is the dog slow case. Too slow for an SSH session to be able to start. But the machine might catch itself and get back onto tracks without an OOM happening.


If you turn off swap (which I do for large fleets of highly uniform and tightly managed systems) you should also mlock your executable pages.


I went with enabling swap and monitoring for page pressure. In the end of the day the disk cache for the application data is also highly performance critical.


How the lock-up looks in practice: RAM is mostly full with heap/stacks, there are a few MB available for disk cache and all processes fight each other to have their own code mapped into the remaining MB. Reading disk I/O is fully saturated at this point.


Kubernetes insists that you disable swap. The reason is that swap messes with the accounting of memory usage. Kubernetes expects hosts to have a very direct action memory reservation. Containers have a memory request; if they are over this request when memory pressure hits, they will be killed. Because the swapped out pages are not included in the memory accounting[1] you can end up in states where nothing is over the request, even though you theoretically haven't overcommitted. This requires falling back to true random killing, which is to be avoided.

It's also based on observations of performance - for many moons, the performance hit of swapping was bad enough that it was never worth it to run two jobs concurrently that didn't both fit into ram together. The exceptions weren't even exceptions. Disk thrashing was a serious impediment and way to slow your whole fleet down.

Now with fast flash being so common, swap is probably actually a good thing for many workloads again, but only "many", and SREs would prefer you make that explicit by using memory mapped files for data that's of random utility, so that the OS can manage that pressure for you, understanding that those files don't need to be fully resident.

[1] This is an oversimplification that I don't remember the real truth of off the top of my head


In a previous cloud hosting provider experience, swap was disabled everywhere

Every instances were designed to:

  - have ~1GB for basic server requirements
  - have XGB for whatever the server was hosting: database ? web server ? proxy ? All have known memory consumptions
  - if required, have some extra GB for IO cache
So we had some known requirements (the "app" line) and some variables requirements (IO cache and "basic server requirements")

Some extra informations:

  - one instance = one service (this is the way to handle technical debt, all managements issues, but also risk management and security-related stuff)
  - storage was backed with half-millions effective IOPS stuff
No oom and no waste


It allows the "fail fast" philosophy, where things break quickly and noticeably (in this case, when you run out of RAM), rather than risking a silent degradation in performance.


Without swap you have two modes of operation: working fine, and not working at all (killed by the OOM killer). With swap, you introduce a third mode: working very slowly. The more modes of operation, the harder to reason about.


This is sort of fair, but working very slowly occurs in more than one dimension.

Swap on means applications may be slower due to page faults requiring swapped out memory. Swap off may mean file system operations are slower due to less opportunity to file system cache frequently used objects.

There is still a trade off here on performance balanced against use case.


Disabling swap is recommended on k8s nodes, since the idea there is to schedule workloads to use up the actual RAM/CPU of the box.


I think it is generally a good thing to do on any dedicated machine.

But for some things, especially with some k8s uses, overprovisioning can be okay. Some stuff just needs to run somewhere where cost of swiping in or out is lower than cost by provisioning more/bigger machines or cost of scaling from zero to one and back.


Yup, first-class support has been coming: https://kubernetes.io/blog/2023/08/24/swap-linux-beta/


I have a special Linux ISO I use to boot a secure enclave. It also runs the whole file system entirely on RAM so nothing persists on reboot. I disabled swap as part of this strategy. However I must make sure it's run on systems with enough physical memory for all of this.

That said, I run a swap partition in the encrypted portion of this laptop, which I think obviates the problem the original ServerFault poster was trying to solve.


I generally want my applications to be oom killed when they run out of physical memory, not when they run out of swap. The latter is much slower and more painful.


I’ve seen a sentiment expressed widely that disabling swap enables their machines to fail fast when the workload runs out of memory. I would be partial to such behaviour as well, but in my experience, without swap, I’ve more often seen systems run into livelock with the OOM killer unbearably slow in reaping its victims meanwhile I cannot ssh in nor view logs, and that such situations basically disappear on systems where I do have swap enabled.

Of course, with or without swap, I can manually obtain the fail-fast behaviour by using systemd-oomd, oomd, early-oom, but I wonder why the reputation of swap differs so from my experience (for context, I’ve mostly run systems for small medium businesses and machines mostly under twenty gigabytes of RAM).


This is the problem i explained in my other post: https://news.ycombinator.com/item?id=40697479

Enabling swap also solved the problem for me.


Yes I had swap fully disabled on Windows since there was 64G of ram and the system was still swapping by default which was creating a lot of unnecessary disk activity and writes. In modern windows (10+) the swappiness is much more sane and that doesn't happen as much but if you have a ton of ram you can safely turn off swapping and force the OS to manage memory more aggressively.


Disabling swap on windows means all of your applications virtual memory has to fit in your physical memory. In Windows any unused virtual memory still needs space in RAM or, it has to have space in the page files.

Right now if I look at my Firefox processes, they take more (10-20%, sometimes a lot more) more commit size (virtual memory) than their private working set. With page files the unused virtual memory portion is simply reserved on the page file with minimal overhead. Without any page files, you will be just wasting memory.

RE: unnecessary writes, it might be windows proactively dumping the contents of the memory (I think this happens but I cannot confirm right now). But in general that's very low priority and it should affect your performance.


Firefox does not request private working sets beyond what is required. There are memory request mechanisms available to applications for MANY generations of windows that specify whether it is high-priority private working set (generally "uninterruptable") or opportunistic commit that is needed. The understanding is that commit can be removed by notification but private working cannot and instead you will be OOM killed if that happens. It is totally fine to have 80% of your system ram "used" but it should not be 80% private working set unless some processes are leaking or legitimately using it. There are also priority flags for memory use and the kernel memory manager will notify and kill in that order.

Check out RAMMap from the absolutely excellent system internals to see how your system memory is allocated currently, even per-process.

As an example, I have 64G of ram, showing ~40GB "used" as per task manager. Of that however only 22GB is private process active. The rest is memory-mapped files, standby, paged pool, nonpaged pool, shared etc.

The issue with pagefile - say there is also a 64GB pagefile - is that windows is notifying processes and the memory manager is considering that the system has "128GB" of ram, which many processes will take as a sign they can reserve more memory and causing an inflation of reserved actual ram.

It is less of an issue now that the memory manager is tier aware and applications have ABIs to check and request memory in a more informed way.

Writes on an SSD are always an issue unless you're running SLC or similar high endurance flash.


Fedora changed to zram instead of swap by default back in 2021 with Fedora 33[1]. So no more swap on disk for one of the biggest Linux distributions. There was no major wailing, nor gnashing of teeth. Most users didn't even notice.

For most users, swap is unnecessary and zram does the job better. For users with 8GiB or less RAM, swap is more likely to be useful (except for embedded systems running from SD cards like Raspberry Pis where swap will kill the card).

[1] https://fedoraproject.org/wiki/Changes/SwapOnZRAM#Benefit_to...


I'm old school. I usually maintain a swap that 2x size RAM, unless RAM exceed 16GiB, then I usually keep swap between 2x and 1x of RAM. This is particularly the case for mission critical servers. Swap gives me half a chance if something hoses memory.


Where there is no disk. For example, computer with no HDD booted from read-only USB stick. RAM disk embedded in kernel for rootfs. Writable directories mounted as tmpfs. Everything runs from memory.


Have not - personally - run into a situation (other than rescue disks (like old Live Linux CDs)) where there was no disk at all in the consumer space

Only seen it for ESXi hosts in the enterprise space (and even there ... there was still a [small] local storage for each host)


I make systems like this for myself. Not suggesting they are popular or would be familiar to all readers. (Although I am aware of others who make them, too.)


I've seen so many good reasons to leave it enabled, and they're correct and well reasoned and many times I agree, enable swap ...

... but on my personal computer I'm not optimizing for long term stability or aggregate behavior. I'm optimizing for "it is fast when I use it, or it throws identifiable issues I can fix to get back to fast".

In that context, most software I use does not thrash with no swap (definitely not all! Java is particularly thrashy). It simply runs fine until it OOMs, which I can immediately see and address (and go kill a Docker I forgot about, 95% of the time).

I've gone back and forth quite a few times, and no-swap consistently gets me MUCH closer to the behavior I want. I enable it and occasionally I get thrashing that takes time to notice, fight with slow UI, and fix, which I very much dislike while I'm doing all this. I disable it and I get crashes, say "oh right" and fix it without losing my focus, and almost never get visible slowdown.

It's not just "emergency memory", there are definitely benefits I'm losing by doing this. But "emergency memory" is something it allows, and avoiding that behavior is worth losing everything else for what I want.


On audio work stations, swapping can lead to an interruption of the audio stream, which can manifest itself as garbled sound up until some very loud noise, not unlike an explosion in a game, which is potentially damaging to the ear, depending on the listening level. So: limiters on all busses, and swap off. If your samples don't fit in memory, try another route.


On Windows you will eventually run into a program that expects page file to be enabled and available even if you have the RAM capacity.


I'm curious, what kind of software would care about that?


Windows doesn't overcommit memory so without a pagefile the available virtual memory is greatly reduced, you can "run out" of memory quite easily with certain applications that allocate but don't use the memory.


I've let Windows auto-manage the pagefile size for my current system. It picked 19GiB total. It's empty. With 128GiB of RAM, that's 15% more memory, but the vast majority of that RAM is free anyway. 15% certainly isn't "greatly reduced" when you max out the RAM capacity in a desktop motherboard.

Lots of the arguments for swap/paging seem to ignore the possibility of just buying "overkill" amounts of RAM.

On the other hand, if you have truly enormous data sets you can RAID 4 M.2 SSDs to max out a PCIE 6.0 x16 slot for 121GiB/s bandwidth in a multiple-TiB swap file. It'll be a while until SSDs get big enough for this to max out the 256TiB virtual address space of x86_64, but you can get 8TiB M.2 SSDs now...


I got 128gb of RAM and with a smaller page file I've ran out of RAM when doing LLM stuff mixed with other background workload. System managed is the way to go.


Do you really have to "disable" it to get rid of it?

I can't remember when I dabbled with a swap partition for the last time.

Now that I saw this post, I ran "swapon -s" on my laptop and on my servers. It comes back empty everywhere.

Why would I ever create a swap partition or file in the first place? Isn't it something from the past when RAM was scarce?


it's like a seat belt. If you never crash your car you are OK with instant death if you do so anyway. I prefer seat belts, even though I also prefer not to crash my car.


Adding 10% more memory via swap will not save you in most cases.


I’ve always disabled swap on my servers. There’s a severe performance penalty when swap is used for a real workload, and that’s far more difficult to diagnose than OOM messages from the kernel. And in every case, swap or not, the resolution is the same. You either get a bigger box, or fix the application.


A few times in me career I've had a narrow use case to disable swap on some systems. The most notable is trying to limit interactions between garbage collected runtimes and the latency introduced by the page faults.

The TLDR is GC pauses were high enough to be causing problems in a particular service, at the time the swap was on rotating media so latencies on a page fault were high, and this was especially a problem during mark and sweep operations leading to long pauses. Due to other GC tuning and the overall IO workload, disabling the swap made sense for the particular servers.

But if I had to do it today the situation might be very different, I might have NVME drives to drop swap onto with much better access latencies, or be able to use mlock to lock critical pages and preventing swapping, etc.

Also, there are some very clear problems introduced with disabling swap, especially on NUMA systems. Again, the particular times I disabled swap, we were able to lock processes onto their relevant numa nodes and do other tuning to make this worthwhile.

So as a general rule, especially with modern hardware, I would agree that it isn't recommended. However, you can probably find narrow use cases where amongst a number of other tunings it makes sense to drop swap. Also, there are plenty of other things you likely want to tune before removing swap.


Not really. If there’s no swap, then at some point programs will just die when you run out of RAM. Once things start going to swap regularly you might notice a slowdown and have a chance to do something about it rather than kill a process unexpectedly.

For server workloads, you probably never want to use the swap, but it’s safest to have it enabled because you don’t know which process is going to get killed when memory runs out. You can monitor for swap usage and tweak your settings appropriately.


I rather just monitor the memory usage and react ( = get alerts) if it goes too high. The performance impact of actually swapping is frequently more visible than occasional oomkill.



Sometimes I've not bothered to set it up and usually I don't notice.


I tend to live a small swap volume (1-4Gb) on my systems where RAM is much, much bigger, 32-64Gb mean, more because of habits than some reasoned technicality. Normally is 100% unused 99+% of the time.


Only if you have burst of requests that need to be replied with constant latency and therefore want to avoid any risk of having to read back swapped out memory.

It's a very rare corner case.


If you are running a fleet of thousands of hosts in an environment with auto-scaling capabilities, your swap will never do anything except under extraordinary burst load, e.g. DDOS.

In that situation, swap provides exactly the wrong thing - it attempts to keep up with memory demand by doing a more expensive and slower version of regular processing. That’s a key ingredient in the recipe for congestion collapse.


Depends on the workload characteristics. If there’s a chance one request could balloon into a huge RAM consumer, you want to be able to survive that without running out of RAM. But if you are hitting swap regularly across your fleet, it’s time to tweak your scaling parameters. But if you never ever hit it, you may be overscaled.


Thank you!


Depends on how much swap you have. If you have a little (0.5-1 GB), you'll catch the extraordinary bursts that are handleable, without getting to the trashes to hard to finally die. And you have a handy dandy thing to monitor to show when you're actually under memory pressure.


Sure swap isn't helping, but having random processes being OOM-killed isn't either …


It does, if it helps you highlight an issue somewhere : sizing, rate limiting or whatever


If you are relying on failures and crashes instead of proactively monitor your workload you are already taking the wrong approach, swap or no swap.


Yeah, but shit happens. Oom killer and the load balancer ejecting the host is better than 0.03% of requests mysteriously failing for weeks.


On servers, unless you have a very fast SSD dedicated to swap, all swap gets you is waiting an extra long time to figure out that the operation you're trying to do is going to fail anyway. The number of tasks that need just a little more RAM after exhausting the dozens to thousands of gigabytes modern computers have are vanishingly few.


> all swap gets you is waiting an extra long time to figure out that the operation you're trying to do is going to fail anyway

No, it gives you progressive performance degradation instead of a hard failure and this is a feature and not a bug.


> the dozens to thousands of gigabytes modern computers have

Hello time traveler, what year do you live in to have regular servers with “dozens of thousands” of GB…


"to" not "of"

Though if you have $400/hour to spend, you can get an u7in-32tb.224xlarge in AWS us-east-1 region.


If you need constant latency you already need to avoid allocation altogether after the start of the program (among other things) so swapping should never occur in the first place.


> swapping should never occur in the first place

No. 99.9% of servers are running more processes than the main workload. There are all sorts of security, monitoring, auditing, inventory tools that might run occasionally.

As I wrote, it's a rare corner case.


I usually turn it off to save disk space. I dunno, with 32GB of RAM I just dont see how the OS should run out of memory, and I don't really want Windows burning out my SSD


Yes, if you absolutely crave CPU performance for tasks that don’t fill all the RAM.

You can set swappiness to 0 on Linux so you’ll only swap in emergencies. Better than crashing.


Swap is required for hibernate.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: