> A database has much more information than the operating system about what, how and when to cache information
Yes, on a dedicated server. However many DB engines instances run on non-dedicated servers, for example along a web server flanked with various processes sometimes reading the local filesystem or using RAM (Varnish, memcached...), and often-run tasks (tempfiles purge, log aggregation, monitoring probes, MTA...). In such a case letting the DB engine use too much RAM, and therefore reducing its global efficiency while limiting buffercache size, may (all other things being equal) imply more 'read' operations, reducing overall performance.
Great point. Selecting the RDBMS page cache size is a key performance parameter that is near impossible to get right on a mixed-use host, both non-dedicated servers and client desktop/laptop. SQL Anywhere, which emphasizes zero-admin, has long supported Dynamic Cache Sizing [1] specifically for this mixed-use case which is/was its bread-and-butter. I don't know if any other RDBMSes do the same (MS SQL?).
As a side note, Apache Arrow's main use case is similar, a column oriented data store shared by one-or-more client processes (Python, R, Julia, Matlab, etc.) on the same general purpose host. This is also now a key distinction between the Apple M1 and its big.LITTLE ARM SoC vs. Amazon Graviton built for server-side virtualized/containerized instances. We should not conflate the two use-cases and understand that the best solution for one use case may not be the best for the other.
Enabling some coopetition among caches by periodically adjusting their sizes, giving more RAM to the most efficient one (higher hit/miss ratio) while shrinking the least efficient one, may let the whole system 'walk' toward the soft point where all caches stay nearby their peak efficiency, or at least alleviate avoidable thrashing.
A first model may lay on memory ballooning and a PID loop. I could not find any pertinent software.
> We should not conflate the two use-cases
Exactly the point! Most hints/principles aren't generic but are stated with insufficient description of their pertinent contexts, especially on online boards and chats channels. It impacts most 'best practices' and code snippets.
Yes, on a dedicated server. However many DB engines instances run on non-dedicated servers, for example along a web server flanked with various processes sometimes reading the local filesystem or using RAM (Varnish, memcached...), and often-run tasks (tempfiles purge, log aggregation, monitoring probes, MTA...). In such a case letting the DB engine use too much RAM, and therefore reducing its global efficiency while limiting buffercache size, may (all other things being equal) imply more 'read' operations, reducing overall performance.