Optimize PostgreSQL Server Performance Through Configuration
By design, the out of the box configuration for PostgreSQL is defined to be a "Jack of All Trades, Master of None". The default configuration for PostgreSQL is fairly painstakingly chosen to ensure that it will run on every environment it is installed, meeting the lowest common denominator resources across most platforms.
Because of this, it's always recommended that one of the first actions performed once an install of PostgreSQL is completed, would be to tune and configure some high-level settings.
There are four high-level settings which will be discussed here: shared_buffers
, wal_buffers
, effective_cache_size
, and maintenance_work_mem
.
Let's begin with shared_buffers
.
shared_buffers
PostgreSQL uses 'double buffering', meaning that PostgreSQL uses its own internal buffer as well as kernel buffered IO. In short, this means that data is stored in memory twice.
The PostgreSQL buffer is named shared_buffers
and it defines how much dedicated system memory PostgreSQL will use for cache.
Because of PostgreSQL's design choice to ensure compatibility on all supported machines and operating systems, this value is set conservatively low by default. As such, updating the shared_buffers
is one of the settings which will be the most effective in improving overall performance on most modern operating systems.
There is not one specific recommended value for shared_buffers
, but the calculation to determine the value for a particular system is not especially difficult.
Generally speaking, the value for shared_buffers
should be roughly 25% of the total system RAM for a dedicated DB server. The value for shared_buffers
should never be set to reserve all of the system RAM for PostgreSQL. A value over 25% of the system RAM can be useful if, for example, it is set such that the entire database working set of data can fit in cache, as this would greatly reduce the amount of time reading from disk.
Alternately, while a larger shared_buffers
value can increase performance in 'read heavy' use cases, having a large shared_buffer
value can be detrimental for 'write heavy' use cases, as the entire contents of shared_buffers
must be processed during writes.
wal_buffers
Write-Ahead Logging (WAL) is a standard method for ensuring integrity of data. Much like in the shared_buffers
setting, PostgreSQL writes WAL records into buffers and then these buffers are flushed to disk.
The default size of the buffer is set by the wal_buffers
setting- initially at 16MB. If the system being tuned has a large number of concurrent connections, then a higher value for wal_buffers
can provide better performance.
effective_cache_size
effective_cache_size
has the reputation of being a confusing PostgreSQL settings, and as such, many times the setting is left to the default value.
The effective_cache_size
value provides a 'rough estimate' of the number of how much memory is available for disk caching by the operating system and within the database itself, after taking into account what's used by the OS itself and other applications.
This value is used only by the PostgreSQL query planner to figure out whether plans it's considering would be expected to fit in RAM or not. As such, it's a bit of a fuzzy number to define for general use cases.
A conservative value for effective_cache_size
would be 1/2 of the total memory available on the system. Most commonly, the value is set to 75% of the total system memory on a dedicated DB server, but can vary depending on the specific discrete needs on a particular server workload.
If the value for effective_cache_size
is too low, then the query planner may decide not to use some indexes, even if they would help greatly increase query speed.
work_mem
The value of work_mem
is used for complex sort operations, and defines the maximum amount of memory to be used for intermediate results, such as hash tables, and for sorting.
When the value for work_mem
is properly tuned, then the majority of sort actions are performed in the much-faster memory, rather than being written and read to disk.
However, it's important to ensure that the work_mem
value is not set too high, as it can 'bottleneck' the available memory on the system as the application performs sort operations. In this case, for example, the system will try to allocate. work_mem
several times over for each concurrent sort operation.
Because of this important caveat, it's ideal to set the global value for work_mem
at a relatively low value, and then alter any specific queries themselves to use a higher work_mem
value:
SET LOCAL work_mem = '256MB';
SELECT * FROM db ORDER BY LOWER(name);
maintenance_work_mem
While work_mem
specifies how much memory is used for complex sort operations, maintenance_work_mem
specifies how much memory is used for routine maintenance tasks, such as VACUUM, CREATE INDEX, and similar.
Unlike work_mem
, however, only one of these maintenance operations can be executed at a time by a database session. As a result, most systems do not have many of these processes running concurrently, so it's typically safe to set this value much larger than work_mem
, as the larger amounts of available memory could improve the performance of vacuuming and database dump restores.
The default value for maintenance_work_mem
is 64MB.
Wrapping Up
Using a tool such as https://pgtune.leopard.in.ua/#/ to craft an initial configuration is worthwhile, but the key to getting the absolute best performance is benchmarking your workload and comparing against a known baseline.
It's also important to remember that even the most well tuned database cannot salvage poorly formed queries. Developers creating applications which interface with the database need to be mindful of how queries are written.
If a query performs heavy joins or other expensive aggregate operations, or if a query is performing a full table scan where an index could be used, it will nearly always perform poorly, no matter how well the database settings are tuned.
We hope that the brief explanations above provide enough insight to allow you go forth and tune your PostgreSQL installs! We're also here to help with PostgreSQL support and to help troubleshoot any PostgreSQL performance issues you may come across.
Related Articles
- Sidecar Service Meshes with Crunchy Postgres for Kubernetes
12 min read
- pg_incremental: Incremental Data Processing in Postgres
11 min read
- Smarter Postgres LLM with Retrieval Augmented Generation
6 min read
- Postgres Partitioning with a Default Partition
16 min read
- Iceberg ahead! Analyzing Shipping Data in Postgres
8 min read