Postgres Parallel Query Troubleshooting

Postgres' ability to execute queries in parallel is a powerful feature that can significantly improve query performance, especially on large datasets. However, like all resources, parallel workers are finite. When there aren't enough available workers, Postgres may downgrade a parallel query to a serial (non-parallel) execution. This sounds reasonable unless the performance of the downgraded query is well beyond the required response times needed by the application.

While helping our clients with Oracle to PostgreSQL migrations, query downgrading is a common challenge. In Oracle 11.2, Oracle introduced a feature called "Parallel Statement Queuing." This feature prevents downgrades by queuing parallel queries until enough parallel PX servers are available to handle the request.

This post explores how parallel queries work, what triggers downgrades, and how you can monitor and optimize parallel worker usage to prevent performance bottlenecks. We'll also explore a sample solution that mirrors Oracle's Parallel Statement Queuing feature.

How Parallel Queries Work

When PostgreSQL executes a query in parallel, it divides the work of one or more query nodes (tasks) across multiple processes called parallel workers. These workers cooperate to process parts of the data simultaneously, reducing query time for operations like scans, joins, and aggregations. The database allocates parallel workers up to the maximum defined by the max_parallel_workers configuration setting. If parallel workers cannot be allocated, the query is downgraded to serial execution.

Causes of Parallel Query Downgrades

There are a few key reasons why a parallel query may be downgraded:

Exhausted Worker Pool PostgreSQL has a limit on the total number of parallel workers it can spawn, controlled by the max_parallel_workers parameter. If this limit is reached, new parallel queries cannot get the workers they need and may fall back to serial execution.
Per-Query Worker Limit Even if there are available workers, each query is subject to the max_parallel_workers_per_gather setting. If this threshold is met or exceeded, additional queries must either run with reduced parallelism or downgrade to serial.
Busy Workload In a busy system where many queries are requesting parallel workers, competition for resources may lead PostgreSQL to downgrade some queries to avoid overloading the system.
Optimizer Stats Statistics on the table and indexes can lead the optimizer to choose a serial execution path over a parallel path.

Simulating the Issue

Create a Large Table

The following SQL will create a large table that will be queried to simulate the benefits of parallel query and the impact when a query is downgraded.

CREATE TABLE large_table AS
SELECT generate_series(1, 10000000) AS id,
       random() * 1000 AS value;

Run a Parallel Query:

SET max_parallel_workers_per_gather = 4;

EXPLAIN (ANALYZE)
SELECT * FROM large_table WHERE value > 500 ORDER BY id DESC;

The query above with 4 parallel workers runs in less than 3 seconds. Below is the execution plan returned from the EXPLAIN ANALYZE:

Gather Merge  (cost=233180.33..827680.90 rows=4965151 width=12) (actual time=1683.415..2367.503 rows=5000337 loops=1)
  Workers Planned: 4
  Workers Launched: 4
  ->  Sort  (cost=232180.27..235283.49 rows=1241288 width=12) (actual time=1653.855..1743.384 rows=1000067 loops=5)
        Sort Key: id DESC
        Sort Method: external merge  Disk: 25640kB
        Worker 0:  Sort Method: external merge  Disk: 25136kB
        Worker 1:  Sort Method: external merge  Disk: 25616kB
        Worker 2:  Sort Method: external merge  Disk: 25496kB
        Worker 3:  Sort Method: external merge  Disk: 25536kB
        ->  Parallel Seq Scan on large_table  (cost=0.00..85327.28 rows=1241288 width=12) (actual time=0.014..191.271 rows=1000067 loops=5)
              Filter: (value > '500'::double precision)
              Rows Removed by Filter: 999933
Planning Time: 0.215 ms
Execution Time: 2511.247 ms
(15 rows)

Executing the query again but this time disabling parallel query to simulate the downgrade of the query.

SET max_parallel_workers_per_gather = 0;

EXPLAIN (ANALYZE)
SELECT * FROM large_table WHERE value > 500 ORDER BY id DESC;

With parallel query disabled, the query response time is now just over 10 seconds. In this simple example, 7 seconds may not seem like a big deal. However, imagine that the response time for a real world example is not 7 seconds but 7 minutes. This is the type of performance degradation we want to prevent. The below output shows the non-parallel execution plan.

QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
Sort  (cost=816132.49..828545.37 rows=4965151 width=12) (actual time=10314.880..10699.870 rows=5000337 loops=1)
  Sort Key: id DESC
  Sort Method: external merge  Disk: 127368kB
  ->  Seq Scan on large_table  (cost=0.00..179069.14 rows=4965151 width=12) (actual time=0.012..691.074 rows=5000337 loops=1)
        Filter: (value > '500'::double precision)
        Rows Removed by Filter: 4999663
Planning Time: 0.199 ms
Execution Time: 10872.894 ms
(8 rows)

How to Detect Downgraded Parallel Queries

When a parallel query is downgraded to a serial execution, it can result in longer query times. Fortunately, PostgreSQL provides several ways to identify such downgrades.

Using EXPLAIN (ANALYZE)

Reviewing the output of the execution plans above, when the query was downgraded, the plan lacked parallel nodes.

If the query is serial, the output will look like this:

Seq Scan on large_table  (cost=0.00..179069.14 rows=4965151 width=12) (actual time=0.012..691.074 rows=5000337 loops=1)

But a parallel query would have entries like:

Parallel Seq Scan on large_table  (cost=0.00..85327.28 rows=1241288 width=12) (actual time=0.014..191.271 rows=1000067 loops=5)

Monitoring Parallel Workers in Use

You can query the pg_stat_activity view to see the number of running parallel workers:

SELECT COUNT(1) AS running_workers
FROM pg_stat_activity
WHERE backend_type = 'parallel worker';

Additionally, you can compare it against the total allowed workers:

SELECT current_setting('max_parallel_workers')::int AS max_workers;

If running_workers reaches or exceeds max_workers, new parallel queries may be downgraded.

Solutions for parallel query downgrades

One solution that we have implemented with customers is a retry function. This function mirrors the behavior of Oracle’s Parallel Query Statement Queuing. The function will check for a certain percentage of parallel workers to be available before executing the desired statement. If there are no workers available, the function will sleep and recheck for a specified period of time. An example of this procedure can be found here.

Conclusion

Parallel queries in PostgreSQL offer great performance benefits, but they rely on the availability of parallel workers. When workers are in short supply, the database gracefully downgrades queries to ensure system stability, though at the cost of performance. By understanding how parallel query downgrades occur, you can better manage parallel workloads and minimize the impact on your system.

Careful tuning, monitoring, and query optimization will help you get the most out of PostgreSQL's parallel query feature—without running into unexpected slowdowns.

Latest Articles