Explaining Your Postgres Query Performance
In a previous post, I talked about pg_stat_statements as a tool for helping direct your query optimization efforts. Now let's say you've identified some queries you want to look into. The EXPLAIN command helps you look even closer into an individual query. If you're already proficient in EXPLAIN, great! Read on for an easy refresher. If you're less familiar with it, this will be a (hopefully) gentle introduction on what insights it might help provide.
I'm going to demonstrate simple EXPLAIN usage with a Geonames country dataset for the United States. There are over 2 million records in this set, and I'm hoping that's a good table size to clearly demonstrate how EXPLAIN is useful.
For those newer to SQL, it's considered a declarative language (although modern DBMS's have enhanced SQL to accommodate procedural programming too). You tell the database server what you want, and it does the job of determining the best way to return you those results. In PostgreSQL, the planner puts together a plan for executing a query based on "query structure and the properties of the data," among other factors. EXPLAIN lets you see that plan. While EXPLAIN is a Postgres-specific command, other RDBMS's will have similar tools.
Let's get to some examples to better see what EXPLAIN does.
Geonames for United States
If you want to follow along with the example, I'm using Postgres 13 and connecting to the database via psql
on WSL2. We'll only be using simple query examples, so even if you're still starting out with Postgres you'll be able to easily understand what the queries are meant to return.
You can grab the US.zip archive from their download server.
Here's the list of data attributes, but I've grabbed only a few of them to load into my database. (If you need to work with a smaller dataset, there are more countries available in the Geonames server.) I've imported the data into a table defined like so:
CREATE TABLE us_geonames (
id integer,
name text,
lat float8,
lon float8,
type text
);
How to use EXPLAIN
To use the EXPLAIN command, you tack on EXPLAIN
before the statement you want to run:
EXPLAIN SELECT name FROM customers WHERE id = 1;
This will return the estimated plan and cost, in plain text by default.
You can also add some options that return slightly different output. A common one is ANALYZE
(which I'll use throughout this post). In addition to the estimated plan and statistics, it will go ahead and run the query and give you the actual run statistics. Or, you can also change the output format to, say, JSON.
(Tip: Other Postgres clients such as pgAdmin can also show you the query plan in a graphical format.)
EXPLAIN ANALYZE will actually run the query, so be careful with updates or deletes! In those cases, consider not using ANALYZE, or you might perhaps wrap the entire statement in a transaction that you can roll back.
Querying with text
columns
I'm starting out with a query that has a WHERE clause (in this case I know this should return exactly one row):
us=# EXPLAIN ANALYZE SELECT * from us_geonames WHERE name = 'Tampa International Airport';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
Gather (cost=1000.00..30062.35 rows=7888 width=84) (actual time=37.164..192.388 rows=1 loops=1)
Workers Planned: 3
Workers Launched: 3
-> Parallel Seq Scan on us_geonames (cost=0.00..28273.55 rows=2545 width=84) (actual time=93.085..130.232 rows=0 loops=4)
Filter: (name = 'Tampa International Airport'::text)
Rows Removed by Filter: 559758
Planning Time: 0.052 ms
Execution Time: 192.413 ms
(8 rows)
I won't go into the details of every specific thing we see in the EXPLAIN output, but let's walk through some general concepts:
- The tree structure. The indented statements that start with an
→
mean that the following operation is a child node of the outer/parent node. For my query, the topmost plan node is a Gather, and Parallel Sequential Scan is a child node of Gather.
The output from the child node is fed as input to the outer node. For our purposes, don't worry too much about how that works exactly. The point is that the tree structure helps you immediately tell that there are multiple steps or levels to the execution plan. - Planning and execution times. I used the ANALYZE option, so we also see how much time it took to generate the query plan, and how much time was spent executing the query (this doesn't include the planning time).
- Estimated statistics. The first set of parentheses on each node line or summary provides estimates made from the query plan.
The topmost or root node gives totals that include those from the child nodes. Each child node or step will have its own statistics, including any child nodes it may also have.
Thecost
estimate isn't associated with actual units, so don't think of it in terms of time or any specific resources. The plan also provides the estimated number ofrows
that will be returned or match the condition. - Actual run statistics. With ANALYZE, I get a second set of parentheses. This provides the actual
time
spent (in milliseconds) in that node, as well as the actualrows
returned. Additionally,loops
will show you if that particular node was executed more than once.
In our example, you may have noticed that the estimated7888
rows is a little off from the actual rows returned. This is something you might look out for or might be a flag.
If you were curious about the "Workers" part of the output: this is when the server is able to carry out parallel queries to help speed things up. There are settings that can be controlled that affect parallel queries (for example, I have max_parallel_workers
set to 4). You'll see Gather or Gather Merge plan nodes with parallel queries.
So the gist of the above plan is, there are sequential scans (i.e. scan the table, row by row from first to last) that are happening in parallel, searching for matches on the filter criteria name = 'Tampa International Airport'
. The work to scan the entire table is essentially divvied up between 3 asynchronous threads. The output from the parallel scans is gathered to produce the result set.
There's only one Tampa International Airport, so it seems like a waste of time to scan each row to find the matching record (and we might not know if that record is towards the top of the table, or nearer the end). The planner at least determined that parallel queries would help. I didn't try it "nonparalleled", though I'd imagine that would go even slower.
We'll try a few more examples. How about pattern matching?
us=# EXPLAIN ANALYZE SELECT * FROM us_geonames WHERE name like '%Tampa%';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
Gather (cost=1000.00..31962.05 rows=217 width=44) (actual time=22.871..209.757 rows=467 loops=1)
Workers Planned: 3
Workers Launched: 3
-> Parallel Seq Scan on us_geonames (cost=0.00..30940.35 rows=70 width=44) (actual time=56.135..164.358 rows=117 loops=4)
Filter: (name ~~ '%Tampa%'::text)
Rows Removed by Filter: 559641
Planning Time: 0.141 ms
Execution Time: 209.810 ms
This turned out largely the same as the above, except this time the Filter criteria is for name values that contain the string "Tampa". It also looks like the estimated vs. actual rows is less skewed.
Let's see a sort:
us=# EXPLAIN ANALYZE SELECT * FROM us_geonames ORDER BY type;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------
Gather Merge (cost=122640.75..379045.89 rows=2166804 width=44) (actual time=669.975..1752.805 rows=2239032 loops=1)
Workers Planned: 3
Workers Launched: 3
-> Sort (cost=121640.71..123446.38 rows=722268 width=44) (actual time=597.224..747.053 rows=559758 loops=4)
Sort Key: type
Sort Method: external merge Disk: 37016kB
Worker 0: Sort Method: external merge Disk: 34768kB
Worker 1: Sort Method: external merge Disk: 28584kB
Worker 2: Sort Method: external merge Disk: 26688kB
-> Parallel Seq Scan on us_geonames (cost=0.00..29134.68 rows=722268 width=44) (actual time=0.025..80.148 rows=559758 loops=4)
Planning Time: 0.049 ms
Execution Time: 1883.467 ms
(12 rows)
Looks similar to the first plan, except there is now a Sort node with the sequential scans as child node. Gather Merge is used with sorted data.
And let's try changing things up a bit so that we're grouping, using an aggregate function, and sorting:
us=# EXPLAIN ANALYZE SELECT type, COUNT(*) FROM us_geonames GROUP BY 1 ORDER BY 2;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=33818.27..33818.66 rows=156 width=12) (actual time=529.629..532.138 rows=416 loops=1)
Sort Key: (count(*))
Sort Method: quicksort Memory: 44kB
-> Finalize GroupAggregate (cost=33753.31..33812.59 rows=156 width=12) (actual time=528.715..531.991 rows=416 loops=1)
Group Key: type
-> Gather Merge (cost=33753.31..33808.69 rows=468 width=12) (actual time=528.705..531.654 rows=1250 loops=1)
Workers Planned: 3
Workers Launched: 3
-> Sort (cost=32753.27..32753.66 rows=156 width=12) (actual time=464.698..464.732 rows=312 loops=4)
Sort Key: type
Sort Method: quicksort Memory: 40kB
Worker 0: Sort Method: quicksort Memory: 39kB
Worker 1: Sort Method: quicksort Memory: 39kB
Worker 2: Sort Method: quicksort Memory: 39kB
-> Partial HashAggregate (cost=32746.02..32747.58 rows=156 width=12) (actual time=464.298..464.383 rows=312 loops=4)
Group Key: type
Batches: 1 Memory Usage: 77kB
Worker 0: Batches: 1 Memory Usage: 77kB
Worker 1: Batches: 1 Memory Usage: 77kB
Worker 2: Batches: 1 Memory Usage: 77kB
-> Parallel Seq Scan on us_geonames (cost=0.00..29134.68 rows=722268 width=4) (actual time=0.004..117.760 rows=559758 loops=4)
Planning Time: 0.066 ms
Execution Time: 532.207 ms
(23 rows)
As you might have guessed, this plan is more complex than the previous ones. Basically, the query has to aggregate and sort rows multiple times to return the desired results. We won't worry about HashAggregate or GroupAggregate in this post, but check out this glossary with some quick explanations for operations or other terms you might see with EXPLAIN. This is just another demonstration of the tree structure and how each step relates to the next.
Use EXPLAIN ANALYZE to see query performance after indexing
So there were at least a couple of things that we saw which indicate that the table might benefit from an index or two: 1) that sequential scan to return just one row, and 2) estimated number of rows being quite a bit off from actual returned rows.
Figuring out when to index, and which things to index is an entire discussion on its own. For this post let's just go ahead and add two indexes, and then check to see if EXPLAIN ANALYZE tells us anything new.
CREATE INDEX us_geonames_name_idx on us_geonames (name);
CREATE INDEX us_geonames_type_idx on us_geonames (type);
us=# EXPLAIN ANALYZE SELECT * from us_geonames WHERE name = 'Tampa International Airport';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on us_geonames (cost=4.52..47.91 rows=11 width=44) (actual time=0.035..0.036 rows=1 loops=1)
Recheck Cond: (name = 'Tampa International Airport'::text)
Heap Blocks: exact=1
-> Bitmap Index Scan on us_geonames_name_idx (cost=0.00..4.51 rows=11 width=0) (actual time=0.030..0.030 rows=1 loops=1)
Index Cond: (name = 'Tampa International Airport'::text)
Planning Time: 0.214 ms
Execution Time: 0.074 ms
(7 rows)
We can see right away that after adding the index, the planner pretty much went "Oh, let's actually not scan the table from top to bottom." With the Bitmap Index Scan, Postgres uses the index we added for the name
column to first find out the location of any matching rows, and then retrieves those rows in the Bitmap Heap Scan.
You might notice that the planning time is longer than the execution time. Still, it's that many times faster than the original plan.
us=# EXPLAIN ANALYZE SELECT * FROM us_geonames WHERE name like '%Tampa%';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
Gather (cost=1000.00..31962.05 rows=217 width=44) (actual time=25.249..229.884 rows=467 loops=1)
Workers Planned: 3
Workers Launched: 3
-> Parallel Seq Scan on us_geonames (cost=0.00..30940.35 rows=70 width=44) (actual time=33.100..183.968 rows=117 loops=4)
Filter: (name ~~ '%Tampa%'::text)
Rows Removed by Filter: 559641
Planning Time: 0.123 ms
Execution Time: 229.944 ms
(8 rows)
Isn't that interesting? This plan is essentially the same, meaning that the planner chose not to use the index. Just because we added an index doesn't mean it'll be used! In this scenario, it has to do with the pattern we're trying to search on -- we just created a B-Tree index (the default kind), but it's not actually helpful for this kind of query.
us=# EXPLAIN ANALYZE SELECT * FROM us_geonames ORDER BY type;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------
Index Scan using us_geonames_type_idx on us_geonames (cost=0.43..128729.96 rows=2239032 width=44) (actual time=0.054..898.211 rows=2239032 loops=1)
Planning Time: 0.137 ms
Execution Time: 1007.902 ms
(3 rows)
This plan doesn't contain any child nodes: it straight up uses the index (whew) for the type
column to do the sorting.
Finally:
us=# EXPLAIN ANALYZE SELECT type, COUNT(*) FROM us_geonames GROUP BY 1 ORDER BY 2;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=28286.27..28286.66 rows=156 width=12) (actual time=269.117..269.259 rows=416 loops=1)
Sort Key: (count(*))
Sort Method: quicksort Memory: 44kB
-> Finalize GroupAggregate (cost=1000.49..28280.58 rows=156 width=12) (actual time=59.070..269.029 rows=416 loops=1)
Group Key: type
-> Gather Merge (cost=1000.49..28275.90 rows=624 width=12) (actual time=59.057..268.624 rows=619 loops=1)
Workers Planned: 4
Workers Launched: 4
-> Partial GroupAggregate (cost=0.43..27201.52 rows=156 width=12) (actual time=1.189..181.774 rows=124 loops=5)
Group Key: type
-> Parallel Index Only Scan using us_geonames_type_idx on us_geonames (cost=0.43..24401.17 rows=559758 width=4) (actual time=0.036..90.309 rows=447806 loops=5)
Heap Fetches: 0
Planning Time: 0.080 ms
Execution Time: 269.316 ms
(14 rows)
This one also looks similar to the pre-index plan, although it does at least use the index, and we see that the execution time is basically cut in half.
EXPLAIN (ANALYZE) is a friend that tells it like it is
EXPLAIN can get kind of intimidating, especially if you're like me (not a DBA nor an advanced Postgres user). But if you stick to some core ideas, you'll eventually become more adept at processing this information to understand the potential prickly bits in your queries:
- The query plan is a tree structure that shows you each inner or child node that feeds into an outer or root node.
- EXPLAIN alone shows estimates. With ANALYZE, you'll actually run the query and see the time it took to create the query plan, plus the time it took to execute the query according to that plan.
- Easy things to look out for to try and diagnose a query are estimated vs actual rows, and perhaps table scans. Sequential scans are not always bad nor slow (it depends on the query, e.g. how restrictive a WHERE clause is), but there might be an opportunity there to optimize.
I should of course note that the query planner relies on statistics collected by Postgres about the system. Those might need looking at and updating if you're not really seeing what you'd expect.
This post showed you how to jump into the EXPLAIN command with some simple examples. For an even more hands-on experience, we have a 15-minute lesson in our Learning Portal--this time with geospatial data and indexes. And finally, I also recommend checking out my colleague Jonathan Katz's post on BRIN indexes for another look into how EXPLAIN helps shine a light on query performance.
Related Articles
- Postgres Tuning & Performance for Analytics Data
19 min read
- Running an Async Web Query Queue with Procedures and pg_cron
6 min read
- Name Collision of the Year: Vector
9 min read
- Sidecar Service Meshes with Crunchy Postgres for Kubernetes
12 min read
- pg_incremental: Incremental Data Processing in Postgres
11 min read