Red Green Repeat Adventures of a Spec Driven Junkie

Postgresql's EXPLAIN and EXPLAIN ANALYZE

As much as I love specs, they do not help much in finding performance problems. Specs ensure functionality is correct, performance tuning requires different tools.

I learned how using postgresql’s EXPLAIN function can highlight database performance bottleneck. These are some of the things I learned.

Postgresql’s EXPLAIN

EXPLAIN function show how the database will optimize a query’s plan and its estimated costs are.

EXPLAIN in Rails

In Rails, the EXPLAIN function can be used on any query (i.e. Model.all => Model.all.explain) and the output can be printed to the console using: puts Model.all.explain.

A simple query EXPLAIN result:

Sort  (cost=4768.180..4807.030 rows=15539 width=385)
    Sort Key: seven
          ->  Seq Scan on sierra  (cost=0.000..921.390 rows=15539 width=385)

prettier version

The most interesting part to me are the cost values:

  • Seq Scan is a basic sequential scan of items from external storage. This example has an estimated cost ranging from 0 to 921.
  • Sort is the database sorting the results of the previous step. Its estimated cost ranges from 4768 to 4807.

From this, I learned: lower everything is better. The lower cost, fewer rows, smaller width (which refers to how wide the data is, like the data type.)

The Bottleneck Query

Now, the slow query I looked at:

Sort  (cost=19530.680..19598.090 rows=26963 width=385)
    Sort Key: sierra.seven
      ->  Unique  (cost=11335.830..12481.750 rows=26963 width=385)
              ->  Sort  (cost=11335.830..11403.230 rows=26963 width=385)
                              Sort Key: sierra.quebec, sierra.echo, sierra.five, xray.kilo
                                            ->  Hash Join  (cost=1905.630..4556.520 rows=26963 width=385)
                                                                  Hash Cond: (mike.charlie = sierra.quebec)
                                                                                        ->  Seq Scan on echo  (cost=0.00..1040.23 rows=2123 width=
                                                                                                      ->  Hash  (cost=921.390..921.390 rows=15539 width=385)
                                                                                                                          ->  Seq Scan on sierra  (cost=0.000..921.390 rows=1539 width=385)

prettier version

The topline cost is much higher than the previous example, and this query is much deeper as well.

When I look through the costs, I can follow how Postgresql’s explain is calculating them:

  • Seq Scan on sierra - cost = 0 - 921 (scan on table 1)
    • Hash - cost = 921 - 921 (create a hash for the join)
    • Seq Scan on echo - cost = 0 - 1040 (scan on table 2)
      • Hash Join - cost = 1905 - 4556

Cost Heirarchy

From this I can see the child costs increases the parent costs. Costs that start at 0 are initial operations, a parent’s cost range are always greater than its children’s cost range.

The one area I could not reconcile this difference was this:

  • Hash Join - cost = 1905 - 4556
    • Sort - cost = 11335 - 11403

The sort cost range is almost five times the child’s hash join, where is this additional cost coming from?!

Postgresql’s EXPLAIN ANALYZE

I joined the #postgresql IRC channel on freenode to get some help.

One thing I learned: do not use EXPLAIN, use EXPLAIN ANALYZE (Thanks depesz!)

Rails & EXPLAIN ANALYZE

Rails does not have a simple way to get the EXPLAIN ANALYZE output so I had to use Rails’ .to_sql to generate the ActiveRecord SQL call and manually paste that into the Postgresql console prepended with: EXPLAIN ANALYZE <Model.all.to_sql statement>

EXPLAIN ANALYZE output

The EXPLAIN ANALYZE output of the first query is:

Sort  (cost=4768.180..4807.030 rows=15539 width=385) (actual time=20.340..23.449 rows=15539 loops=1)
    Sort Key: seven
        Sort Method: external merge  Disk: 5680kB
          ->  Seq Scan on sierra  (cost=0.000..921.390 rows=15539 width=385) (actual time=0.014..2.993 rows=15539 loops=1)

prettier version

EXPLAIN ANALYZE provides the estimate from the EXPLAIN function, but also the actual results of running the query. One interesting part of the EXPLAIN ANALYZE output not in the EXPLAIN output:

  • Sort Method: external merge Disk: 5680kB - this tells us that the work_mem is set too low and the function had to use external storage to complete the function, in this case another 5MB (thanks to Berge from #postgresql for explaining this!)

The EXPLAIN ANALYZE output of the second query is:

Sort  (cost=19530.680..19598.090 rows=26963 width=385) (actual time=1128.695..1131.622 rows=15539 loops=1)
    Sort Key: sierra.seven
        Sort Method: external merge  Disk: 5688kB
          ->  Unique  (cost=11335.830..12481.750 rows=26963 width=385) (actual time=1095.016..1115.935 rows=15539 loops=1)
                  ->  Sort  (cost=11335.830..11403.230 rows=26963 width=385) (actual time=1095.016..1103.221 rows=26950 loops=1)
                                  Sort Key: sierra.quebec, sierra.echo, sierra.five, sierra.yankee_zulu, sierra.zulu_lima, sierra.papa, sierra.seven, sierra.romeo_hotel, sierra.golf, sierra.zulu_whiskey, sierra.romeo_oscar, sierra.six_kilo, sierra.yankee_kilo, sierra.kilo, sierra.bravo, sierra.six_mike
                                                  Sort Method: external merge  Disk: 10528kB
                                                                ->  Hash Join  (cost=1905.630..4556.520 rows=26963 width=385) (actual time=13.038..45.847 rows=26950 loops=1)
                                                                                      Hash Cond: (mike.charlie = sierra.quebec)
                                                                                                          ->  Seq Scan on mike  (cost=0.000..1040.230 rows=27123 width=4) (actual time=0.004..6.287 rows=27088 loops=1)
                                                                                                                              ->  Hash  (cost=921.390..921.390 rows=15539 width=385) (actual time=13.017..13.017 rows=15539 loops=1)
                                                                                                                                                          Buckets: 1024  Batches: 2  Memory Usage: 3054kB
                                                                                                                                                                                    ->  Seq Scan on sierra  (cost=0.000..921.390 rows=15539 width=385) (actual time=0.003..3.754 rows=15539 loops=1)

prettier version

The big cost jump of the sort is apparent, the external merge is adding to the cost dramatically.

In Summary

Postgresql’s EXPLAIN ANALYZE function is very beneficial to improving web application performance by seeing the cost of a database query.

This is one tool I will reach for every time I need to look at any database performance issues.

Resources Used