お気楽スロークエリ(JOIN & GROUP編) « LANCARD.LAB｜ランカードコムのスタッフブログ

お気楽スロークエリ(JOIN & GROUP編)

2016/05/10
murave
PostgreSQL
SQL

運用中の業務系システムがあるんですが、お客様から「遅い」とご指摘を頂いた機能を調べたところ台帳からデータを取得し、集計しているクエリが非常に遅い事が判明しました。

調査している環境のデータベースは PostgreSQL 9.3 です。こんな時は EXPLAIN ANALYZE。

ちなみにhaisou_daityouが台帳で現在2850万件程のデータがあり日々増え続けています。
syouhin_detailはマスタで380件程度です。

EXPLAIN ANALYZE 

SELECT 
    haisou_daityou.syouhin_detail_id, 
    syouhin_detail.code as syouhin_detail_code,
    haisou_daityou.haisou_bi,
    date_part('day'::text, haisou_daityou.haisou_bi)::int as day,
    sum(haisou_daityou.haisou_suu) AS haisou_sum

  FROM haisou_daityou
  LEFT OUTER JOIN syouhin_detail ON haisou_daityou.syouhin_detail_id = syouhin_detail.id

  WHERE 
      "YearMonth"(haisou_daityou.haisou_bi) = "YearMonth"('2016-03-01'::date)
  
    GROUP BY 
        haisou_daityou.syouhin_detail_id, 
        syouhin_detail.code, 
        haisou_daityou.haisou_bi
    
    ORDER BY
        syouhin_detail.code, 
        haisou_daityou.haisou_bi

結果

GroupAggregate  (cost=1141834.38..1173281.22 rows=1048228 width=18) (actual time=30736.728..35349.575 rows=9393 loops=1)
  ->  Sort  (cost=1141834.38..1144454.95 rows=1048228 width=18) (actual time=30735.447..34864.067 rows=1024085 loops=1)
        Sort Key: syouhin_detail.code, haisou_daityou.haisou_bi, haisou_daityou.syouhin_detail_id
        Sort Method: external merge  Disk: 30048kB
        ->  Hash Left Join  (cost=650914.54..994020.09 rows=1048228 width=18) (actual time=1812.137..5063.999 rows=1024085 loops=1)
              Hash Cond: (haisou_daityou.syouhin_detail_id = syouhin_detail.id)
              ->  Bitmap Heap Scan on haisou_daityou  (cost=650901.04..979593.45 rows=1048228 width=12) (actual time=1811.261..4238.580 rows=1024085 loops=1)
                    Recheck Cond: ((((date_part('year'::text, (haisou_bi)::timestamp without time zone))::integer * 100) + (date_part('month'::text, (haisou_bi)::timestamp without time zone))::integer) = 201603)
                    Rows Removed by Index Recheck: 191609
                    ->  Bitmap Index Scan on haisou_daityou_tokuisaki_yearmonth_updated  (cost=0.00..650638.98 rows=1048228 width=0) (actual time=1808.060..1808.060 rows=1024085 loops=1)
                          Index Cond: ((((date_part('year'::text, (haisou_bi)::timestamp without time zone))::integer * 100) + (date_part('month'::text, (haisou_bi)::timestamp without time zone))::integer) = 201603)
              ->  Hash  (cost=8.78..8.78 rows=378 width=10) (actual time=0.847..0.847 rows=379 loops=1)
                    Buckets: 1024  Batches: 1  Memory Usage: 17kB
                    ->  Seq Scan on syouhin_detail  (cost=0.00..8.78 rows=378 width=10) (actual time=0.007..0.697 rows=379 loops=1)
Total runtime: 35426.470 ms

うわぁ。おーそーいー。開発環境なので本番と比べると非常に非力ではありますが…

特に

「Hash LEFT JOIN (cost=650914.54..994020.09 ROWS=1048228 width=18) (actual TIME=1812.137..5063.999 ROWS=1024085 loops=1)」

お客さんからの「初回が特に遅い」という話にも納得(初回はハッシュテーブルが作られますが以降はキャッシュされますので)。

ちなみに前記の結果は初回ではありません。やばい。

INDEXを追加してみたり、集計されている中間テーブルを使っての再実装を検討したり(実際にクエリを書いてみたり)、WINDOW関数での再実装を検討したりしましたが、結局、次のような書き換えとなりました。

EXPLAIN ANALYZE 

SELECT  syouhin_detail.code as syouhin_detail_code, sum_tmp.*
  FROM
    (
    SELECT 
      haisou_daityou.syouhin_detail_id, 
      haisou_daityou.haisou_bi,
      date_part('day'::text, haisou_daityou.haisou_bi)::int as day,
      sum(haisou_daityou.haisou_suu) AS haisou_sum

      FROM haisou_daityou

      WHERE 
        haisou_daityou.haisou_bi BETWEEN '2016-03-01'::date AND '2016-03-31'::date
  
      GROUP BY 
        haisou_daityou.syouhin_detail_id, 
        haisou_daityou.haisou_bi
    ) as sum_tmp

  LEFT OUTER JOIN syouhin_detail ON sum_tmp.syouhin_detail_id = syouhin_detail.id

  ORDER BY
    syouhin_detail.code, 
    sum_tmp.haisou_bi

結果

Sort  (cost=87961.52..87988.41 rows=10756 width=26) (actual time=1640.387..1641.442 rows=9393 loops=1)
  Sort Key: syouhin_detail.code, haisou_daityou.haisou_bi
  Sort Method: quicksort  Memory: 971kB
  ->  Hash Left Join  (cost=86797.56..87241.25 rows=10756 width=26) (actual time=1516.623..1529.725 rows=9393 loops=1)
        Hash Cond: (haisou_daityou.syouhin_detail_id = syouhin_detail.id)
        ->  HashAggregate  (cost=86784.06..86972.29 rows=10756 width=12) (actual time=1516.362..1524.763 rows=9393 loops=1)
              ->  Index Scan using haisou_daityou_haisou_bi on haisou_daityou  (cost=0.44..78691.12 rows=1079058 width=12) (actual time=0.028..758.021 rows=1024085 loops=1)
                    Index Cond: ((haisou_bi >= '2016-03-01'::date) AND (haisou_bi <= '2016-03-31'::date))
        ->  Hash  (cost=8.78..8.78 rows=378 width=10) (actual time=0.245..0.245 rows=379 loops=1)
              Buckets: 1024  Batches: 1  Memory Usage: 17kB
              ->  Seq Scan on syouhin_detail  (cost=0.00..8.78 rows=378 width=10) (actual time=0.005..0.135 rows=379 loops=1)
Total runtime: 1642.233 ms

集計した後にJOINすることで20倍くらい速くなりました。初回ではもっともっと差がでるものと思われます。

集計のためにお気楽にJOINした後にGROUP BYしたのが悪かった模様。

お気楽に書いたクエリがスロークエリ化して泣いたお話でした。「JOIN & GROUP編」とはしたが続編はない！(といいな)