2015年6月19日金曜日

現代のMySQLはもう不要にINをEXISTSに書き換えない

漢(オトコ)のコンピュータ道: なぜMySQLのサブクエリは遅いのか。

この記事は 2009/3/25 に書かれたもののようである。
2009年3月といえばMySQL 5.1がGAになってわずか半年、MySQL 6.0.10-alphaがリリースされた頃で、MariaDBもまだ姿を見せていない頃だ。


時は流れて2015年、MySQL 5.6がGAになって早2年半、5.7のGAマダァ-? (・∀・ )っ/凵⌒☆チンチン

な頃なので、もういい加減誰か言ってくれてもいいんじゃないかと思う。


もうMySQL(5.6)は不要にINをEXISTSに書き換えたりしないんだよって


mysql51> EXPLAIN SELECT * FROM Country WHERE Continent = 'Asia' AND Code IN
    -> (SELECT CountryCode FROM City WHERE Population > Country.Population / 2);
+----+--------------------+---------+----------------+---------------+-------------+---------+------+------+-------------+
| id | select_type        | table   | type           | possible_keys | key         | key_len | ref  | rows | Extra       |
+----+--------------------+---------+----------------+---------------+-------------+---------+------+------+-------------+
|  1 | PRIMARY            | Country | ALL            | NULL          | NULL        | NULL    | NULL |  222 | Using where |
|  2 | DEPENDENT SUBQUERY | City    | index_subquery | CountryCode   | CountryCode | 3       | func |    9 | Using where |
+----+--------------------+---------+----------------+---------------+-------------+---------+------+------+-------------+
2 rows in set (0.00 sec)


mysql55> EXPLAIN SELECT * FROM Country WHERE Continent = 'Asia' AND Code IN
    -> (SELECT CountryCode FROM City WHERE Population > Country.Population / 2);
+----+--------------------+---------+----------------+---------------+-------------+---------+------+------+-------------+
| id | select_type        | table   | type           | possible_keys | key         | key_len | ref  | rows | Extra       |
+----+--------------------+---------+----------------+---------------+-------------+---------+------+------+-------------+
|  1 | PRIMARY            | Country | ALL            | NULL          | NULL        | NULL    | NULL |  226 | Using where |
|  2 | DEPENDENT SUBQUERY | City    | index_subquery | CountryCode   | CountryCode | 3       | func |    7 | Using where |
+----+--------------------+---------+----------------+---------------+-------------+---------+------+------+-------------+
2 rows in set (0.02 sec)


mysql56> EXPLAIN SELECT * FROM Country WHERE Continent = 'Asia' AND Code IN
    -> (SELECT CountryCode FROM City WHERE Population > Country.Population / 2);
+----+-------------+---------+------+---------------+-------------+---------+-----------------+------+----------------------------------+
| id | select_type | table   | type | possible_keys | key         | key_len | ref             | rows | Extra                            |
+----+-------------+---------+------+---------------+-------------+---------+-----------------+------+----------------------------------+
|  1 | SIMPLE      | Country | ALL  | PRIMARY       | NULL        | NULL    | NULL            |  239 | Using where                      |
|  1 | SIMPLE      | City    | ref  | CountryCode   | CountryCode | 3       | d1.Country.Code |    9 | Using where; FirstMatch(Country) |
+----+-------------+---------+------+---------------+-------------+---------+-----------------+------+----------------------------------+
2 rows in set (0.00 sec)

mysql57> EXPLAIN SELECT * FROM Country WHERE Continent = 'Asia' AND Code IN
    -> (SELECT CountryCode FROM City WHERE Population > Country.Population / 2);
+----+-------------+---------+------------+------+---------------+-------------+---------+-----------------+------+----------+----------------------------------+
| id | select_type | table   | partitions | type | possible_keys | key         | key_len | ref             | rows | filtered | Extra                            |
+----+-------------+---------+------------+------+---------------+-------------+---------+-----------------+------+----------+----------------------------------+
|  1 | SIMPLE      | Country | NULL       | ALL  | PRIMARY       | NULL        | NULL    | NULL            |  239 |    14.29 | Using where                      |
|  1 | SIMPLE      | City    | NULL       | ref  | CountryCode   | CountryCode | 3       | d1.Country.Code |   18 |    33.33 | Using where; FirstMatch(Country) |
+----+-------------+---------+------------+------+---------------+-------------+---------+-----------------+------+----------+----------------------------------+
2 rows in set, 2 warnings (0.00 sec)

mysql57> show warnings\G
*************************** 1. row ***************************
  Level: Note
   Code: 1276
Message: Field or reference 'd1.Country.Population' of SELECT #2 was resolved in SELECT #1
*************************** 2. row ***************************
  Level: Note
   Code: 1003
Message: /* select#1 */ select `d1`.`Country`.`Code` AS `Code`,`d1`.`Country`.`Name` AS `Name`,`d1`.`Country`.`Continent` AS `Continent`,`d1`.`Country`.`Region` AS `Region`,`d1`.`Country`.`SurfaceArea` AS `SurfaceArea`,`d1`.`Country`.`IndepYear` AS `IndepYear`,`d1`.`Country`.`Population` AS `Population`,`d1`.`Country`.`LifeExpectancy` AS `LifeExpectancy`,`d1`.`Country`.`GNP` AS `GNP`,`d1`.`Country`.`GNPOld` AS `GNPOld`,`d1`.`Country`.`LocalName` AS `LocalName`,`d1`.`Country`.`GovernmentForm` AS `GovernmentForm`,`d1`.`Country`.`HeadOfState` AS `HeadOfState`,`d1`.`Country`.`Capital` AS `Capital`,`d1`.`Country`.`Code2` AS `Code2` from `d1`.`Country` semi join (`d1`.`City`) where ((`d1`.`City`.`CountryCode` = `d1`.`Country`.`Code`) and (`d1`.`Country`.`Continent` = 'Asia') and (`d1`.`City`.`Population` > (`d1`.`Country`.`Population` / 2)))
2 rows in set (0.00 sec)

クエリーは先頭にリンクを張った、御大のブログ記事のものと一緒。world_innodb.sqlを流してから、3つのテーブルに対してANALYZE TABLEをかけてある。5.1.73, 5.5.44, 5.6.25, 5.7.7の比較だ。

最後の5.7のワーニングは、5.7のデフォルトのEXPLAINがEXPLAIN EXTENDEDになったので、オプティマイズした結果をワーニングバッファに突っ込んでいる。SHOW WARNINGSでオプティマイザーがどうクエリーを書き換えたかが見えるのでよく見ると。

select .. from `d1`.`Country` semi join (`d1`.`City`) where ..


semi-join最適化は5.6.5からの実装なので、MySQL 5.6のGAバージョンならこれは5.6でも同じように最適化できる。
MySQL :: MySQL 5.6 Reference Manual :: 8.2.1.18 Subquery Optimization

もちろんすべてのINが良しなにされるとは限らない(本当に相関サブクエリーが必要なケースは相関サブクエリーになるし、5.7になっても相関サブクエリーは遅い)が、MySQLのオプティマイザーがバカなせいでつらい目を見た過去の俺とか俺とか俺とかにはとても嬉しい。

日本よ、これがMySQL 5.6だッ!! (c) nippondanji


ちなみに、

MariaDB [d1]> SELECT @@version;
+----------------+
| @@version      |
+----------------+
| 5.3.12-MariaDB |
+----------------+
1 row in set (0.00 sec)

MariaDB [d1]> EXPLAIN EXTENDED SELECT * FROM Country WHERE Continent = 'Asia' AND Code IN
    -> (SELECT CountryCode FROM City WHERE Population > Country.Population / 2);
+----+-------------+---------+------+---------------+-------------+---------+-----------------+------+----------+----------------------------------+
| id | select_type | table   | type | possible_keys | key         | key_len | ref             | rows | filtered | Extra                            |
+----+-------------+---------+------+---------------+-------------+---------+-----------------+------+----------+----------------------------------+
|  1 | PRIMARY     | Country | ALL  | PRIMARY       | NULL        | NULL    | NULL            |  269 |   100.00 | Using where                      |
|  1 | PRIMARY     | City    | ref  | CountryCode   | CountryCode | 3       | d1.Country.Code |    1 |   100.00 | Using where; FirstMatch(Country) |
+----+-------------+---------+------+---------------+-------------+---------+-----------------+------+----------+----------------------------------+
2 rows in set, 2 warnings (0.00 sec)

MariaDB [d1]> SHOW WARNINGS\G
*************************** 1. row ***************************
  Level: Note
   Code: 1276
Message: Field or reference 'd1.Country.Population' of SELECT #2 was resolved in SELECT #1
*************************** 2. row ***************************
  Level: Note
   Code: 1003
Message: select `d1`.`Country`.`Code` AS `Code`,`d1`.`Country`.`Name` AS `Name`,`d1`.`Country`.`Continent` AS `Continent`,`d1`.`Country`.`Region` AS `Region`,`d1`.`Country`.`SurfaceArea` AS `SurfaceArea`,`d1`.`Country`.`IndepYear` AS `IndepYear`,`d1`.`Country`.`Population` AS `Population`,`d1`.`Country`.`LifeExpectancy` AS `LifeExpectancy`,`d1`.`Country`.`GNP` AS `GNP`,`d1`.`Country`.`GNPOld` AS `GNPOld`,`d1`.`Country`.`LocalName` AS `LocalName`,`d1`.`Country`.`GovernmentForm` AS `GovernmentForm`,`d1`.`Country`.`HeadOfState` AS `HeadOfState`,`d1`.`Country`.`Capital` AS `Capital`,`d1`.`Country`.`Code2` AS `Code2` from `d1`.`Country` semi join (`d1`.`City`) where ((`d1`.`City`.`CountryCode` = `d1`.`Country`.`Code`) and (`d1`.`Country`.`Continent` = 'Asia') and (`d1`.`City`.`Population` > (`d1`.`Country`.`Population` / 2)))
2 rows in set (0.00 sec)

MariaDBはもっと前からできるんだけどな! :)
(5.3で試してるけど、もとは5.2くらいだったような気がする。ただし裏は取ってない)

ところでsemi joinって聞くたびに蝉ジョインとか脳内変換されるんだけどこれなんとかなりませんか。

0 件のコメント :

コメントを投稿