下面的演示基于MySQL5.7.27版本
一、關(guān)于MySQL子查詢的優(yōu)化策略介紹:
子查詢優(yōu)化策略
對(duì)于不同類(lèi)型的子查詢,優(yōu)化器會(huì)選擇不同的策略。
1. 對(duì)于 IN、=ANY 子查詢,優(yōu)化器有如下策略選擇:
- semijoin
- Materialization
- exists
2. 對(duì)于 NOT IN、>ALL 子查詢,優(yōu)化器有如下策略選擇:
3. 對(duì)于 derived 派生表,優(yōu)化器有如下策略選擇:
derived_merge,將派生表合并到外部查詢中(5.7 引入 );
將派生表物化為內(nèi)部臨時(shí)表,再用于外部查詢。
注意:update 和 delete 語(yǔ)句中子查詢不能使用 semijoin、materialization 優(yōu)化策略
二、創(chuàng)建數(shù)據(jù)進(jìn)行模擬演示
為了方便分析問(wèn)題先建兩張表并插入模擬數(shù)據(jù):
CREATE TABLE `test02` (
`id` int(11) NOT NULL,
`a` int(11) DEFAULT NULL,
`b` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `a` (`a`)
) ENGINE=InnoDB;
drop procedure idata;
delimiter ;;
create procedure idata()
begin
declare i int;
set i=1;
while(i=10000)do
insert into test02 values(i, i, i);
set i=i+1;
end while;
end;;
delimiter ;
call idata();
create table test01 like test02;
insert into test01 (select * from test02 where id=1000)
三、舉例分析SQL實(shí)例
子查詢示例:
SELECT * FROM test01 WHERE test01.a IN (SELECT test02.b FROM test02 WHERE id 10)
大部分人可定會(huì)簡(jiǎn)單的認(rèn)為這個(gè) SQL 會(huì)這樣執(zhí)行:
SELECT test02.b FROM test02 WHERE id 10
結(jié)果:1,2,3,4,5,6,7,8,9
SELECT * FROM test01 WHERE test01.a IN (1,2,3,4,5,6,7,8,9);
但實(shí)際上 MySQL 并不是這樣做的。MySQL 會(huì)將相關(guān)的外層表壓到子查詢中,優(yōu)化器認(rèn)為這樣效率更高。也就是說(shuō),優(yōu)化器會(huì)將上面的 SQL 改寫(xiě)成這樣:
select * from test01 where exists(select b from test02 where id 10 and test01.a=test02.b);
提示: 針對(duì)mysql5.5以及之前的版本
查看執(zhí)行計(jì)劃如下,發(fā)現(xiàn)這條SQL對(duì)表test01進(jìn)行了全表掃描1000,效率低下:
root@localhost [dbtest01]>desc select * from test01 where exists(select b from test02 where id 10 and test01.a=test02.b);
+----+--------------------+--------+------------+-------+---------------+---------+---------+------+--------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------------+--------+------------+-------+---------------+---------+---------+------+--------+----------+-------------+
| 1 | PRIMARY | test01 | NULL | ALL | NULL | NULL | NULL | NULL | 1000 | 100.00 | Using where |
| 2 | DEPENDENT SUBQUERY | test02 | NULL | range | PRIMARY | PRIMARY | 4 | NULL | 9 | 10.00 | Using where |
+----+--------------------+--------+------------+-------+---------------+---------+---------+------+--------+----------+-------------+
2 rows in set, 2 warnings (0.00 sec)
但是此時(shí)實(shí)際執(zhí)行下面的SQL,發(fā)現(xiàn)也不慢啊,這不是自相矛盾嘛,別急,咱們繼續(xù)往下分析:
SELECT * FROM test01 WHERE test01.a IN (SELECT test02.b FROM test02 WHERE id 10)
查看此條SQL的執(zhí)行計(jì)劃如下:
root@localhost [dbtest01]>desc SELECT * FROM test01 WHERE test01.a IN (SELECT test02.b FROM test02 WHERE id 10);
+----+--------------+-------------+------------+-------+---------------+---------+---------+---------------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------+-------------+------------+-------+---------------+---------+---------+---------------+------+----------+-------------+
| 1 | SIMPLE | subquery2> | NULL | ALL | NULL | NULL | NULL | NULL | NULL | 100.00 | Using where |
| 1 | SIMPLE | test01 | NULL | ref | a | a | 5 | subquery2>.b | 1 | 100.00 | NULL |
| 2 | MATERIALIZED | test02 | NULL | range | PRIMARY | PRIMARY | 4 | NULL | 9 | 100.00 | Using where |
+----+--------------+-------------+------------+-------+---------------+---------+---------+---------------+------+----------+-------------+
3 rows in set, 1 warning (0.00 sec)
發(fā)現(xiàn)優(yōu)化器使用到了策略MATERIALIZED。于是對(duì)此策略進(jìn)行了資料查詢和學(xué)習(xí)。
https://dev.mysql.com/doc/refman/5.6/en/subquery-optimization.html
原因是從MySQL5.6版本之后包括MySQL5.6版本,優(yōu)化器引入了新的優(yōu)化策略:materialization=[off|on],semijoin=[off|on],(off代表關(guān)閉此策略,on代表開(kāi)啟此策略)
可以采用show variables like 'optimizer_switch'; 來(lái)查看MySQL采用的優(yōu)化器策略。當(dāng)然這些策略都是可以在線進(jìn)行動(dòng)態(tài)修改的
set global optimizer_switch='materialization=on,semijoin=on';代表開(kāi)啟優(yōu)化策略materialization和semijoin
MySQL5.7.27默認(rèn)的優(yōu)化器策略:
root@localhost [dbtest01]>show variables like 'optimizer_switch';
+------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Variable_name | Value |
+------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| optimizer_switch | index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,engine_condition_pushdown=on,index_condition_pushdown=on,mrr=on,mrr_cost_based=on,block_nested_loop=on,batched_key_access=off,materialization=on,semijoin=on,loosescan=on,firstmatch=on,duplicateweedout=on,subquery_materialization_cost_based=on,use_index_extensions=on,condition_fanout_filter=on,derived_merge=on |
+------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
所以在MySQL5.6及以上版本時(shí)
執(zhí)行下面的SQL是不會(huì)慢的。因?yàn)镸ySQL的優(yōu)化器策略materialization和semijoin 對(duì)此SQL進(jìn)行了優(yōu)化
SELECT * FROM test01 WHERE test01.a IN (SELECT test02.b FROM test02 WHERE id 10)
然而咱們把mysql的優(yōu)化器策略materialization和semijoin 關(guān)閉掉測(cè)試,發(fā)現(xiàn)SQL確實(shí)對(duì)test01進(jìn)行了全表的掃描(1000):
set global optimizer_switch='materialization=off,semijoin=off';
執(zhí)行計(jì)劃如下test01表確實(shí)進(jìn)行了全表掃描:
root@localhost [dbtest01]>desc SELECT * FROM test01 WHERE test01.a IN (SELECT test02.b FROM test02 WHERE id 10);
+----+--------------------+--------+------------+-------+---------------+---------+---------+------+--------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------------+--------+------------+-------+---------------+---------+---------+------+--------+----------+-------------+
| 1 | PRIMARY | test01 | NULL | ALL | NULL | NULL | NULL | NULL | 1000 | 100.00 | Using where |
| 2 | DEPENDENT SUBQUERY | test02 | NULL | range | PRIMARY | PRIMARY | 4 | NULL | 9 | 10.00 | Using where |
+----+--------------------+--------+------------+-------+---------------+---------+---------+------+--------+----------+-------------+
2 rows in set, 1 warning (0.00 sec)
下面咱們分析下這個(gè)執(zhí)行計(jì)劃:
?。。?!再次提示:如果是mysql5.5以及之前的版本,或者是mysql5.6以及之后的版本關(guān)閉掉優(yōu)化器策略materialization=off,semijoin=off,得到的SQL執(zhí)行計(jì)劃和下面的是相同的
root@localhost [dbtest01]>desc select * from test01 where exists(select b from test02 where id 10 and test01.a=test02.b);
+----+--------------------+--------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------------+--------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
| 1 | PRIMARY | test01 | NULL | ALL | NULL | NULL | NULL | NULL | 1000 | 100.00 | Using where |
| 2 | DEPENDENT SUBQUERY | test02 | NULL | range | PRIMARY | PRIMARY | 4 | NULL | 9 | 10.00 | Using where |
+----+--------------------+--------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
2 rows in set, 2 warnings (0.00 sec)
不相關(guān)子查詢變成了關(guān)聯(lián)子查詢(select_type:DEPENDENT SUBQUERY),子查詢需要根據(jù) b 來(lái)關(guān)聯(lián)外表 test01,因?yàn)樾枰獗淼?test01 字段,所以子查詢是沒(méi)法先執(zhí)行的。執(zhí)行流程為:
- 掃描 test01,從 test01 取出一行數(shù)據(jù) R;
- 從數(shù)據(jù)行 R 中,取出字段 a 執(zhí)行子查詢,如果得到結(jié)果為 TRUE,則把這行數(shù)據(jù) R 放到結(jié)果集;
- 重復(fù) 1、2 直到結(jié)束。
總的掃描行數(shù)為 1000+1000*9=10000(這是理論值,但是實(shí)際值比10000還少,怎么來(lái)的一直沒(méi)想明白,看規(guī)律是子查詢結(jié)果集每多一行,總掃描行數(shù)就會(huì)少幾行)。
Semi-join優(yōu)化器:
這樣會(huì)有個(gè)問(wèn)題,如果外層表是一個(gè)非常大的表,對(duì)于外層查詢的每一行,子查詢都得執(zhí)行一次,這個(gè)查詢的性能會(huì)非常差。我們很容易想到將其改寫(xiě)成 join 來(lái)提升效率:
select test01.* from test01 join test02 on test01.a=test02.b and test02.id10;
# 查看此SQL的執(zhí)行計(jì)劃:
desc select test01.* from test01 join test02 on test01.a=test02.b and test02.id10;
root@localhost [dbtest01]>EXPLAIN extended select test01.* from test01 join test02 on test01.a=test02.b and test02.id10;
+----+-------------+--------+------------+-------+---------------+---------+---------+-------------------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+-------+---------------+---------+---------+-------------------+------+----------+-------------+
| 1 | SIMPLE | test02 | NULL | range | PRIMARY | PRIMARY | 4 | NULL | 9 | 100.00 | Using where |
| 1 | SIMPLE | test01 | NULL | ref | a | a | 5 | dbtest01.test02.b | 1 | 100.00 | NULL |
+----+-------------+--------+------------+-------+---------------+---------+---------+-------------------+------+----------+-------------+
2 rows in set, 2 warnings (0.00 sec)
這樣優(yōu)化可以讓 t2 表做驅(qū)動(dòng)表,t1 表關(guān)聯(lián)字段有索引,查找效率非常高。
但這里會(huì)有個(gè)問(wèn)題,join 是有可能得到重復(fù)結(jié)果的,而 in(select ...) 子查詢語(yǔ)義則不會(huì)得到重復(fù)值。
而 semijoin 正是解決重復(fù)值問(wèn)題的一種特殊聯(lián)接。
在子查詢中,優(yōu)化器可以識(shí)別出 in 子句中每組只需要返回一個(gè)值,在這種情況下,可以使用 semijoin 來(lái)優(yōu)化子查詢,提升查詢效率。
這是 MySQL 5.6 加入的新特性,MySQL 5.6 以前優(yōu)化器只有 exists 一種策略來(lái)“優(yōu)化”子查詢。
經(jīng)過(guò) semijoin 優(yōu)化后的 SQL 和執(zhí)行計(jì)劃分為:
root@localhost [dbtest01]>desc SELECT * FROM test01 WHERE test01.a IN (SELECT test02.b FROM test02 WHERE id 10);
+----+--------------+-------------+------------+-------+---------------+---------+---------+---------------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------+-------------+------------+-------+---------------+---------+---------+---------------+------+----------+-------------+
| 1 | SIMPLE | subquery2> | NULL | ALL | NULL | NULL | NULL | NULL | NULL | 100.00 | Using where |
| 1 | SIMPLE | test01 | NULL | ref | a | a | 5 | subquery2>.b | 1 | 100.00 | NULL |
| 2 | MATERIALIZED | test02 | NULL | range | PRIMARY | PRIMARY | 4 | NULL | 9 | 100.00 | Using where |
+----+--------------+-------------+------------+-------+---------------+---------+---------+---------------+------+----------+-------------+
3 rows in set, 1 warning (0.00 sec)
select
`test01`.`id`,`test01`.`a`,`test01`.`b`
from `test01` semi join `test02`
where
((`test01`.`a` = `subquery2>`.`b`)
and (`test02`.`id` 10));
##注意這是優(yōu)化器改寫(xiě)的SQL,客戶端上是不能用 semi join 語(yǔ)法的
semijoin 優(yōu)化實(shí)現(xiàn)比較復(fù)雜,其中又分 FirstMatch、Materialize 等策略,上面的執(zhí)行計(jì)劃中 select_type=MATERIALIZED 就是代表使用了 Materialize 策略來(lái)實(shí)現(xiàn)的 semijoin
這里 semijoin 優(yōu)化后的執(zhí)行流程為:
先執(zhí)行子查詢,把結(jié)果保存到一個(gè)臨時(shí)表中,這個(gè)臨時(shí)表有個(gè)主鍵用來(lái)去重;
從臨時(shí)表中取出一行數(shù)據(jù) R;
從數(shù)據(jù)行 R 中,取出字段 b 到被驅(qū)動(dòng)表 t1 中去查找,滿足條件則放到結(jié)果集;
重復(fù)執(zhí)行 2、3,直到結(jié)束。
這樣一來(lái),子查詢結(jié)果有 9 行,即臨時(shí)表也有 9 行(這里沒(méi)有重復(fù)值),總的掃描行數(shù)為 9+9+9*1=27 行,比原來(lái)的 10000 行少了很多。
MySQL 5.6 版本中加入的另一種優(yōu)化特性 materialization,就是把子查詢結(jié)果物化成臨時(shí)表,然后代入到外查詢中進(jìn)行查找,來(lái)加快查詢的執(zhí)行速度。內(nèi)存臨時(shí)表包含主鍵(hash 索引),消除重復(fù)行,使表更小。
如果子查詢結(jié)果太大,超過(guò) tmp_table_size 大小,會(huì)退化成磁盤(pán)臨時(shí)表。這樣子查詢只需要執(zhí)行一次,而不是對(duì)于外層查詢的每一行都得執(zhí)行一遍。
不過(guò)要注意的是,這樣外查詢依舊無(wú)法通過(guò)索引快速查找到符合條件的數(shù)據(jù),只能通過(guò)全表掃描或者全索引掃描,
semijoin 和 materialization 的開(kāi)啟是通過(guò) optimizer_switch 參數(shù)中的 semijoin={on|off}、materialization={on|off} 標(biāo)志來(lái)控制的。
上文中不同的執(zhí)行計(jì)劃就是對(duì) semijoin 和 materialization 進(jìn)行開(kāi)/關(guān)產(chǎn)生的
總的來(lái)說(shuō)對(duì)于子查詢,先檢查是否滿足各種優(yōu)化策略的條件(比如子查詢中有 union 則無(wú)法使用 semijoin 優(yōu)化)
然后優(yōu)化器會(huì)按成本進(jìn)行選擇,實(shí)在沒(méi)得選就會(huì)用 exists 策略來(lái)“優(yōu)化”子查詢,exists 策略是沒(méi)有參數(shù)來(lái)開(kāi)啟或者關(guān)閉的。
下面舉一個(gè)delete相關(guān)的子查詢例子:
把上面的2張測(cè)試表分別填充350萬(wàn)數(shù)據(jù)和50萬(wàn)數(shù)據(jù)來(lái)測(cè)試delete語(yǔ)句
root@localhost [dbtest01]>select count(*) from test02;
+----------+
| count(*) |
+----------+
| 3532986 |
+----------+
1 row in set (0.64 sec)
root@localhost [dbtest01]>create table test01 like test02;
Query OK, 0 rows affected (0.01 sec)
root@localhost [dbtest01]>insert into test01 (select * from test02 where id=500000)
root@localhost [dbtest01]>select count(*) from test01;
+----------+
| count(*) |
+----------+
| 500000 |
執(zhí)行delete刪除語(yǔ)句執(zhí)行了4s
root@localhost [dbtest01]>delete FROM test01 WHERE test01.a IN (SELECT test02.b FROM test02 WHERE id 10);
Query OK, 9 rows affected (4.86 sec)
查看 執(zhí)行計(jì)劃,對(duì)test01表進(jìn)行了幾乎全表掃描:
root@localhost [dbtest01]>desc delete FROM test01 WHERE test01.a IN (SELECT test02.b FROM test02 WHERE id 10);
+----+--------------------+--------+------------+-------+---------------+---------+---------+------+--------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------------+--------+------------+-------+---------------+---------+---------+------+--------+----------+-------------+
| 1 | DELETE | test01 | NULL | ALL | NULL | NULL | NULL | NULL | 499343 | 100.00 | Using where |
| 2 | DEPENDENT SUBQUERY | test02 | NULL | range | PRIMARY | PRIMARY | 4 | NULL | 9 | 10.00 | Using where |
+----+--------------------+--------+------------+-------+---------------+---------+---------+------+--------+----------+-------------+
2 rows in set (0.00 sec)
于是修改上面的delete SQL語(yǔ)句偽join語(yǔ)句
root@localhost [dbtest01]>desc delete test01.* from test01 join test02 on test01.a=test02.b and test02.id10;
+----+-------------+--------+------------+-------+---------------+---------+---------+-------------------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+-------+---------------+---------+---------+-------------------+------+----------+-------------+
| 1 | SIMPLE | test02 | NULL | range | PRIMARY | PRIMARY | 4 | NULL | 9 | 100.00 | Using where |
| 1 | DELETE | test01 | NULL | ref | a | a | 5 | dbtest01.test02.b | 1 | 100.00 | NULL |
+----+-------------+--------+------------+-------+---------------+---------+---------+-------------------+------+----------+-------------+
2 rows in set (0.01 sec)
執(zhí)行非常的快
root@localhost [dbtest01]>delete test01.* from test01 join test02 on test01.a=test02.b and test02.id10;
Query OK, 9 rows affected (0.01 sec)
root@localhost [dbtest01]>select test01.* from test01 join test02 on test01.a=test02.b and test02.id10;
Empty set (0.00 sec)
下面的這個(gè)表執(zhí)行要全表掃描,非常慢,基本對(duì)表test01進(jìn)行了全表掃描:
root@lcalhost [dbtest01]>desc delete FROM test01 WHERE id IN (SELECT id FROM test02 WHERE id='350000');
+----+--------------------+--------+------------+-------+---------------+---------+---------+-------+--------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------------+--------+------------+-------+---------------+---------+---------+-------+--------+----------+-------------+
| 1 | DELETE | test01 | NULL | ALL | NULL | NULL | NULL | NULL | 499343 | 100.00 | Using where |
| 2 | DEPENDENT SUBQUERY | test02 | NULL | const | PRIMARY | PRIMARY | 4 | const | 1 | 100.00 | Using index |
+----+--------------------+--------+------------+-------+---------------+---------+---------+-------+--------+----------+-------------+
2 rows in set (0.00 sec)
然而采用join的話,效率非常的高:
root@localhost [dbtest01]>desc delete test01.* FROM test01 inner join test02 WHERE test01.id=test02.id and test02.id=350000 ;
+----+-------------+--------+------------+-------+---------------+---------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+-------+---------------+---------+---------+-------+------+----------+-------------+
| 1 | DELETE | test01 | NULL | const | PRIMARY | PRIMARY | 4 | const | 1 | 100.00 | NULL |
| 1 | SIMPLE | test02 | NULL | const | PRIMARY | PRIMARY | 4 | const | 1 | 100.00 | Using index |
+----+-------------+--------+------------+-------+---------------+---------+---------+-------+------+----------+-------------+
2 rows in set (0.01 sec)
root@localhost [dbtest01]> desc delete test01.* from test01 join test02 on test01.a=test02.b and test02.id=350000;
+----+-------------+--------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
| 1 | SIMPLE | test02 | NULL | const | PRIMARY | PRIMARY | 4 | const | 1 | 100.00 | NULL |
| 1 | DELETE | test01 | NULL | ref | a | a | 5 | const | 1 | 100.00 | NULL |
+----+-------------+--------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
2 rows in set (0.00 sec)
參考文檔:
https://www.cnblogs.com/zhengyun_ustc/p/slowquery1.html
https://www.jianshu.com/p/3989222f7084
https://dev.mysql.com/doc/refman/5.6/en/subquery-optimization.html
到此這篇關(guān)于MySQL之select in 子查詢優(yōu)化的實(shí)現(xiàn)的文章就介紹到這了,更多相關(guān)MySQL select in 子查詢優(yōu)化內(nèi)容請(qǐng)搜索腳本之家以前的文章或繼續(xù)瀏覽下面的相關(guān)文章希望大家以后多多支持腳本之家!
您可能感興趣的文章:- 淺談MySQL之select優(yōu)化方案
- MySQL將select結(jié)果執(zhí)行update的實(shí)例教程
- 解決MySQL讀寫(xiě)分離導(dǎo)致insert后select不到數(shù)據(jù)的問(wèn)題
- MySQL Select語(yǔ)句是如何執(zhí)行的
- mysql學(xué)習(xí)筆記之完整的select語(yǔ)句用法實(shí)例詳解
- MySQL select、insert、update批量操作語(yǔ)句代碼實(shí)例
- 簡(jiǎn)單了解MySQL SELECT執(zhí)行順序
- mysql事務(wù)select for update及數(shù)據(jù)的一致性處理講解
- MySQL中Update、select聯(lián)用操作單表、多表,及視圖與臨時(shí)表的區(qū)別
- mysql select緩存機(jī)制使用詳解
- MySql數(shù)據(jù)庫(kù)中Select用法小結(jié)
- 論一條select語(yǔ)句在MySQL是怎樣執(zhí)行的