01問題
前期開發反饋在使用獨立子查詢時,不論子查詢中結果集有幾個,語句都會卡死遲遲得不到返回結果。但是如果去掉子查詢,直接賦值查詢很快得到返回結果。聽到這個情況第一反應很可能DBLE在獨立子查詢上沒有做相關優化,真的是這樣么?下面我們將問題復現以及優化方式進行展示。
02 演示及優化
環境檢查
DBLE版本:2.19.11.5
MySQL版本:5.7.28
涉及分片表:src_biz_filelist
分片鍵:batch_no
涉及垂直表:src_image_txn_jnl
分片拆分規則:stringhash
節點數量:4
2.1 原始語句及執行計劃
獨立子查詢語句如下:
select caption,batch_no,image_no from src_biz_filelist where batch_no in (SELECT imageid FROM src_image_txn_jnl WHERE front_seq_num = '001780568097' and txn_dt = '2029-08-20')
執行計劃如下:
mysql> explain select caption,batch_no,image_no from src_biz_filelist where batch_no in (SELECT imageid FROM src_image_txn_jnl WHERE front_seq_num = '001780568097' and txn_dt = '2029-08-20');
+-------------------+-----------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| DATA_NODE | TYPE | SQL/REF |
+-------------------+-----------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| dn1_0 | BASE SQL | select `src_biz_filelist`.`ysyid`,`src_biz_filelist`.`batch_no`,`src_biz_filelist`.`image_no`,`src_biz_filelist`.`caption`,`src_biz_filelist`.`create_time`,`src_biz_filelist`.`update_time`,`src_biz_filelist`.`img_seql`,`src_biz_filelist`.`data_date` from `src_biz_filelist` ORDER BY `src_biz_filelist`.`batch_no` ASC |
| dn2_0 | BASE SQL | select `src_biz_filelist`.`ysyid`,`src_biz_filelist`.`batch_no`,`src_biz_filelist`.`image_no`,`src_biz_filelist`.`caption`,`src_biz_filelist`.`create_time`,`src_biz_filelist`.`update_time`,`src_biz_filelist`.`img_seql`,`src_biz_filelist`.`data_date` from `src_biz_filelist` ORDER BY `src_biz_filelist`.`batch_no` ASC |
| dn3_0 | BASE SQL | select `src_biz_filelist`.`ysyid`,`src_biz_filelist`.`batch_no`,`src_biz_filelist`.`image_no`,`src_biz_filelist`.`caption`,`src_biz_filelist`.`create_time`,`src_biz_filelist`.`update_time`,`src_biz_filelist`.`img_seql`,`src_biz_filelist`.`data_date` from `src_biz_filelist` ORDER BY `src_biz_filelist`.`batch_no` ASC |
| dn4_0 | BASE SQL | select `src_biz_filelist`.`ysyid`,`src_biz_filelist`.`batch_no`,`src_biz_filelist`.`image_no`,`src_biz_filelist`.`caption`,`src_biz_filelist`.`create_time`,`src_biz_filelist`.`update_time`,`src_biz_filelist`.`img_seql`,`src_biz_filelist`.`data_date` from `src_biz_filelist` ORDER BY `src_biz_filelist`.`batch_no` ASC |
| merge_and_order_1 | MERGE_AND_ORDER | dn1_0; dn2_0; dn3_0; dn4_0 |
| shuffle_field_1 | SHUFFLE_FIELD | merge_and_order_1 |
| dn1_1 | BASE SQL | select `autoalias_src_image_txn_jnl`.`autoalias_scalar` from (select distinct `src_image_txn_jnl`.`imageid` as `autoalias_scalar` from `src_image_txn_jnl` where (`src_image_txn_jnl`.`front_seq_num` = '001780568097') AND (`src_image_txn_jnl`.`txn_dt` = '2029-08-20') order by autoalias_scalar ASC) autoalias_src_image_txn_jnl order by `autoalias_src_image_txn_jnl`.`autoalias_scalar` ASC |
| merge_1 | MERGE | dn1_1 |
| join_1 | JOIN | shuffle_field_1; merge_1 |
| shuffle_field_2 | SHUFFLE_FIELD | join_1 |
+-------------------+-----------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
10 rows in set (0.00 sec)
**執行計劃可見,DBLE對語句進行了拆分。首先分別在4個數據節點全表掃描分片表src_biz_filelist,將各自返回的結果集在DBLE層合并排序。然后將子查詢結果去重排序后返回。最后在DBLE層進行匹配操作。當分片表數據量較大時,全部提取數據動作,即使不與子查詢中結果集匹配,效率也不會很好。
**
按照開發描述改寫子查詢語句賦予具體值,進行查看:
mysql> explain select caption,batch_no,image_no from src_biz_filelist where batch_no in ('0250002403989000IMMEDA200630095447WV','1650002432532000IMMEDA200630095233WS');
+-----------+----------+------------------------------------------------------------------------------------------------------------------------------------------------------+
| DATA_NODE | TYPE | SQL/REF |
+-----------+----------+------------------------------------------------------------------------------------------------------------------------------------------------------+
| dn4 | BASE SQL | select caption,batch_no,image_no from src_biz_filelist where batch_no in ('0250002403989000IMMEDA200630095447WV','1650002432532000IMMEDA200630095233WS') |
+-----------+----------+------------------------------------------------------------------------------------------------------------------------------------------------------+
1 rows in set (0.00 sec)
因為分片表src_biz_filelist的分片鍵為batch_no,語句直接下壓到了數據所在的dn4節點。驗證了開發描述的在使用獨立子查詢時,不論子查詢中結果集有幾個,語句都會卡死遲遲得不到返回結果。如果去掉子查詢賦予具體值,很快得到返回結果。程序進行語句拆分傳入具體值查詢,也是我們的優化方法之一。那是不是DBLE在獨立子查詢上真的沒有做相關優化呢?我們繼續操作。
**2.2 **獨立標量子查詢語句及執行計劃
如果明確子查詢結果單個值情況下,可以改寫使用獨立標量子查詢,執行計劃如下:
mysql> explain select caption,batch_no,image_no from src_biz_filelist where batch_no = (SELECT imageid FROM src_image_txn_jnl WHERE front_seq_num = '001780568097' and txn_dt = '2029-08-20');
+--------------------+-----------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| DATA_NODE | TYPE | SQL/REF |
+--------------------+-----------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| dn1_0 | BASE SQL | select `src_image_txn_jnl`.`imageid` as `autoalias_scalar` from `src_image_txn_jnl` where (`src_image_txn_jnl`.`front_seq_num` = '001780568097') AND (`src_image_txn_jnl`.`txn_dt` = '2029-08-20') limit 0,2 |
| merge_1 | MERGE | dn1_0 |
| scalar_sub_query_1 | SCALAR_SUB_QUERY | merge_1 |
| dn1_1 | BASE SQL(May No Need) | scalar_sub_query_1; select `src_biz_filelist`.`caption`,`src_biz_filelist`.`batch_no`,`src_biz_filelist`.`image_no` from `src_biz_filelist` where `src_biz_filelist`.`batch_no` = '{NEED_TO_REPLACE}' |
| dn2_0 | BASE SQL(May No Need) | scalar_sub_query_1; select `src_biz_filelist`.`caption`,`src_biz_filelist`.`batch_no`,`src_biz_filelist`.`image_no` from `src_biz_filelist` where `src_biz_filelist`.`batch_no` = '{NEED_TO_REPLACE}' |
| dn3_0 | BASE SQL(May No Need) | scalar_sub_query_1; select `src_biz_filelist`.`caption`,`src_biz_filelist`.`batch_no`,`src_biz_filelist`.`image_no` from `src_biz_filelist` where `src_biz_filelist`.`batch_no` = '{NEED_TO_REPLACE}' |
| dn4_0 | BASE SQL(May No Need) | scalar_sub_query_1; select `src_biz_filelist`.`caption`,`src_biz_filelist`.`batch_no`,`src_biz_filelist`.`image_no` from `src_biz_filelist` where `src_biz_filelist`.`batch_no` = '{NEED_TO_REPLACE}' |
| merge_2 | MERGE | dn1_1; dn2_0; dn3_0; dn4_0 |
| shuffle_field_1 | SHUFFLE_FIELD | merge_2 |
+--------------------+-----------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
9 rows in set (0.00 sec)
執行計劃可見:dn1_0部分先將子查詢結果返回,出現了SCALAR_SUB_QUERY標量子查詢的標識,被引用到外部查詢中,語句直接下壓到了全部數據節點。這種情況下依據分片鍵進行等值查詢,效率是OK的。
****03 解決
通過查看原始語句在DBLE的執行計劃,下壓到MySQL時,由于DBLE對語句進行了重寫,并沒有使用到MySQL優化器改寫語句為semi join方式。在MySQL中需要使用子查詢完成的任務,很多時候可以使用連接方式來實現,那么我們在上層就重寫為join語句實驗一下。
查看DBLE文檔server.xml 系統參數配置部分有一個useJoinStrategy參數,該參數作用是開啟之后會嘗試判斷join兩邊的where來重新調整查詢SQL下發的順序,默認關閉。添加true至server.xml的標簽中,并重啟DBLE。
查看改寫之后語句執行計劃:
mysql> explain select t1.caption,t1.batch_no,t1.image_no from src_biz_filelist t1 inner join src_image_txn_jnl t2 on t1.batch_no=t2.imageid WHERE t2.front_seq_num = '001780568097' and t2.txn_dt = '2029-08-20';
+-------------------+-----------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| DATA_NODE | TYPE | SQL/REF |
+-------------------+-----------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| dn1_0 | BASE SQL | select `t2`.`id`,`t2`.`evt_id`,`t2`.`front_seq_num`,`t2`.`org_id`,`t2`.`teller_id`,`t2`.`txn_dt`,`t2`.`txn_tm`,`t2`.`txn_type_cd`,`t2`.`txn_cd`,`t2`.`sub_seq_num`,`t2`.`bcs_seq_num`,`t2`.`user_ccy_cd`,`t2`.`acct_num`,`t2`.`acct_ccy_cd`,`t2`.`acct_amt`,`t2`.`txn_ccy_cd`,`t2`.`txn_amt`,`t2`.`exch_rate`,`t2`.`resp_num`,`t2`.`proc_info`,`t2`.`auth_teller_id`,`t2`.`terminal_num`,`t2`.`cntrpty_acct_num`,`t2`.`cntrpty_acct_name`,`t2`.`vchr_type_encode`,`t2`.`vchr_num`,`t2`.`vchr_dt`,`t2`.`chk_teller_id`,`t2`.`vchr_main_file_desc`,`t2`.`vchr_accessory_desc`,`t2`.`biz_status_cd`,`t2`.`fin_txn_ind`,`t2`.`txn_complt_dt`,`t2`.`ident_chk_chker_id`,`t2`.`ident_chk_result_cd`,`t2`.`veri_seal_chker_id`,`t2`.`veri_seal_result_cd`,`t2`.`tel_che_chker_id`,`t2`.`tel_che_result_cd`,`t2`.`tel_che_comment`,`t2`.`pay_acct_doubt_type_cd`,`t2`.`rcv_acct_doubt_type_cd`,`t2`.`job_center`,`t2`.`imageid`,`t2`.`pay_name`,`t2`.`txn_desc`,`t2`.`origin_txn_seq_num`,`t2`.`txn_start_tm`,`t2`.`txn_memo`,`t2`.`bcs_or_back_txn_num`,`t2`.`bea_intrfc_num`,`t2`.`txn_auth_teller`,`t2`.`send_back_or_back_err_info`,`t2`.`data_date` from `src_image_txn_jnl` `t2` where (`t2`.`front_seq_num` = '001780568097') AND (`t2`.`txn_dt` = '2029-08-20') order by `t2`.`imageid` ASC |
| merge_1 | MERGE | dn1_0 |
| nest_loop_1 | NEST_LOOP | merge_1 |
| dn1_1 | BASE SQL(May No Need) | nest_loop_1's RESULTS; select `t1`.`ysyid`,`t1`.`batch_no`,`t1`.`image_no`,`t1`.`caption`,`t1`.`create_time`,`t1`.`update_time`,`t1`.`img_seql`,`t1`.`data_date` from `src_biz_filelist` `t1` where t1.batch_no IN ('{NEED_TO_REPLACE}') ORDER BY `t1`.`batch_no` ASC |
| dn2_0 | BASE SQL(May No Need) | nest_loop_1's RESULTS; select `t1`.`ysyid`,`t1`.`batch_no`,`t1`.`image_no`,`t1`.`caption`,`t1`.`create_time`,`t1`.`update_time`,`t1`.`img_seql`,`t1`.`data_date` from `src_biz_filelist` `t1` where t1.batch_no IN ('{NEED_TO_REPLACE}') ORDER BY `t1`.`batch_no` ASC |
| dn3_0 | BASE SQL(May No Need) | nest_loop_1's RESULTS; select `t1`.`ysyid`,`t1`.`batch_no`,`t1`.`image_no`,`t1`.`caption`,`t1`.`create_time`,`t1`.`update_time`,`t1`.`img_seql`,`t1`.`data_date` from `src_biz_filelist` `t1` where t1.batch_no IN ('{NEED_TO_REPLACE}') ORDER BY `t1`.`batch_no` ASC |
| dn4_0 | BASE SQL(May No Need) | nest_loop_1's RESULTS; select `t1`.`ysyid`,`t1`.`batch_no`,`t1`.`image_no`,`t1`.`caption`,`t1`.`create_time`,`t1`.`update_time`,`t1`.`img_seql`,`t1`.`data_date` from `src_biz_filelist` `t1` where t1.batch_no IN ('{NEED_TO_REPLACE}') ORDER BY `t1`.`batch_no` ASC |
| merge_and_order_1 | MERGE_AND_ORDER | dn1_1; dn2_0; dn3_0; dn4_0 |
| shuffle_field_1 | SHUFFLE_FIELD | merge_and_order_1 |
| join_1 | JOIN | nest_loop_1; shuffle_field_1 |
| shuffle_field_2 | SHUFFLE_FIELD | join_1 |
+-------------------+-----------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
11 rows in set (0.00 sec)
執行計劃可見:dn1_0部分先將子查詢結果返回,出現了NEST_LOOP嵌套循環標識,嵌套循環的結果集被引用到外部查詢中,語句直接下壓到了全部數據節點。
開啟MySQL general_log 日志查看語句實際在MySQL層運行情況如下:
2021-05-12T08:58:09.324017+08:00 1080 Query select `t2`.`id`,`t2`.`evt_id`,`t2`.`front_seq_num`,`t2`.`org_id`,`t2`.`teller_id`,`t2`.`txn_dt`,`t2`.`txn_tm`,`t2`.`txn_type_cd`,`t2`.`txn_cd`,`t2`.`sub_seq_num`,`t2`.`bcs_seq_num`,`t2`.`user_ccy_cd`,`t2`.`acct_num`,`t2`.`acct_ccy_cd`,`t2`.`acct_amt`,`t2`.`txn_ccy_cd`,`t2`.`txn_amt`,`t2`.`exch_rate`,`t2`.`resp_num`,`t2`.`proc_info`,`t2`.`auth_teller_id`,`t2`.`terminal_num`,`t2`.`cntrpty_acct_num`,`t2`.`cntrpty_acct_name`,`t2`.`vchr_type_encode`,`t2`.`vchr_num`,`t2`.`vchr_dt`,`t2`.`chk_teller_id`,`t2`.`vchr_main_file_desc`,`t2`.`vchr_accessory_desc`,`t2`.`biz_status_cd`,`t2`.`fin_txn_ind`,`t2`.`txn_complt_dt`,`t2`.`ident_chk_chker_id`,`t2`.`ident_chk_result_cd`,`t2`.`veri_seal_chker_id`,`t2`.`veri_seal_result_cd`,`t2`.`tel_che_chker_id`,`t2`.`tel_che_result_cd`,`t2`.`tel_che_comment`,`t2`.`pay_acct_doubt_type_cd`,`t2`.`rcv_acct_doubt_type_cd`,`t2`.`job_center`,`t2`.`imageid`,`t2`.`pay_name`,`t2`.`txn_desc`,`t2`.`origin_txn_seq_num`,`t2`.`txn_start_tm`,`t2`.`txn_memo`,`t2`.`bcs_or_back_txn_num`,`t2`.`bea_intrfc_num`,`t2`.`txn_auth_teller`,`t2`.`send_back_or_back_err_info`,`t2`.`data_date` from `src_image_txn_jnl` `t2` where (`t2`.`front_seq_num` = '001780568097') AND (`t2`.`txn_dt` = '2029-08-20') order by `t2`.`imageid` ASC
2021-05-12T08:58:09.326400+08:00 1091 Query select `t1`.`ysyid`,`t1`.`batch_no`,`t1`.`image_no`,`t1`.`caption`,`t1`.`create_time`,`t1`.`update_time`,`t1`.`img_seql`,`t1`.`data_date` from `src_biz_filelist` `t1` where t1.batch_no IN ('0250002403989000IMMEDA200630095447WV', '0250002401013000IMMEDA200914092919JX') ORDER BY `t1`.`batch_no` ASC
語句執行順序變化,獨立子查詢被賦予了具體值,并按分片鍵進行數據路由的特性,下發到指定分片節點進行查詢。
需要注意的是關于nestLoop還有兩個參數nestLoopConnSize /nestLoopRowsSize 若臨時表行數大于這兩個值乘積,DBLE則會報告一個后端連接錯誤。
綜上優化方式有以下幾種:
1、程序進行語句拆分傳入具體值進行查詢。
2、如果明確子查詢結果單個值情況下,可以改寫使用獨立標量子查詢。
3、添加 true 參數,并改寫為連接查詢。
-
開發
+關注
關注
0文章
370瀏覽量
40843 -
數據庫
+關注
關注
7文章
3799瀏覽量
64393 -
MySQL
+關注
關注
1文章
809瀏覽量
26569
發布評論請先 登錄
相關推薦
評論