[TOC] # MERGE MERGE 算子用于将源表中的数据行以更新或插入的方式合并到目标表中。 OceanBase 数据库支持的 MERGE 算子包括 MERGE 和 MULTI PARTITION MERGE。 ## MERGE MERGE 算子用于合并数据表单个分区中的数据。 如下例所示,Q1 查询将 src\_tbl 表中的数据行合并到 t1 表中,对于 src\_tbl 中的每一条数据行按照如下方式进行合并: * 当 t1 中存在满足`t1.c1=src_tbl.c1`条件的数据行: * 如果满足`src_tbl.c2 > '100'`,则执行更新操作,将`t1.c2`的值置为`src_tbl.c2`的值(目标表中的每一行只会更新一次)。 * 如果不满足`src_tbl.c2 > '100'`,则不执行更新操作。 * 当 t1 中不存在满足`t1.c1 = src_tbl.c1`条件的数据行: * 如果满足`src_tbl.c1 > 10`,则执行插入操作,向 t1 中插入`(src_tbl.c1,src_tbl.c2)`。 * 如果不满足`src_tbl.c1 > 10`,则不指定插入操作。 ~~~ obclient>CREATE TABLE src_tbl (c1 INT PRIMARY KEY, c2 VARCHAR2(10)); Query OK, 0 rows affected (0.12 sec) obclient>CREATE TABLE t1 (c1 INT PRIMARY KEY, c2 VARCHAR2(10)); Query OK, 0 rows affected (0.12 sec) obclient>CREATE TABLE t2 (c1 INT PRIMARY KEY, c2 VARCHAR2(10)) PARTITION BY HASH(c1) PARTITIONS 10; Query OK, 0 rows affected (0.12 sec) Q1: obclient>EXPLAIN MERGE INTO t1 USING src_tbl ON (t1.c1 = src_tbl.c1) WHEN MATCHED THEN UPDATE SET t1.c2 = src_tbl.c2 WHERE src_tbl.c2 > '100' WHEN NOT MATCHED THEN INSERT (t1.c1, t1.c2) VALUES (src_tbl.c1,src_tbl.c2) WHERE src_tbl.c1 > 10\G; *************************** 1. row *************************** Query Plan: =============================================== |ID|OPERATOR |NAME |EST. ROWS|COST | ----------------------------------------------- |0 |MERGE | |100001 |100001| |1 | MERGE OUTER JOIN| |100001 |219005| |2 | TABLE SCAN |SRC_TBL|100000 |61860 | |3 | TABLE SCAN |T1 |100000 |61860 | =============================================== Outputs & filters: ------------------------------------- 0 - output([column_conv(DECIMAL,PS:(38,0),NOT NULL,SRC_TBL.C1)], [column_conv(VARCHAR,utf8mb4_bin,length:10,NULL,SRC_TBL.C2)]), filter(nil), columns([{T1: ({T1: (T1.C1, T1.C2)})}]), partitions(p0), update([T1.C2=column_conv(VARCHAR,utf8mb4_bin,length:10,NULL,SRC_TBL.C2)]), match_conds([T1.C1 = SRC_TBL.C1]), insert_conds([SRC_TBL.C1 > 10]), update_conds([SRC_TBL.C2 > '100']), delete_conds(nil) 1 - output([SRC_TBL.C1], [SRC_TBL.C2], [T1.C1], [T1.C1 = SRC_TBL.C1], [T1.C2]), filter(nil), equal_conds([T1.C1 = SRC_TBL.C1]), other_conds(nil) 2 - output([SRC_TBL.C1], [SRC_TBL.C2]), filter(nil), access([SRC_TBL.C1], [SRC_TBL.C2]), partitions(p0) 3 - output([T1.C1], [T1.C2]), filter(nil), access([T1.C1], [T1.C2]), partitions(p0) ~~~ 其中,OUTER JOIN 是合并功能实现时依赖的一次联接操作,使用 MERGE 算子时,一定会在`source_table`和`target_table`上做一次外联接,目的是为了区分哪些行是匹配的,哪些是不匹配的。 执行计划展示中的 outputs & filters 详细列出了 MERGE 算子的输出信息如下: <table data-tag="table" id="table-029-rn7-72h" class="table"><colgroup span="1" width="200" data-tag="col" id="col-sib-egh-omi" colwidth="1*" colnum="1" colname="col1" style="width:50%" class="col"></colgroup><colgroup span="1" width="540" data-tag="col" id="col-j0b-f4r-36c" colwidth="1*" colnum="2" colname="col2" style="width:50%" class="col"></colgroup><thead id="thead-1cd-o30-wqv" class="thead"><tr id="tr-jwy-6jk-ov8"><th id="td-xh8-kvg-uow"><p id="p-ib4-jx4-7pp">信息名称</p></th><th id="td-f55-ex2-576"><p id="p-9uu-nxr-lr5">含义</p></th></tr></thead><tbody data-tag="tbody" id="tbody-wo3-2he-mel" class="tbody"><tr data-tag="tr" id="tr-y8k-vzv-xyh" class="tr"><td data-tag="td" id="td-hww-gv4-8im" class="td"><p id="p-iik-18a-k3j">output</p></td><td data-tag="td" id="td-ws5-pat-cdc" class="td"><p id="p-bib-p96-tzr">该算子输出的表达式。</p></td></tr><tr data-tag="tr" id="tr-y9k-018-50d" class="tr"><td data-tag="td" id="td-nk4-ymn-ywo" class="td"><p id="p-bxm-wpr-dbo">filter</p></td><td data-tag="td" id="td-ph4-xwz-don" class="td"><p id="p-rec-4v1-y3e">该算子上的过滤条件。</p><p id="p-a2j-erw-mhh">由于示例中 MERGE 算子没有设置 filter,所以为 nil。</p></td></tr><tr data-tag="tr" id="tr-4tq-j63-5c3" class="tr"><td data-tag="td" id="td-w0s-nga-lqq" class="td"><p data-tag="p" id="p-0cf-riv-gfq" class="p">columns</p></td><td data-tag="td" id="td-u14-lo7-p35" class="td"><p data-tag="p" id="p-iwk-icl-ndk" class="p">插入操作涉及的数据表的列。</p></td></tr><tr data-tag="tr" id="tr-jik-izx-jrg" class="tr"><td data-tag="td" id="td-b9q-1fn-rdk" class="td"><p id="p-yjq-puf-53c">partitions</p></td><td data-tag="td" id="td-3q8-u0a-34m" class="td"><p id="p-v0e-lt2-k0t">插入操作涉及到的数据表的分区。</p></td></tr><tr data-tag="tr" id="tr-avl-rmz-6z8" class="tr"><td data-tag="td" id="td-qhy-l7d-4sq" class="td"><p data-tag="p" id="p-nam-toj-vo7" class="p">update</p></td><td data-tag="td" id="td-4rg-nff-p7r" class="td"><p data-tag="p" id="p-vxs-5ji-sis" class="p">更新操作中所有的赋值表达式。</p></td></tr><tr data-tag="tr" id="tr-bmf-1vm-010" class="tr"><td data-tag="td" id="td-922-t9r-5cy" class="td"><p data-tag="p" id="p-ern-k0r-0ax" class="p">match_conds</p></td><td data-tag="td" id="td-m8f-zwj-wd3" class="td"><p data-tag="p" id="p-uqc-xnk-wzk" class="p">源表和目标表进行匹配的条件。</p></td></tr><tr data-tag="tr" id="tr-4o1-pyt-e74" class="tr"><td data-tag="td" id="td-w3j-8o6-liq" class="td"><p data-tag="p" id="p-7po-x7y-yw1" class="p">insert_conds</p></td><td data-tag="td" id="td-r1c-thp-39o" class="td"><p data-tag="p" id="p-q8s-uqg-bw8" class="p">插入操作需要满足的条件。</p></td></tr><tr data-tag="tr" id="tr-egu-w7h-eou" class="tr"><td data-tag="td" id="td-vag-l6z-zfr" class="td"><p data-tag="p" id="p-97l-q6e-e6p" class="p">update_conds</p></td><td data-tag="td" id="td-88a-vpc-rll" class="td"><p data-tag="p" id="p-bxm-lmu-m5h" class="p">更新操作需要满足的条件。</p></td></tr><tr data-tag="tr" id="tr-xip-6c9-agq" class="tr"><td data-tag="td" id="td-8a1-zcr-92g" class="td"><p data-tag="p" id="p-l8m-7lm-7om" class="p">delete_conds</p></td><td data-tag="td" id="td-31w-joc-q78" class="td"><p data-tag="p" id="p-kvi-w3j-9ye" class="p">删除操作需要满足的条件。</p></td></tr></tbody></table> ## MULTI PARTITION MERGE MULTI PARTITION MERGE 算子用于合并数据表多个分区中的数据。 如下例所示,Q2 查询将 src\_tbl 表中的数据行合并到分区表 t2 表中,对于 src\_tbl 中的每一条数据行按照如下方式进行合并: * 当 t2 中存在满足`t2.c1 = src_tbl.c1`条件的数据行: * 执行更新操作,将`t2.c2`的值置为`substr(src_tbl.c2, 1, 5)`的值(目标表中的每一行只会更新一次)。 * 更新完成后,如果满足`t2.c2 > '80000'`,则删除对应的数据行。 * 当 t2 中不存在满足`t2.c1 = src_tbl.c1`条件的数据行,执行插入操作,向 t2 中插入`(src_tbl.c1, src_tbl.c2)`。 ~~~ Q2: obclient>EXPLAIN MERGE INTO t2 USING SRC_TBL ON (t2.c1 = src_tbl.c1) WHEN MATCHED THEN UPDATE SET t2.c2 = SUBSTR(src_tbl.c2, 1, 5) DELETE WHERE t2.c2 > '80000' WHEN NOT MATCHED THEN INSERT (t2.c1, t2.c2) VALUES (src_tbl.c1,src_tbl.c2)\G; *************************** 1. row *************************** Query Plan: ============================================================ |ID|OPERATOR |NAME |EST. ROWS|COST | -------------------------------------------------------------- |0 |MULTI PARTITION MERGE | |100000 |100000 | |1 | PX COORDINATOR | |100000 |956685 | |2 | EXCHANGE OUT DISTR |:EX10001|100000 |899889 | |3 | MERGE OUTER JOIN | |100000 |899889 | |4 | EXCHANGE IN DISTR | |100000 |90258 | |5 | EXCHANGE OUT DISTR (PKEY)|:EX10000|100000 |61860 | |6 | TABLE SCAN |SRC_TBL |100000 |61860 | |7 | SORT | |1000000 |5447108| |8 | PX PARTITION ITERATOR | |1000000 |618524 | |9 | TABLE SCAN |T2 |1000000 |618524 | ============================================================== Outputs & filters: ------------------------------------- 0 - output([column_conv(DECIMAL,PS:(38,0),NOT NULL,SRC_TBL.C1)], [column_conv(VARCHAR,utf8mb4_bin,length:10,NULL,SRC_TBL.C2)]), filter(nil), columns([{T2: ({T2: (T2.C1, T2.C2)})}]), partitions(p[0-9]), update([T2.C2=column_conv(VARCHAR,utf8mb4_bin,length:10,NULL,SUBSTR(SRC_TBL.C2, 1, 5))]), match_conds([T2.C1 = SRC_TBL.C1]), insert_conds(nil), update_conds(nil), delete_conds([T2.C2 > '80000']) 1 - output([SRC_TBL.C1], [SRC_TBL.C2], [T2.C1], [T2.C1 = SRC_TBL.C1], [T2.C2]), filter(nil) 2 - output([SRC_TBL.C1], [SRC_TBL.C2], [T2.C1], [T2.C1 = SRC_TBL.C1], [T2.C2]), filter(nil), dop=1 3 - output([SRC_TBL.C1], [SRC_TBL.C2], [T2.C1], [T2.C1 = SRC_TBL.C1], [T2.C2]), filter(nil), equal_conds([T2.C1 = SRC_TBL.C1]), other_conds(nil) 4 - output([SRC_TBL.C1], [SRC_TBL.C2]), filter(nil) 5 - (#keys=1, [SRC_TBL.C1]), output([SRC_TBL.C1], [SRC_TBL.C2]), filter(nil), is_single, dop=1 6 - output([SRC_TBL.C1], [SRC_TBL.C2]), filter(nil), access([SRC_TBL.C1], [SRC_TBL.C2]), partitions(p0) 7 - output([T2.C1], [T2.C2]), filter(nil), sort_keys([T2.C1, ASC]), local merge sort 8 - output([T2.C1], [T2.C2]), filter(nil) 9 - output([T2.C1], [T2.C2]), filter(nil), access([T2.C1], [T2.C2]), partitions(p[0-9]) ~~~ 上述示例的执行计划展示中的 outputs & filters 详细列出了 MULTI PARTITION MERGE 算子的输出信息,字段的含义与 MERGE 算子相同。