-
Notifications
You must be signed in to change notification settings - Fork 759
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: import more join strageties for merge into (new distributed and standalone stargety ) #13950
feat: import more join strageties for merge into (new distributed and standalone stargety ) #13950
Conversation
Pull request description must contain CLA like the following:
|
2274b1f
to
91d7fef
Compare
3260e7f
to
78ae91f
Compare
…import_upper_optimizer_for_merge_into
waiting for #14011 to merge. @SkyFan2002 @dantengsky Let me do long run test and wizard check firstly. |
pass mergeinto2 wizard in standalone mode: Click mePreparing to run MERGE-INTO-C1...
Executing command: bendsql --query=-- MERGE-INTO-C1: Asset Types Distribution
SELECT asset_type, COUNT(*) AS count
FROM assets
GROUP BY asset_type
ORDER BY count DESC, asset_type
LIMIT 13 -D mergeinto
Command executed successfully. Output:
BTC 104817
ETH 104808
NEW_ASSET 100000
XRP 90375
Executing command: snowsql --query -- MERGE-INTO-C1: Asset Types Distribution
SELECT asset_type, COUNT(*) AS count
FROM assets
GROUP BY asset_type
ORDER BY count DESC, asset_type
LIMIT 13 --dbname mergeinto --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
BTC 104817
ETH 104808
NEW_ASSET 100000
XRP 90375
OK - MERGE-INTO-C1
BTC 104817
ETH 104808
NEW_ASSET 100000
XRP 90375
Preparing to run MERGE-INTO-C2...
Executing command: bendsql --query=
-- MERGE-INTO-C2: Aggregated Quantity and Value Statistics
SELECT SUM(quantity) AS total_quantity,
AVG(quantity) AS average_quantity,
SUM(value) AS total_value,
AVG(value) AS average_value
FROM assets -D mergeinto
Command executed successfully. Output:
160342725.20941540 400.856813023538 1603427252.09414242 4008.568130235356
Executing command: snowsql --query
-- MERGE-INTO-C2: Aggregated Quantity and Value Statistics
SELECT SUM(quantity) AS total_quantity,
AVG(quantity) AS average_quantity,
SUM(value) AS total_value,
AVG(value) AS average_value
FROM assets --dbname mergeinto --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
160342725.20941540 400.856813023539 1603427252.09414242 4008.568130235356
DIFFERENCE FOUND
MERGE-INTO-C2:
-- MERGE-INTO-C2: Aggregated Quantity and Value Statistics
SELECT SUM(quantity) AS total_quantity,
AVG(quantity) AS average_quantity,
SUM(value) AS total_value,
AVG(value) AS average_value
FROM assets
Differences:
bendsql:
160342725.20941540 400.856813023538 1603427252.09414242 4008.568130235356
snowsql:
160342725.20941540 400.856813023539 1603427252.09414242 4008.568130235356
Preparing to run MERGE-INTO-C3...
Executing command: bendsql --query=
-- MERGE-INTO-C3: Assets Counts by User
SELECT user_id, COUNT(*) AS count
FROM assets
GROUP BY user_id
ORDER BY count DESC, user_id
LIMIT 13 -D mergeinto
Command executed successfully. Output:
0 5
2 5
4 5
6 5
8 5
10 5
12 5
14 5
16 5
18 5
20 5
22 5
24 5
Executing command: snowsql --query
-- MERGE-INTO-C3: Assets Counts by User
SELECT user_id, COUNT(*) AS count
FROM assets
GROUP BY user_id
ORDER BY count DESC, user_id
LIMIT 13 --dbname mergeinto --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
0 5
2 5
4 5
6 5
8 5
10 5
12 5
14 5
16 5
18 5
20 5
22 5
24 5
OK - MERGE-INTO-C3
0 5
2 5
4 5
6 5
8 5
10 5
12 5
14 5
16 5
18 5
20 5
22 5
24 5
Preparing to run MERGE-INTO-C4...
Executing command: bendsql --query=
-- MERGE-INTO-C4: Date Range Analysis of Last Update
SELECT CASE
WHEN last_updated < '2022-01-01' THEN 'Before 2022'
ELSE 'After 2021-12-31'
END AS date_range,
COUNT(*) AS count
FROM assets
GROUP BY date_range
ORDER BY date_range
LIMIT 13 -D mergeinto
Command executed successfully. Output:
After 2021-12-31 360014
Before 2022 39986
Executing command: snowsql --query
-- MERGE-INTO-C4: Date Range Analysis of Last Update
SELECT CASE
WHEN last_updated < '2022-01-01' THEN 'Before 2022'
ELSE 'After 2021-12-31'
END AS date_range,
COUNT(*) AS count
FROM assets
GROUP BY date_range
ORDER BY date_range
LIMIT 13 --dbname mergeinto --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
After 2021-12-31 360014
Before 2022 39986
OK - MERGE-INTO-C4
After 2021-12-31 360014
Before 2022 39986
Preparing to run MERGE-INTO-C5...
Executing command: bendsql --query=
-- MERGE-INTO-C5: General Status Distribution
SELECT status, COUNT(*) AS count
FROM orders
GROUP BY status
ORDER BY count DESC, status
LIMIT 13 -D mergeinto
Command executed successfully. Output:
completed 526479
pending 522450
cancelled 451071
above_avg 100418
below_avg 100075
Pending 100000
avg 49532
Executing command: snowsql --query
-- MERGE-INTO-C5: General Status Distribution
SELECT status, COUNT(*) AS count
FROM orders
GROUP BY status
ORDER BY count DESC, status
LIMIT 13 --dbname mergeinto --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
completed 526479
pending 522450
cancelled 451071
above_avg 100418
below_avg 100075
Pending 100000
avg 49532
OK - MERGE-INTO-C5
completed 526479
pending 522450
cancelled 451071
above_avg 100418
below_avg 100075
Pending 100000
avg 49532
Preparing to run MERGE-INTO-C6...
Executing command: bendsql --query=
-- MERGE-INTO-C6: General Quantity Statistics
SELECT SUM(quantity) AS total_quantity,
AVG(quantity) AS average_quantity,
MIN(quantity) AS min_quantity,
MAX(quantity) AS max_quantity
FROM orders -D mergeinto
Command executed successfully. Output:
176507520.66613014 95.408181330592 0.00005807 435.53925916
Executing command: snowsql --query
-- MERGE-INTO-C6: General Quantity Statistics
SELECT SUM(quantity) AS total_quantity,
AVG(quantity) AS average_quantity,
MIN(quantity) AS min_quantity,
MAX(quantity) AS max_quantity
FROM orders --dbname mergeinto --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
176507520.66613014 95.408181330593 0.00005807 435.53925916
DIFFERENCE FOUND
MERGE-INTO-C6:
-- MERGE-INTO-C6: General Quantity Statistics
SELECT SUM(quantity) AS total_quantity,
AVG(quantity) AS average_quantity,
MIN(quantity) AS min_quantity,
MAX(quantity) AS max_quantity
FROM orders
Differences:
bendsql:
176507520.66613014 95.408181330592 0.00005807 435.53925916
snowsql:
176507520.66613014 95.408181330593 0.00005807 435.53925916
Preparing to run MERGE-INTO-C7...
Executing command: bendsql --query=
-- MERGE-INTO-C7: New Orders vs Existing Orders Count
SELECT CASE
WHEN order_id > 500000 THEN 'New Order'
ELSE 'Existing Order'
END AS order_category,
COUNT(*) AS count
FROM orders
GROUP BY order_category
ORDER BY count DESC
LIMIT 13 -D mergeinto
Command executed successfully. Output:
New Order 1099996
Existing Order 750029
Executing command: snowsql --query
-- MERGE-INTO-C7: New Orders vs Existing Orders Count
SELECT CASE
WHEN order_id > 500000 THEN 'New Order'
ELSE 'Existing Order'
END AS order_category,
COUNT(*) AS count
FROM orders
GROUP BY order_category
ORDER BY count DESC
LIMIT 13 --dbname mergeinto --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
New Order 1099996
Existing Order 750029
OK - MERGE-INTO-C7
New Order 1099996
Existing Order 750029
Preparing to run MERGE-INTO-C8...
Executing command: bendsql --query=
-- MERGE-INTO-C8: Order Type Distribution
SELECT order_type, COUNT(*) AS count
FROM orders
GROUP BY order_type
ORDER BY count DESC, order_type
LIMIT 13 -D mergeinto
Command executed successfully. Output:
buy 976729
sell 873296
Executing command: snowsql --query
-- MERGE-INTO-C8: Order Type Distribution
SELECT order_type, COUNT(*) AS count
FROM orders
GROUP BY order_type
ORDER BY count DESC, order_type
LIMIT 13 --dbname mergeinto --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
buy 976729
sell 873296
OK - MERGE-INTO-C8
buy 976729
sell 873296
Preparing to run MERGE-INTO-C9...
Executing command: bendsql --query=
-- MERGE-INTO-C9: Date Range Analysis
SELECT CASE
WHEN created_at < '2022-01-01' THEN 'Before 2022'
WHEN created_at BETWEEN '2021-01-01' AND '2021-06-30' THEN 'First Half 2021'
ELSE 'After 2021-06-30'
END AS date_range,
COUNT(*) AS count
FROM orders
GROUP BY date_range
ORDER BY date_range
LIMIT 13 -D mergeinto
Command executed successfully. Output:
After 2021-06-30 100000
Before 2022 1750025
Executing command: snowsql --query
-- MERGE-INTO-C9: Date Range Analysis
SELECT CASE
WHEN created_at < '2022-01-01' THEN 'Before 2022'
WHEN created_at BETWEEN '2021-01-01' AND '2021-06-30' THEN 'First Half 2021'
ELSE 'After 2021-06-30'
END AS date_range,
COUNT(*) AS count
FROM orders
GROUP BY date_range
ORDER BY date_range
LIMIT 13 --dbname mergeinto --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
After 2021-06-30 100000
Before 2022 1750025
OK - MERGE-INTO-C9
After 2021-06-30 100000
Before 2022 1750025
Preparing to run MERGE-INTO-C10...
Executing command: bendsql --query=
-- MERGE-INTO-C10: Price Analysis After Adjustments
SELECT SUM(price) AS total_price,
AVG(price) AS average_price,
MIN(price) AS min_price,
MAX(price) AS max_price
FROM orders -D mergeinto
Command executed successfully. Output:
925670420.59100828 500.355627946113 0.00058074 999.99408411
Executing command: snowsql --query
-- MERGE-INTO-C10: Price Analysis After Adjustments
SELECT SUM(price) AS total_price,
AVG(price) AS average_price,
MIN(price) AS min_price,
MAX(price) AS max_price
FROM orders --dbname mergeinto --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
925670420.59100828 500.355627946113 0.00058074 999.99408411
OK - MERGE-INTO-C10
925670420.59100828 500.355627946113 0.00058074 999.99408411
Preparing to run MERGE-INTO-C11...
Executing command: bendsql --query=
-- MERGE-INTO-C11: Transaction Types Distribution
SELECT transaction_type, COUNT(*) AS count
FROM transactions
GROUP BY transaction_type
ORDER BY count DESC, transaction_type
LIMIT 13 -D mergeinto
Command executed successfully. Output:
trade 740367
withdrawal 359266
deposit 358898
Executing command: snowsql --query
-- MERGE-INTO-C11: Transaction Types Distribution
SELECT transaction_type, COUNT(*) AS count
FROM transactions
GROUP BY transaction_type
ORDER BY count DESC, transaction_type
LIMIT 13 --dbname mergeinto --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
trade 740367
withdrawal 359266
deposit 358898
OK - MERGE-INTO-C11
trade 740367
withdrawal 359266
deposit 358898
Preparing to run MERGE-INTO-C12...
Executing command: bendsql --query=
-- MERGE-INTO-C12: Aggregated Quantity Statistics
SELECT SUM(quantity) AS total_quantity,
AVG(quantity) AS average_quantity,
MIN(quantity) AS min_quantity,
MAX(quantity) AS max_quantity
FROM transactions -D mergeinto
Command executed successfully. Output:
112084766.86573024 76.847709692649 0.00017860 664.06985910
Executing command: snowsql --query
-- MERGE-INTO-C12: Aggregated Quantity Statistics
SELECT SUM(quantity) AS total_quantity,
AVG(quantity) AS average_quantity,
MIN(quantity) AS min_quantity,
MAX(quantity) AS max_quantity
FROM transactions --dbname mergeinto --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
112084766.86573024 76.847709692650 0.00017860 664.06985910
DIFFERENCE FOUND
MERGE-INTO-C12:
-- MERGE-INTO-C12: Aggregated Quantity Statistics
SELECT SUM(quantity) AS total_quantity,
AVG(quantity) AS average_quantity,
MIN(quantity) AS min_quantity,
MAX(quantity) AS max_quantity
FROM transactions
Differences:
bendsql:
112084766.86573024 76.847709692649 0.00017860 664.06985910
snowsql:
112084766.86573024 76.847709692650 0.00017860 664.06985910
Preparing to run MERGE-INTO-C13...
Executing command: bendsql --query=
-- MERGE-INTO-C13: Transaction Counts by User and Asset Type
SELECT user_id, asset_type, COUNT(*) AS count
FROM transactions
GROUP BY user_id, asset_type
ORDER BY count DESC, user_id, asset_type
LIMIT 13 -D mergeinto
Command executed successfully. Output:
804 ETH 15
1216 ETH 15
216 BTC 14
425 ETH 14
844 BTC 14
1231 ETH 14
1539 ETH 14
1603 ETH 14
1926 ETH 14
2609 BTC 14
2704 ETH 14
2827 ETH 14
2841 BTC 14
Executing command: snowsql --query
-- MERGE-INTO-C13: Transaction Counts by User and Asset Type
SELECT user_id, asset_type, COUNT(*) AS count
FROM transactions
GROUP BY user_id, asset_type
ORDER BY count DESC, user_id, asset_type
LIMIT 13 --dbname mergeinto --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
804 ETH 15
1216 ETH 15
216 BTC 14
425 ETH 14
844 BTC 14
1231 ETH 14
1539 ETH 14
1603 ETH 14
1926 ETH 14
2609 BTC 14
2704 ETH 14
2827 ETH 14
2841 BTC 14
OK - MERGE-INTO-C13
804 ETH 15
1216 ETH 15
216 BTC 14
425 ETH 14
844 BTC 14
1231 ETH 14
1539 ETH 14
1603 ETH 14
1926 ETH 14
2609 BTC 14
2704 ETH 14
2827 ETH 14
2841 BTC 14
Preparing to run MERGE-INTO-C14...
Executing command: bendsql --query=
-- MERGE-INTO-C14: Date Range Analysis of Transactions
SELECT CASE
WHEN transaction_time < '2022-01-01' THEN 'Before 2022'
ELSE 'After 2021-12-31'
END AS date_range,
COUNT(*) AS count
FROM transactions
GROUP BY date_range
ORDER BY date_range
LIMIT 13 -D mergeinto
Command executed successfully. Output:
After 2021-12-31 433526
Before 2022 1025005
Executing command: snowsql --query
-- MERGE-INTO-C14: Date Range Analysis of Transactions
SELECT CASE
WHEN transaction_time < '2022-01-01' THEN 'Before 2022'
ELSE 'After 2021-12-31'
END AS date_range,
COUNT(*) AS count
FROM transactions
GROUP BY date_range
ORDER BY date_range
LIMIT 13 --dbname mergeinto --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
After 2021-12-31 433526
Before 2022 1025005
OK - MERGE-INTO-C14
After 2021-12-31 433526
Before 2022 1025005
Preparing to run MERGE-INTO-C15...
Executing command: bendsql --query=
-- MERGE-INTO-C15: asserts
SELECT asset_type, SUM(quantity) AS total_quantity, SUM(value) AS total_value
FROM assets
GROUP BY asset_type ORDER BY asset_type ASC -D mergeinto
Command executed successfully. Output:
BTC 54537998.34092175 545379983.40921303
ETH 50387362.73309448 503873627.33094612
NEW_ASSET 10000000.00000000 100000000.00000000
XRP 45417364.13539917 454173641.35398327
Executing command: snowsql --query
-- MERGE-INTO-C15: asserts
SELECT asset_type, SUM(quantity) AS total_quantity, SUM(value) AS total_value
FROM assets
GROUP BY asset_type ORDER BY asset_type ASC --dbname mergeinto --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
BTC 54537998.34092175 545379983.40921303
ETH 50387362.73309448 503873627.33094612
NEW_ASSET 10000000.00000000 100000000.00000000
XRP 45417364.13539917 454173641.35398327
OK - MERGE-INTO-C15
BTC 54537998.34092175 545379983.40921303
ETH 50387362.73309448 503873627.33094612
NEW_ASSET 10000000.00000000 100000000.00000000
XRP 45417364.13539917 454173641.35398327
Preparing to run MERGE-INTO-C16...
Executing command: bendsql --query=
-- MERGE-INTO-C16: orders
SELECT asset_type, SUM(quantity) AS total_quantity, AVG(price) AS average_price
FROM orders
GROUP BY asset_type ORDER BY asset_type ASC -D mergeinto
Command executed successfully. Output:
BTC 63828821.79126546 518.585135705821
ETH 58825213.47902498 482.677325160081
NEW_ORDER 5000000.00000000 500.000000000000
XRP 48853485.39583970 499.634938778944
Executing command: snowsql --query
-- MERGE-INTO-C16: orders
SELECT asset_type, SUM(quantity) AS total_quantity, AVG(price) AS average_price
FROM orders
GROUP BY asset_type ORDER BY asset_type ASC --dbname mergeinto --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
BTC 63828821.79126546 518.585135705821
ETH 58825213.47902498 482.677325160081
NEW_ORDER 5000000.00000000 500.000000000000
XRP 48853485.39583970 499.634938778945
DIFFERENCE FOUND
MERGE-INTO-C16:
-- MERGE-INTO-C16: orders
SELECT asset_type, SUM(quantity) AS total_quantity, AVG(price) AS average_price
FROM orders
GROUP BY asset_type ORDER BY asset_type ASC
Differences:
bendsql:
BTC 63828821.79126546 518.585135705821
ETH 58825213.47902498 482.677325160081
NEW_ORDER 5000000.00000000 500.000000000000
XRP 48853485.39583970 499.634938778944
snowsql:
BTC 63828821.79126546 518.585135705821
ETH 58825213.47902498 482.677325160081
NEW_ORDER 5000000.00000000 500.000000000000
XRP 48853485.39583970 499.634938778945
Preparing to run MERGE-INTO-C17...
Executing command: bendsql --query=
-- MERGE-INTO-C17: transactions
SELECT transaction_type, SUM(quantity) AS total_quantity
FROM transactions
GROUP BY transaction_type ORDER BY transaction_type ASC -D mergeinto
Command executed successfully. Output:
deposit 20055537.48778317
trade 73372088.64123127
withdrawal 18657140.73671580
Executing command: snowsql --query
-- MERGE-INTO-C17: transactions
SELECT transaction_type, SUM(quantity) AS total_quantity
FROM transactions
GROUP BY transaction_type ORDER BY transaction_type ASC --dbname mergeinto --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
deposit 20055537.48778317
trade 73372088.64123127
withdrawal 18657140.73671580
OK - MERGE-INTO-C17
deposit 20055537.48778317
trade 73372088.64123127
withdrawal 18657140.73671580 |
Pass mergeinto in distributed mode: Click me```sql Preparing to run MERGE-INTO-C1... Executing command: bendsql --query=-- MERGE-INTO-C1: Asset Types Distribution SELECT asset_type, COUNT(*) AS count FROM assets GROUP BY asset_type ORDER BY count DESC, asset_type LIMIT 13 -D mergeinto Command executed successfully. Output: NEW_ASSET 50000 BTC 34939 ETH 34936 XRP 30125Executing command: snowsql --query -- MERGE-INTO-C1: Asset Types Distribution OK - MERGE-INTO-C1 Preparing to run MERGE-INTO-C2... -- MERGE-INTO-C2: Aggregated Quantity and Value Statistics Executing command: snowsql --query -- MERGE-INTO-C2: Aggregated Quantity and Value Statistics DIFFERENCE FOUND MERGE-INTO-C2: -- MERGE-INTO-C2: Aggregated Quantity and Value Statistics bendsql: snowsql: Preparing to run MERGE-INTO-C3... -- MERGE-INTO-C3: Assets Counts by User Executing command: snowsql --query -- MERGE-INTO-C3: Assets Counts by User OK - MERGE-INTO-C3 Preparing to run MERGE-INTO-C4... -- MERGE-INTO-C4: Date Range Analysis of Last Update Executing command: snowsql --query -- MERGE-INTO-C4: Date Range Analysis of Last Update OK - MERGE-INTO-C4 Preparing to run MERGE-INTO-C5... -- MERGE-INTO-C5: General Status Distribution Executing command: snowsql --query -- MERGE-INTO-C5: General Status Distribution OK - MERGE-INTO-C5 Preparing to run MERGE-INTO-C6... -- MERGE-INTO-C6: General Quantity Statistics Executing command: snowsql --query -- MERGE-INTO-C6: General Quantity Statistics DIFFERENCE FOUND MERGE-INTO-C6: -- MERGE-INTO-C6: General Quantity Statistics bendsql: snowsql: Preparing to run MERGE-INTO-C7... -- MERGE-INTO-C7: New Orders vs Existing Orders Count Executing command: snowsql --query -- MERGE-INTO-C7: New Orders vs Existing Orders Count OK - MERGE-INTO-C7 Preparing to run MERGE-INTO-C8... -- MERGE-INTO-C8: Order Type Distribution Executing command: snowsql --query -- MERGE-INTO-C8: Order Type Distribution OK - MERGE-INTO-C8 Preparing to run MERGE-INTO-C9... -- MERGE-INTO-C9: Date Range Analysis Executing command: snowsql --query -- MERGE-INTO-C9: Date Range Analysis OK - MERGE-INTO-C9 Preparing to run MERGE-INTO-C10... -- MERGE-INTO-C10: Price Analysis After Adjustments Executing command: snowsql --query -- MERGE-INTO-C10: Price Analysis After Adjustments DIFFERENCE FOUND MERGE-INTO-C10: -- MERGE-INTO-C10: Price Analysis After Adjustments bendsql: snowsql: Preparing to run MERGE-INTO-C11... -- MERGE-INTO-C11: Transaction Types Distribution Executing command: snowsql --query -- MERGE-INTO-C11: Transaction Types Distribution OK - MERGE-INTO-C11 Preparing to run MERGE-INTO-C12... -- MERGE-INTO-C12: Aggregated Quantity Statistics Executing command: snowsql --query -- MERGE-INTO-C12: Aggregated Quantity Statistics OK - MERGE-INTO-C12 Preparing to run MERGE-INTO-C13... -- MERGE-INTO-C13: Transaction Counts by User and Asset Type Executing command: snowsql --query -- MERGE-INTO-C13: Transaction Counts by User and Asset Type OK - MERGE-INTO-C13 Preparing to run MERGE-INTO-C14... -- MERGE-INTO-C14: Date Range Analysis of Transactions Executing command: snowsql --query -- MERGE-INTO-C14: Date Range Analysis of Transactions OK - MERGE-INTO-C14 Preparing to run MERGE-INTO-C15... -- MERGE-INTO-C15: asserts Executing command: snowsql --query -- MERGE-INTO-C15: asserts OK - MERGE-INTO-C15 Preparing to run MERGE-INTO-C16... -- MERGE-INTO-C16: orders Executing command: snowsql --query -- MERGE-INTO-C16: orders DIFFERENCE FOUND MERGE-INTO-C16: -- MERGE-INTO-C16: orders bendsql: snowsql: Preparing to run MERGE-INTO-C17... -- MERGE-INTO-C17: transactions Executing command: snowsql --query -- MERGE-INTO-C17: transactions OK - MERGE-INTO-C17
|
Docker Image for PR
|
Docker Image for PR
|
…b.com/JackTan25/databend into import_upper_optimizer_for_merge_into
pass mergeinto2 in distributed mode: Click mePreparing to run MERGE-INTO-C1...
Executing command: bendsql --query=-- MERGE-INTO-C1: Asset Types Distribution
SELECT asset_type, COUNT(*) AS count
FROM assets
GROUP BY asset_type
ORDER BY count DESC, asset_type
LIMIT 13 -D mergeinto
Command executed successfully. Output:
BTC 104817
ETH 104808
NEW_ASSET 100000
XRP 90375
Executing command: snowsql --query -- MERGE-INTO-C1: Asset Types Distribution
SELECT asset_type, COUNT(*) AS count
FROM assets
GROUP BY asset_type
ORDER BY count DESC, asset_type
LIMIT 13 --dbname mergeinto --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
BTC 104817
ETH 104808
NEW_ASSET 100000
XRP 90375
OK - MERGE-INTO-C1
BTC 104817
ETH 104808
NEW_ASSET 100000
XRP 90375
Preparing to run MERGE-INTO-C2...
Executing command: bendsql --query=
-- MERGE-INTO-C2: Aggregated Quantity and Value Statistics
SELECT SUM(quantity) AS total_quantity,
AVG(quantity) AS average_quantity,
SUM(value) AS total_value,
AVG(value) AS average_value
FROM assets -D mergeinto
Command executed successfully. Output:
160342725.20941540 400.856813023538 1603427252.09414242 4008.568130235356
Executing command: snowsql --query
-- MERGE-INTO-C2: Aggregated Quantity and Value Statistics
SELECT SUM(quantity) AS total_quantity,
AVG(quantity) AS average_quantity,
SUM(value) AS total_value,
AVG(value) AS average_value
FROM assets --dbname mergeinto --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
160342725.20941540 400.856813023539 1603427252.09414242 4008.568130235356
DIFFERENCE FOUND
MERGE-INTO-C2:
-- MERGE-INTO-C2: Aggregated Quantity and Value Statistics
SELECT SUM(quantity) AS total_quantity,
AVG(quantity) AS average_quantity,
SUM(value) AS total_value,
AVG(value) AS average_value
FROM assets
Differences:
bendsql:
160342725.20941540 400.856813023538 1603427252.09414242 4008.568130235356
snowsql:
160342725.20941540 400.856813023539 1603427252.09414242 4008.568130235356
Preparing to run MERGE-INTO-C3...
Executing command: bendsql --query=
-- MERGE-INTO-C3: Assets Counts by User
SELECT user_id, COUNT(*) AS count
FROM assets
GROUP BY user_id
ORDER BY count DESC, user_id
LIMIT 13 -D mergeinto
Command executed successfully. Output:
0 5
2 5
4 5
6 5
8 5
10 5
12 5
14 5
16 5
18 5
20 5
22 5
24 5
Executing command: snowsql --query
-- MERGE-INTO-C3: Assets Counts by User
SELECT user_id, COUNT(*) AS count
FROM assets
GROUP BY user_id
ORDER BY count DESC, user_id
LIMIT 13 --dbname mergeinto --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
0 5
2 5
4 5
6 5
8 5
10 5
12 5
14 5
16 5
18 5
20 5
22 5
24 5
OK - MERGE-INTO-C3
0 5
2 5
4 5
6 5
8 5
10 5
12 5
14 5
16 5
18 5
20 5
22 5
24 5
Preparing to run MERGE-INTO-C4...
Executing command: bendsql --query=
-- MERGE-INTO-C4: Date Range Analysis of Last Update
SELECT CASE
WHEN last_updated < '2022-01-01' THEN 'Before 2022'
ELSE 'After 2021-12-31'
END AS date_range,
COUNT(*) AS count
FROM assets
GROUP BY date_range
ORDER BY date_range
LIMIT 13 -D mergeinto
Command executed successfully. Output:
After 2021-12-31 360014
Before 2022 39986
Executing command: snowsql --query
-- MERGE-INTO-C4: Date Range Analysis of Last Update
SELECT CASE
WHEN last_updated < '2022-01-01' THEN 'Before 2022'
ELSE 'After 2021-12-31'
END AS date_range,
COUNT(*) AS count
FROM assets
GROUP BY date_range
ORDER BY date_range
LIMIT 13 --dbname mergeinto --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
After 2021-12-31 360014
Before 2022 39986
OK - MERGE-INTO-C4
After 2021-12-31 360014
Before 2022 39986
Preparing to run MERGE-INTO-C5...
Executing command: bendsql --query=
-- MERGE-INTO-C5: General Status Distribution
SELECT status, COUNT(*) AS count
FROM orders
GROUP BY status
ORDER BY count DESC, status
LIMIT 13 -D mergeinto
Command executed successfully. Output:
completed 526479
pending 522450
cancelled 451071
above_avg 100418
below_avg 100075
Pending 100000
avg 49532
Executing command: snowsql --query
-- MERGE-INTO-C5: General Status Distribution
SELECT status, COUNT(*) AS count
FROM orders
GROUP BY status
ORDER BY count DESC, status
LIMIT 13 --dbname mergeinto --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
completed 526479
pending 522450
cancelled 451071
above_avg 100418
below_avg 100075
Pending 100000
avg 49532
OK - MERGE-INTO-C5
completed 526479
pending 522450
cancelled 451071
above_avg 100418
below_avg 100075
Pending 100000
avg 49532
Preparing to run MERGE-INTO-C6...
Executing command: bendsql --query=
-- MERGE-INTO-C6: General Quantity Statistics
SELECT SUM(quantity) AS total_quantity,
AVG(quantity) AS average_quantity,
MIN(quantity) AS min_quantity,
MAX(quantity) AS max_quantity
FROM orders -D mergeinto
Command executed successfully. Output:
176507520.66613014 95.408181330592 0.00005807 435.53925916
Executing command: snowsql --query
-- MERGE-INTO-C6: General Quantity Statistics
SELECT SUM(quantity) AS total_quantity,
AVG(quantity) AS average_quantity,
MIN(quantity) AS min_quantity,
MAX(quantity) AS max_quantity
FROM orders --dbname mergeinto --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
176507520.66613014 95.408181330593 0.00005807 435.53925916
DIFFERENCE FOUND
MERGE-INTO-C6:
-- MERGE-INTO-C6: General Quantity Statistics
SELECT SUM(quantity) AS total_quantity,
AVG(quantity) AS average_quantity,
MIN(quantity) AS min_quantity,
MAX(quantity) AS max_quantity
FROM orders
Differences:
bendsql:
176507520.66613014 95.408181330592 0.00005807 435.53925916
snowsql:
176507520.66613014 95.408181330593 0.00005807 435.53925916
Preparing to run MERGE-INTO-C7...
Executing command: bendsql --query=
-- MERGE-INTO-C7: New Orders vs Existing Orders Count
SELECT CASE
WHEN order_id > 500000 THEN 'New Order'
ELSE 'Existing Order'
END AS order_category,
COUNT(*) AS count
FROM orders
GROUP BY order_category
ORDER BY count DESC
LIMIT 13 -D mergeinto
Command executed successfully. Output:
New Order 1099996
Existing Order 750029
Executing command: snowsql --query
-- MERGE-INTO-C7: New Orders vs Existing Orders Count
SELECT CASE
WHEN order_id > 500000 THEN 'New Order'
ELSE 'Existing Order'
END AS order_category,
COUNT(*) AS count
FROM orders
GROUP BY order_category
ORDER BY count DESC
LIMIT 13 --dbname mergeinto --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
New Order 1099996
Existing Order 750029
OK - MERGE-INTO-C7
New Order 1099996
Existing Order 750029
Preparing to run MERGE-INTO-C8...
Executing command: bendsql --query=
-- MERGE-INTO-C8: Order Type Distribution
SELECT order_type, COUNT(*) AS count
FROM orders
GROUP BY order_type
ORDER BY count DESC, order_type
LIMIT 13 -D mergeinto
Command executed successfully. Output:
buy 976729
sell 873296
Executing command: snowsql --query
-- MERGE-INTO-C8: Order Type Distribution
SELECT order_type, COUNT(*) AS count
FROM orders
GROUP BY order_type
ORDER BY count DESC, order_type
LIMIT 13 --dbname mergeinto --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
buy 976729
sell 873296
OK - MERGE-INTO-C8
buy 976729
sell 873296
Preparing to run MERGE-INTO-C9...
Executing command: bendsql --query=
-- MERGE-INTO-C9: Date Range Analysis
SELECT CASE
WHEN created_at < '2022-01-01' THEN 'Before 2022'
WHEN created_at BETWEEN '2021-01-01' AND '2021-06-30' THEN 'First Half 2021'
ELSE 'After 2021-06-30'
END AS date_range,
COUNT(*) AS count
FROM orders
GROUP BY date_range
ORDER BY date_range
LIMIT 13 -D mergeinto
Command executed successfully. Output:
After 2021-06-30 100000
Before 2022 1750025
Executing command: snowsql --query
-- MERGE-INTO-C9: Date Range Analysis
SELECT CASE
WHEN created_at < '2022-01-01' THEN 'Before 2022'
WHEN created_at BETWEEN '2021-01-01' AND '2021-06-30' THEN 'First Half 2021'
ELSE 'After 2021-06-30'
END AS date_range,
COUNT(*) AS count
FROM orders
GROUP BY date_range
ORDER BY date_range
LIMIT 13 --dbname mergeinto --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
After 2021-06-30 100000
Before 2022 1750025
OK - MERGE-INTO-C9
After 2021-06-30 100000
Before 2022 1750025
Preparing to run MERGE-INTO-C10...
Executing command: bendsql --query=
-- MERGE-INTO-C10: Price Analysis After Adjustments
SELECT SUM(price) AS total_price,
AVG(price) AS average_price,
MIN(price) AS min_price,
MAX(price) AS max_price
FROM orders -D mergeinto
Command executed successfully. Output:
925670420.59100828 500.355627946113 0.00058074 999.99408411
Executing command: snowsql --query
-- MERGE-INTO-C10: Price Analysis After Adjustments
SELECT SUM(price) AS total_price,
AVG(price) AS average_price,
MIN(price) AS min_price,
MAX(price) AS max_price
FROM orders --dbname mergeinto --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
925670420.59100828 500.355627946113 0.00058074 999.99408411
OK - MERGE-INTO-C10
925670420.59100828 500.355627946113 0.00058074 999.99408411
Preparing to run MERGE-INTO-C11...
Executing command: bendsql --query=
-- MERGE-INTO-C11: Transaction Types Distribution
SELECT transaction_type, COUNT(*) AS count
FROM transactions
GROUP BY transaction_type
ORDER BY count DESC, transaction_type
LIMIT 13 -D mergeinto
Command executed successfully. Output:
trade 740367
withdrawal 359266
deposit 358898
Executing command: snowsql --query
-- MERGE-INTO-C11: Transaction Types Distribution
SELECT transaction_type, COUNT(*) AS count
FROM transactions
GROUP BY transaction_type
ORDER BY count DESC, transaction_type
LIMIT 13 --dbname mergeinto --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
trade 740367
withdrawal 359266
deposit 358898
OK - MERGE-INTO-C11
trade 740367
withdrawal 359266
deposit 358898
Preparing to run MERGE-INTO-C12...
Executing command: bendsql --query=
-- MERGE-INTO-C12: Aggregated Quantity Statistics
SELECT SUM(quantity) AS total_quantity,
AVG(quantity) AS average_quantity,
MIN(quantity) AS min_quantity,
MAX(quantity) AS max_quantity
FROM transactions -D mergeinto
Command executed successfully. Output:
112084766.86573024 76.847709692649 0.00017860 664.06985910
Executing command: snowsql --query
-- MERGE-INTO-C12: Aggregated Quantity Statistics
SELECT SUM(quantity) AS total_quantity,
AVG(quantity) AS average_quantity,
MIN(quantity) AS min_quantity,
MAX(quantity) AS max_quantity
FROM transactions --dbname mergeinto --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
112084766.86573024 76.847709692650 0.00017860 664.06985910
DIFFERENCE FOUND
MERGE-INTO-C12:
-- MERGE-INTO-C12: Aggregated Quantity Statistics
SELECT SUM(quantity) AS total_quantity,
AVG(quantity) AS average_quantity,
MIN(quantity) AS min_quantity,
MAX(quantity) AS max_quantity
FROM transactions
Differences:
bendsql:
112084766.86573024 76.847709692649 0.00017860 664.06985910
snowsql:
112084766.86573024 76.847709692650 0.00017860 664.06985910
Preparing to run MERGE-INTO-C13...
Executing command: bendsql --query=
-- MERGE-INTO-C13: Transaction Counts by User and Asset Type
SELECT user_id, asset_type, COUNT(*) AS count
FROM transactions
GROUP BY user_id, asset_type
ORDER BY count DESC, user_id, asset_type
LIMIT 13 -D mergeinto
Command executed successfully. Output:
804 ETH 15
1216 ETH 15
216 BTC 14
425 ETH 14
844 BTC 14
1231 ETH 14
1539 ETH 14
1603 ETH 14
1926 ETH 14
2609 BTC 14
2704 ETH 14
2827 ETH 14
2841 BTC 14
Executing command: snowsql --query
-- MERGE-INTO-C13: Transaction Counts by User and Asset Type
SELECT user_id, asset_type, COUNT(*) AS count
FROM transactions
GROUP BY user_id, asset_type
ORDER BY count DESC, user_id, asset_type
LIMIT 13 --dbname mergeinto --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
804 ETH 15
1216 ETH 15
216 BTC 14
425 ETH 14
844 BTC 14
1231 ETH 14
1539 ETH 14
1603 ETH 14
1926 ETH 14
2609 BTC 14
2704 ETH 14
2827 ETH 14
2841 BTC 14
OK - MERGE-INTO-C13
804 ETH 15
1216 ETH 15
216 BTC 14
425 ETH 14
844 BTC 14
1231 ETH 14
1539 ETH 14
1603 ETH 14
1926 ETH 14
2609 BTC 14
2704 ETH 14
2827 ETH 14
2841 BTC 14
Preparing to run MERGE-INTO-C14...
Executing command: bendsql --query=
-- MERGE-INTO-C14: Date Range Analysis of Transactions
SELECT CASE
WHEN transaction_time < '2022-01-01' THEN 'Before 2022'
ELSE 'After 2021-12-31'
END AS date_range,
COUNT(*) AS count
FROM transactions
GROUP BY date_range
ORDER BY date_range
LIMIT 13 -D mergeinto
Command executed successfully. Output:
After 2021-12-31 433526
Before 2022 1025005
Executing command: snowsql --query
-- MERGE-INTO-C14: Date Range Analysis of Transactions
SELECT CASE
WHEN transaction_time < '2022-01-01' THEN 'Before 2022'
ELSE 'After 2021-12-31'
END AS date_range,
COUNT(*) AS count
FROM transactions
GROUP BY date_range
ORDER BY date_range
LIMIT 13 --dbname mergeinto --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
After 2021-12-31 433526
Before 2022 1025005
OK - MERGE-INTO-C14
After 2021-12-31 433526
Before 2022 1025005
Preparing to run MERGE-INTO-C15...
Executing command: bendsql --query=
-- MERGE-INTO-C15: asserts
SELECT asset_type, SUM(quantity) AS total_quantity, SUM(value) AS total_value
FROM assets
GROUP BY asset_type ORDER BY asset_type ASC -D mergeinto
Command executed successfully. Output:
BTC 54537998.34092175 545379983.40921303
ETH 50387362.73309448 503873627.33094612
NEW_ASSET 10000000.00000000 100000000.00000000
XRP 45417364.13539917 454173641.35398327
Executing command: snowsql --query
-- MERGE-INTO-C15: asserts
SELECT asset_type, SUM(quantity) AS total_quantity, SUM(value) AS total_value
FROM assets
GROUP BY asset_type ORDER BY asset_type ASC --dbname mergeinto --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
BTC 54537998.34092175 545379983.40921303
ETH 50387362.73309448 503873627.33094612
NEW_ASSET 10000000.00000000 100000000.00000000
XRP 45417364.13539917 454173641.35398327
OK - MERGE-INTO-C15
BTC 54537998.34092175 545379983.40921303
ETH 50387362.73309448 503873627.33094612
NEW_ASSET 10000000.00000000 100000000.00000000
XRP 45417364.13539917 454173641.35398327
Preparing to run MERGE-INTO-C16...
Executing command: bendsql --query=
-- MERGE-INTO-C16: orders
SELECT asset_type, SUM(quantity) AS total_quantity, AVG(price) AS average_price
FROM orders
GROUP BY asset_type ORDER BY asset_type ASC -D mergeinto
Command executed successfully. Output:
BTC 63828821.79126546 518.585135705821
ETH 58825213.47902498 482.677325160081
NEW_ORDER 5000000.00000000 500.000000000000
XRP 48853485.39583970 499.634938778944
Executing command: snowsql --query
-- MERGE-INTO-C16: orders
SELECT asset_type, SUM(quantity) AS total_quantity, AVG(price) AS average_price
FROM orders
GROUP BY asset_type ORDER BY asset_type ASC --dbname mergeinto --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
BTC 63828821.79126546 518.585135705821
ETH 58825213.47902498 482.677325160081
NEW_ORDER 5000000.00000000 500.000000000000
XRP 48853485.39583970 499.634938778945
DIFFERENCE FOUND
MERGE-INTO-C16:
-- MERGE-INTO-C16: orders
SELECT asset_type, SUM(quantity) AS total_quantity, AVG(price) AS average_price
FROM orders
GROUP BY asset_type ORDER BY asset_type ASC
Differences:
bendsql:
BTC 63828821.79126546 518.585135705821
ETH 58825213.47902498 482.677325160081
NEW_ORDER 5000000.00000000 500.000000000000
XRP 48853485.39583970 499.634938778944
snowsql:
BTC 63828821.79126546 518.585135705821
ETH 58825213.47902498 482.677325160081
NEW_ORDER 5000000.00000000 500.000000000000
XRP 48853485.39583970 499.634938778945
Preparing to run MERGE-INTO-C17...
Executing command: bendsql --query=
-- MERGE-INTO-C17: transactions
SELECT transaction_type, SUM(quantity) AS total_quantity
FROM transactions
GROUP BY transaction_type ORDER BY transaction_type ASC -D mergeinto
Command executed successfully. Output:
deposit 20055537.48778317
trade 73372088.64123127
withdrawal 18657140.73671580
Executing command: snowsql --query
-- MERGE-INTO-C17: transactions
SELECT transaction_type, SUM(quantity) AS total_quantity
FROM transactions
GROUP BY transaction_type ORDER BY transaction_type ASC --dbname mergeinto --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
deposit 20055537.48778317
trade 73372088.64123127
withdrawal 18657140.73671580
OK - MERGE-INTO-C17
deposit 20055537.48778317
trade 73372088.64123127
withdrawal 18657140.73671580 |
This pr is hold on, we need to wait #14011, Let me start the new hashtable with blockinfo task firstly for this pr's target build strategy. cc @dantengsky |
Docker Image for PR
|
Click medeploy@(new_join_stragety_merge_into)/test> explain merge into small_target as t1 using (select * from target_table_cluster) as t2 on t1.l_partkey = t2.l_partkey and t1.l_orderkey = t2.l_orderkey and t1.l_suppkey = t2.l_suppkey and t1.l_linenumber = t2.l_linenumber and t1.l_shipdate = t2.l_shipdate when matched then update *;
EXPLAIN MERGE INTO small_target AS t1 USING (
SELECT
*
FROM
target_table_cluster
) AS t2 ON t1.l_partkey = t2.l_partkey
AND t1.l_orderkey = t2.l_orderkey
AND t1.l_suppkey = t2.l_suppkey
AND t1.l_linenumber = t2.l_linenumber
AND t1.l_shipdate = t2.l_shipdate
WHEN matched THEN
UPDATE
*
-[ EXPLAIN ]-----------------------------------
MergeInto:
target_table: default.test.small_target
├── matched update: [condition: None,update set l_suppkey = l_suppkey (#2),l_extendedprice = l_extendedprice (#5),l_linenumber = l_linenumber (#3),l_tax = l_tax (#7),l_shipdate = l_shipdate (#10),l_commitdate = l_commitdate (#11),l_shipmode = l_shipmode (#14),l_discount = l_discount (#6),l_comment = l_comment (#15),l_shipinstruct = l_shipinstruct (#13),l_quantity = l_quantity (#4),l_returnflag = l_returnflag (#8),l_receiptdate = l_receiptdate (#12),l_linestatus = l_linestatus (#9),l_orderkey = l_orderkey (#0),l_partkey = l_partkey (#1)]
└── Exchange(Merge)
└── HashJoin: INNER
├── equi conditions: [and(and(and(and(eq(t2.l_partkey (#1), t1.l_partkey (#17)), eq(t2.l_orderkey (#0), t1.l_orderkey (#16))), eq(t2.l_suppkey (#2), t1.l_suppkey (#18))), eq(t2.l_linenumber (#3), t1.l_linenumber (#19))), eq(t2.l_shipdate (#10), t1.l_shipdate (#26)))]
├── non-equi conditions: []
├── Exchange(Random)
│ └── EvalScalar
│ ├── scalars: [target_table_cluster.l_orderkey (#0), target_table_cluster.l_partkey (#1), target_table_cluster.l_suppkey (#2), target_table_cluster.l_linenumber (#3), target_table_cluster.l_quantity (#4), target_table_cluster.l_extendedprice (#5), target_table_cluster.l_discount (#6), target_table_cluster.l_tax (#7), target_table_cluster.l_returnflag (#8), target_table_cluster.l_linestatus (#9), target_table_cluster.l_shipdate (#10), target_table_cluster.l_commitdate (#11), target_table_cluster.l_receiptdate (#12), target_table_cluster.l_shipinstruct (#13), target_table_cluster.l_shipmode (#14), target_table_cluster.l_comment (#15)]
│ └── LogicalGet
│ ├── table: default.test.target_table_cluster
│ ├── filters: []
│ ├── order by: []
│ └── limit: NONE
└── Exchange(Broadcast)
└── LogicalGet
├── table: default.test.small_target
├── filters: []
├── order by: []
└── limit: NONE
21 rows explain in 1.229 sec. Processed 0 rows, 0 B (0 rows/s, 0 B/s)
deploy@(new_join_stragety_merge_into)/test> merge into small_target as t1 using (select * from target_table_cluster) as t2 on t1.l_partkey = t2.l_partkey and t1.l_orderkey = t2.l_orderkey and t1.l_suppkey = t2.l_suppkey and t1.l_linenumber = t2.l_linenumber and t1.l_shipdate = t2.l_shipdate when matched then update *;
MERGE INTO small_target AS t1 USING (
SELECT
*
FROM
target_table_cluster
) AS t2 ON t1.l_partkey = t2.l_partkey
AND t1.l_orderkey = t2.l_orderkey
AND t1.l_suppkey = t2.l_suppkey
AND t1.l_linenumber = t2.l_linenumber
AND t1.l_shipdate = t2.l_shipdate
WHEN matched THEN
UPDATE
*
┌────────────────────────┐
│ number of rows updated │
│ Int32 │
├────────────────────────┤
│ 500000 │
└────────────────────────┘
1 row read in 81.576 sec. Processed 3.62 billion row, 656.57 GiB (44.42 million row/s, 8.05 GiB/s) |
Click medeploy@(remove_static_filter)/test> explain merge into small_target as t1 using (select * from target_table_cluster) as t2 on t1.l_partkey = t2.l_partkey and t1.l_orderkey = t2.l_orderkey and t1.l_suppkey = t2.l_suppkey and t1.l_linenumber = t2.l_linenumber and t1.l_shipdate = t2.l_shipdate when matched then update *;
EXPLAIN MERGE INTO small_target AS t1 USING (
SELECT
*
FROM
target_table_cluster
) AS t2 ON t1.l_partkey = t2.l_partkey
AND t1.l_orderkey = t2.l_orderkey
AND t1.l_suppkey = t2.l_suppkey
AND t1.l_linenumber = t2.l_linenumber
AND t1.l_shipdate = t2.l_shipdate
WHEN matched THEN
UPDATE
*
-[ EXPLAIN ]-----------------------------------
MergeInto:
target_table: default.test.small_target
├── matched update: [condition: None,update set l_commitdate = l_commitdate (#11),l_comment = l_comment (#15),l_receiptdate = l_receiptdate (#12),l_suppkey = l_suppkey (#2),l_linenumber = l_linenumber (#3),l_shipinstruct = l_shipinstruct (#13),l_linestatus = l_linestatus (#9),l_orderkey = l_orderkey (#0),l_quantity = l_quantity (#4),l_extendedprice = l_extendedprice (#5),l_tax = l_tax (#7),l_partkey = l_partkey (#1),l_discount = l_discount (#6),l_shipdate = l_shipdate (#10),l_shipmode = l_shipmode (#14),l_returnflag = l_returnflag (#8)]
└── Exchange(Merge)
└── HashJoin: INNER
├── equi conditions: [and(and(and(and(eq(t1.l_partkey (#17), t2.l_partkey (#1)), eq(t1.l_orderkey (#16), t2.l_orderkey (#0))), eq(t1.l_suppkey (#18), t2.l_suppkey (#2))), eq(t1.l_linenumber (#19), t2.l_linenumber (#3))), eq(t1.l_shipdate (#26), t2.l_shipdate (#10)))]
├── non-equi conditions: []
├── LogicalGet
│ ├── table: default.test.small_target
│ ├── filters: []
│ ├── order by: []
│ └── limit: NONE
└── Exchange(Broadcast)
└── AddRowNumber
└── EvalScalar
├── scalars: [target_table_cluster.l_orderkey (#0), target_table_cluster.l_partkey (#1), target_table_cluster.l_suppkey (#2), target_table_cluster.l_linenumber (#3), target_table_cluster.l_quantity (#4), target_table_cluster.l_extendedprice (#5), target_table_cluster.l_discount (#6), target_table_cluster.l_tax (#7), target_table_cluster.l_returnflag (#8), target_table_cluster.l_linestatus (#9), target_table_cluster.l_shipdate (#10), target_table_cluster.l_commitdate (#11), target_table_cluster.l_receiptdate (#12), target_table_cluster.l_shipinstruct (#13), target_table_cluster.l_shipmode (#14), target_table_cluster.l_comment (#15)]
└── LogicalGet
├── table: default.test.target_table_cluster
├── filters: []
├── order by: []
└── limit: NONE
21 rows explain in 1.137 sec. Processed 0 rows, 0 B (0 rows/s, 0 B/s)
deploy@(remove_static_filter)/test> merge into small_target as t1 using (select * from target_table_cluster) as t2 on t1.l_partkey = t2.l_partkey and t1.l_orderkey = t2.l_orderkey and t1.l_suppkey = t2.l_suppkey and t1.l_linenumber = t2.l_linenumber and t1.l_shipdate = t2.l_shipdate when matched then update *;
MERGE INTO small_target AS t1 USING (
SELECT
*
FROM
target_table_cluster
) AS t2 ON t1.l_partkey = t2.l_partkey
AND t1.l_orderkey = t2.l_orderkey
AND t1.l_suppkey = t2.l_suppkey
AND t1.l_linenumber = t2.l_linenumber
AND t1.l_shipdate = t2.l_shipdate
WHEN matched THEN
UPDATE
*
error: error happens after fetched 0 rows: APIError: ResponseError with 1104: memory usage 51.5 GB(51542990492) exceeds limit 51.5 GB(51539607552) |
…import_upper_optimizer_for_merge_into
re-opened at #14093 |
I hereby agree to the terms of the CLA available at: https://databend.rs/dev/policies/cla/
Summary
Summary about this PR
and target table should be build side.
In fact, we support hash shuffle stragety for our distributed merge into, but we don't enable it, we support
2.1
firstly for our left outer (of course, we can get insert only(left anti),matched only(inner join) when we use target table as build side too, it doesn't matter).Some important work next:
Feature: Merge Into Optimizations #12595
This change is