[kvexec] merge join #8561

max-hoffman · 2024-11-14T22:44:06Z

This isn't the best perf win on linux, but it counteracts the sql.Row interface PR which otherwise would swing merge join +30% in the wrong direction.

goos: darwin
goarch: arm64
pkg: github.com/dolthub/dolt/go/performance/microsysbench
                │  before.txt  │           after.txt           │
                │    sec/op    │    sec/op     vs base         │
OltpJoinScan-12   680.6µ ± 26%   612.1µ ± 17%  ~ (p=0.240 n=6)

                │  before.txt  │              after.txt              │
                │     B/op     │     B/op      vs base               │
OltpJoinScan-12   163.8Ki ± 0%   123.8Ki ± 0%  -24.42% (p=0.002 n=6)

                │ before.txt  │             after.txt              │
                │  allocs/op  │  allocs/op   vs base               │
OltpJoinScan-12   5.906k ± 0%   4.233k ± 0%  -28.33% (p=0.002 n=6)

TODO:

left join
nulls and other edge cases
execute full comparer

…te.sh

max-hoffman · 2024-11-14T23:04:43Z

#benchmark

github-actions · 2024-11-14T23:05:08Z

@max-hoffman workflow run: https://github.com/dolthub/dolt/actions/runs/11847027155

coffeegoddd · 2024-11-14T23:14:13Z

@max-hoffman DOLT

comparing_percentages
100.000000 to 100.000000

version	result	total
`302ab0b`	ok	5937457

version	total_tests
`302ab0b`	5937457

correctness_percentage
100.0

coffeegoddd · 2024-11-14T23:22:40Z

@coffeegoddd DOLT

comparing_percentages
100.000000 to 100.000000

version	result	total
`4fa6227`	ok	5937457

version	total_tests
`4fa6227`	5937457

correctness_percentage
100.0

coffeegoddd · 2024-11-14T23:40:35Z

@max-hoffman DOLT

test_name	from_latency_p95	to_latency_p95	percent_change
tpcc-scale-factor-1	61.08	59.99	-1.78

test_name	from_server_name	from_server_version	from_tps	to_server_name	to_server_version	to_tps	percent_change
tpcc-scale-factor-1	dolt	`0e34d26`	40.64	dolt	`4fa6227`	40.7	0.15

coffeegoddd · 2024-11-15T00:31:10Z

@max-hoffman DOLT

read_tests	from_latency	to_latency	percent_change
covering_index_scan	0.62	0.62	0.0
groupby_scan	16.41	16.41	0.0
index_join	2.26	2.26	0.0
index_join_scan	1.79	1.64	-8.38
index_scan	53.85	55.82	3.66
oltp_point_select	0.26	0.27	3.85
oltp_read_only	5.28	5.37	1.7
select_random_points	0.64	0.65	1.56
select_random_ranges	0.63	0.64	1.59
table_scan	54.83	55.82	1.81
types_table_scan	139.85	144.97	3.66

write_tests	from_latency	to_latency	percent_change
oltp_delete_insert	5.77	5.88	1.91
oltp_insert	2.91	2.91	0.0
oltp_read_write	11.24	11.45	1.87
oltp_update_index	2.91	2.97	2.06
oltp_update_non_index	2.86	2.91	1.75
oltp_write_only	5.88	5.99	1.87
types_delete_insert	6.21	6.21	0.0

…te.sh

coffeegoddd · 2024-11-19T04:06:35Z

@max-hoffman DOLT

comparing_percentages
100.000000 to 100.000000

version	result	total
`f7225f7`	ok	5937457

version	total_tests
`f7225f7`	5937457

correctness_percentage
100.0

coffeegoddd · 2024-11-19T04:14:49Z

@coffeegoddd DOLT

comparing_percentages
100.000000 to 100.000000

version	result	total
`0a7e70a`	ok	5937457

version	total_tests
`0a7e70a`	5937457

correctness_percentage
100.0

…kv-merge-join

max-hoffman · 2024-11-19T21:43:16Z

#benchmark

github-actions · 2024-11-19T21:43:37Z

@max-hoffman workflow run: https://github.com/dolthub/dolt/actions/runs/11922319267

coffeegoddd · 2024-11-19T21:44:18Z

@max-hoffman DOLT

comparing_percentages
100.000000 to 100.000000

version	result	total
`e8f4ead`	ok	5937457

version	total_tests
`e8f4ead`	5937457

correctness_percentage
100.0

coffeegoddd · 2024-11-19T22:19:43Z

@max-hoffman DOLT

test_name	from_latency_p95	to_latency_p95	percent_change
tpcc-scale-factor-1	57.87	58.92	1.81

test_name	from_server_name	from_server_version	from_tps	to_server_name	to_server_version	to_tps	percent_change
tpcc-scale-factor-1	dolt	`f4e529a`	41.65	dolt	`e8f4ead`	41.42	-0.55

coffeegoddd · 2024-11-19T23:10:13Z

@max-hoffman DOLT

read_tests	from_latency	to_latency	percent_change
covering_index_scan	0.62	0.69	11.29
groupby_scan	16.71	16.41	-1.8
index_join	2.26	2.26	0.0
index_join_scan	1.82	1.44	-20.88
index_scan	54.83	54.83	0.0
oltp_point_select	0.27	0.27	0.0
oltp_read_only	5.37	5.37	0.0
select_random_points	0.65	0.65	0.0
select_random_ranges	0.64	0.64	0.0
table_scan	55.82	55.82	0.0
types_table_scan	144.97	142.39	-1.78

write_tests	from_latency	to_latency
oltp_delete_insert	5.88	5.88
oltp_insert	2.91	2.91
oltp_read_write	11.45	11.45
oltp_update_index	2.97	2.97
oltp_update_non_index	2.91	2.91
oltp_write_only	5.99	5.99
types_delete_insert	6.21	6.21

coffeegoddd · 2024-11-20T20:07:45Z

@max-hoffman DOLT

comparing_percentages
100.000000 to 100.000000

version	result	total
`c1e9358`	ok	5937457

version	total_tests
`c1e9358`	5937457

correctness_percentage
100.0

coffeegoddd · 2024-11-20T22:23:36Z

@max-hoffman DOLT

comparing_percentages
100.000000 to 100.000000

version	result	total
`fdabb0a`	ok	5937457

version	total_tests
`fdabb0a`	5937457

correctness_percentage
100.0

coffeegoddd · 2024-11-20T23:29:51Z

@max-hoffman DOLT

comparing_percentages
100.000000 to 100.000000

version	result	total
`b76555f`	ok	5937457

version	total_tests
`b76555f`	5937457

correctness_percentage
100.0

jycor

I think using opposite logic is more readable:
https://github.com/dolthub/dolt/compare/max/kv-merge-join...james/refactor?expand=1

While gotos aren't the best, I think it's still pretty understandable, so LGTM

coffeegoddd · 2024-11-21T23:10:21Z

@max-hoffman DOLT

comparing_percentages
100.000000 to 100.000000

version	result	total
`f67d302`	ok	5937457

version	total_tests
`f67d302`	5937457

correctness_percentage
100.0

zachmu

Not as bad as you built it up to be, generally not too hard to understand.

I think readability would be improved by making the loops explicit, keeping goto statements for true jumps rather than "go to beginning of this loop"

zachmu · 2024-11-22T00:36:48Z