-
Notifications
You must be signed in to change notification settings - Fork 2.3k
[Gen4] Aggregation and Grouping on Operators #12994
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
64 commits
Select commit
Hold shift + click to select a range
595d8d8
initial aggregator operator move to new horizon planner
harshit-gangal a9918af
added ordering for grouping key in horizon planning
harshit-gangal 329d5b7
gen4 planner: start of push aggregations through joins
systay 11ad8fb
planner feat: split aggregation across joins
systay 4d0c986
ordering push down on aggregations
harshit-gangal 7dbaeb7
aggregation grouping based on the user provided ordering
harshit-gangal 56933ea
gen4 planner refactor: use visitor pattern
systay 8907650
gen4 test: make TestOne enable operator debug printing
systay 1c78485
gen4 planner: fix groupby in shortDescription
systay 9bd77cb
fall back when we are doing distinct aggregations or aggregations on …
systay cee9df6
group by visitor returning column index
harshit-gangal 4f81544
make sure that the projection under the aggregation makes sense
systay 4e9adcb
feat: keep better track of offsets in aggregation planning
GuptaManan100 3db4cb9
feat: add code to figure out grouping can be pushed down in the prese…
GuptaManan100 4f66c23
aggregator to add grouping column if not part of select expression an…
harshit-gangal 85265db
wip planner refactoring: support more aggregations
systay 94a1dbc
gen4 refactoring: stop compact from going down into the route
systay 1a06abc
update more test cases
systay 99c178d
handle aggr functions not supported in vtgate
systay ea22160
once under the route, no need to be clever about pushing columns
systay 95aa81a
make sure to order by the correct expression
systay eda0deb
Various operator fixes
systay d93d95f
make sure that the old horizon planning still works as before
systay c747827
add end2end tests for the new grouping and ordering capabilities
systay f03b34d
refactor aggregation handling
systay 2a8f28d
update test expectations
systay 6cad2fd
add more aggregation tests
systay 5f8fe3f
handle min/max in the new aggregation planning
systay 8f98c76
refactor. clean up. make pretty
systay ffe7666
join engine fix: pass bind vars with type value when left side have e…
harshit-gangal f6f3de5
projection pushing to use reserved vars to avoid conflicting bind var…
harshit-gangal dab64b7
gen4 planner refactoring
systay 446c4b1
gen4 planner refactoring: clean up logic, add comments
systay d97915d
projection not pushed on join when created for aggregation above, fix…
harshit-gangal 60b8ea3
added new aggregation cases
harshit-gangal 3a11228
add column to pushed aggregation if it is present in the original top…
harshit-gangal 27c7c26
adding dummy grouping to right side aggregator when aggregation is pr…
harshit-gangal ea0c9c8
remove extra group by; use IF instead of COALESCE
systay 101734c
projection to contain output columns as aliasExpr, count star multipl…
harshit-gangal 5cbeb26
make sure to add projection columns for all aggregations
systay 3d95ce7
add group by on the RHS of split aggregations so we don't get invalid…
systay 6aa627f
test expectations
systay 44eb9de
make it possible to push sorting under projection
systay aefc508
refactoring
systay ca95c24
handle opcode.AggregationRandom in the new operator model
systay 2cb3b4a
add logging for each operator transformation
systay 6af4190
test expectation
systay d16ec81
handle count on columns
systay 7e1f905
refactor code and re-use count(*) more aggresively
systay cbf18fa
add random e2e testing for aggregations
systay 9c4cadf
added test with known inconsistencies between mysql and vitess
systay caa25d7
refactor to use types
systay 266ed85
remove failing test and remove unnecessary printing
systay 282ab7f
added EnableGeneralLog method to enable general logs on all the mysql…
harshit-gangal ce8b7ef
print vitess plan when vitess and mysql result does not match
harshit-gangal 8bb3370
add ORDER BY code to the fuzzer
systay c97c627
add group by order by test
harshit-gangal 06475db
do not pass 0 to rand.Intn func as it causes panic
harshit-gangal 82b0175
some code refactor and comments
harshit-gangal 97eff7d
added comments
systay 566a796
added more comments
systay 3c1b92c
refactor and comment after review
systay fd62d66
add column names to projections
systay 41235db
Merge remote-tracking branch 'upstream/main' into aggr-op
harshit-gangal File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
212 changes: 212 additions & 0 deletions
212
go/test/endtoend/vtgate/queries/aggregation/fuzz_test.go
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,212 @@ | ||
| /* | ||
| Copyright 2023 The Vitess Authors. | ||
|
|
||
| Licensed under the Apache License, Version 2.0 (the "License"); | ||
| you may not use this file except in compliance with the License. | ||
| You may obtain a copy of the License at | ||
|
|
||
| http://www.apache.org/licenses/LICENSE-2.0 | ||
|
|
||
| Unless required by applicable law or agreed to in writing, software | ||
| distributed under the License is distributed on an "AS IS" BASIS, | ||
| WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| See the License for the specific language governing permissions and | ||
| limitations under the License. | ||
| */ | ||
|
|
||
| package aggregation | ||
|
|
||
| import ( | ||
| "fmt" | ||
| "math/rand" | ||
| "strings" | ||
| "testing" | ||
| "time" | ||
|
|
||
| "golang.org/x/exp/maps" | ||
|
|
||
| "vitess.io/vitess/go/vt/log" | ||
| ) | ||
|
|
||
| type ( | ||
| column struct { | ||
| name string | ||
| typ string | ||
| } | ||
| tableT struct { | ||
| name string | ||
| columns []column | ||
| } | ||
| ) | ||
|
|
||
| func TestFuzzAggregations(t *testing.T) { | ||
| // This test randomizes values and queries, and checks that mysql returns the same values that Vitess does | ||
| mcmp, closer := start(t) | ||
| defer closer() | ||
|
|
||
| noOfRows := rand.Intn(20) | ||
| var values []string | ||
| for i := 0; i < noOfRows; i++ { | ||
| values = append(values, fmt.Sprintf("(%d, 'name%d', 'value%d', %d)", i, i, i, i)) | ||
| } | ||
| t1Insert := fmt.Sprintf("insert into t1 (t1_id, name, value, shardKey) values %s;", strings.Join(values, ",")) | ||
| values = nil | ||
| noOfRows = rand.Intn(20) | ||
| for i := 0; i < noOfRows; i++ { | ||
| values = append(values, fmt.Sprintf("(%d, %d)", i, i)) | ||
| } | ||
| t2Insert := fmt.Sprintf("insert into t2 (id, shardKey) values %s;", strings.Join(values, ",")) | ||
|
|
||
| mcmp.Exec(t1Insert) | ||
| mcmp.Exec(t2Insert) | ||
|
|
||
| t.Cleanup(func() { | ||
| if t.Failed() { | ||
| fmt.Println(t1Insert) | ||
| fmt.Println(t2Insert) | ||
| } | ||
| }) | ||
|
|
||
| schema := map[string]tableT{ | ||
| "t1": {name: "t1", columns: []column{ | ||
| {name: "t1_id", typ: "bigint"}, | ||
| {name: "name", typ: "varchar"}, | ||
| {name: "value", typ: "varchar"}, | ||
| {name: "shardKey", typ: "bigint"}, | ||
| }}, | ||
| "t2": {name: "t2", columns: []column{ | ||
| {name: "id", typ: "bigint"}, | ||
| {name: "shardKey", typ: "bigint"}, | ||
| }}, | ||
| } | ||
|
|
||
| endBy := time.Now().Add(1 * time.Second) | ||
| schemaTables := maps.Values(schema) | ||
|
|
||
| var queryCount int | ||
| for time.Now().Before(endBy) || t.Failed() { | ||
| tables := createTables(schemaTables) | ||
| query := randomQuery(tables, 3, 3) | ||
| mcmp.Exec(query) | ||
| if t.Failed() { | ||
| fmt.Println(query) | ||
| } | ||
| queryCount++ | ||
| } | ||
| log.Info("Queries successfully executed: %d", queryCount) | ||
| } | ||
|
|
||
| func randomQuery(tables []tableT, maxAggrs, maxGroupBy int) string { | ||
| randomCol := func(tblIdx int) (string, string) { | ||
| tbl := tables[tblIdx] | ||
| col := randomEl(tbl.columns) | ||
| return fmt.Sprintf("tbl%d.%s", tblIdx, col.name), col.typ | ||
| } | ||
| predicates := createPredicates(tables, randomCol) | ||
| aggregates := createAggregations(tables, maxAggrs, randomCol) | ||
| grouping := createGroupBy(tables, maxGroupBy, randomCol) | ||
| sel := "select /*vt+ PLANNER=Gen4 */ " + strings.Join(aggregates, ", ") + " from " | ||
|
|
||
| var tbls []string | ||
| for i, s := range tables { | ||
| tbls = append(tbls, fmt.Sprintf("%s as tbl%d", s.name, i)) | ||
| } | ||
| sel += strings.Join(tbls, ", ") | ||
|
|
||
| if len(predicates) > 0 { | ||
| sel += " where " | ||
| sel += strings.Join(predicates, " and ") | ||
| } | ||
| if len(grouping) > 0 { | ||
| sel += " group by " | ||
| sel += strings.Join(grouping, ", ") | ||
| } | ||
| // we do it this way so we don't have to do only `only_full_group_by` queries | ||
| var noOfOrderBy int | ||
| if len(grouping) > 0 { | ||
| // panic on rand function call if value is 0 | ||
| noOfOrderBy = rand.Intn(len(grouping)) | ||
| } | ||
| if noOfOrderBy > 0 { | ||
| noOfOrderBy = 0 // TODO turning on ORDER BY here causes lots of failures to happen | ||
| } | ||
| if noOfOrderBy > 0 { | ||
| var orderBy []string | ||
| for noOfOrderBy > 0 { | ||
| noOfOrderBy-- | ||
| if rand.Intn(2) == 0 || len(grouping) == 0 { | ||
| orderBy = append(orderBy, randomEl(aggregates)) | ||
| } else { | ||
| orderBy = append(orderBy, randomEl(grouping)) | ||
| } | ||
| } | ||
| sel += " order by " | ||
| sel += strings.Join(orderBy, ", ") | ||
| } | ||
| return sel | ||
| } | ||
|
|
||
| func createGroupBy(tables []tableT, maxGB int, randomCol func(tblIdx int) (string, string)) (grouping []string) { | ||
| noOfGBs := rand.Intn(maxGB) | ||
| for i := 0; i < noOfGBs; i++ { | ||
| tblIdx := rand.Intn(len(tables)) | ||
| col, _ := randomCol(tblIdx) | ||
| grouping = append(grouping, col) | ||
| } | ||
| return | ||
| } | ||
|
|
||
| func createAggregations(tables []tableT, maxAggrs int, randomCol func(tblIdx int) (string, string)) (aggregates []string) { | ||
| aggregations := []func(string) string{ | ||
| func(_ string) string { return "count(*)" }, | ||
| func(e string) string { return fmt.Sprintf("count(%s)", e) }, | ||
| //func(e string) string { return fmt.Sprintf("sum(%s)", e) }, | ||
| //func(e string) string { return fmt.Sprintf("avg(%s)", e) }, | ||
| //func(e string) string { return fmt.Sprintf("min(%s)", e) }, | ||
| //func(e string) string { return fmt.Sprintf("max(%s)", e) }, | ||
| } | ||
|
|
||
| noOfAggrs := rand.Intn(maxAggrs) + 1 | ||
| for i := 0; i < noOfAggrs; i++ { | ||
| tblIdx := rand.Intn(len(tables)) | ||
| e, _ := randomCol(tblIdx) | ||
| aggregates = append(aggregates, randomEl(aggregations)(e)) | ||
| } | ||
| return aggregates | ||
| } | ||
|
|
||
| func createTables(schemaTables []tableT) []tableT { | ||
| noOfTables := rand.Intn(2) + 1 | ||
| var tables []tableT | ||
|
|
||
| for i := 0; i < noOfTables; i++ { | ||
| tables = append(tables, randomEl(schemaTables)) | ||
| } | ||
| return tables | ||
| } | ||
|
|
||
| func createPredicates(tables []tableT, randomCol func(tblIdx int) (string, string)) (predicates []string) { | ||
| for idx1 := range tables { | ||
| for idx2 := range tables { | ||
| if idx1 == idx2 { | ||
| continue | ||
| } | ||
| noOfPredicates := rand.Intn(2) | ||
|
|
||
| for noOfPredicates > 0 { | ||
| col1, t1 := randomCol(idx1) | ||
| col2, t2 := randomCol(idx2) | ||
| if t1 != t2 { | ||
| continue | ||
| } | ||
| predicates = append(predicates, fmt.Sprintf("%s = %s", col1, col2)) | ||
| noOfPredicates-- | ||
| } | ||
| } | ||
| } | ||
| return predicates | ||
| } | ||
|
|
||
| func randomEl[K any](in []K) K { | ||
| return in[rand.Intn(len(in))] | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.