Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Sec. indexes on relations #2670

Merged

Conversation

islamaliev
Copy link
Contributor

@islamaliev islamaliev commented Jun 3, 2024

Relevant issue(s)

Resolves #2601 #2578 #2577

Description

Enables fetching related objects via secondary indexes.

It also fixes a bug with queries that contain multiple aggregates on the same collection.

Note: I will add some more tests

@@ -413,25 +413,23 @@ func resolveAggregates(
childMapping = childMapping.CloneWithoutRender()
mapping.SetChildAt(index, childMapping)

if !childIsMapped {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was a nasty bug that took many hours to spot and fix.
It because visible for TestOneToManyToOneWithSumOfDeepFilterSubTypeOfBothDescAndAsc test where 2 aggregates are used on the same related object:

query {
    Author {
        name
        s1: _sum(book: {field: rating, filter: {publisher: {yearOpened: {_eq: 2013}}}})
        s2: _sum(book: {field: rating, filter: {publisher: {yearOpened: {_ge: 2020}}}})
    }
}

After the first aggregate is processed with correct []Requestable fields, the second would reuse the results of the frst which doesn't include resolving of fields. So Fields field would have a nil value. Which happened to be interpreted by the fetcher as all-fields-requested.
So when dealing with secondary indexes I explicitly added to the fields array a relation id field (like book_id). That made the fetcher to fetch only the related id field and would prevent the filter from letting yearOpened: {_ge: 2020} condition to pass, because it's not fetched.

That's why filter dependencies should be resolved for every aggregate.

Let me know if you think there is a better solution.

@islamaliev islamaliev self-assigned this Jun 3, 2024
@islamaliev islamaliev added area/query Related to the query component perf Performance issue or suggestion labels Jun 3, 2024
@islamaliev islamaliev added this to the DefraDB v0.12 milestone Jun 3, 2024
@islamaliev islamaliev marked this pull request as ready for review June 3, 2024 17:12
@islamaliev islamaliev force-pushed the feat/sec-index-on-relations branch from bbf4941 to d986959 Compare June 3, 2024 17:12
Copy link
Contributor

@AndrewSisley AndrewSisley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

praise: Looks really really good Islam, and I can really imagine how much of a pain that must have been to find and fix. Fantastic job.

I gave the production code a look over, and wanted to submit before the standup in case I get distracted etc. No major suggestions, will likely be a quick approval once I finish my review.

Thanks a bunch for this, and sorry about the pain you must have gone through in the mapper :)

internal/planner/select.go Show resolved Hide resolved
internal/planner/type_join.go Outdated Show resolved Hide resolved
internal/planner/type_join.go Outdated Show resolved Hide resolved
internal/planner/mapper/select.go Show resolved Hide resolved
internal/planner/type_join.go Outdated Show resolved Hide resolved
@@ -164,6 +164,7 @@ func TestIndex_QueryWithIndexOnOneToManyRelationAndFilter_Data(t *testing.T) {
},
testUtils.CreateDoc{
CollectionID: 0,
// TODO: Ask Andy of the docID is intentionally wrong (doesn't exist)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

answer: It shouldn't be wrong, it should be the doc id of ESA - sorry if it is wrong, please correct it if it is 😅

Copy link
Contributor

@AndrewSisley AndrewSisley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - thanks Islam, it looks really good. Approving now but please resolve/argue against all the minor comments from me before merge :)

@@ -17,13 +17,15 @@ import (
)

func TestQueryWithIndexOnOneToManyRelation_IfFilterOnIndexedRelation_ShouldFilter2(t *testing.T) {
// 3 users have a MacBook Pro: Islam, Shahzad, Keenan
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

praise: Thanks for these, is useful info :)

Request: makeExplainQuery(req2),
// we make 3 index fetches to get the 3 address with city == "Montreal"
// and 3 more index fetches to get the corresponding users
// then 3 field fetches to get the name of each user
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

praise: These are great - thanks Islam!

Comment on lines 453 to 464
// firstNode is the node that is executed first (usually an indexed collection).
firstNode planNode // User
// secondNode is the node that is executed second.
secondNode planNode // Device
// subRelField is a field name of a secondary doc that refers to the primary docID (like author_id).
subRelField string // owner_id
// topRelField is a field name of the primary doc that refers to the secondary docID (like author_id).
topRelField string // devices_id
// isInvertable indicates if the join can be inverted.
isInvertable bool
// isInverted indicates if the join direction is inverted.
isInverted bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

praise: Thanks for the documentation

Copy link
Member

@shahzadlone shahzadlone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks really good :)

if err != nil {
return nil, err
}
// Add the explaination of the rest of the explain graph under the "root" graph.
explainGraphBuilder[joinRootLabel] = indexJoinRootExplainGraph
}

if node.subType != nil {
indexJoinSubTypeExplainGraph, err := buildDebugExplainGraph(node.subType)
if node.childSide.plan != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

praise: I like these!

// Add the attribute(s).
simpleExplainMap[joinRootLabel] = joinType.rootName
simpleExplainMap[joinSubTypeNameLabel] = joinType.subTypeName
simpleExplainMap[joinRootLabel] = immutable.Some(j.childSide.relFieldDef.Name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Instead of a string, this will now be an optional type. Does the caller that receives the result of simpleExplain() handle this by checking HasValue idk why I missed seeing that check. or does it rely on the default string value being "" if it is missing? another question, can it be missing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rootName was also immutable.Option

@islamaliev islamaliev force-pushed the feat/sec-index-on-relations branch from d986959 to e6bcb5d Compare June 4, 2024 13:46
Copy link

codecov bot commented Jun 4, 2024

Codecov Report

Attention: Patch coverage is 84.98294% with 44 lines in your changes missing coverage. Please review.

Project coverage is 78.18%. Comparing base (da3d057) to head (d145032).

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #2670      +/-   ##
===========================================
+ Coverage    78.17%   78.18%   +0.01%     
===========================================
  Files          303      303              
  Lines        22996    23052      +56     
===========================================
+ Hits         17976    18021      +45     
- Misses        3661     3672      +11     
  Partials      1359     1359              
Flag Coverage Δ
all-tests 78.18% <84.98%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
client/normal_nil.go 94.12% <100.00%> (-5.88%) ⬇️
client/schema_field_description.go 65.57% <100.00%> (ø)
internal/db/collection_index.go 88.27% <100.00%> (-0.11%) ⬇️
internal/db/index.go 72.28% <100.00%> (-0.99%) ⬇️
internal/planner/explain.go 60.56% <100.00%> (ø)
internal/planner/mapper/select.go 40.00% <ø> (ø)
internal/planner/planner.go 82.55% <100.00%> (ø)
internal/planner/select.go 85.11% <100.00%> (+0.52%) ⬆️
internal/planner/mapper/mapper.go 89.52% <85.71%> (ø)
internal/planner/type_join.go 81.23% <83.13%> (+0.44%) ⬆️

... and 11 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update da3d057...d145032. Read the comment docs.

@islamaliev islamaliev force-pushed the feat/sec-index-on-relations branch from e6bcb5d to fe8a7d4 Compare June 4, 2024 16:12
@islamaliev islamaliev merged commit 1eb1fb5 into sourcenetwork:develop Jun 4, 2024
33 checks passed
@fredcarle fredcarle added the feature New feature or request label Jun 24, 2024
@shahzadlone
Copy link
Member

Bug bash: got a crash course on this PR and tested some edge case tests (like TestQueryWithIndexOnOneToMany_IfFilterOnIndexedRelation_ShouldFilterWithExplain) with more fields indexed. LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/query Related to the query component feature New feature or request perf Performance issue or suggestion
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Sec. index: enable indexing of related objects
4 participants