-
Notifications
You must be signed in to change notification settings - Fork 178
Support pagination in V2 engine, phase 1 #1497
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support pagination in V2 engine, phase 1 #1497
Conversation
* Fixing integration tests broken during POC Signed-off-by: MaxKsyunz <[email protected]> * Comment to clarify an exception. Signed-off-by: MaxKsyunz <[email protected]> * Add support for paginated scroll request, first page. Implement PaginatedPlanCache.convertToPlan for second page to work. Signed-off-by: MaxKsyunz <[email protected]> * Progress on paginated scroll request, subsequent page. Signed-off-by: MaxKsyunz <[email protected]> * Move `ExpressionSerializer` from `opensearch` to `core`. Signed-off-by: Yury-Fridlyand <[email protected]> * Rename `Cursor` `asString` to `toString`. Signed-off-by: Yury-Fridlyand <[email protected]> * Disable scroll cleaning. Signed-off-by: Yury-Fridlyand <[email protected]> * Add full cursor serialization and deserialization. Signed-off-by: Yury-Fridlyand <[email protected]> * Misc fixes. Signed-off-by: Yury-Fridlyand <[email protected]> * Further work on pagination. * Added push down page size from `LogicalPaginate` to `LogicalRelation`. * Improved cursor encoding and decoding. * Added cursor compression. * Fixed issuing `SearchScrollRequest`. * Fixed returning last empty page. * Minor code grooming/commenting. Signed-off-by: Yury-Fridlyand <[email protected]> * Pagination fix for empty indices. Signed-off-by: Yury-Fridlyand <[email protected]> * Fix error reporting on wrong cursor. Signed-off-by: Yury-Fridlyand <[email protected]> * Minor comments and error reporting improvement. Signed-off-by: Yury-Fridlyand <[email protected]> * Add an end-to-end integration test. Signed-off-by: Yury-Fridlyand <[email protected]> * Add `explain` request handlers. Signed-off-by: Yury-Fridlyand <[email protected]> * Add IT for explain. Signed-off-by: Yury-Fridlyand <[email protected]> * Address issues flagged by checkstyle build step (#229) Signed-off-by: MaxKsyunz <[email protected]> * Pagination, phase 1: Add unit tests for `:core` module with coverage. (#230) * Add unit tests for `:core` module with coverage. Uncovered: `toCursor`, because it is will be changed soon. Signed-off-by: Yury-Fridlyand <[email protected]> * Pagination, phase 1: Add unit tests for SQL module with coverage. (#239) * Add unit tests for SQL module with coverage. Signed-off-by: Yury-Fridlyand <[email protected]> * Update sql/src/main/java/org/opensearch/sql/sql/domain/SQLQueryRequest.java Signed-off-by: Yury-Fridlyand <[email protected]> Co-authored-by: GabeFernandez310 <[email protected]> --------- Signed-off-by: Yury-Fridlyand <[email protected]> Co-authored-by: GabeFernandez310 <[email protected]> * Pagination, phase 1: Add unit tests for `:opensearch` module with coverage. (#233) * Add UT for `:opensearch` module with full coverage, except `toCursor`. Signed-off-by: Yury-Fridlyand <[email protected]> * Fix checkstyle. Signed-off-by: Yury-Fridlyand <[email protected]> --------- Signed-off-by: Yury-Fridlyand <[email protected]> * Fix the merges. Signed-off-by: Yury-Fridlyand <[email protected]> * Fix explain. Signed-off-by: Yury-Fridlyand <[email protected]> * Fix scroll cleaning. Signed-off-by: Yury-Fridlyand <[email protected]> * Store `TotalHits` and use it to report `total` in response. Signed-off-by: Yury-Fridlyand <[email protected]> * Add missing UT for `:protocol` module. Signed-off-by: Yury-Fridlyand <[email protected]> * Fix PPL UTs damaged in f4ea4ad. Signed-off-by: Yury-Fridlyand <[email protected]> * Minor checkstyle fixes. Signed-off-by: Yury-Fridlyand <[email protected]> * Fallback to v1 engine for pagination (#245) * Pagination fallback integration tests. Signed-off-by: MaxKsyunz <[email protected]> * Add UT with coverage for `toCursor` serialization. Signed-off-by: Yury-Fridlyand <[email protected]> * Fix broken tests in `legacy`. Signed-off-by: Yury-Fridlyand <[email protected]> * Fix getting `total` from non-paged requests and from queries without `FROM` clause. Signed-off-by: Yury-Fridlyand <[email protected]> * Fix scroll cleaning. Signed-off-by: Yury-Fridlyand <[email protected]> * Fix cursor request processing. Signed-off-by: Yury-Fridlyand <[email protected]> * Update ITs. Signed-off-by: Yury-Fridlyand <[email protected]> * Fix (again) TotalHits feature. Signed-off-by: Yury-Fridlyand <[email protected]> * Fix typo in prometheus config. Signed-off-by: Yury-Fridlyand <[email protected]> * Recover commented logging. Signed-off-by: Yury-Fridlyand <[email protected]> * Move `test_pagination_blackbox` to a separate class and add logging. Signed-off-by: Yury-Fridlyand <[email protected]> * Address some PR feedbacks: rename some classes and revert unnecessary whitespace changed. Signed-off-by: Yury-Fridlyand <[email protected]> * Minor commenting. Signed-off-by: Yury-Fridlyand <[email protected]> * Address PR comments. * Add javadocs * Renames * Cleaning up some comments * Remove unused code * Speed up IT Signed-off-by: Yury-Fridlyand <[email protected]> * Minor missing changes. Signed-off-by: Yury-Fridlyand <[email protected]> * Integration tests for fetch_size, max_result_window, and query.size_limit (#248) Signed-off-by: MaxKsyunz <[email protected]> * Remove `PaginatedQueryService`, extend `QueryService` to hold two planners and use them. Signed-off-by: Yury-Fridlyand <[email protected]> * Move push down functions from request builders to a new interface. Signed-off-by: Yury-Fridlyand <[email protected]> * Some file moves. Signed-off-by: Yury-Fridlyand <[email protected]> * Minor clean-up according to PR review. Signed-off-by: Yury-Fridlyand <[email protected]> --------- Signed-off-by: MaxKsyunz <[email protected]> Signed-off-by: Yury-Fridlyand <[email protected]> Co-authored-by: MaxKsyunz <[email protected]> Co-authored-by: GabeFernandez310 <[email protected]> Co-authored-by: Max Ksyunz <[email protected]>
Signed-off-by: Yury-Fridlyand <[email protected]>
Signed-off-by: Yury-Fridlyand <[email protected]>
Yury-Fridlyand
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I copied unresolved questions/issues from Bit-Quill#226
core/src/main/java/org/opensearch/sql/planner/optimizer/rule/PushPageSize.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/opensearch/sql/storage/StorageEngine.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/opensearch/sql/executor/execution/PaginatedPlan.java
Outdated
Show resolved
Hide resolved
Signed-off-by: Yury-Fridlyand <[email protected]>
core/src/main/java/org/opensearch/sql/planner/SerializablePlan.java
Outdated
Show resolved
Hide resolved
|
|
||
| QueryResponse response = new QueryResponse(physicalPlan.schema(), result); | ||
| QueryResponse response = new QueryResponse(physicalPlan.schema(), result, | ||
| plan.getTotalHits(), planSerializer.convertToCursor(plan)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also confused about why we need to maintain totalHits in each physical operator.
total hits == number of rows matching the search criteria
Do you mean total hits = matched #rows in each page or totally? If the latter, why it becomes 0 finally?
| * Override pushDown* methods from TableScanBuilder as more features | ||
| * support pagination. | ||
| */ | ||
| public class OpenSearchPagedIndexScanBuilder extends TableScanBuilder { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This class just a holder, and seems do nothing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice finding. Yes, it is. It was simplified so it became just a holder.
* Remove PaginateOperator class since it is no longer used. --------- Signed-off-by: MaxKsyunz <[email protected]>
Signed-off-by: Yury-Fridlyand <[email protected]>
Signed-off-by: Yury-Fridlyand <[email protected]>
|
Lets we tag this for 2.8 and discuss the requested changes. we are a few hours away from code freeze as this is a huge PR with critical changes; lets move this to the next release train |
The base branch was changed.
Signed-off-by: Yury-Fridlyand <[email protected]>
MaxKsyunz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Yury-Fridlyand thank you so much for the diagrams!
Could you add method returns and activations to objects where appropriate? They make it easier to visualize the call stack.
Signed-off-by: Yury-Fridlyand <[email protected]>
# Conflicts: # opensearch/src/main/java/org/opensearch/sql/opensearch/executor/protector/ResourceMonitorPlan.java
Signed-off-by: MaxKsyunz <[email protected]>
* Support pagination in V2 engine, phase 1 (#226) * Fixing integration tests broken during POC Signed-off-by: MaxKsyunz <[email protected]> * Comment to clarify an exception. Signed-off-by: MaxKsyunz <[email protected]> * Add support for paginated scroll request, first page. Implement PaginatedPlanCache.convertToPlan for second page to work. Signed-off-by: MaxKsyunz <[email protected]> * Progress on paginated scroll request, subsequent page. Signed-off-by: MaxKsyunz <[email protected]> * Move `ExpressionSerializer` from `opensearch` to `core`. Signed-off-by: Yury-Fridlyand <[email protected]> * Rename `Cursor` `asString` to `toString`. Signed-off-by: Yury-Fridlyand <[email protected]> * Disable scroll cleaning. Signed-off-by: Yury-Fridlyand <[email protected]> * Add full cursor serialization and deserialization. Signed-off-by: Yury-Fridlyand <[email protected]> * Misc fixes. Signed-off-by: Yury-Fridlyand <[email protected]> * Further work on pagination. * Added push down page size from `LogicalPaginate` to `LogicalRelation`. * Improved cursor encoding and decoding. * Added cursor compression. * Fixed issuing `SearchScrollRequest`. * Fixed returning last empty page. * Minor code grooming/commenting. Signed-off-by: Yury-Fridlyand <[email protected]> * Pagination fix for empty indices. Signed-off-by: Yury-Fridlyand <[email protected]> * Fix error reporting on wrong cursor. Signed-off-by: Yury-Fridlyand <[email protected]> * Minor comments and error reporting improvement. Signed-off-by: Yury-Fridlyand <[email protected]> * Add an end-to-end integration test. Signed-off-by: Yury-Fridlyand <[email protected]> * Add `explain` request handlers. Signed-off-by: Yury-Fridlyand <[email protected]> * Add IT for explain. Signed-off-by: Yury-Fridlyand <[email protected]> * Address issues flagged by checkstyle build step (#229) Signed-off-by: MaxKsyunz <[email protected]> * Pagination, phase 1: Add unit tests for `:core` module with coverage. (#230) * Add unit tests for `:core` module with coverage. Uncovered: `toCursor`, because it is will be changed soon. Signed-off-by: Yury-Fridlyand <[email protected]> * Pagination, phase 1: Add unit tests for SQL module with coverage. (#239) * Add unit tests for SQL module with coverage. Signed-off-by: Yury-Fridlyand <[email protected]> * Update sql/src/main/java/org/opensearch/sql/sql/domain/SQLQueryRequest.java Signed-off-by: Yury-Fridlyand <[email protected]> Co-authored-by: GabeFernandez310 <[email protected]> --------- Signed-off-by: Yury-Fridlyand <[email protected]> Co-authored-by: GabeFernandez310 <[email protected]> * Pagination, phase 1: Add unit tests for `:opensearch` module with coverage. (#233) * Add UT for `:opensearch` module with full coverage, except `toCursor`. Signed-off-by: Yury-Fridlyand <[email protected]> * Fix checkstyle. Signed-off-by: Yury-Fridlyand <[email protected]> --------- Signed-off-by: Yury-Fridlyand <[email protected]> * Fix the merges. Signed-off-by: Yury-Fridlyand <[email protected]> * Fix explain. Signed-off-by: Yury-Fridlyand <[email protected]> * Fix scroll cleaning. Signed-off-by: Yury-Fridlyand <[email protected]> * Store `TotalHits` and use it to report `total` in response. Signed-off-by: Yury-Fridlyand <[email protected]> * Add missing UT for `:protocol` module. Signed-off-by: Yury-Fridlyand <[email protected]> * Fix PPL UTs damaged in f4ea4ad. Signed-off-by: Yury-Fridlyand <[email protected]> * Minor checkstyle fixes. Signed-off-by: Yury-Fridlyand <[email protected]> * Fallback to v1 engine for pagination (#245) * Pagination fallback integration tests. Signed-off-by: MaxKsyunz <[email protected]> * Add UT with coverage for `toCursor` serialization. Signed-off-by: Yury-Fridlyand <[email protected]> * Fix broken tests in `legacy`. Signed-off-by: Yury-Fridlyand <[email protected]> * Fix getting `total` from non-paged requests and from queries without `FROM` clause. Signed-off-by: Yury-Fridlyand <[email protected]> * Fix scroll cleaning. Signed-off-by: Yury-Fridlyand <[email protected]> * Fix cursor request processing. Signed-off-by: Yury-Fridlyand <[email protected]> * Update ITs. Signed-off-by: Yury-Fridlyand <[email protected]> * Fix (again) TotalHits feature. Signed-off-by: Yury-Fridlyand <[email protected]> * Fix typo in prometheus config. Signed-off-by: Yury-Fridlyand <[email protected]> * Recover commented logging. Signed-off-by: Yury-Fridlyand <[email protected]> * Move `test_pagination_blackbox` to a separate class and add logging. Signed-off-by: Yury-Fridlyand <[email protected]> * Address some PR feedbacks: rename some classes and revert unnecessary whitespace changed. Signed-off-by: Yury-Fridlyand <[email protected]> * Minor commenting. Signed-off-by: Yury-Fridlyand <[email protected]> * Address PR comments. * Add javadocs * Renames * Cleaning up some comments * Remove unused code * Speed up IT Signed-off-by: Yury-Fridlyand <[email protected]> * Minor missing changes. Signed-off-by: Yury-Fridlyand <[email protected]> * Integration tests for fetch_size, max_result_window, and query.size_limit (#248) Signed-off-by: MaxKsyunz <[email protected]> * Remove `PaginatedQueryService`, extend `QueryService` to hold two planners and use them. Signed-off-by: Yury-Fridlyand <[email protected]> * Move push down functions from request builders to a new interface. Signed-off-by: Yury-Fridlyand <[email protected]> * Some file moves. Signed-off-by: Yury-Fridlyand <[email protected]> * Minor clean-up according to PR review. Signed-off-by: Yury-Fridlyand <[email protected]> --------- Signed-off-by: MaxKsyunz <[email protected]> Signed-off-by: Yury-Fridlyand <[email protected]> Co-authored-by: MaxKsyunz <[email protected]> Co-authored-by: GabeFernandez310 <[email protected]> Co-authored-by: Max Ksyunz <[email protected]> * Make scroll timeout configurable. Signed-off-by: Yury-Fridlyand <[email protected]> * Fix IT to set cursor keep alive parameter. Signed-off-by: Yury-Fridlyand <[email protected]> * Remove `QueryId.None`. Signed-off-by: Yury-Fridlyand <[email protected]> * Rename according to PR feedback. Signed-off-by: Yury-Fridlyand <[email protected]> * Remove default implementations of `PushDownRequestBuilder`. Signed-off-by: Yury-Fridlyand <[email protected]> * Merge paginated plan optimizer into the regular optimizer. (opensearch-project#1516) Merge paginated plan optimizer into the regular optimizer. --------- Signed-off-by: MaxKsyunz <[email protected]> Co-authored-by: Yury-Fridlyand <[email protected]> * Complete rework on serialization and deserialization. (opensearch-project#1498) Signed-off-by: Yury-Fridlyand <[email protected]> * Resolve merge conflicts and fix tests. Signed-off-by: Yury-Fridlyand <[email protected]> * Minor cleanup. Signed-off-by: Yury-Fridlyand <[email protected]> * Minor cleanup - missing changes for the previous commit. Signed-off-by: Yury-Fridlyand <[email protected]> * Remove paginate operator (opensearch-project#1528) * Remove PaginateOperator class since it is no longer used. --------- Signed-off-by: MaxKsyunz <[email protected]> * Remove `PaginatedPlan` - move logic to `QueryPlan`. Signed-off-by: Yury-Fridlyand <[email protected]> * Remove default implementations from `SerializablePlan`. Signed-off-by: Yury-Fridlyand <[email protected]> * Add a doc. Signed-off-by: Yury-Fridlyand <[email protected]> * Update design graphs. Signed-off-by: Yury-Fridlyand <[email protected]> * More fixes for merge from upstream/main. Signed-off-by: MaxKsyunz <[email protected]> --------- Signed-off-by: MaxKsyunz <[email protected]> Signed-off-by: Yury-Fridlyand <[email protected]> Co-authored-by: MaxKsyunz <[email protected]> Co-authored-by: GabeFernandez310 <[email protected]> Co-authored-by: Max Ksyunz <[email protected]>
Supersedes #1483 and Bit-Quill#226
Description
https://github.com/Bit-Quill/opensearch-project-sql/blob/61767f2b200b2a7f8c8b2df32de209b3c30caa61/docs/dev/Pagination-v2.md
PHASE 1
Pagination supports queries like
For example:
Cursor request:
All credits to @MaxKsyunz.
You can use attached script for testing as well. Command line:
./cursor_test.sh <table> <page size>. Requiresjq. Rename it before use.cursor_test.sh.txt
Scroll API usage doc:
https://www.elastic.co/guide/en/elasticsearch/client/java-api/current/java-search-scrolling.html
Please, see team review and discussion in Bit-Quill#226
Pagination demo: https://user-images.githubusercontent.com/88679692/224208630-8d38d833-abf8-4035-8d15-d5fb4382deca.mp4
Features
ResourceMonitorPlanPlannerandImplementorLast page is empty - this shows pagination is finished.
totalHits- total number of matched rows/docs (see details)Design Diagrams
New code workflows are highlighted.
Design document with these graphs is also available in the docs.
First page
sequenceDiagram participant SQLService participant QueryPlanFactory participant ResponseListener participant ResponseFormatter participant CanPaginateVisitor participant QueryService participant Planner participant CreatePagingTableScanBuilder participant OpenSearchExecutionEngine participant PlanSerializer participant Physical Plan Tree participant OpenSearchPagedIndexScan participant OpenSearchScrollRequest SQLService->>QueryPlanFactory:execute critical QueryPlanFactory->>CanPaginateVisitor:canConvertToCursor CanPaginateVisitor-->>QueryPlanFactory:true end QueryPlanFactory->>QueryService:execute QueryService->>Planner:optimize critical Planner->>CreatePagingTableScanBuilder:apply CreatePagingTableScanBuilder-->>QueryService:paged index scan end QueryService->>OpenSearchExecutionEngine:execute OpenSearchExecutionEngine-->OpenSearchExecutionEngine:iterate result set critical Serialization OpenSearchExecutionEngine->>PlanSerializer:convertToCursor PlanSerializer->>Physical Plan Tree:serialize Physical Plan Tree->>OpenSearchPagedIndexScan:serialize OpenSearchPagedIndexScan->>OpenSearchScrollRequest:toCursor OpenSearchScrollRequest-->>OpenSearchPagedIndexScan:scroll Id OpenSearchPagedIndexScan-->>PlanSerializer:Serialized Plan Tree PlanSerializer-->>OpenSearchExecutionEngine:cursor end critical OpenSearchExecutionEngine->>Physical Plan Tree:getTotalHits Physical Plan Tree-->>OpenSearchExecutionEngine:total hits end OpenSearchExecutionEngine-->>ResponseListener:QueryResponse ResponseListener->>ResponseFormatter:format with cursorSecond page
sequenceDiagram participant SQLService participant QueryPlanFactory participant ResponseListener participant ResponseFormatter participant QueryService participant OpenSearchExecutionEngine participant PlanSerializer participant Physical Plan Tree participant OpenSearchPagedIndexScan participant ContinuePageRequest SQLService->>QueryPlanFactory:execute QueryPlanFactory->>QueryService:execute critical Deserialization QueryService->>PlanSerializer:convertToPlan PlanSerializer->>Physical Plan Tree:deserialize Physical Plan Tree->>OpenSearchPagedIndexScan:deserialize OpenSearchPagedIndexScan-->>PlanSerializer:resolve engine PlanSerializer->>OpenSearchPagedIndexScan:OpenSearchStorageEngine OpenSearchPagedIndexScan->>ContinuePageRequest:create new OpenSearchPagedIndexScan-->>PlanSerializer:deserialized plan tree PlanSerializer-->>QueryService:Physical plan tree end QueryService->>OpenSearchExecutionEngine:execute OpenSearchExecutionEngine-->OpenSearchExecutionEngine:iterate result set critical Serialization OpenSearchExecutionEngine->>PlanSerializer:convertToCursor PlanSerializer->>Physical Plan Tree:serialize Physical Plan Tree->>OpenSearchPagedIndexScan:serialize OpenSearchPagedIndexScan->>ContinuePageRequest:toCursor ContinuePageRequest-->>OpenSearchPagedIndexScan:scroll Id OpenSearchPagedIndexScan-->>PlanSerializer:Serialized Plan Tree PlanSerializer-->>OpenSearchExecutionEngine:cursor end critical OpenSearchExecutionEngine->>Physical Plan Tree:getTotalHits Physical Plan Tree-->>OpenSearchExecutionEngine:total hits end OpenSearchExecutionEngine-->>ResponseListener:QueryResponse ResponseListener->>ResponseFormatter:format with cursorLegacy Engine Fallback
sequenceDiagram participant RestSQLQueryAction participant Legacy Engine participant SQLService participant QueryPlanFactory participant CanPaginateVisitor RestSQLQueryAction->>SQLService:prepareRequest SQLService->>QueryPlanFactory:execute critical V2 support check QueryPlanFactory->>CanPaginateVisitor:createContinuePaginatedPlan CanPaginateVisitor-->>QueryPlanFactory:false QueryPlanFactory-->>RestSQLQueryAction:UnsupportedCursorRequestException end RestSQLQueryAction->>Legacy Engine:acceptFurther improvements are coming
LIMIT,WHERE,HAVINGandORDER BYand queries withoutFROMinto pagination. in #253 - Phase 2Issues Resolved
#656
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.