-
Notifications
You must be signed in to change notification settings - Fork 180
Fixes for Multisearch and Append command
#4512
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Multisearch and Append command
f641eda to
b9332ed
Compare
| if (!typesForName.contains(fieldType)) { | ||
| // New field or same name with different type - add to schema | ||
| schema.add(new SchemaField(fieldName, fieldType)); | ||
| typesForName.add(fieldType); | ||
| } | ||
| // If we've seen this exact (name, type) combination, skip it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should rather raise error from here. (Then we don't need to check again)
Later we would implement a logic to generalize types for the same field name here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion! I think it makes sense, moved the location to raise error
|
|
||
| if (!typesForName.contains(fieldType)) { | ||
| // New field or same name with different type - add to schema | ||
| RelDataType existingType = seenFields.get(fieldName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We recently find another issue of type conflicts here. RelDataType evaluates the hash equality by its digested string as well. For example, "INTEGER" is not equal to "INTEGER NOT NULL". A quick fix would be aligning the same SqlType to be nullable. Ideally it won't affect the data type resolution while execution. cc @xinyual
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can consider to allow same SqlTypeName but with different nullability to be merged here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion! I have updated the implementation to allow same SqlTypeName but with different nullability to be merged
| Limitations | ||
| =========== | ||
|
|
||
| * **Schema Compatibility**: When fields with the same name exist between the main search and sub-search but have incompatible types, the query will fail with an error. To avoid type conflicts, ensure that fields with the same name have the same data type, or use different field names (e.g., by renaming with ``eval`` or using ``fields`` to select non-conflicting columns). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understand the intention here. Strong schema engine like SQL restricts the type to be the same. Some weak schema engine resolves types at runtime and doesn't care the data type. I think it's not easy to make it compatible.
Not sure what's better user experience and customer expectation here. Does user accept this behavior or expect to union anyway? cc @LantaoJin
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. Thanks for the change.
Signed-off-by: Kai Huang <[email protected]> # Conflicts: # docs/category.json
Signed-off-by: Kai Huang <[email protected]>
Signed-off-by: Kai Huang <[email protected]>
Signed-off-by: Kai Huang <[email protected]>
Signed-off-by: Kai Huang <[email protected]>
98e40bc to
ae5dbb1
Compare
| * @param rowType The row type to search for timestamp fields | ||
| * @return The name of the timestamp field, or null if not found | ||
| * @param rowType The row type to search for @timestamp field | ||
| * @return "@timestamp" if the field exists, or null if not found |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (non-blocking): Should we think about indices with different timestamp field names?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently it is set as a limitation: we want to only support @timestamp
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMultisearchCommandIT.java
Outdated
Show resolved
Hide resolved
Signed-off-by: Kai Huang <[email protected]>
* Fix for Multisearch and Append command Signed-off-by: Kai Huang <[email protected]> # Conflicts: # docs/category.json * fix tests Signed-off-by: Kai Huang <[email protected]> * fix test Signed-off-by: Kai Huang <[email protected]> * remove error location Signed-off-by: Kai Huang <[email protected]> * Allow same SqlTypeName but with different nullability to be merged Signed-off-by: Kai Huang <[email protected]> * Update error message Signed-off-by: Kai Huang <[email protected]> --------- Signed-off-by: Kai Huang <[email protected]> (cherry picked from commit 0dd5949) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
* Fix for Multisearch and Append command # Conflicts: # docs/category.json * fix tests * fix test * remove error location * Allow same SqlTypeName but with different nullability to be merged * Update error message --------- (cherry picked from commit 0dd5949) Signed-off-by: Kai Huang <[email protected]> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
| return schema; | ||
| } | ||
|
|
||
| private static boolean areTypesCompatible(RelDataType type1, RelDataType type2) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is another concern of using this method to allow type merge. If index A's RelDataType 'INTEGER NOT NULL' is put to unified schema, index B's same name RelDataType 'INTEGER' will be merged silently. Index B's column values could contain NULL values.
The generated code could ignore the null check because the merged unified schema has 'iNTEGER NOT NULL'. It will probably throw NPE when merging index B's NULL values. We could write some query to double check if we can reproduce this scenario.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried some queries in my local test. Haven't seen such NPE errors yet. Not sure if there is edge case. But for now we can leave it there until needed fix.
* Fix for Multisearch and Append command Signed-off-by: Kai Huang <[email protected]> # Conflicts: # docs/category.json * fix tests Signed-off-by: Kai Huang <[email protected]> * fix test Signed-off-by: Kai Huang <[email protected]> * remove error location Signed-off-by: Kai Huang <[email protected]> * Allow same SqlTypeName but with different nullability to be merged Signed-off-by: Kai Huang <[email protected]> * Update error message Signed-off-by: Kai Huang <[email protected]> --------- Signed-off-by: Kai Huang <[email protected]>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Simeon Widdis <[email protected]> Co-authored-by: Manasvini B S <[email protected]> Co-authored-by: opensearch-ci-bot <[email protected]> Co-authored-by: Louis Chu <[email protected]> Co-authored-by: Chen Dai <[email protected]> Co-authored-by: Mebsina <[email protected]> Co-authored-by: Yuanchun Shen <[email protected]> Co-authored-by: opensearch-trigger-bot[bot] <98922864+opensearch-trigger-bot[bot]@users.noreply.github.com> Co-authored-by: Kai Huang <[email protected]> Co-authored-by: Peng Huo <[email protected]> Co-authored-by: Alexey Temnikov <[email protected]> Co-authored-by: Riley Jerger <[email protected]> Co-authored-by: Tomoyuki MORITA <[email protected]> Co-authored-by: Lantao Jin <[email protected]> Co-authored-by: Songkan Tang <[email protected]> Co-authored-by: qianheng <[email protected]> Co-authored-by: Simeon Widdis <[email protected]> Co-authored-by: Xinyuan Lu <[email protected]> Co-authored-by: Jialiang Liang <[email protected]> Co-authored-by: Peter Zhu <[email protected]> Co-authored-by: Vinay Krishna Pudyodu <[email protected]> Co-authored-by: expani <[email protected]> Co-authored-by: expani1729 <[email protected]> Co-authored-by: Vamsi Manohar <[email protected]> Co-authored-by: ritvibhatt <[email protected]> Co-authored-by: Xinyu Hao <[email protected]> Co-authored-by: Marc Handalian <[email protected]> Co-authored-by: Marc Handalian <[email protected]> Fix join type ambiguous issue when specify the join type with sql-like join criteria (opensearch-project#4474) Fix issue 4441 (opensearch-project#4449) Fix missing keywordsCanBeId (opensearch-project#4491) Fix the bug of explicit makeNullLiteral for UDT fields (opensearch-project#4475) Fix mapping after aggregation push down (opensearch-project#4500) Fix percentile bug (opensearch-project#4539) Fix JsonExtractAllFunctionIT failure (opensearch-project#4556) Fix sort push down into agg after project already pushed (opensearch-project#4546) Fix push down failure for min/max on derived field (opensearch-project#4572) Fix compile issue in main (opensearch-project#4608) Fix filter parsing failure on date fields with non-default format (opensearch-project#4616) Fix bin nested fields issue (opensearch-project#4606) Fix: Support Alias Fields in MIN, MAX, FIRST, LAST, and TAKE Aggregations (opensearch-project#4621) fix rename issue (opensearch-project#4670) Fixes for `Multisearch` and `Append` command (opensearch-project#4512) Fix asc/desc keyword behavior for sort command (opensearch-project#4651) Fix] Fix unexpected shift of extraction for `rex` with nested capture groups in named groups (opensearch-project#4641) Fix CVE-2025-48924 (opensearch-project#4665) Fix sub-fields accessing of generated structs (opensearch-project#4683) Fix] Incorrect Field Index Mapping in AVG to SUM/COUNT Conversion (opensearch-project#15)
Description
Related PRs
#4332 #4123