-
Notifications
You must be signed in to change notification settings - Fork 181
Add validation for expand command on scalar types (#5065) #5089
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Peng Huo <[email protected]>
…/index keywords When a PPL query contains duplicate 'source' or 'index' keywords (e.g., 'source source=index_name'), the parser was accepting it as valid syntax, treating the first keyword as a search expression. This caused confusing errors later when OpenSearch tried to expand fields. This fix adds validation in AstBuilder.visitSearchFrom() to detect when reserved keywords 'source' or 'index' appear as search expressions before the fromClause. It now throws a clear SyntaxCheckException with a helpful error message suggesting the correct syntax. Signed-off-by: Peng Huo <[email protected]>
Signed-off-by: Peng Huo <[email protected]>
…e source/index keywords" This reverts commit 268f77b.
Signed-off-by: Peng Huo <[email protected]>
…#5065) - Validate field type in buildExpandRelNode before uncollect operation - Throw UnsupportedOperationException with clear message for scalar types - OpenSearch multi-value fields stored as scalars cannot be expanded when codegen triggered - Add integration test to verify error message Signed-off-by: Peng Huo <[email protected]>
|
Caution Review failedThe pull request is closed. 📝 WalkthroughSummary by CodeRabbitRelease NotesBug Fixes
Documentation
✏️ Tip: You can customize this high-level summary in your review settings. WalkthroughIntroduces comprehensive documentation for a multi-agent PR review system (PPL doctor and RCA-fix agents), adds a Python-based PR data collector tool with GitHub CLI integration, establishes standardized PR review workflows and checklists, and implements a runtime type validation in CalciteRelNodeVisitor to handle array-type fields correctly in expand operations. Changes
Sequence Diagram(s)sequenceDiagram
actor User
participant PPLDoctor as PPL Doctor<br/>(Orchestrator)
participant GitHub as GitHub API
participant RCAAgent as RCA-Fix Agent
participant Slack as Slack Notification
User->>PPLDoctor: Submit PPL issue intake
PPLDoctor->>GitHub: Fetch issue/PR details
GitHub-->>PPLDoctor: PR metadata, diff, comments
PPLDoctor->>PPLDoctor: Intake validation & gate check
alt Issue Reproducible
PPLDoctor->>PPLDoctor: Run reproduction workflow
PPLDoctor->>GitHub: Create/update test case
PPLDoctor->>RCAAgent: Delegate RCA + fix request
RCAAgent->>RCAAgent: Root cause analysis
RCAAgent->>RCAAgent: Implement & verify fix
RCAAgent-->>PPLDoctor: Return fix with artifacts
PPLDoctor->>GitHub: Create/update PR with fix
PPLDoctor->>Slack: Notify PR ready for review
else Issue Not Reproducible
PPLDoctor->>GitHub: Request clarification comment
PPLDoctor->>Slack: Notify awaiting feedback
end
PPLDoctor-->>User: Final PR/review envelope
sequenceDiagram
actor User
participant CLI as PR Collector CLI
participant GH as GitHub CLI
participant Formatter as Markdown Formatter
participant Disk as File System
User->>CLI: pr_collector.py --repo owner/repo --start-date X --end-date Y
CLI->>GH: gh pr list --repo --created-at range
GH-->>CLI: PR list (numbers, metadata)
loop For each PR
CLI->>GH: gh pr view PR_NUM --json details
GH-->>CLI: PR fields, URL, author, dates
CLI->>GH: gh pr review-status PR_NUM
GH-->>CLI: Reviews & review comments
CLI->>GH: gh pr comments PR_NUM
GH-->>CLI: Issue comments (filtered)
end
CLI->>CLI: Aggregate PR data list
CLI->>Formatter: format_pr_markdown(pr_data)
Formatter->>Formatter: Render headers, description, reviews
Formatter->>Formatter: Filter out bot/CodeRabbit comments
Formatter-->>CLI: Formatted markdown
CLI->>Disk: save_pr_data(.kiro/resources/{repo_name})
Disk-->>CLI: File saved + summary
CLI-->>User: Output location & PR count
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested labels
Suggested reviewers
✨ Finishing touches
🧪 Generate unit tests (beta)
Tip 🧪 Unit Test Generation v2 is now available!We have significantly improved our unit test generation capabilities. To enable: Add this to your reviews:
finishing_touches:
unit_tests:
enabled: trueTry it out by using the Have feedback? Share your thoughts on our Discord thread! Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Description
Add validation to reject
expandcommand on scalar types with a clear error message. This addresses issue #5065 where expand fails with a confusing "UNNEST argument must be a collection" error when used on OpenSearch multi-value fields.Root Cause:
OpenSearch doesn't have an ARRAY type. Multi-value fields like
[1, 2, 3]are stored as repeated scalar values (e.g., typelong). When Calcite'suncollectoperation is triggered during codegen, it expectsArraySqlTypebut receivesBasicSqlType(BIGINT), causing a CalciteException.Solution:
Validate the field type at planning time and fail fast with a descriptive error message explaining that expand only works on explicitly defined array types, not OpenSearch's implicit multi-value fields.
Impact:
expandcommandRelated Issues
Resolves #5065
Testing
Issue5065ITthat verifies the error messageCalciteExpandCommandIT)./gradlew :integ-test:integTest --tests "org.opensearch.sql.calcite.Issue5065IT"Check List