Skip to content

Conversation

@wendigo
Copy link
Contributor

@wendigo wendigo commented Jun 6, 2025

Description

Additional context and related issues

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text:

## Section
* Fix some things. ({issue}`issuenumber`)

@cla-bot cla-bot bot added the cla-signed label Jun 6, 2025
@wendigo wendigo requested a review from martint June 6, 2025 17:37
@wendigo wendigo force-pushed the serafin/reserved-identifiers-once branch from fcd72b0 to d786ef3 Compare June 6, 2025 18:39
@wendigo wendigo requested review from losipiuk and raunaqmorarka June 6, 2025 19:12
@raunaqmorarka raunaqmorarka requested a review from Copilot June 6, 2025 19:26

This comment was marked as outdated.

@wendigo wendigo force-pushed the serafin/reserved-identifiers-once branch from c150ee3 to 592446f Compare June 6, 2025 19:40
@wendigo wendigo changed the title Parsing improvements Random improvements to random things in random places Jun 6, 2025
@wendigo wendigo force-pushed the serafin/reserved-identifiers-once branch from 592446f to 8c3f427 Compare June 6, 2025 20:09
@wendigo wendigo requested a review from Copilot June 6, 2025 20:50
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces several cross-cutting refactorings and enhancements, including a unified numeric literal parser, caching for type signature translations, and cache-backed rule indexing, along with corresponding test and assertion updates.

  • Consolidated numeric parsing logic into a new NumericParser and updated NumericParameter and LongLiteral to use it.
  • Added caches in TypeSignatureTranslator and RuleIndex for performance; refactored parsing and lookup loops accordingly.
  • Updated test code to use more expressive AssertJ assertions and added a new parser test for underscored numeric parameters.

Reviewed Changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
core/trino-spi/src/main/java/io/trino/spi/block/BlockUtil.java Removed unused checkValidPositions validation method
core/trino-parser/src/test/java/io/trino/sql/parser/TestTypeParser.java Added a new assertion for foo(1_000, 2_000) ARRAY parsing
core/trino-parser/src/main/java/io/trino/sql/tree/NumericParser.java Introduced NumericParser with underscore and base-prefix support
core/trino-parser/src/main/java/io/trino/sql/tree/NumericParameter.java Stored and exposed parsed numeric value on parameters
core/trino-parser/src/main/java/io/trino/sql/tree/LongLiteral.java Delegated numeric parsing to NumericParser
core/trino-parser/src/main/java/io/trino/sql/ReservedIdentifiers.java Cached reserved identifiers set; switched to Locale.ENGLISH
core/trino-main/src/test/java/io/trino/sql/planner/iterative/TestRuleIndex.java Switched from collect(toSet()) to AssertJ contains... assertions
core/trino-main/src/main/java/io/trino/sql/planner/iterative/ExpressionRewriteRuleSet.java Replaced stream-based rewrite with loop and builder for clarity
core/trino-main/src/main/java/io/trino/sql/planner/iterative/RuleIndex.java Reworked getCandidates to use an EvictableCache
core/trino-main/src/main/java/io/trino/sql/planner/iterative/IterativeOptimizer.java Simplified rule-iteration loop
core/trino-main/src/main/java/io/trino/sql/analyzer/TypeSignatureTranslator.java Added a cache for data-type parsing; improved identifier checks and caching
Comments suppressed due to low confidence (2)

core/trino-parser/src/main/java/io/trino/sql/ReservedIdentifiers.java:127

  • [nitpick] The method name isNotIdentifier is confusing since it returns true for reserved keywords. Consider renaming it to something like isReservedKeyword or isNonIdentifierKeyword for clarity.
private static boolean isNotIdentifier(String name)

core/trino-spi/src/main/java/io/trino/spi/block/BlockUtil.java:60

  • [nitpick] The removal of checkValidPositions may break callers if this method was in use. If it’s no longer needed, consider cleaning up references or marking it as deprecated before removal.
static void checkValidPositions(boolean[] positions, int positionCount)

Copy link

Copilot AI Jun 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrapping parseTypeSignature failures in a generic RuntimeException loses the original semanticException and TYPE_MISMATCH error code. Consider throwing a semanticException with the appropriate error code to preserve user-facing diagnostics.

Suggested change
throw new RuntimeException(e);
throw new TrinoException(TYPE_MISMATCH, e.getMessage(), e);

Copilot uses AI. Check for mistakes.
@wendigo wendigo force-pushed the serafin/reserved-identifiers-once branch from d332e0d to 728604f Compare June 8, 2025 16:51
@losipiuk
Copy link
Member

losipiuk commented Jun 9, 2025

Fuzzer driven development?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

original looks nicer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah but resizes the list instead of preallocating it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returns a new builder, expecting the specified number of elements to be added.
If expectedSize is exactly the number of elements added to the builder before ImmutableList.Builder.build is called, the builder is likely to perform better than an unsized builder() would have.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and the toImmutableList() is:

  private static final Collector<Object, ?, ImmutableList<Object>> TO_IMMUTABLE_LIST =
      Collector.of(
          ImmutableList::builder,
          ImmutableList.Builder::add,
          ImmutableList.Builder::combine,
          ImmutableList.Builder::build);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nicer != more performant :P

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Meh - I doubt it matters at all, so I would keep nicer. I will not fight though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have we observed a measurable performance improvement? If not, then readability is more important.

Copy link
Member

@losipiuk losipiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

x

@wendigo wendigo force-pushed the serafin/reserved-identifiers-once branch from 728604f to 784cb11 Compare June 9, 2025 09:06
@wendigo wendigo requested a review from losipiuk June 9, 2025 09:07
@wendigo wendigo merged commit 48b5e78 into master Jun 9, 2025
106 of 109 checks passed
@wendigo wendigo deleted the serafin/reserved-identifiers-once branch June 9, 2025 09:59
@github-actions github-actions bot added this to the 477 milestone Jun 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

4 participants