Skip to content

Tighten allowed characters for non-delimited identifiers#6380

Merged
martint merged 1 commit intotrinodb:masterfrom
martint:identifiers
Dec 18, 2020
Merged

Tighten allowed characters for non-delimited identifiers#6380
martint merged 1 commit intotrinodb:masterfrom
martint:identifiers

Conversation

@martint
Copy link
Copy Markdown
Member

@martint martint commented Dec 18, 2020

Identifiers with characters such as @ or : are not being treated as
delimited identifiers, which causes issues when the planner creates
synthetic identifiers as part of the plan IR expressions. When
the IR expressions are serialized, they are not being properly
quoted, which causes failures when parsing the plan on the work side.

For example, for a query such as:

SELECT try("a@b") FROM t

The planner creates an expression of the form:

"$internal$try"("$INTERNAL$BIND"("a@b", (a@b_0) -> "a@b_0"))

The argument to the lambda expression is not properly quoted.

Fixes #6375

Identifiers with characters such as @ or : are not being treated as
delimited identifiers, which causes issues when the planner creates
synthetic identifiers as part of the plan IR expressions. When
the IR expressions are serialized, they are not being properly
quoted, which causes failures when parsing the plan on the work side.

For example, for a query such as:

    SELECT try("a@b") FROM t

The planner creates an expression of the form:

    "$internal$try"("$INTERNAL$BIND"("a@b", (a@b_0) -> "a@b_0"))

The argument to the lambda expression is not properly quoted.
@cla-bot cla-bot bot added the cla-signed label Dec 18, 2020
@martint martint merged commit d5e02a9 into trinodb:master Dec 18, 2020
@martint martint added this to the 349 milestone Dec 18, 2020
@findepi
Copy link
Copy Markdown
Member

findepi commented Dec 18, 2020

Fixes #6375

Would be great to have a test for that, as we may be mistreating these identifiers in some other place as well

@martint
Copy link
Copy Markdown
Member Author

martint commented Dec 18, 2020

Yes, unfortunately it can only be reproduced with actual tables that contain column names with those characters. We don't have any tables like that, and the engine testing framework has no provision for creating tables in tests.

@kokosing
Copy link
Copy Markdown
Member

the engine testing framework has no provision for creating tables in tests.

You can use MockConnector (extend it if needed) to make possible to test certain parts of the engine. Each connector has its own issues, relying on connector to test engine would cause that engine would need to use plenty of them to get decent coverage for issues like that. Actually, it is a case today where certain features only have tests coverage in connector tests (see tests in Hive).

@martint
Copy link
Copy Markdown
Member Author

martint commented Dec 19, 2020

As far as I can tell, MockConnector only supports metadata operations. We'd need to implement the SplitManager, Page sources, etc. to be able to use it for a test like this.

@kokosing
Copy link
Copy Markdown
Member

As far as I can tell, MockConnector only supports metadata operations.

That is the current state, but nothing prevents us to extend to support SplitManager, Page sources, etc. The biggest challenge is that it has to be able to serializable whereas it is difficult to serialize functional entities.

@findepi
Copy link
Copy Markdown
Member

findepi commented Dec 19, 2020

I was able to reproduce the problem with a test. #6385

@martint martint mentioned this pull request Dec 28, 2020
9 tasks
v-jizhang added a commit to v-jizhang/presto that referenced this pull request Feb 22, 2022
Cherry-pick of trinodb/trino#6380:

Identifiers with characters such as @ or : are not being treated as
delimited identifiers, which causes issues when the planner creates
synthetic identifiers as part of the plan IR expressions. When the
IR expressions are serialized, they are not being properly quoted,
which causes failures when parsing the plan on the work side.

For example, for a query such as:

    SELECT try("a@b") FROM t

The planner creates an expression of the form:

    "$internal$try"("$INTERNAL$BIND"("a@b", (a@b_0) -> "a@b_0"))

The argument to the lambda expression is not properly quoted.

Co-authored-by: Martin Traverso <mtraverso@gmail.com>
zhenxiao pushed a commit to prestodb/presto that referenced this pull request Feb 24, 2022
Cherry-pick of trinodb/trino#6380:

Identifiers with characters such as @ or : are not being treated as
delimited identifiers, which causes issues when the planner creates
synthetic identifiers as part of the plan IR expressions. When the
IR expressions are serialized, they are not being properly quoted,
which causes failures when parsing the plan on the work side.

For example, for a query such as:

    SELECT try("a@b") FROM t

The planner creates an expression of the form:

    "$internal$try"("$INTERNAL$BIND"("a@b", (a@b_0) -> "a@b_0"))

The argument to the lambda expression is not properly quoted.

Co-authored-by: Martin Traverso <mtraverso@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

Query fail for function try() with column names containing special character

4 participants