Normalize ColumnMetadata to support case-sensitive column names by agrawalreetika · Pull Request #24983 · prestodb/presto

agrawalreetika · 2025-04-25T14:53:04Z

Description

Follow up of #24551

Improves identifier handling (column name) to align with SQL standards for better compatibility with case-sensitive and case-normalizing databases, while minimizing SPI-breaking changes.

Motivation and Context

RFC details - prestodb/rfcs#36

Currently, column names are lowercased at the SPI level (ColumnMetadata.java#L45). Removing this generic lowercase conversion will require updates to normalize column names via the metadata API in each connector.

Impact

Improves identifier handling (column name) to align with SQL standards for better compatibility with case-sensitive and case-normalizing databases, while minimizing SPI-breaking changes.

Test Plan

Existing UT passing
Added support for Mysql and new UT added for Mysql for when mixed-case support is enabled

Contributor checklist

Please make sure your submission complies with our contributing guide, in particular code style and commit standards.
PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
Documented new properties (with its default value), SQL syntax, functions, or other functionality.
If release notes are required, they follow the release notes guidelines.
Adequate tests were added if applicable.
CI passed.

Release Notes

Please follow release notes guidelines and fill in the release notes below.

== RELEASE NOTES ==

General Changes
* Add case-senstive support for column names. It can be enabled for JDBC based connector by setting `case-sensitive-name-matching=true` at the catalog level

prestodb-ci · 2025-05-14T03:11:24Z

@ethanyzhang imported this issue as lakehouse/presto #24983

prestodb-ci · 2025-06-09T16:54:14Z

@ethanyzhang imported this issue as lakehouse/tracker #24983

steveburnett · 2025-07-01T14:32:17Z

Do we need documentation? Maybe in https://prestodb.io/docs/current/connector/mysql.html#general-configuration-properties. Can you think of better, or other locations for documenting this?

How does this case-sensitive-name-matching config property interact with the existing case-insensitive-name-matching MySQL config property?

agrawalreetika · 2025-07-01T16:20:19Z

Do we need documentation? Maybe in https://prestodb.io/docs/current/connector/mysql.html#general-configuration-properties. Can you think of better, or other locations for documenting this?

How does this case-sensitive-name-matching config property interact with the existing case-insensitive-name-matching MySQL config property?

@steveburnett Its going to be catalog level property, in the last PR it was added for mysql. I have added the documentatioin for other JDBC connectors as well since its added as a config to others as well in this PR. Please check.

steveburnett

LGTM! (docs)

Pull branch, local doc build, everything looks good. Thanks!

agrawalreetika · 2025-07-02T05:08:28Z

@hantangwangd @ZacBlanco @ScrapCodes @aaneja Could you please help me with the review of this PR whenever you get a chance, this is a follow-up PR for columns #24551

ZacBlanco · 2025-07-02T16:58:53Z

presto-accumulo/src/main/java/com/facebook/presto/accumulo/AccumuloMetadata.java

-            return new ConnectorTableMetadata(tableName, table.getColumnsMetadata());
+            List<ColumnMetadata> columns = table.getColumnsMetadata().stream()
+                    .map(column -> normalizedColumnMetadata(session, column))
+                    .collect(toList());


Suggested change

.collect(toList());

.collect(toImmutableList());

ZacBlanco · 2025-07-02T17:03:34Z

presto-example-http/src/main/java/com/facebook/presto/example/ExampleMetadata.java

-        return new ConnectorTableMetadata(tableName, table.getColumnsMetadata());
+        List<ColumnMetadata> columns = table.getColumnsMetadata().stream()
+                .map(column -> normalizedColumnMetadata(session, column))
+                .collect(toList());


Suggested change

.collect(toList());

.collect(toImmutableList());

ZacBlanco · 2025-07-02T17:05:06Z

presto-google-sheets/src/main/java/com/facebook/presto/google/sheets/SheetsMetadata.java

+        return ColumnMetadata.builder()
+                .setName(normalizeIdentifier(session, columnMetadata.getName()))
+                .setType(columnMetadata.getType())
+                .setHidden(columnMetadata.isHidden())
+                .setNullable(columnMetadata.isNullable())
+                .setComment(columnMetadata.getComment().orElse(null))
+                .setProperties(columnMetadata.getProperties())
+                .setExtraInfo(columnMetadata.getExtraInfo().orElse(null))
+                .build();


This seems like a reoccurring method. Wonder if we can add a utility method on ColumnMetadata which effectively does this and accepts a lambda or already-normalized name. Will reduce code duplication

ZacBlanco · 2025-07-02T17:05:14Z

presto-google-sheets/src/main/java/com/facebook/presto/google/sheets/SheetsMetadata.java

-        for (ColumnMetadata column : table.get().getColumnsMetadata()) {
+        List<ColumnMetadata> columns = table.get().getColumnsMetadata().stream()
+                .map(column -> normalizedColumnMetadata(session, column))
+                .collect(toList());


Suggested change

.collect(toList());

.collect(toImmutableList());

ZacBlanco · 2025-07-02T17:06:07Z

presto-jmx/src/main/java/com/facebook/presto/connector/jmx/JmxMetadata.java

+                    String normalizedName = normalizeIdentifier(session, column.getColumnName());
+                    return column.getColumnMetadata(normalizedName);


Debugging? Any reason we can't just inline?

ZacBlanco · 2025-07-02T17:07:36Z

presto-main-base/src/main/java/com/facebook/presto/metadata/MetadataManager.java

+                .setProperties(columnMetadata.getProperties())
+                .setExtraInfo(columnMetadata.getExtraInfo().orElse(null))
+                .build();
+    }


another instance where utility method would help

ZacBlanco · 2025-07-02T17:09:01Z

presto-memory/src/main/java/com/facebook/presto/plugin/memory/MemoryMetadata.java

        return tables.values().stream()
                .filter(table -> prefix.matches(table.toSchemaTableName()))
-                .collect(toMap(MemoryTableHandle::toSchemaTableName, handle -> handle.toTableMetadata().getColumns()));
+                .collect(toMap(MemoryTableHandle::toSchemaTableName, handle -> toTableMetadata(handle, session).getColumns()));


Suggested change

.collect(toMap(MemoryTableHandle::toSchemaTableName, handle -> toTableMetadata(handle, session).getColumns()));

.collect(toImmutableMap(MemoryTableHandle::toSchemaTableName, handle -> toTableMetadata(handle, session).getColumns()));

ZacBlanco · 2025-07-02T17:09:09Z

presto-memory/src/main/java/com/facebook/presto/plugin/memory/MemoryMetadata.java

+                    String normalizedName = normalizeIdentifier(session, column.getName());
+                    return column.toColumnMetadata(normalizedName);
+                })
+                .collect(toList());


Suggested change

.collect(toList());

.collect(toImmutableList());

ZacBlanco · 2025-07-02T17:09:54Z

...-mysql/src/test/java/com/facebook/presto/plugin/mysql/TestMySqlIntegrationMixedCaseTest.java


        assertQueryFails("CREATE TABLE test (a integer, A integer)",
-                "line 1:31: Column name 'A' specified more than once");
+                "Duplicate column name 'A'");


Hmm, are we losing the line/column information from the error? I feel that is useful for large queries.

So I remember it right since we are enabling case-senetive for column as well in this PR for mysql, this is coming from MySQL Exception, which is why I think this was modified -

SQL Error [1060] [42S21]: Duplicate column name 'A'

ZacBlanco · 2025-07-02T17:11:01Z

...product-tests/src/main/java/com/facebook/presto/tests/mysql/TestMySQLMixedCaseSupportOn.java

-        query("CREATE TABLE " + CATALOG + ".\"" + SCHEMA_NAME + "\".\"" + TABLE_NAME_JOIN_LOWER + "\" AS " +
-                "SELECT d.* FROM " + CATALOG + ".\"" + SCHEMA_NAME + "\".\"" + TABLE_NAME + "\" d " +
-                "INNER JOIN " + CATALOG + ".\"" + SCHEMA_NAME + "\".\"" + TABLE_NAME + "\" m " +
+        query("CREATE TABLE " + CATALOG + "." + SCHEMA_NAME + "." + TABLE_NAME_0 + " (name VARCHAR(50), id INT)");


Why are we changing all these tests? Seems like it is just th quote escaping. Does it really need to change?

This I just found extra eascape not really neeede, so I just cleanup it up. I can revert if you think we should handle it separately?

+1. Can we only modify the parts that are necessary due to this change? It's a little tricky to figure out which parts of tests apply to the current change. Or if the change of quote escaping for schema names and table names in the test cases are indeed necessary, could we extract those changes into a separate commit?

ethanyzhang · 2025-07-08T20:08:59Z

@agrawalreetika I address the conflicts internally, you need to update your PR here.

ZacBlanco

LGTM, but just a few nits and one comment

presto-bigquery/src/main/java/com/facebook/presto/plugin/bigquery/BigQueryMetadata.java

presto-cassandra/src/main/java/com/facebook/presto/cassandra/CassandraMetadata.java

presto-spi/src/main/java/com/facebook/presto/spi/ColumnMetadata.java

ZacBlanco

Thanks for the changes @agrawalreetika ! LGTM

hantangwangd

Thanks for the work. LGTM!

prestodb-ci added the from:IBM PR from IBM label Apr 25, 2025

agrawalreetika force-pushed the mixed-case-v2-columns-2 branch 8 times, most recently from 6d9e7da to d94fc21 Compare May 7, 2025 16:22

agrawalreetika force-pushed the mixed-case-v2-columns-2 branch 10 times, most recently from 9877280 to 8bc7a44 Compare May 14, 2025 03:03

agrawalreetika force-pushed the mixed-case-v2-columns-2 branch from 8bc7a44 to c1613fa Compare May 22, 2025 17:17

agrawalreetika force-pushed the mixed-case-v2-columns-2 branch from c1613fa to f2eba02 Compare May 30, 2025 07:16

agrawalreetika force-pushed the mixed-case-v2-columns-2 branch from f2eba02 to 723f5c2 Compare June 9, 2025 17:00

agrawalreetika changed the title ~~[DO NOT REVIEW] Mixed case v2 columns 2~~ Normalize ColumnMetadata to support case-sensitive column names Jul 1, 2025

agrawalreetika marked this pull request as ready for review July 1, 2025 13:33

agrawalreetika requested review from a team, ZacBlanco, hantangwangd and vinothchandar as code owners July 1, 2025 13:33

agrawalreetika requested a review from ScrapCodes July 1, 2025 13:47

agrawalreetika force-pushed the mixed-case-v2-columns-2 branch from 723f5c2 to e206568 Compare July 1, 2025 13:58

agrawalreetika force-pushed the mixed-case-v2-columns-2 branch from e206568 to 92cf350 Compare July 1, 2025 16:19

agrawalreetika requested review from elharo and steveburnett as code owners July 1, 2025 16:19

steveburnett previously approved these changes Jul 1, 2025

View reviewed changes

agrawalreetika requested a review from aaneja July 2, 2025 05:08

ZacBlanco requested changes Jul 2, 2025

View reviewed changes

agrawalreetika dismissed steveburnett’s stale review via 822f4a0 July 3, 2025 07:53

agrawalreetika force-pushed the mixed-case-v2-columns-2 branch 2 times, most recently from f53cc50 to 172ac47 Compare July 3, 2025 10:16

agrawalreetika force-pushed the mixed-case-v2-columns-2 branch from 172ac47 to 8cc3ad5 Compare July 8, 2025 20:14

agrawalreetika requested a review from ZacBlanco July 10, 2025 17:09

ZacBlanco reviewed Jul 15, 2025

View reviewed changes

agrawalreetika force-pushed the mixed-case-v2-columns-2 branch 2 times, most recently from 0ce4e53 to a3cecfe Compare July 15, 2025 08:51

agrawalreetika added 2 commits July 15, 2025 16:58

Normalize ColumnMetadata to support mixed-case column names

8a78bfc

Enable case-sensitive matching support for all the Jdbc connectors

6a05e77

agrawalreetika force-pushed the mixed-case-v2-columns-2 branch from a3cecfe to 6a05e77 Compare July 15, 2025 11:28

agrawalreetika requested a review from ZacBlanco July 15, 2025 13:09

ZacBlanco approved these changes Jul 15, 2025

View reviewed changes

hantangwangd approved these changes Jul 16, 2025

View reviewed changes

agrawalreetika merged commit e9bba9d into prestodb:master Jul 17, 2025
109 checks passed

This was referenced Jul 24, 2025

Add release notes for 0.294 unix280/presto#39

Merged

Add release notes for 0.294 unix280/presto#40

Merged

prestodb-ci mentioned this pull request Jul 28, 2025

Add release notes for 0.294 #25633

Merged

6 tasks

		String normalizedName = normalizeIdentifier(session, column.getColumnName());
		return column.getColumnMetadata(normalizedName);

	.collect(toMap(MemoryTableHandle::toSchemaTableName, handle -> toTableMetadata(handle, session).getColumns()));
	.collect(toImmutableMap(MemoryTableHandle::toSchemaTableName, handle -> toTableMetadata(handle, session).getColumns()));

Conversation

agrawalreetika commented Apr 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Motivation and Context

Impact

Test Plan

Contributor checklist

Release Notes

Uh oh!

prestodb-ci commented May 14, 2025

Uh oh!

prestodb-ci commented Jun 9, 2025

Uh oh!

steveburnett commented Jul 1, 2025

Uh oh!

agrawalreetika commented Jul 1, 2025

Uh oh!

steveburnett left a comment

Choose a reason for hiding this comment

Uh oh!

agrawalreetika commented Jul 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

agrawalreetika Jul 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ethanyzhang commented Jul 8, 2025

Uh oh!

ZacBlanco left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ZacBlanco left a comment

Choose a reason for hiding this comment

Uh oh!

hantangwangd left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

agrawalreetika commented Apr 25, 2025 •

edited

Loading

agrawalreetika commented Jul 2, 2025 •

edited

Loading

agrawalreetika Jul 3, 2025 •

edited

Loading