Skip to content

Conversation

@bibith4
Copy link
Contributor

@bibith4 bibith4 commented Aug 5, 2025

Description

Enable case sensitivity support in Cassandra connector

Motivation and Context

Cassandra connector always treated table and column names as lowercase. With this change, the connector can preserve case sensitivity, allowing the creation of tables and columns with the same name but different casing, depending on the configuration setting.

Impact

Supports creating tables and columns with the same name in different cases

Screenshot 2025-08-12 at 6 07 07 PM

Test Plan

Contributor checklist

  • Please make sure your submission complies with our contributing guide, in particular code style and commit standards.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.

Release Notes

Please follow release notes guidelines and fill in the release notes below.

== RELEASE NOTES ==

Cassandra Connector Changes
* Add support for case-sensitive identifiers in Cassandra. It can be enabled by setting ``case-sensitive-name-matching=true`` configuration in the catalog configuration

@prestodb-ci prestodb-ci added the from:IBM PR from IBM label Aug 5, 2025
@agrawalreetika agrawalreetika changed the title [Do not review]Mixed case support for cassandra [Do not review] Case-sensitive support for cassandra Aug 6, 2025
@bibith4 bibith4 force-pushed the mixedcase_cassandra branch 5 times, most recently from 44c3dcc to 73ec9cd Compare August 11, 2025 15:46
@bibith4 bibith4 marked this pull request as ready for review August 12, 2025 07:10
@bibith4 bibith4 requested a review from a team as a code owner August 12, 2025 07:10
@prestodb-ci prestodb-ci requested review from a team, BryanCutler and ScrapCodes and removed request for a team August 12, 2025 07:10
@bibith4 bibith4 requested a review from agrawalreetika August 12, 2025 07:10
@bibith4 bibith4 changed the title [Do not review] Case-sensitive support for cassandra Case-sensitive support for cassandra Aug 12, 2025
Copy link
Member

@agrawalreetika agrawalreetika left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making the changes and adding the support.
I think currently, schemas are still getting converted to lowercase? https://github.com/bibith4/presto/blob/73ec9cd4a49982120eb3af5e781a943baf755205/presto-cassandra/src/main/java/com/facebook/presto/cassandra/CassandraMetadata.java#L101

Please verify and make the changes along with the tests accordingly.

@bibith4 bibith4 force-pushed the mixedcase_cassandra branch 3 times, most recently from 5dd62cf to c81c097 Compare August 13, 2025 10:08
@bibith4
Copy link
Contributor Author

bibith4 commented Aug 13, 2025

Thanks for making the changes and adding the support. I think currently, schemas are still getting converted to lowercase? https://github.com/bibith4/presto/blob/73ec9cd4a49982120eb3af5e781a943baf755205/presto-cassandra/src/main/java/com/facebook/presto/cassandra/CassandraMetadata.java#L101

Please verify and make the changes along with the tests accordingly.

@agrawalreetika Added the requested changes. Please check

@bibith4 bibith4 requested a review from agrawalreetika August 13, 2025 10:09
session.execute("CREATE KEYSPACE test_keyspace WITH REPLICATION = {'class':'SimpleStrategy', 'replication_factor': 1}");
assertContainsEventually(() -> getQueryRunner().execute("SHOW SCHEMAS FROM cassandra"), resultBuilder(getSession(), createUnboundedVarcharType())
.row("test_keyspace")
.build(), new Duration(1, MINUTES));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets add schema creation for same KEYSPACE in upper case as well and then check for lowercase and upper cae for in testShowSchemas

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add the documenation details as well for new config - https://prestodb.io/docs/current/connector/cassandra.html#configuration-properties

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add the documenation details as well for new config - https://prestodb.io/docs/current/connector/cassandra.html#configuration-properties

Yes, please add documentation. You can use the MySQL doc as a model where the case-sensitive-name-matching property is documented in the last row of the table.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@agrawalreetika @steveburnett updated the document with new config property. Please check

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets add schema creation for same KEYSPACE in upper case as well and then check for lowercase and upper cae for in testShowSchemas

@agrawalreetika Added the required changes. Please check

Copy link
Contributor

@steveburnett steveburnett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the docs! Minor nits.

@bibith4 bibith4 force-pushed the mixedcase_cassandra branch from 6b547ef to 58971f4 Compare August 18, 2025 15:10
@bibith4 bibith4 requested a review from steveburnett August 18, 2025 15:12
steveburnett
steveburnett previously approved these changes Aug 18, 2025
Copy link
Contributor

@steveburnett steveburnett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! (docs)

Pull updated branch, new local doc build, looks good. Thanks!

Copy link
Member

@imjalpreet imjalpreet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bibith4, some changes appear to be in the wrong commit. Please check and update the commits to include only relevant changes. It would really help in the review process. Thank you!

@bibith4
Copy link
Contributor Author

bibith4 commented Sep 11, 2025

@bibith4, some changes appear to be in the wrong commit. Please check and update the commits to include only relevant changes. It would really help in the review process. Thank you!

@imjalpreet As discussed created a seperate PR for Enable and fix all Cassandra connector tests in CI(#26022 (comment)).

@bibith4 bibith4 force-pushed the mixedcase_cassandra branch 3 times, most recently from aa7e9d7 to a1fbb34 Compare September 18, 2025 06:43
@bibith4
Copy link
Contributor Author

bibith4 commented Sep 18, 2025

@imjalpreet Updated the PR to include only mixed-case support for the Cassandra change. Can you please take a look.

@bibith4 bibith4 requested a review from imjalpreet September 18, 2025 07:29
session.execute("DROP KEYSPACE keyspace_1");
}
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bibith4 Can you add a test method for ALTER operations as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@agrawalreetika Cassandra does not support ALTER operations

Copy link
Member

@imjalpreet imjalpreet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @bibith4.

try {
for (String tableName : cassandraSession.getCaseSensitiveTableNames(schemaName)) {
tableNames.add(new SchemaTableName(schemaName, tableName.toLowerCase(ENGLISH)));
String finalTableName = normalizeIdentifier(session, tableName);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
String finalTableName = normalizeIdentifier(session, tableName);
String normalizedTableName = normalizeIdentifier(session, tableName);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@imjalpreet Changed as per suggestion. Please check

for (CassandraColumnHandle columnHandle : table.getColumns()) {
columnHandles.put(CassandraCqlUtils.cqlNameToSqlName(columnHandle.getName()).toLowerCase(ENGLISH), columnHandle);
String columnName = CassandraCqlUtils.cqlNameToSqlName(columnHandle.getName());
String finalColumnName = normalizeIdentifier(session, columnName);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
String finalColumnName = normalizeIdentifier(session, columnName);
String normalizedColumnName = normalizeIdentifier(session, columnName);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@imjalpreet Changed as per suggestion. Please check

ImmutableMap.Builder<String, ColumnHandle> columnHandles = ImmutableMap.builder();
for (CassandraColumnHandle columnHandle : table.getColumns()) {
columnHandles.put(CassandraCqlUtils.cqlNameToSqlName(columnHandle.getName()).toLowerCase(ENGLISH), columnHandle);
String columnName = CassandraCqlUtils.cqlNameToSqlName(columnHandle.getName());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: static import for cqlNameToSqlName

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@imjalpreet Changed as per suggestion. Please check

Comment on lines 305 to 317
if (result != null) {
throw new PrestoException(
NOT_SUPPORTED,
format("More than one keyspace has been found for the case insensitive schema name: %s -> (%s, %s)",
caseInsensitiveSchemaName, result.getName(), keyspace.getName()));
format("More than one keyspace has been found for the schema name: %s -> (%s, %s)",
caseSensitiveSchemaName, result.getName(), keyspace.getName()));
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this check when caseSensitiveNameMatchingEnabled is true? I think it can only happen if we match case-insensitively.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@imjalpreet Changed to check only if caseSensitiveNameMatchingEnabled is false

keyspace.getTables().stream(),
keyspace.getMaterializedViews().stream())
.filter(table -> table.getName().equalsIgnoreCase(caseInsensitiveTableName))
.filter(table -> caseSensitiveNameMatchingEnabled ? table.getName().equals(caseSensitiveTableName) : table.getName().equalsIgnoreCase(caseSensitiveTableName))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe we can make the matchesSchemaName method generic and use it for both table and schema checks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@imjalpreet Changed to make the method generic

Comment on lines 362 to 373
if (columnMap.containsKey(columnNameKey)) {
throw new PrestoException(
NOT_SUPPORTED,
format("More than one column has been found for the case insensitive column name: %s -> (%s, %s)",
lowercaseName, lowercaseNameToColumnMap.get(lowercaseName).getName(), column.getName()));
columnNameKey, columnMap.get(columnNameKey).getName(), column.getName()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question here. Do we need this when caseSensitiveNameMatchingEnabled is true?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@imjalpreet Changed to check only if caseSensitiveNameMatchingEnabled is false

{
private CassandraServer server;
private CassandraSession session;
private static final String KEYSPACE = "test_connetor";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
private static final String KEYSPACE = "test_connetor";
private static final String KEYSPACE = "test_connector";

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@imjalpreet Corrected. Please check

Comment on lines 101 to 102
assertQueryFails("CREATE TABLE test_connector.TEST_CREATEAS_FAIL_Join AS SELECT c.custkey, o.orderkey FROM " +
"tpch.customer c INNER JOIN tpch.ORDERS1 o ON c.custkey = o.custkey WHERE c.mktsegment = 'BUILDING'", "Table cassandra.tpch.ORDERS1 does not exist"); //failure scenario since tpch.ORDERS1 doesn't exist
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the significance of this test? Shouldn't we test with just ORDERS rather than ORDERS1 as lowercase orders exists?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@imjalpreet Corrected to use ORDERS. Please check

"tpch.customer Cus INNER JOIN tpch.orders Ord ON Cus.custkey = Ord.custkey WHERE Cus.mktsegment = 'BUILDING'");
assertTrue(getQueryRunner().tableExists(session, "Test_CreateAs_Mixed_Join"));
}
finally {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: you missed dropping TEST_CREATEAS_Join

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@imjalpreet added drop statement for TEST_CREATEAS_Join

assertQuery("SELECT COUNT(*) FROM test_connetor.test_select", "VALUES 2");
}
finally {
getQueryRunner().execute(session, "DROP TABLE IF EXISTS TEST_SELECT");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We created lowercase TEST_SELECT

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@imjalpreet Corrected. Please check

Copy link
Member

@imjalpreet imjalpreet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, @bibith4.

Changes LGTM now, last few nits.

format("More than one keyspace has been found for the case insensitive schema name: %s -> (%s, %s)",
caseInsensitiveSchemaName, result.getName(), keyspace.getName()));
format("More than one keyspace has been found for the schema name: %s -> (%s, %s)",
caseSensitiveSchemaName, result.getName(), keyspace.getName()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think we should print the lowercased schema name in the error message

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@imjalpreet Corrected. Please check

format("More than one table has been found for the case insensitive table name: %s -> (%s)",
caseInsensitiveTableName, tableNames));
caseSensitiveTableName, tableNames));
}
Copy link
Member

@imjalpreet imjalpreet Sep 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missed this the last time, similar to schema and column names, this is also only needed when caseSensitiveNameMatchingEnabled is false

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, ignore this. Its already handled.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But print the lowercased table name in error message

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@imjalpreet Corrected. Please check

private static void checkColumnNames(List<ColumnMetadata> columns)
{
Map<String, ColumnMetadata> lowercaseNameToColumnMap = new HashMap<>();
Map<String, ColumnMetadata> columnMap = new HashMap<>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe this line change can now be reverted

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@imjalpreet Corrected. Please check

@bibith4 bibith4 force-pushed the mixedcase_cassandra branch 2 times, most recently from 5cbeef3 to 2aa6330 Compare September 22, 2025 06:44
@bibith4 bibith4 requested a review from imjalpreet September 22, 2025 06:50
@bibith4 bibith4 changed the title Case-sensitive support for cassandra feat(cassandra): Enable case-sensitive support Oct 3, 2025
@bibith4 bibith4 force-pushed the mixedcase_cassandra branch from 2aa6330 to d3cb81e Compare October 3, 2025 05:47
@bibith4 bibith4 force-pushed the mixedcase_cassandra branch from d3cb81e to 22b61df Compare October 6, 2025 05:09
Copy link
Member

@imjalpreet imjalpreet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, @bibith4.

LGTM.

@imjalpreet imjalpreet requested a review from tdcmeehan October 8, 2025 10:06
Comment on lines +47 to +51
connectorProperties = new HashMap<>(ImmutableMap.copyOf(connectorProperties));
connectorProperties.putIfAbsent("cassandra.contact-points", server.getHost());
connectorProperties.putIfAbsent("cassandra.native-protocol-port", Integer.toString(server.getPort()));
connectorProperties.putIfAbsent("cassandra.allow-drop-table", "true");

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these related to the case sensitivity changes?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These were already being used. The change is to add them to the newly introduced connectorProperties map.

@imjalpreet imjalpreet merged commit 75a4ff9 into prestodb:master Oct 8, 2025
113 of 117 checks passed
imsayari404 pushed a commit to imsayari404/presto that referenced this pull request Oct 13, 2025
## Description
Enable case sensitivity support in Cassandra connector

## Motivation and Context
Cassandra connector always treated table and column names as lowercase.
With this change, the connector can preserve case sensitivity, allowing
the creation of tables and columns with the same name but different
casing, depending on the configuration setting.

## Impact
Supports creating tables and columns with the same name in different
cases
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

from:IBM PR from IBM

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants