Skip to content

Conversation

@bibith4
Copy link
Contributor

@bibith4 bibith4 commented Sep 11, 2025

Description

Enable and fix all Cassandra connector tests in CI

Motivation and Context

Cassandra test cases were not enabled in the CI. After the PR #25351
to enhance SHOW COLUMNS was merged, some Cassandra test cases started failing. This change re-enables the Cassandra tests and fixes the test failures.

Impact

Enabled all cassandra tests

Test Plan

Contributor checklist

  • Please make sure your submission complies with our contributing guide, in particular code style and commit standards.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.

Release Notes

== NO RELEASE NOTE ==

@prestodb-ci prestodb-ci added the from:IBM PR from IBM label Sep 11, 2025
@sourcery-ai
Copy link
Contributor

sourcery-ai bot commented Sep 11, 2025

Reviewer's Guide

This PR re-enables Cassandra connector tests in CI by updating test classes to ensure proper cleanup, adapting expected results to recent SHOW COLUMNS enhancements, overriding unsupported tests as no-ops, and adjusting build configurations and CI workflows to include the full Cassandra test suite.

Class diagram for updated Cassandra test classes

classDiagram
  class TestCassandraDistributed {
    +setUp()
    +tearDown()
    +testCases()
    -overriddenUnsupportedTests()
  }
  class TestCassandraIntegrationSmokeTest {
    +setUp()
    +tearDown()
    +testShowColumns()
    -adaptedExpectedResults()
    -overriddenUnsupportedTests()
  }
  class TestCassandraTokenSplitManager {
    +setUp()
    +tearDown()
    +testTokenSplits()
    -overriddenUnsupportedTests()
  }
  TestCassandraDistributed <|-- TestCassandraIntegrationSmokeTest
  TestCassandraDistributed <|-- TestCassandraTokenSplitManager
Loading

File-Level Changes

Change Details Files
Wrap DDL sequences in tests with try-finally blocks to guarantee cleanup
  • Added try-finally around CREATE/DROP sequences in integration smoke tests
  • Wrapped keyspace/table/drop operations in finally blocks across multiple test methods
  • Ensured session.execute(DROP ...) is always invoked regardless of test outcome
TestCassandraIntegrationSmokeTest.java
TestCassandraDistributed.java
TestCassandraTokenSplitManager.java
Updated expected SHOW COLUMNS results to include the new statistics columns
  • Expanded resultBuilder schema with three additional BIGINT columns
  • Adjusted .row() calls to supply min/max/null fraction values
  • Replaced literal types with Long.valueOf for clarity
TestCassandraDistributed.java
Disabled or overridden unsupported connector tests as no-ops
  • Added @OverRide no-op implementations for tests relying on array or CHAR types
  • Commented that connector only supports CTAS for transaction tests
  • Disabled payload join and subfield-related tests
TestCassandraDistributed.java
Refactored Maven profile and CI workflow to run full Cassandra tests
  • Renamed Maven profile from test-cassandra-integration-smoke-test to ci-full-tests
  • Removed explicit includes filter to run entire test suite
  • Updated .github/workflows/tests.yml to use -P ci-full-tests for Cassandra
pom.xml
.github/workflows/tests.yml

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@bibith4 bibith4 marked this pull request as ready for review September 11, 2025 11:40
@bibith4 bibith4 requested review from a team, czentgr and unidevel as code owners September 11, 2025 11:40
@prestodb-ci prestodb-ci requested review from a team and wanglinsong and removed request for a team September 11, 2025 11:40
@bibith4 bibith4 requested a review from imjalpreet September 11, 2025 11:40
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes - here's some feedback:

  • The repetitive try–finally cleanup logic in many tests could be extracted into a shared helper or an @after method to reduce boilerplate.
  • Overriding dozens of unsupported test methods as no-ops clutters TestCassandraDistributed — consider filtering them at the base class or using a suite-level exclude instead.
  • The new Maven profile id “ci-full-tests” is generic; renaming it to something more descriptive like “cassandra-integration-tests” would improve clarity.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The repetitive try–finally cleanup logic in many tests could be extracted into a shared helper or an @After method to reduce boilerplate.
- Overriding dozens of unsupported test methods as no-ops clutters TestCassandraDistributed — consider filtering them at the base class or using a suite-level exclude instead.
- The new Maven profile id “ci-full-tests” is generic; renaming it to something more descriptive like “cassandra-integration-tests” would improve clarity.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Copy link
Member

@imjalpreet imjalpreet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, @bibith4. I have a few suggestions and clarifications.

@Override
@Parameters("storageFormat")
@Test
public void testAddDistinctForSemiJoinBuild(@Optional("PARQUET") String storageFormat)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason for this override?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@imjalpreet In AbstractTestQueries, we currently cast o.orderdate to DATE only when the format is "DWRF":

String orderdate = format.equals("DWRF") ? "cast(o.orderdate as DATE)" : "o.orderdate";
As a result, when the test data has orderdate as VARCHAR and the format is not "DWRF", o.orderdate remains a string, causing the date comparison to fail.
To fix this, changed the code to cast o.orderdate to DATE regardless of the format.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that, since the Cassandra connector doesn’t rely on formats, they aren’t relevant here.

Because we only want orderdate to be connector-specific, instead of duplicating the entire test, we could introduce a method in AbstractTestDistributedQueries with a default implementation that defines whether orderdate needs casting. Then, we’d only need to override that method in TestCassandraDistributed.

@Override
@Parameters("storageFormat")
@Test
public void testQueryWithEmptyInput(@Optional("PARQUET") String storageFormat)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason for this override?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@imjalpreet In AbstractTestQueries, we currently cast o.orderdate to DATE only when the format is "DWRF":

String orderdate = format.equals("DWRF") ? "cast(o.orderdate as DATE)" : "o.orderdate";
As a result, when the test data has orderdate as VARCHAR and the format is not "DWRF", o.orderdate remains a string, causing the date comparison to fail.
To fix this, changed the code to cast o.orderdate to DATE regardless of the format.

emptyJoinQueries(enableOptimization);
}

private void emptyJoinQueries(Session session)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason for this override?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@imjalpreet The emptyJoinQueries() method was not overridden. It is used by testQueryWithEmptyInput().

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once we refactor as I suggested, we won't need to copy this method as well.

assertSelect("table_all_types_copy", true);
}
finally {
execute("DROP TABLE table_all_types_copy");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
execute("DROP TABLE table_all_types_copy");
execute("DROP TABLE IF EXISTS table_all_types_copy");

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected . Please check

assertEquals(execute("SELECT column_1 FROM cassandra.keyspace_1.table_1").getRowCount(), 1);
}
finally {
assertUpdate("DROP TABLE cassandra.keyspace_1.table_1");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
assertUpdate("DROP TABLE cassandra.keyspace_1.table_1");
assertUpdate("DROP TABLE IF EXISTS cassandra.keyspace_1.table_1");

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected . Please check

assertEquals(execute("SELECT column_2 FROM cassandra.keyspace_2.table_2").getRowCount(), 1);
}
finally {
assertUpdate("DROP TABLE cassandra.keyspace_2.table_2");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
assertUpdate("DROP TABLE cassandra.keyspace_2.table_2");
assertUpdate("DROP TABLE IF EXISTS cassandra.keyspace_2.table_2");

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected . Please check

assertEquals(splits.size(), 1);
}
finally {
session.execute(format("DROP TABLE %s.%s", KEYSPACE, tableName));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
session.execute(format("DROP TABLE %s.%s", KEYSPACE, tableName));
session.execute(format("DROP TABLE IF EXISTS %s.%s", KEYSPACE, tableName));

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected . Please check

assertEquals(splits.size(), PARTITION_COUNT / SPLIT_SIZE);
}
finally {
session.execute(format("DROP TABLE %s.%s", KEYSPACE, tableName));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
session.execute(format("DROP TABLE %s.%s", KEYSPACE, tableName));
session.execute(format("DROP TABLE IF EXISTS %s.%s", KEYSPACE, tableName));

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected . Please check

@bibith4 bibith4 force-pushed the enable_and_fix_cassandra_test branch from 7a33dfe to e7148f3 Compare September 11, 2025 13:42
@bibith4 bibith4 requested a review from imjalpreet September 11, 2025 14:08
Copy link
Member

@imjalpreet imjalpreet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @bibith4.

I have a suggestion that should help reduce duplication.

@Override
@Parameters("storageFormat")
@Test
public void testAddDistinctForSemiJoinBuild(@Optional("PARQUET") String storageFormat)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that, since the Cassandra connector doesn’t rely on formats, they aren’t relevant here.

Because we only want orderdate to be connector-specific, instead of duplicating the entire test, we could introduce a method in AbstractTestDistributedQueries with a default implementation that defines whether orderdate needs casting. Then, we’d only need to override that method in TestCassandraDistributed.

emptyJoinQueries(enableOptimization);
}

private void emptyJoinQueries(Session session)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once we refactor as I suggested, we won't need to copy this method as well.

Comment on lines 283 to 284
<configuration>
<includes>
<include>**/TestCassandraIntegrationSmokeTest.java</include>
</includes>
</configuration>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: please remove this empty configuration property

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@imjalpreet Corrected. Can you please check

@bibith4 bibith4 force-pushed the enable_and_fix_cassandra_test branch 2 times, most recently from 206a3d4 to e40a2ef Compare September 15, 2025 05:54
@bibith4 bibith4 requested a review from imjalpreet September 15, 2025 06:12
Copy link
Member

@imjalpreet imjalpreet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, @bibith4.

LGTM % minor refactor.

Comment on lines 8091 to 8099

/**
By default, casts only when storageFormat is DWRF.
*/
protected String getOrderDateExpression(String storageFormat)
{
// DWRF does not support date type.
return storageFormat.equals("DWRF") ? "cast(o.orderdate as DATE)" : "o.orderdate";
}

/**
By default, casts only when storageFormat is DWRF.
*/
protected String getShipDateExpression(String storageFormat)
{
// DWRF does not support date type.
return storageFormat.equals("DWRF") ? "cast(l.shipdate as DATE)" : "l.shipdate";
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/**
By default, casts only when storageFormat is DWRF.
*/
protected String getOrderDateExpression(String storageFormat)
{
// DWRF does not support date type.
return storageFormat.equals("DWRF") ? "cast(o.orderdate as DATE)" : "o.orderdate";
}
/**
By default, casts only when storageFormat is DWRF.
*/
protected String getShipDateExpression(String storageFormat)
{
// DWRF does not support date type.
return storageFormat.equals("DWRF") ? "cast(l.shipdate as DATE)" : "l.shipdate";
}
/**
* Returns a date expression, casting to DATE if storageFormat is DWRF.
*/
protected String getDateExpression(String storageFormat, String columnExpression)
{
// DWRF does not support date type.
return storageFormat.equals("DWRF") ? "cast(" + columnExpression + " as DATE)" : columnExpression;
}

nit: good to refactor

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@imjalpreet Refactored the code as per suggestion. Can you please check

@bibith4 bibith4 force-pushed the enable_and_fix_cassandra_test branch from e40a2ef to 1fb914c Compare September 16, 2025 05:05
@bibith4 bibith4 force-pushed the enable_and_fix_cassandra_test branch from 1fb914c to 9a22377 Compare September 16, 2025 05:42
Copy link
Member

@imjalpreet imjalpreet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, @bibith4.

@imjalpreet
Copy link
Member

@tdcmeehan, can you please help review this?

Also, we need to modify the required check as the profile name is not test-cassandra-integration-smoke-test any longer.

@tdcmeehan tdcmeehan merged commit 0cbc311 into prestodb:master Sep 17, 2025
102 of 103 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

from:IBM PR from IBM

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants