Skip to content

fix(plugin-cassandra): Drop stale tables if table creation process fails#27100

Merged
pdabre12 merged 3 commits intoprestodb:masterfrom
pdabre12:fix-cassandra-CTAS
Feb 12, 2026
Merged

fix(plugin-cassandra): Drop stale tables if table creation process fails#27100
pdabre12 merged 3 commits intoprestodb:masterfrom
pdabre12:fix-cassandra-CTAS

Conversation

@pdabre12
Copy link
Copy Markdown
Contributor

@pdabre12 pdabre12 commented Feb 6, 2026

Description

Dropping stale tables left behind in Cassandra connector if table creation process fails.

Motivation and Context

When a CREATE TABLE or CTAS operation fails during execution, Cassandra connector may leave behind a partially created table.

This results in:

  1. Stale tables in Cassandra.
  2. Metadata inconsistencies between Presto and Cassandra

Impact

No user impact

Test Plan

Unit test, CI

Contributor checklist

  • Please make sure your submission complies with our contributing guide, in particular code style and commit standards.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.
  • If adding new dependencies, verified they have an OpenSSF Scorecard score of 5.0 or higher (or obtained explicit TSC approval for lower scores).

Release Notes

Please follow release notes guidelines and fill in the release notes below.

== RELEASE NOTES ==

Cassandra Connector Changes
* Drop stale tables if table creation process fails.

Summary by Sourcery

Add per-transaction Cassandra metadata with rollback support to clean up tables created during failed or aborted table-creation operations.

Bug Fixes:

  • Ensure tables created during Cassandra table-creation are dropped on transaction rollback when allowed, preventing stale tables after failures.

Enhancements:

  • Replace the singleton Cassandra transaction handle with UUID-based handles and track connector metadata per transaction, adding commit and rollback handling in the Cassandra connector.
  • Adjust Cassandra connector wiring and tests to construct CassandraMetadata per transaction instead of as a singleton.

Summary by Sourcery

Ensure Cassandra connector tracks per-transaction metadata and cleans up partially created tables on rollback to avoid stale tables after failed table creation.

Bug Fixes:

  • Prevent stale Cassandra tables by dropping tables created during failed or aborted CREATE TABLE operations when transactions are rolled back.

Enhancements:

  • Introduce UUID-based Cassandra transaction handles and manage CassandraMetadata instances per transaction with commit and rollback support in the connector.
  • Wire Cassandra connector dependencies directly into CassandraMetadata construction instead of using a singleton metadata binding.

Tests:

  • Update Cassandra connector tests to use transactional metadata access and add a rollback test verifying that aborted table creation does not leave the table behind.

@prestodb-ci prestodb-ci added the from:IBM PR from IBM label Feb 6, 2026
@sourcery-ai
Copy link
Copy Markdown
Contributor

sourcery-ai bot commented Feb 6, 2026

Reviewer's Guide

Refactors the Cassandra connector to use per-transaction CassandraMetadata instances tied to UUID-based transaction handles, adds commit/rollback semantics that execute a stored rollback action to drop newly created tables when table creation fails, wires the new lifecycle through the connector modules, and updates tests to use transactional metadata access and verify rollback behavior.

Sequence diagram for Cassandra table creation rollback on transaction abort

sequenceDiagram
    actor PrestoEngine
    participant CassandraConnector
    participant CassandraMetadata
    participant CassandraSession
    participant Cassandra

    PrestoEngine->>CassandraConnector: beginTransaction(isolationLevel, readOnly)
    CassandraConnector->>CassandraConnector: checkConnectorSupports(READ_UNCOMMITTED, isolationLevel)
    CassandraConnector->>CassandraTransactionHandle: new CassandraTransactionHandle()
    CassandraConnector->>CassandraMetadata: new CassandraMetadata(connectorId, cassandraSession, partitionManager, extraColumnMetadataCodec, config)
    CassandraConnector->>CassandraConnector: transactions.put(transaction, metadata)
    CassandraConnector-->>PrestoEngine: transactionHandle

    PrestoEngine->>CassandraConnector: getMetadata(transactionHandle)
    CassandraConnector->>CassandraConnector: metadata = transactions.get(transactionHandle)
    CassandraConnector-->>PrestoEngine: CassandraMetadata

    PrestoEngine->>CassandraMetadata: createTable(session, tableMetadata)
    CassandraMetadata->>CassandraSession: execute(CREATE TABLE ...)
    CassandraSession->>Cassandra: CREATE TABLE
    Cassandra-->>CassandraSession: success
    CassandraSession-->>CassandraMetadata: ok
    CassandraMetadata->>CassandraMetadata: setRollback(schemaName, tableName)
    CassandraMetadata-->>PrestoEngine: CassandraOutputTableHandle

    PrestoEngine->>PrestoEngine: failure before finishCreateTable

    PrestoEngine->>CassandraConnector: rollback(transactionHandle)
    CassandraConnector->>CassandraConnector: metadata = transactions.remove(transactionHandle)
    CassandraConnector->>CassandraMetadata: rollback()
    CassandraMetadata->>CassandraMetadata: rollbackAction.getAndSet(null)
    CassandraMetadata->>CassandraSession: execute(DROP TABLE schema.table)
    CassandraSession->>Cassandra: DROP TABLE
    Cassandra-->>CassandraSession: success
    CassandraSession-->>CassandraMetadata: ok
    CassandraMetadata-->>CassandraConnector: rollback completed
    CassandraConnector-->>PrestoEngine: rollback completed
Loading

Class diagram for updated Cassandra transaction and metadata lifecycle

classDiagram

class CassandraTransactionHandle {
    - UUID uuid
    + CassandraTransactionHandle()
    + CassandraTransactionHandle(UUID uuid)
    + UUID getUuid()
    + boolean equals(Object obj)
    + int hashCode()
    + String toString()
}

class CassandraConnector {
    - CassandraConnectorId connectorId
    - LifeCycleManager lifeCycleManager
    - CassandraPartitionManager partitionManager
    - CassandraClientConfig config
    - CassandraSession cassandraSession
    - CassandraSplitManager splitManager
    - ConnectorRecordSetProvider recordSetProvider
    - ConnectorPageSinkProvider pageSinkProvider
    - List~PropertyMetadata~ sessionProperties
    - JsonCodec~List~ExtraColumnMetadata~~ extraColumnMetadataCodec
    - ConcurrentMap~ConnectorTransactionHandle, CassandraMetadata~ transactions
    + CassandraConnector(CassandraConnectorId connectorId, LifeCycleManager lifeCycleManager, CassandraSplitManager splitManager, CassandraRecordSetProvider recordSetProvider, CassandraPageSinkProvider pageSinkProvider, CassandraSessionProperties sessionProperties, CassandraSession cassandraSession, CassandraPartitionManager partitionManager, JsonCodec~List~ExtraColumnMetadata~~ extraColumnMetadataCodec, CassandraClientConfig config)
    + ConnectorTransactionHandle beginTransaction(IsolationLevel isolationLevel, boolean readOnly)
    + ConnectorCommitHandle commit(ConnectorTransactionHandle transaction)
    + void rollback(ConnectorTransactionHandle transaction)
    + ConnectorMetadata getMetadata(ConnectorTransactionHandle transaction)
    + boolean isSingleStatementWritesOnly()
}

class CassandraMetadata {
    - CassandraConnectorId connectorId
    - CassandraSession cassandraSession
    - CassandraPartitionManager partitionManager
    - CassandraClientConfig config
    - JsonCodec~List~ExtraColumnMetadata~~ extraColumnMetadataCodec
    - AtomicReference~Runnable~ rollbackAction
    - boolean allowDropTable
    + CassandraMetadata(CassandraConnectorId connectorId, CassandraSession cassandraSession, CassandraPartitionManager partitionManager, JsonCodec~List~ExtraColumnMetadata~~ extraColumnMetadataCodec, CassandraClientConfig config)
    + CassandraOutputTableHandle createTable(ConnectorSession session, ConnectorTableMetadata tableMetadata)
    + Optional~ConnectorOutputMetadata~ finishCreateTable(ConnectorSession session, ConnectorOutputTableHandle tableHandle, Collection~Slice~ fragments, Collection~ComputedStatistics~ computedStatistics)
    + void rollback()
    + String normalizeIdentifier(ConnectorSession session, String identifier)
    - void setRollback(String schemaName, String tableName)
    - void clearRollback()
}

class CassandraSession {
    + void execute(String cql)
}

class CassandraPartitionManager
class CassandraClientConfig
class ExtraColumnMetadata
class LifeCycleManager
class CassandraSplitManager
class CassandraRecordSetProvider
class CassandraPageSinkProvider
class CassandraSessionProperties {
    + List~PropertyMetadata~ getSessionProperties()
}

CassandraConnector --> CassandraTransactionHandle : creates
CassandraConnector "1" --> "*" CassandraMetadata : per transaction
CassandraConnector --> CassandraSession
CassandraConnector --> CassandraPartitionManager
CassandraConnector --> CassandraClientConfig
CassandraConnector --> CassandraSplitManager
CassandraConnector --> CassandraRecordSetProvider
CassandraConnector --> CassandraPageSinkProvider
CassandraConnector --> CassandraSessionProperties
CassandraConnector --> ExtraColumnMetadata : uses

CassandraMetadata --> CassandraSession
CassandraMetadata --> CassandraPartitionManager
CassandraMetadata --> CassandraClientConfig
CassandraMetadata --> ExtraColumnMetadata : encoded by

CassandraTransactionHandle ..|> ConnectorTransactionHandle
CassandraConnector ..|> Connector
CassandraMetadata ..|> ConnectorMetadata
Loading

File-Level Changes

Change Details Files
Introduce UUID-based CassandraTransactionHandle and transaction-scoped metadata lifecycle in CassandraConnector, including commit and rollback hooks.
  • Replace the singleton CassandraTransactionHandle enum with a class that wraps a UUID and is JSON-serializable with proper equals/hashCode/toString.
  • Extend CassandraConnector to manage a ConcurrentMap from ConnectorTransactionHandle to CassandraMetadata, creating a new CassandraMetadata per beginTransaction and enforcing READ_UNCOMMITTED.
  • Implement commit and rollback methods on CassandraConnector that validate and remove transactions, and delegate to CassandraMetadata.rollback() on rollback.
presto-cassandra/src/main/java/com/facebook/presto/cassandra/CassandraTransactionHandle.java
presto-cassandra/src/main/java/com/facebook/presto/cassandra/CassandraConnector.java
Add rollback support in CassandraMetadata to drop newly created tables on aborted or failed table-creation operations.
  • Add an AtomicReference rollbackAction field to CassandraMetadata to track a single pending rollback operation.
  • In createTable, set a rollback action that issues a DROP TABLE for the newly created table before returning the output handle.
  • In finishCreateTable, clear the rollback action so successful completion does not trigger cleanup.
  • Implement a public rollback() method that enforces allowDropTable and executes any pending rollback action, and helper methods to set/clear the rollback action with state checks.
presto-cassandra/src/main/java/com/facebook/presto/cassandra/CassandraMetadata.java
Update Guice wiring so CassandraMetadata is no longer a singleton and can be created per transaction by CassandraConnector.
  • Remove the singleton binding for CassandraMetadata from CassandraClientModule, leaving CassandraConnector and other components as singletons.
  • Inject CassandraSession, CassandraPartitionManager, CassandraClientConfig, and JsonCodec<List> into CassandraConnector so it can construct CassandraMetadata instances itself.
presto-cassandra/src/main/java/com/facebook/presto/cassandra/CassandraClientModule.java
presto-cassandra/src/main/java/com/facebook/presto/cassandra/CassandraConnector.java
Adjust Cassandra connector tests to use transactional metadata access and add coverage for rollback dropping stale tables.
  • Store the constructed Connector in TestCassandraConnector instead of a shared ConnectorMetadata, and configure the connector with cassandra.allow-drop-table=true.
  • Update existing tests to begin a transaction, fetch ConnectorMetadata from the connector, and pass the transaction handle into split and record-set APIs instead of using the singleton transaction handle.
  • Add a new testRollbackTables that starts a transaction, begins createTable, simulates a failure, calls connector.rollback(), and asserts the table does not appear in listTables.
  • Minor helper changes to getTableHandle to accept ConnectorMetadata and new SchemaTableName fields for rollback-related tables.
presto-cassandra/src/test/java/com/facebook/presto/cassandra/TestCassandraConnector.java

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@pdabre12 pdabre12 changed the title fix(plugin-cassandra) : Drop stale tables if table creation process fails fix(plugin-cassandra): Drop stale tables if table creation process fails Feb 6, 2026
@pdabre12 pdabre12 marked this pull request as ready for review February 11, 2026 19:19
@pdabre12 pdabre12 requested a review from a team as a code owner February 11, 2026 19:19
@prestodb-ci prestodb-ci requested review from a team, imsayari404 and infvg and removed request for a team February 11, 2026 19:19
Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 3 issues, and left some high level feedback:

  • In CassandraMetadata.rollback(), throwing PERMISSION_DENIED whenever allowDropTable is false will make any rollback fail for that catalog, even when no rollback action was registered; consider only enforcing this when a rollback action is actually present, to avoid breaking read-only transactions or transactions that never created a table.
  • The single AtomicReference<Runnable> rollbackAction combined with checkState in setRollback means a transaction can only safely create one table; if multiple table-creation operations per transaction are expected, this should be changed to track rollback actions per table or explicitly guarded/validated at a higher level.
  • The transactions map in CassandraConnector relies on the engine to always call commit/rollback; if that contract might not hold under failures, consider adding a defensive cleanup path (e.g., on connector shutdown) or logging to help detect and debug leaked transaction entries.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `CassandraMetadata.rollback()`, throwing `PERMISSION_DENIED` whenever `allowDropTable` is false will make *any* rollback fail for that catalog, even when no rollback action was registered; consider only enforcing this when a rollback action is actually present, to avoid breaking read-only transactions or transactions that never created a table.
- The single `AtomicReference<Runnable> rollbackAction` combined with `checkState` in `setRollback` means a transaction can only safely create one table; if multiple table-creation operations per transaction are expected, this should be changed to track rollback actions per table or explicitly guarded/validated at a higher level.
- The `transactions` map in `CassandraConnector` relies on the engine to always call `commit`/`rollback`; if that contract might not hold under failures, consider adding a defensive cleanup path (e.g., on connector shutdown) or logging to help detect and debug leaked transaction entries.

## Individual Comments

### Comment 1
<location> `presto-cassandra/src/main/java/com/facebook/presto/cassandra/CassandraMetadata.java:374-380` </location>
<code_context>
         return caseSensitiveNameMatchingEnabled ? identifier : identifier.toLowerCase(ROOT);
     }
+
+    public void rollback()
+    {
+        if (!allowDropTable) {
+            throw new PrestoException(
+                    PERMISSION_DENIED, "Table creation was aborted and requires rollback, but cleanup failed because DROP TABLE is disabled in this Cassandra catalog.");
+        }
+        Optional.ofNullable(rollbackAction.getAndSet(null)).ifPresent(Runnable::run);
+    }
+
</code_context>

<issue_to_address>
**issue (bug_risk):** Rollback should only fail with PERMISSION_DENIED when there is an actual rollback action to run

Currently, `rollback()` throws `PrestoException(PERMISSION_DENIED, ...)` whenever `allowDropTable` is false, even if `rollbackAction` is `null`. That means transactions that never created a table will still fail rollback when `allowDropTable` is disabled.

You can avoid this by only enforcing the permission when there’s an actual action to run, e.g.:

```java
public void rollback()
{
    Runnable action = rollbackAction.getAndSet(null);
    if (action == null) {
        return; // nothing to roll back
    }

    if (!allowDropTable) {
        throw new PrestoException(
                PERMISSION_DENIED,
                "Table creation was aborted and requires rollback, but cleanup failed because DROP TABLE is disabled in this Cassandra catalog.");
    }

    action.run();
}
```
</issue_to_address>

### Comment 2
<location> `presto-cassandra/src/test/java/com/facebook/presto/cassandra/TestCassandraConnector.java:143` </location>
<code_context>
+        concurrentCreateTable = new SchemaTableName(database, "concurrent_create_table");
     }

     @Test
</code_context>

<issue_to_address>
**suggestion (testing):** Strengthen `testRollbackTables` by asserting the table exists before rollback to prove cleanup actually happens

Currently `testRollbackTables` only asserts that `rollbackTable` is absent after rollback, which would still pass if the table creation silently failed. To verify that rollback actually cleans up an existing table, add an assertion that the table exists immediately after `beginCreateTable` and before simulating the failure, then keep the final `assertFalse` after `connector.rollback(transactionHandle)` so the test confirms a transition from present to absent.
</issue_to_address>

### Comment 3
<location> `presto-cassandra/src/test/java/com/facebook/presto/cassandra/TestCassandraConnector.java:105` </location>
<code_context>
     protected SchemaTableName tableUnpartitioned;
     protected SchemaTableName invalidTable;
+    protected SchemaTableName rollbackTable;
+    protected SchemaTableName concurrentCreateTable;
     private CassandraServer server;
-    private ConnectorMetadata metadata;
</code_context>

<issue_to_address>
**suggestion (testing):** Either exercise `concurrentCreateTable` in a test or remove it; consider testing multiple `beginCreateTable` calls in the same transaction

`concurrentCreateTable` is added but never used in tests. Given the new `AtomicReference<Runnable> rollbackAction` and the `checkState` in `setRollback`, this would be a good candidate to verify behavior when `beginCreateTable` is invoked multiple times in the same transaction (e.g., second call throws `IllegalStateException` because a rollback is already set). If you don’t intend to test that scenario, consider removing `concurrentCreateTable` to keep the fixture focused.

Suggested implementation:

```java
    protected SchemaTableName invalidTable;
    protected SchemaTableName rollbackTable;
    private CassandraServer server;

```

Search the remainder of `TestCassandraConnector.java` for any usages or initializations of `concurrentCreateTable` (for example, in `@BeforeClass setup()` or any test methods) and remove those lines as well, since the field is no longer defined.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

tdcmeehan
tdcmeehan previously approved these changes Feb 11, 2026
@pdabre12 pdabre12 merged commit a023a5b into prestodb:master Feb 12, 2026
81 of 82 checks passed
@pdabre12 pdabre12 deleted the fix-cassandra-CTAS branch February 12, 2026 01:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

from:IBM PR from IBM

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants