Skip to content

Support write large checkpoint in Delta Lake#25078

Merged
raunaqmorarka merged 1 commit intotrinodb:masterfrom
chenjian2664:delta_fix_large_checkpoint
Feb 19, 2025
Merged

Support write large checkpoint in Delta Lake#25078
raunaqmorarka merged 1 commit intotrinodb:masterfrom
chenjian2664:delta_fix_large_checkpoint

Conversation

@chenjian2664
Copy link
Copy Markdown
Contributor

@chenjian2664 chenjian2664 commented Feb 19, 2025

Description

Close #25011

Additional context and related issues

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text:

## Delta Lake
* Fix query failures when writing to delta tables. ({issue}`25011`)

@cla-bot cla-bot bot added the cla-signed label Feb 19, 2025
@github-actions github-actions bot added the delta-lake Delta Lake connector label Feb 19, 2025
@chenjian2664
Copy link
Copy Markdown
Contributor Author

wondering how to add test

@raunaqmorarka
Copy link
Copy Markdown
Member

wondering how to add test

Try adding a unit test which has sufficient entries to go over PageBuilder size threshold

@chenjian2664
Copy link
Copy Markdown
Contributor Author

wondering how to add test

Try adding a unit test which has sufficient entries to go over PageBuilder size threshold

How to know the entries size is sufficient is enough or not, is 1k insert enough?
for instance:

        int size = 1024;
        try (TestTable table = newTrinoTable("test_large_checkpoint", "(data int) WITH (checkpoint_interval = %d)".formatted(size))) {
            for (int i = 0; i < size; i++) {
                System.out.println("current: " + i);
                assertUpdate("INSERT INTO " + table.getName() + " VALUES 1", 1);
            }

            assertQuery("SELECT COUNT(*) FROM " + table.getName(), "VALUES " + size);
        }

does it make sense?

@raunaqmorarka
Copy link
Copy Markdown
Member

wondering how to add test

Try adding a unit test which has sufficient entries to go over PageBuilder size threshold

How to know the entries size is sufficient is enough or not, is 1k insert enough? for instance:

        int size = 1024;
        try (TestTable table = newTrinoTable("test_large_checkpoint", "(data int) WITH (checkpoint_interval = %d)".formatted(size))) {
            for (int i = 0; i < size; i++) {
                System.out.println("current: " + i);
                assertUpdate("INSERT INTO " + table.getName() + " VALUES 1", 1);
            }

            assertQuery("SELECT COUNT(*) FROM " + table.getName(), "VALUES " + size);
        }

does it make sense?

Just use a debugger to see when the page builder gets full and use some size significantly above that

@chenjian2664 chenjian2664 force-pushed the delta_fix_large_checkpoint branch from 85383eb to 92f9878 Compare February 19, 2025 11:34
@raunaqmorarka raunaqmorarka merged commit c8e5dba into trinodb:master Feb 19, 2025
@github-actions github-actions bot added this to the 471 milestone Feb 19, 2025
@chenjian2664 chenjian2664 deleted the delta_fix_large_checkpoint branch February 19, 2025 23:13
throws IOException
{
flush();
writer.close();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we maybe need

try{
   flush();
} finally {
    writer.close();
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla-signed delta-lake Delta Lake connector

Development

Successfully merging this pull request may close these issues.

DeltaLakeMetadata Failed to write checkpoint for table

3 participants