Skip to content

Conversation

@jerry-024
Copy link
Contributor

@jerry-024 jerry-024 commented Nov 3, 2025

Purpose

Format table: support write file in _temporary at first

Tests

API and Format

Documentation

@jerry-024 jerry-024 marked this pull request as draft November 3, 2025 03:15
@jerry-024 jerry-024 marked this pull request as ready for review November 3, 2025 08:40
.filter(c -> c instanceof RenamingTwoPhaseOutputStream.TempFileCommitter)
.findAny()
.isPresent()) {
if (partitionPaths.size() > 1) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should introduce a clean method to TwoPhaseOutputStream.

+ commitMessage.getClass().getName());
}
}
Set<Path> partitionPaths = new HashSet<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why extract this field?

public void clean(FileIO fileIO) throws IOException {
Path path = tempPath.getParent();
if (fileIO.exists(path)) {
fileIO.deleteQuietly(path);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you should use deleteDirectoryQuietly.

@jerry-024 jerry-024 force-pushed the format_table_rename branch from 76a2f00 to a082b56 Compare November 3, 2025 09:51
Copy link
Contributor

@JingsongLi JingsongLi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@JingsongLi JingsongLi merged commit 3fd6c29 into apache:master Nov 3, 2025
7 of 22 checks passed
gmdfalk added a commit to gmdfalk/paimon that referenced this pull request Nov 5, 2025
* master: (162 commits)
  [Python] Rename to BATCH_COMMIT_IDENTIFIER in snapshot.py
  [Python] Suppport multi prepare commit in the same TableWrite  (apache#6526)
  [spark] Fix drop temporary view (apache#6529)
  [core] skip validate main branch before orphan files cleaning (apache#6524)
  [core][spark] Introduce upper transform (apache#6521)
  [Python] Keep the variable names of Identifier consistent with Java (apache#6520)
  [core] Remove hash lookup to simplify interface (apache#6519)
  [core][format] Format Table plan partitions should ignore hidden & illegal dirs (apache#6522)
  [hotfix] Print partition spec and type when error in InternalRowPartitionComputer
  [hotfix] Add more informat to check partition spec in InternalRowPartitionComputer
  [hotfix] Use deleteDirectoryQuietly in TempFileCommitter.clean
  [core] format table: support write file in _temporary at first (apache#6510)
  [core] Support non null column with write type (apache#6513)
  [core][fix] Blob with rolling file failed (apache#6518)
  [core][rest] Support schema validation and infer for external paimon table (apache#6501)
  [hotfix] Correct visitors for TransformPredicate
  [hotfix] Rename to copy from withNewInputs in TransformPredicate
  [core][spark] Support push down transform predicate (apache#6506)
  [spark] Implement SupportsReportStatistics for PaimonFormatTableBaseScan (apache#6515)
  [docs] add docs for auto-clustering of historical partitions (apache#6516)
  ...
@jerry-024 jerry-024 deleted the format_table_rename branch November 6, 2025 02:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants