Skip to content

Commit

Permalink
Spark: Positional deletes creates unpartitioned path
Browse files Browse the repository at this point in the history
I was doing some work on the Python side:

#6775

But ran into an issue when creating some integration tests
for testing the positional deletes. I ended up with double
slashes:

s3://warehouse/default/test_positional_mor_deletes/data//00000-32-70be11f7-3c4b-40e0-b35a-334e97ef6554-00001-deletes.parquet

It looks like the Struct is not-null, but the partition
not partitioned, therefore it creates a partitioned path,
but with the empty struct we'll end up with a double slash `//`
that Minio doesn't like.

Outputfactory.java
```java
  public EncryptedOutputFile newOutputFile(PartitionSpec spec, StructLike partition) { // partition is a StructCopy
    String newDataLocation = locations.newDataLocation(spec, partition, generateFilename());
    OutputFile rawOutputFile = io.newOutputFile(newDataLocation);
    return encryptionManager.encrypt(rawOutputFile);
  }
```

ClusteredWriter.java
```java
      // copy the partition key as the key object may be reused
      this.currentPartition = StructCopy.copy(partition);  // partition is a StructProjection
      this.currentWriter = newWriter(currentSpec, currentPartition);
```

I still have to dig into why there is a StructProjection.

Resolves #7678
  • Loading branch information
Fokko committed May 22, 2023
1 parent 65004c1 commit 880cb49
Showing 1 changed file with 1 addition and 1 deletion.
Original file line number Diff line number Diff line change
Expand Up @@ -109,7 +109,7 @@ protected void openCurrentWriter() {
}

private EncryptedOutputFile newFile() {
if (partition == null) {
if (partition == null || (spec != null && spec.isUnpartitioned())) {
return fileFactory.newOutputFile();
} else {
return fileFactory.newOutputFile(spec, partition);
Expand Down

0 comments on commit 880cb49

Please sign in to comment.