Convert Iceberg to TrinoFileSystem interface#13530
Conversation
There was a problem hiding this comment.
Consider marking these classes as thread safe or not. I believe only the provider is thread safe and the rest are considered single threaded
There was a problem hiding this comment.
Actually, I think that everything should be thread safe, including TrinoInput since it only has positioned read methods. The only thing not safe would be the FileIterator returned from FileSystem.listFiles() but that seems expected.
Did you see something not thread safe?
beb9454 to
561b66f
Compare
0beb2ca to
6960816
Compare
| hdfsClient.delete(schemaDir); | ||
| } | ||
|
|
||
| @Test(groups = ICEBERG) // make sure empty directories are noticed as well |
There was a problem hiding this comment.
Why was this test removed?
| public IcebergAvroPageSource( | ||
| FileIO fileIo, | ||
| String path, | ||
| InputFile file, |
| public IcebergAvroFileWriter( | ||
| FileIO fileIo, | ||
| Path path, | ||
| OutputFile file, |
| public IcebergFileWriter createDataFileWriter( | ||
| Path outputPath, | ||
| TrinoFileSystem fileSystem, | ||
| String outputPath, |
There was a problem hiding this comment.
Why String for outputPath ?
I see that FileIO expects a String for newOutputFile(String) method, but I a wondering why working with a rather generic type over Path is better.
| Path path = new Path(location); | ||
| FileSystem fileSystem = hdfsEnvironment.getFileSystem(hdfsContext, path); | ||
| if (fileSystem.exists(path) && fileSystem.listFiles(path, true).hasNext()) { | ||
| if (fileSystem.listFiles(location).hasNext()) { |
There was a problem hiding this comment.
Should we add recursive boolean parameter to TrinoFileSystem#listFiles(String) for more transparency? Alternatively, modify the method name to listFilesRecursively
| InputFile file; | ||
| OptionalLong fileModifiedTime = OptionalLong.empty(); | ||
| try { | ||
| file = fileSystem.toFileIo().newInputFile(inputFile.location(), inputFile.length()); |
There was a problem hiding this comment.
toFileIo() returns a Closeable.
Therefore let's use try-with-resources.
| FileEntry next() | ||
| throws IOException; | ||
|
|
||
| static FileIterator empty() |
There was a problem hiding this comment.
nit: since the returned instance has no state, we can make it static
|
Overall this PR brings a more elegant approach to dealing with the file system. The only thing I'm not sure of being an improvement is using |
Description
refactoring
Related issues, pull requests, and links
First commit is extracted to #13361
Documentation
(x) No documentation is needed.
Release notes
(x) No release notes entries required.