-
Notifications
You must be signed in to change notification settings - Fork 2.9k
Core: Delegate bulk and prefix operations in ResolvingFileIO #8169
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
3ac5713
2aace57
2d129e9
df3e850
2009bd2
19230d5
9d73070
f7b4530
f6ea325
d4c591f
8e6d9a1
839da9d
065dd0b
076aa43
ce32d5c
109d385
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,25 @@ | ||
| /* | ||
| * Licensed to the Apache Software Foundation (ASF) under one | ||
| * or more contributor license agreements. See the NOTICE file | ||
| * distributed with this work for additional information | ||
| * regarding copyright ownership. The ASF licenses this file | ||
| * to you under the Apache License, Version 2.0 (the | ||
| * "License"); you may not use this file except in compliance | ||
| * with the License. You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * | ||
| * Unless required by applicable law or agreed to in writing, | ||
| * software distributed under the License is distributed on an | ||
| * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
| * KIND, either express or implied. See the License for the | ||
| * specific language governing permissions and limitations | ||
| * under the License. | ||
| */ | ||
| package org.apache.iceberg.io; | ||
|
|
||
| /** | ||
| * This interface is intended as an extension for FileIO implementations that support being a | ||
| * delegate target. | ||
| */ | ||
| public interface DelegateFileIO extends FileIO, SupportsPrefixOperations, SupportsBulkOperations {} |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -19,8 +19,16 @@ | |
| package org.apache.iceberg.io; | ||
|
|
||
| import static org.assertj.core.api.Assertions.assertThat; | ||
| import static org.assertj.core.api.Assertions.assertThatCode; | ||
| import static org.assertj.core.api.Assertions.assertThatThrownBy; | ||
| import static org.mockito.ArgumentMatchers.any; | ||
| import static org.mockito.Mockito.doReturn; | ||
| import static org.mockito.Mockito.mock; | ||
| import static org.mockito.Mockito.spy; | ||
| import static org.mockito.Mockito.withSettings; | ||
|
|
||
| import java.io.IOException; | ||
| import org.apache.hadoop.conf.Configuration; | ||
| import org.apache.iceberg.TestHelpers; | ||
| import org.apache.iceberg.relocated.com.google.common.collect.ImmutableMap; | ||
| import org.junit.jupiter.api.Test; | ||
|
|
@@ -47,4 +55,23 @@ public void testResolvingFileIOJavaSerialization() throws IOException, ClassNotF | |
| FileIO roundTripSerializedFileIO = TestHelpers.roundTripSerialize(testResolvingFileIO); | ||
| assertThat(roundTripSerializedFileIO.properties()).isEqualTo(testResolvingFileIO.properties()); | ||
| } | ||
|
|
||
| @Test | ||
| public void testEnsureInterfaceImplementation() { | ||
| ResolvingFileIO testResolvingFileIO = spy(new ResolvingFileIO()); | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this is ok, but what I was hoping we would have coverage for was that all delegates properly worked. This may be very difficult, I'm just looking for a way to confirm that the class will work at runtime. If we can force all the checks at compile time I'd be ok with that as well.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We wouldn't be able to test the actual delegates without having the dependencies here (i.e. iceberg-aws and iceberg-gcp). I'll see if a service pattern could work where the delegate types register themselves with ResolvingFileIO, then we can have the compile time check.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I looked into using Java SPI, but I don't think it is a good idea as it could cause problems with shadow jars. Also it isn't much different than this solution. We can't register with ResolvingFileIO from the delegate class as the delegate class might not be loaded at all yet. One option, somewhat similar to #7976 would be to remove
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I added tests for each of the delegate types to ensure they load. Though that doesn't prevent a new class from being added to the list that doesn't implement the interface. |
||
| testResolvingFileIO.setConf(new Configuration()); | ||
| testResolvingFileIO.initialize(ImmutableMap.of()); | ||
|
|
||
| String fileIONoMixins = mock(FileIO.class).getClass().getName(); | ||
| doReturn(fileIONoMixins).when(testResolvingFileIO).implFromLocation(any()); | ||
| assertThatThrownBy(() -> testResolvingFileIO.newInputFile("/file")) | ||
| .isInstanceOf(IllegalStateException.class); | ||
|
|
||
| String fileIOWithMixins = | ||
| mock(FileIO.class, withSettings().extraInterfaces(DelegateFileIO.class)) | ||
| .getClass() | ||
| .getName(); | ||
| doReturn(fileIOWithMixins).when(testResolvingFileIO).implFromLocation(any()); | ||
| assertThatCode(() -> testResolvingFileIO.newInputFile("/file")).doesNotThrowAnyException(); | ||
| } | ||
| } | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did we resolve this? I think we probably want to pass the iterable here rather than the iterator to make sure we close everything properly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No not yet, I'm still thinking through the best solution. Passing the original iterable could cause an additional request to be made, i.e. one additional request for the peek, so I was hoping to avoid that.