-
Notifications
You must be signed in to change notification settings - Fork 2.9k
Aliyun: Mock aliyun OSS(Object Storage Service) #3067
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
openinx
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @xingbowu to make a smaller PR for the Aliyun OSS integration work. I skimmed the whole PR, this PR is trying to introduce the aliyun oss mock app server, so that we could build oss test cases on top of it (We don't need to mock all the called OSS API in a relative complex test cases , such as multi-upload test cases). It's a good thing for us to introduce a simple simulator to align the local mock app and aliyun online oss server.
The most important thing for me is: How do we guarantee the local mock app is aligning correctly to the aliyun online oss server ? In the parent PR #2230 , we introduced a class test rule named OSSTestRule and the rule has two different impl:
- OSSMockRule: The rule will start a local mini aliyun oss server, which could serving the remote OSS client http requests.
- OSSIntegrationTestRule: The rule will prepare testing buckets in the remote oss server, so that the test cases could write real data to.
For the local oss application, we provide tests cases in TestLocalOSS , which was designed to run different TestRule according to the intentional environment variables (will run local mock app by default). If the tests could be passes on both local env and remote oss env, then we could definitely ensure the local oss app is implemented correctly.
aliyun/src/test/java/org/apache/iceberg/aliyun/oss/mock/AliyunOSSMockApp.java
Outdated
Show resolved
Hide resolved
aliyun/src/test/java/org/apache/iceberg/aliyun/oss/mock/AliyunOSSMockApp.java
Outdated
Show resolved
Hide resolved
6192cd6 to
d832cde
Compare
Thanks a lot for pointing out missing part. I have done rework and added local test here to guarantee quality. comparing with 2230, implemented more test case, such as range get. Additionally, followed up your review comments to simulate basic aliyun oss behavior including create/put/delete and excluded multi-parts , feel free to let me know if you have further comments. |
aliyun/src/test/java/org/apache/iceberg/aliyun/oss/util/AliyunOSSURI.java
Outdated
Show resolved
Hide resolved
aliyun/src/test/java/org/apache/iceberg/aliyun/oss/AliyunOSSTestRule.java
Show resolved
Hide resolved
aliyun/src/test/java/org/apache/iceberg/aliyun/oss/AliyunOSSTestRule.java
Show resolved
Hide resolved
aliyun/src/test/java/org/apache/iceberg/aliyun/oss/AliyunOSSTestRule.java
Outdated
Show resolved
Hide resolved
aliyun/src/test/java/org/apache/iceberg/aliyun/oss/mock/AliyunOSSMockRule.java
Outdated
Show resolved
Hide resolved
aliyun/src/test/java/org/apache/iceberg/aliyun/oss/mock/AliyunOSSMockRule.java
Outdated
Show resolved
Hide resolved
| public void tearDownBucket(String bucket) { | ||
| try { | ||
| Files.walk(rootDir().toPath()) | ||
| .filter(p -> p.toFile().isFile()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why if we have a bucket named test-bucket, and have a object named path/to/a.dat, Then the full local path will be <root-dir>/test-bucket/path/to/a.dat, will we remove the <root-dir>/test-bucket/path/to directory also ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In aliyun OSS, the <root-dir>/test-bucket/path/to won't be a specific directory in the real production, but in our local OSS storage app, it will be a real directory (though it's not a real object that people could see by using aliyun OSS SDK).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding to the scenario you list above, the directories are deleted in the deleteBucket. However, we indeed have problem in another scenario. if we create an object with name "aa/bb/cc.txt", and then remove it. after that we create a new object with name "aa/bb". it has problem because of deleting cc.txt in current logic only. I will fix it in separate PR.
aliyun/src/test/java/org/apache/iceberg/aliyun/oss/mock/AliyunOSSMockRule.java
Outdated
Show resolved
Hide resolved
aliyun/src/test/java/org/apache/iceberg/aliyun/oss/mock/AliyunOSSMockApp.java
Outdated
Show resolved
Hide resolved
| public class TestLocalAliyunOSS { | ||
|
|
||
| @ClassRule | ||
| public static final AliyunOSSTestRule OSS_TEST_RULE = AliyunOSSMockRule.builder().silent().build(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to make this TestRule to be configurable so that we could verify this unit tests on both local mock oss services and remote aliyun OSS environment, to guarantee that the local oss app has the same semantics as remote aliyun OSS environment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@xingbowu How is feeling about this comment ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, I will implement it in the next PR, which include both local and remote part
aliyun/src/test/java/org/apache/iceberg/aliyun/oss/mock/TestLocalAliyunOSS.java
Show resolved
Hide resolved
aliyun/src/test/java/org/apache/iceberg/aliyun/oss/mock/TestLocalAliyunOSS.java
Show resolved
Hide resolved
|
|
||
| @Test | ||
| public void testGetObjectWithRange() throws IOException { | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extra empty line ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I think this line should be removed because it seems not match the whole iceberg project code style , even if we don't put it into the check style rule set.
aliyun/src/test/java/org/apache/iceberg/aliyun/oss/mock/TestLocalAliyunOSS.java
Outdated
Show resolved
Hide resolved
aliyun/src/test/java/org/apache/iceberg/aliyun/oss/mock/AliyunOSSMockLocalStore.java
Outdated
Show resolved
Hide resolved
|
|
||
| Bucket getBucket(String bucketName) { | ||
| List<Bucket> buckets = findBucketsByFilter(file -> | ||
| Files.isDirectory(file) && file.getFileName().endsWith(bucketName)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
file.getFileName().endsWith(bucketName)
It's incorrect to use the file.getFileName().endsWith(bucketName) to check its bucket name, because for a bucket test-bucket, it's possible that we will have an object with name /path/to/test-bucket/a.txt. In that case, we will create a directory <root-dir>/path/to/test-bucket though it's no an object being visiable to OSS SDK, but it will fail to check the existence of this given bucket.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
newDirectoryStream only traverses the directory&file under root path. the object with name /test-bucket/a.txt, is under /test-bucket(bucket name)/path/to/test-bucket(object prefix name)/a.txt. so it is no problem here.
openinx
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@xingbowu Thanks for the updating, I just token another round and left some comments !
27f1168 to
fcef7ac
Compare
|
The broken unit test is : Looks like it's unrelated to this PR, let's reopen to trigger the travis CI once again. |
|
@openinx Thanks for your effort and comments, I have reworked them with latest code, feel free to take further review |
| compileOnly 'javax.xml.bind:jaxb-api' | ||
| compileOnly 'javax.activation:activation' | ||
| compileOnly 'org.glassfish.jaxb:jaxb-runtime' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to introduce those three jars when adding a testing framework under the aliyun/src/test module?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
per aliyun oss sdk documentation, https://help.aliyun.com/document_detail/32009.html
jaxb related dependencies need to be added under java 9 and plus version environment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, then I will recommend to package all the dependencies jar into a single bundled iceberg-aliyun jar. Because in flink SQL client, we will need to add each jar in the shell as the following:
./bin/sql-client.sh embedded \
-j <iceberg-aliyu>.jar \
-j <aliyun-oss>.jar \
-j <javax.xml.bind:jaxb-api>.jar \
-j <org.glassfish.jaxb:jaxb-runtime>.jar
shellIt will be quite tedious for people to add jars one by one to make the iceberg job works ( aws-sdk don't need to package all of them into a bundled jar because aws sdk has provided it ( see document).
| javax.xml.bind:jaxb-api = 2.3.1 | ||
| javax.activation:activation = 1.1.1 | ||
| org.glassfish.jaxb:jaxb-runtime = 2.3.3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we need to three extra jars ? I passed the test cases after removing them ..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above comment
aliyun/src/test/java/org/apache/iceberg/aliyun/oss/AliyunOSSTestRule.java
Outdated
Show resolved
Hide resolved
|
I left several comments, almost looks good to me. @rdblue @jackye1995 would you like to have a double check ? |
26aa8a8 to
c492ec0
Compare
|
|
||
| Banner.Mode bannerMode = Banner.Mode.CONSOLE; | ||
|
|
||
| if (Boolean.parseBoolean(String.valueOf(properties.remove("silent")))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Here we can replace the silent by the PROP_SILENT.
|
@xingbowu , Could you please file new issues for the above comment ([1], [2], [3]) and put them in this project dashboard , so that we could easily track the whole progress ? I plan to get this PR merged once we've tracked those issues and fixed the other minor comments, thanks for the work ! [1] #3067 (comment) |
Issue #3180 opened |
openinx
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM !
Aliyun OSS is popular in public cloud service, especially the users from china. Many use cases of integrating iceberg make aliyun oss as backend storage. So it benefits community to integrate iceberg with aliyun oss. I would like to contribute several PRs and complete this work. Here is 1st step : Mock aliyun OSS in UT
Aliyun OSS SDK doesn't support mock local environment, and no any plan to develop this feature recently.
To make unit test of iceberg integration with oss efficiently, this PR mocks a local lightweight aliyun oss behavior for UT similar with s3mock