Skip to content

Conversation

@adoroszlai
Copy link
Contributor

What changes were proposed in this pull request?

Fix a bug where attempting to create empty file in FSO bucket via S3 Gateway results in creation of a directory.

$ ozone sh bucket create --layout FILE_SYSTEM_OPTIMIZED /s3v/fso-bucket

$ hdfs dfs -touch s3a://fso-bucket/empty.s3a

$ ozone sh key info /s3v/fso-bucket/empty.s3a
{
  "name" : "empty.s3a/",
  "dataSize" : 0,
  ...
  "ozoneKeyLocations" : [ ],
  "file" : false
}

The same works OK in OBJECT_STORE and LEGACY buckets.

It was broken by HDDS-7594, which fixed a bug where attempt to create directory through S3G created files instead. In other words, prior behavior was also wrong, just the other way around.

https://issues.apache.org/jira/browse/HDDS-10570

How was this patch tested?

Added test case in unit test (TestObjectPut).

Changed acceptance tests to run tests on all three bucket layout types.

Also tested manually:

$ ozone sh bucket create --layout FILE_SYSTEM_OPTIMIZED /s3v/fso-bucket

$ export AWS_ACCESS_KEY_ID=any AWS_SECRET_KEY=any
$ export OZONE_CLASSPATH=aws-java-sdk-bundle-1.12.367.jar:hadoop-aws-3.3.6.jar

$ ozone fs -Dfs.s3a.path.style.access=true -Dfs.s3a.endpoint=http://s3g:9878 \
    -touch s3a://fso-bucket/empty.file

$ ozone fs -Dfs.s3a.path.style.access=true -Dfs.s3a.endpoint=http://s3g:9878 \
    -mkdir s3a://fso-bucket/dir

$ ozone fs -ls ofs://om/s3v/fso-bucket/
Found 2 items
drwxrwxrwx   - hadoop hadoop          0 2024-03-27 21:15 ofs://om/s3v/fso-bucket/dir
-rw-rw-rw-   3 hadoop hadoop          0 2024-03-27 21:15 ofs://om/s3v/fso-bucket/empty.file

$ ozone sh key info /s3v/fso-bucket/dir/
{
  "name" : "dir/",
  "dataSize" : 0,
  ...
  "ozoneKeyLocations" : [ ],
  "file" : false
}

$ ozone sh key info /s3v/fso-bucket/empty.file
{
  "name" : "empty.file",
  "dataSize" : 0,
  ...
  "ozoneKeyLocations" : [ ],
  "file" : true
}

CI:
https://github.com/adoroszlai/ozone/actions/runs/8456929496

@adoroszlai adoroszlai added the s3 S3 Gateway label Mar 27, 2024
@adoroszlai adoroszlai self-assigned this Mar 27, 2024
@adoroszlai
Copy link
Contributor Author

@mladjan-gadzic please take a look if you have time

@adoroszlai
Copy link
Contributor Author

adoroszlai commented Mar 27, 2024

@ivandika3 @SaketaChalamchala can you please review?

@adoroszlai adoroszlai requested a review from errose28 March 27, 2024 21:32
@mladjan-gadzic
Copy link
Contributor

@mladjan-gadzic please take a look if you have time

@adoroszlai, thanks for working on this. Looks like an oversight in HDDS-7594 is fixed here. I'd just like to confirm one thing. Is it possible to receive put request for dir creation without trailing slash?

@errose28
Copy link
Contributor

Is it possible to receive put request for dir creation without trailing slash?

I recall the spec being a bit unclear here when we did HDDS-7594. I can't think of another system that has a client and server that each support directories and allows them to communicate through a protocol that does not support directory creation. If the spec is that an object must be empty and end in a slash to be considered a directory when passing through the client and server S3 layers then this change makes sense. I guess the way to verify this is correct is to check the s3a code and verify that it always translates mkdir to an object with a trailing slash in the name, regardless of whether the slash was present in the name provided. If it does not do this, then the config key may be the only way to distinguish the two cases.

@adoroszlai
Copy link
Contributor Author

adoroszlai commented Mar 28, 2024

Thanks @errose28, @mladjan-gadzic for the quick response.

I plan to add S3A's implementation of Hadoop contract tests to CI in HDDS-8450, but it's not yet ready. HDDS-10572 is required for speeding up the tests (maybe even fixing some).

Until then, some partial results with the patch:

Tests run: 20, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 19.789 s - in org.apache.hadoop.fs.contract.s3a.ITestS3AContractGetFileStatus

On current master:

Tests run: 20, Failures: 6, Errors: 1, Skipped: 0, Time elapsed: 22.714 s <<< FAILURE! - in org.apache.hadoop.fs.contract.s3a.ITestS3AContractGetFileStatus
testListLocatedStatusFiltering(org.apache.hadoop.fs.contract.s3a.ITestS3AContractGetFileStatus)  Time elapsed: 1.321 s  <<< ERROR!
org.apache.hadoop.fs.FileAlreadyExistsException: s3a://fso-bucket/test/file-1.txt is a directory

testListLocatedStatusFile(org.apache.hadoop.fs.contract.s3a.ITestS3AContractGetFileStatus)  Time elapsed: 0.753 s  <<< FAILURE!
java.lang.AssertionError: size of file list returned expected:<1> but was:<0>

testListStatusFiltering(org.apache.hadoop.fs.contract.s3a.ITestS3AContractGetFileStatus)  Time elapsed: 1.483 s  <<< FAILURE!
java.lang.AssertionError: length of listStatus(s3a://fso-bucket/test/file-1.txt, org.apache.hadoop.fs.contract.AbstractContractGetFileStatusTest$AllPathsFilter@cfd1779 ) [] expected:<1> but was:<0>

testListFilesFile(org.apache.hadoop.fs.contract.s3a.ITestS3AContractGetFileStatus)  Time elapsed: 1.496 s  <<< FAILURE!
java.lang.AssertionError: size of file list returned expected:<1> but was:<0>

testListStatusIteratorFile(org.apache.hadoop.fs.contract.s3a.ITestS3AContractGetFileStatus)  Time elapsed: 0.618 s  <<< FAILURE!
java.lang.AssertionError: 
[size of file list returned using hasNext() and next() calls should be 1] 
Expected size:<1> but was:<0> in:
<[]>

testListStatusFile(org.apache.hadoop.fs.contract.s3a.ITestS3AContractGetFileStatus)  Time elapsed: 1.239 s  <<< FAILURE!
java.lang.AssertionError: expected:<1> but was:<0>

testListFilesFileRecursive(org.apache.hadoop.fs.contract.s3a.ITestS3AContractGetFileStatus)  Time elapsed: 0.842 s  <<< FAILURE!
java.lang.AssertionError: size of file list returned expected:<1> but was:<0>

Also tried some other tests that pass both with and without the patch:

Tests run: 16, Failures: 0, Errors: 0, Skipped: 4, Time elapsed: 12.325 s - in org.apache.hadoop.fs.contract.s3a.ITestS3AContractCreate
Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 17.392 s - in org.apache.hadoop.fs.contract.s3a.ITestS3AContractDelete
Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 244.242 s - in org.apache.hadoop.fs.contract.s3a.ITestS3AContractMkdir
Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.666 s - in org.apache.hadoop.fs.contract.s3a.ITestS3AContractRootDir

All results above are for FSO bucket. These tests (including ITestS3AContractGetFileStatus) also pass with/without the patch on OBJECT_STORE and LEGACY buckets.

Copy link
Contributor

@errose28 errose28 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this @adoroszlai. Having the s3a contract tests run against Ozone to check these cases will be a great improvement in the future as well.

@adoroszlai adoroszlai merged commit 85c9c97 into apache:master Mar 29, 2024
@adoroszlai adoroszlai deleted the HDDS-10570 branch March 29, 2024 08:31
@adoroszlai
Copy link
Contributor Author

Thanks @errose28, @mladjan-gadzic for the review.

I'd just like to confirm one thing. Is it possible to receive put request for dir creation without trailing slash?

All requests in mkdir contract tests seem to have trailing slash.

jojochuang pushed a commit to jojochuang/ozone that referenced this pull request May 29, 2024
xichen01 pushed a commit to xichen01/ozone that referenced this pull request Aug 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

s3 S3 Gateway

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants