Skip to content

Conversation

@xichen01
Copy link
Contributor

@xichen01 xichen01 commented Oct 31, 2023

What changes were proposed in this pull request?

To fix the problem of abnormal dirty data overwriting normal data when S3G writes data. Make Ozone S3 puts atomic writes, all successes total or failures
The main implementation is to check whether the data of the key is complete or not when committing, if the data written by the key is different from the expected size, the key is considered incomplete, and the commit operation is not executed.

  @Override
  public synchronized void close() throws IOException {
//...
    try {
//...
      if (requiresAtomicWrite) {
        long expectedSize = blockOutputStreamEntryPool.getDataSize();
        Preconditions.checkArgument(expectedSize == offset,
            String.format("Expected: %d and actual %d write sizes do not match",
                expectedSize, offset));
      }
      blockOutputStreamEntryPool.commitKey(offset);
    } finally {
      blockOutputStreamEntryPool.cleanup();
    }
  }
  • The requiresAtomicWrite is only enable in S3G, it will be set in EndpointBase#initialization , just like the s3Auth
  • The blockOutputStreamEntryPool.getDataSize() is the dataSize that was passed in when the S3G created the key (including the MPU key).
  • The dataSize is the
    • Content-Length: normal put key
    • x-amz-decoded-content-length: Stream mode uploaded key.
    • x-amz-copy-source-range: The length specified in the MPU copy Range (Ozone does not currently support Copy Ranges for non-MPU keys.).
    • Key length of source key: copy request without Range

PS:
refer to AWS put doc https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutObject.html

Amazon S3 never adds partial objects; if you receive a success response, Amazon S3 added the entire object to the bucket.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-9526

How was this patch tested?

unit test, manually test.
manually test.

[root@VM-8-3-centos ~]$ aws configure set default.s3.multipart_threshold 1GB
[root@VM-8-3-centos ~]$ aws s3 --endpoint-url http://localhost:9878 cp ~/500M.img s3://bucket1/500M.img
^Ccancelled: ctrl-c received 
[root@VM-8-3-centos /tmp]$ aws s3 ls s3://bucket1/ --endpoint-url http://localhost:9878
// A key that has not been completely written should not be visible.

@xichen01 xichen01 marked this pull request as draft October 31, 2023 15:57
# Conflicts:
#	hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/BlockDataStreamOutputEntryPool.java
#	hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/BlockOutputStreamEntryPool.java
#	hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/rpc/RpcClient.java
#	hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/ObjectEndpoint.java

copyHeader = headers.getHeaderString(COPY_SOURCE_HEADER);
if (copyHeader != null) {
String range =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we add this COPY_SOURCE_HEADER_RANGE check in the put() L253 too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ozone does not currently support copy ranges for non-MPU keys so Ozone will copy the entire key regardless of the value of COPY_SOURCE_HEADER_RANGE.

Maybe we can add this when we need to support copy ranges.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked the CopyObject document, it looks like range copy is not supported.

https://docs.aws.amazon.com/AmazonS3/latest/API/API_CopyObject.html

* A mismatch will prevent the commit from succeeding.
* This is essential for operations like S3 put to ensure atomicity.
*/
private boolean requiresAtomicWrite;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest change the requiresAtomicWrite to atomicKeyCreation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

}
if (requiresAtomicWrite) {
long expectedSize = blockOutputStreamEntryPool.getDataSize();
Preconditions.checkArgument(expectedSize == offset,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest to use Preconditions.checkState here, so it will throw out IllegalStateException instead of
IllegalArgumentException. IllegalStateException is more appropriate than IllegalArgumentException here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And please catch this exception in ObjectEndpoint#put finally when OutputStream#close is called.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

objectEndpoint.setHeaders(headers);
objectEndpoint.setClient(client);
objectEndpoint.setOzoneConfiguration(new OzoneConfiguration());
objectEndpoint.setHeaders(headers);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is duplicate with L150.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

objectEndpoint.setHeaders(headers);
String keyName = UUID.randomUUID().toString();

String chunkedContent = "0a;chunk-signature=signature\r\n"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xichen01 , could you explain a little bit about this chunkedContent? Don't understand why this string length is 15.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The length of "1234567890" + "abcde" is 15, 05;chunk-signature=signature is some metadata.

I referenced it here:

Assert.assertEquals(200, response.getStatus());
Assert.assertEquals("1234567890abcde", keyContent);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay.

@ChenSammi
Copy link
Contributor

@xichen01 , thanks for provide the fix so quickly. Is it possible to do more manual tests, such as multi upload, and range multi upload case?

@xichen01
Copy link
Contributor Author

xichen01 commented Nov 1, 2023

@xichen01 , thanks for provide the fix so quickly. Is it possible to do more manual tests, such as multi upload, and range multi upload case?

OK, I will do more test for this Patch

@kerneltime
Copy link
Contributor

There are few test failures in your fork, can you please take a look https://github.com/xichen01/ozone/actions/runs/6717923197/job/18257546921

@kerneltime kerneltime requested a review from duongkame November 1, 2023 21:13
@kerneltime
Copy link
Contributor

@SaketaChalamchala @tanvipenumudy can you please take a look?

try {
output.close();
} catch (IllegalStateException ex) {
LOG.error(String.format(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can occur due to client behavior and we should not be logging at level error a client with a bad connection could fill up the S3G logs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The metric at line 320 should be moved below into the finally block.

@kerneltime
Copy link
Contributor

Can you look into the test failure in your branch https://github.com/xichen01/ozone/actions/runs/6773687585/job/18410561865

@xichen01
Copy link
Contributor Author

xichen01 commented Nov 7, 2023

Can you look into the test failure in your branch https://github.com/xichen01/ozone/actions/runs/6773687585/job/18410561865

The failed test are:

Prepare Ozone Manager                                                 | FAIL |
Test timeout 5 minutes exceeded.
--
Prepare :: Prepares OMs                                               | FAIL |
1 test, 0 passed, 1 failed

This may have been an unstable test, I didn't change the OM code, when I reran the test, the test succeeded.
https://github.com/xichen01/ozone/actions/runs/6773687585

@xichen01 xichen01 marked this pull request as ready for review November 7, 2023 03:44
@guohao-rosicky
Copy link
Contributor

Thanks @xichen01 working on this,
I think it's a good change to just add requiresAtomicWrite to s3g and add an assertion on close, if there's an assertion error, it also goes into finally for cleanup and doesn't result in resource leakage.

Elegant code and detailed testing, LGTM.

@adoroszlai
Copy link
Contributor

@ChenSammi @kerneltime @SaketaChalamchala please take a look at the latest patch

Copy link
Contributor

@ChenSammi ChenSammi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last patch looks good to me, +1

@kerneltime , @SaketaChalamchala , could you like to take another look?

@ChenSammi ChenSammi merged commit e14fb8e into apache:master Nov 9, 2023
@ChenSammi
Copy link
Contributor

Thanks @xichen01 for this contribution, @kerneltime @adoroszlai @SaketaChalamchala for the code review.

ibrusentsev pushed a commit to ibrusentsev/ozone that referenced this pull request Nov 14, 2023
jojochuang pushed a commit to jojochuang/ozone that referenced this pull request Feb 1, 2024
…se data loss in case of an exception. (apache#5524)

(cherry picked from commit e14fb8e)
Change-Id: Iaa64909d76ec3bd129797ff54f46f19430e8dba8
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants