Skip to content

Conversation

@szetszwo
Copy link
Contributor

@szetszwo szetszwo commented Aug 11, 2022

What changes were proposed in this pull request?

From @jojochuang

Running Impala TPC-DS which stresses Ozone DN read path.
BufferUtils#assignByteBuffers stands out as one of the offender.
We can experiment with MappedByteBuffer and see if it makes performance better.

What is the link to the Apache JIRA

HDDS-7117

How was this patch tested?

New unit test.

@szetszwo szetszwo requested a review from jojochuang August 11, 2022 18:06
Copy link
Contributor

@jojochuang jojochuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @szetszwo for the PR!
The PR makes sense to me. I'll need to verify the performance difference in the cluster.

Copy link
Contributor

@jojochuang jojochuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: I hope this patch can better leverage OS cache, get rid of the temporary heap buffer. Will verify and benchmark in a real cluster.

@szetszwo
Copy link
Contributor Author

@jojochuang , thanks a lot for testing this!

@jojochuang
Copy link
Contributor

Please hold the merge as I am still doing a few more perf benchmarks. Thanks.

@szetszwo
Copy link
Contributor Author

@jojochuang , sure. Will wait for your test results. Thanks a lot!

@szetszwo szetszwo force-pushed the HDDS-7117 branch 2 times, most recently from dc70681 to 1525353 Compare August 30, 2022 08:30
@swamirishi
Copy link
Contributor

@jojochuang Can this be merged? Are the test results out?

@adoroszlai adoroszlai requested a review from jojochuang October 15, 2023 16:46
@adoroszlai adoroszlai marked this pull request as draft November 20, 2023 06:16
@ChenSammi
Copy link
Contributor

ChenSammi commented Dec 4, 2023

@szetszwo , thanks for working on this. Do you have time to do a patch rebase?

I did a micro performance test with a 4.5GB file, comparing the on heap buffer, off heap buffer and memory mapped buffer, here is the result.

image

By default, DN will use 1MB(bytes.per.checksum) as the buffer size. So using memory mapped buffer can increase at least micro level performance a lot.

@szetszwo
Copy link
Contributor Author

szetszwo commented Dec 4, 2023

@ChenSammi , thanks a lot for running the benchmark! When the file size is large, the performance is expected to be better for MappedByteBuffer . Your benchmark confirms the expectation.

Let me update this change.

@szetszwo szetszwo marked this pull request as ready for review December 4, 2023 20:47
@szetszwo szetszwo requested a review from ChenSammi December 4, 2023 20:47
@ChenSammi
Copy link
Contributor

ChenSammi commented Dec 5, 2023

@szetszwo , there is another your PR #3730 which aims to improve the read performance too. I didn't got chance to closely look at #3730 yet. Not sure is there overlap between this one and #3730?

@szetszwo
Copy link
Contributor Author

szetszwo commented Dec 5, 2023

@ChenSammi , this PR is to change reading the local files using java MappedByteBuffer while #3730 uses Netty ChunkedNioFile.

Copy link
Contributor

@ChenSammi ChenSammi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last patch LGTM +1.

@szetszwo szetszwo merged commit d9eb6b1 into apache:master Dec 6, 2023
@szetszwo
Copy link
Contributor Author

szetszwo commented Dec 6, 2023

@ChenSammi , thanks a lot for reviewing this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants