-
Notifications
You must be signed in to change notification settings - Fork 590
HDDS-6615. EC: Improve write performance by pipelining encode and flush #3994
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
This looks like a good change. I need to review in more detail next week (out of the office on Friday). Out of interest, have you benchmarked Ratis writes vs EC writes? How do they compare? I also wonder if it would be possible to do the writes in |
ozone/hadoop-hdds/client/src/main/java/org/apache/hadoop/hdds/scm/storage/ECBlockOutputStream.java Lines 84 to 89 in 6abee8a
|
I have tested Writing many keys in EC is slightly faster (~10%) than Ratis, but not as fast as in theory (because EC writes less data). When writing many large keys (1000x 5GB / 100 threads), EC works fine but Ratis cannot complete the write. |
sodonnel
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - thanks for working on this, as it seems a good change.
|
Thanks @sodonnel for the review. |
* master: (110 commits) HDDS-7472. EC: Fix NSSummaryEndpoint#getDiskUsage for EC keys (apache#3987) HDDS-5704. Ozone URI syntax description in help content needs to mention about ozone service id (apache#3862) HDDS-7555. Upgrade Ratis to 2.4.2-8b8bdda-SNAPSHOT. (apache#4028) HDDS-7541. FSO recursive delete directory with hierarchy takes much time for cleanup (apache#4008) HDDS-7581. Fix update-jar-report for snapshot (apache#4034) HDDS-7253. Fix exception when '/' in key name (apache#4038) HDDS-7579. Use Netty 4.1.77 for consistency (apache#4031) HDDS-7562. Suppress warning about long filenames in tar (apache#4017) HDDS-7563. Add a handler for under replicated Ratis containers in RM (apache#4025) HDDS-7497. Fix mkdir does not update bucket's usedNamespace (apache#3969) HDDS-7567. Invalid entries in LICENSE (apache#4020) HDDS-7575. Correct showing of RATIS-THREE icon in Recon UI (apache#4026) HDDS-7540. Let reusable workflow inherit secrets (apache#4012) HDDS-7568. Bump copyright year in NOTICE (apache#4018) HDDS-7394. OM RPC FairCallQueue decay decision metrics list caller username in the metric (apache#3878) HDDS-7510. Recon: Return number of open containers in `/clusterState` endpoint (apache#3989) HDDS-7561. Improve setquota, clrquota CLI usage (apache#4016) HDDS-6615. EC: Improve write performance by pipelining encode and flush (apache#3994) HDDS-7554. Recon UI should show DORMANT in pipeline status filter (apache#4010) HDDS-7540. Separate scheduled CI from push/PR workflows (apache#4004) ...
What changes were proposed in this pull request?
Introduced a new flush thread in ECKeyOutputStream, pipelining the encode stage and flush stage.
The producer thread does the buffering and encoding, and the consumer thread does the flushing.
They are connected by a
ArrayBlockingQueue, whose size is"ec.stripe.queue.size", which defaults to 2.Necessary refactors has been done to seperate the two stages.
EOFDummyStripeis used to denote the end of the key,CheckpointDummyStripeis used for syncing in tests.Synchronization was added in tests.
NOTE: Currently the exception handling needs another look. If exception were thrown in the flush thread,This has been fixed in d710aff.the main thread will not get it until calling
Future.get()inclose().What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-6615
How was this patch tested?
It's covered by existing test. Synchronization was added before checking state.
Mannual test of writing 1 key by freon shows ~20% performance improvement.