-
Notifications
You must be signed in to change notification settings - Fork 588
HDDS-10372. SCM and Datanode communication for reconciliation #6506
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
kerneltime
merged 44 commits into
apache:HDDS-10239-container-reconciliation
from
errose28:HDDS-10372-reconcile-cli
May 29, 2024
Merged
Changes from all commits
Commits
Show all changes
44 commits
Select commit
Hold shift + click to select a range
8339f96
WIP SCM changes for reconcile cli
errose28 573ec30
Changes for SCM WIP pt2
errose28 b259dae
Use SCMException only for error handling
errose28 4a89c8e
Add SCM event handler for reconcile events
errose28 f9d1bfd
Add datanode reconcile stub.
errose28 ddf3ce8
Updates after reviewing diff
errose28 47f6c06
Fix checkstyle
errose28 9154e3c
Basic reconcile scm <-> DN works
errose28 d036229
Improve error handling
errose28 9aecbcf
Add DN side unit tests
errose28 2775807
Test with two containers
errose28 600174f
Add container report handler tests
errose28 6ec2acb
Add ICR tests, improve FCR tests
errose28 4841c8f
Remove duplicate line from report handler test
errose28 8e7e8f7
Add (currently failing) test for scm event handler
errose28 74bb00a
Refactor contaienr eligibility, SCM event handler tests pass
errose28 2effd78
Checkstyle
errose28 d055094
Add robot test that may not pass yet
errose28 3586cc3
Rat
errose28 ac6ff0e
Test repeat run with fixed acc test workflow
errose28 d84d3ef
Some acceptance test fixes
errose28 a4a0a1c
Update comment
errose28 6f2c7e4
findbugs
errose28 9c94f1c
Separate container handler test for metrics
errose28 e42324e
Almost finished separated metrics and report tests
errose28 3cb0416
TestReconcileContainerCommandHandler complete and passing
errose28 18f4f2a
Checkstyle
errose28 2163008
Fix TestSCMExceptionResultCodes
errose28 13a335e
Might have fixed acceptance test
errose28 b03c4fc
Use long as checksum representation
errose28 bc434ec
Apparently snakeyaml is coupled to Java variable names
errose28 290fbb7
checkstyle
errose28 afd6043
Rename two existing container file checksum methods for clarity
errose28 d8e86bd
Test that container data checksum is not written to .container file
errose28 990735f
Fix simple acceptance test issue
errose28 f4ace2d
Print data checksum as hex string in container info output
errose28 36f92ee
Add log of placeholder checksum generated
errose28 f26a402
checkstyle
errose28 c8bdffc
putLong increments position, need to rewind.
errose28 f862836
Undo acceptance workflow change from master
errose28 8a2427b
Undo accidental gh actions change
errose28 31e60a9
Fix comment typo
errose28 97a76d7
Merge branch 'HDDS-10239-container-reconciliation' into HDDS-10372-re…
errose28 28b8862
Address review comments, fix error after merge commit
errose28 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
110 changes: 110 additions & 0 deletions
110
.../ozone/container/common/statemachine/commandhandler/ReconcileContainerCommandHandler.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,110 @@ | ||
| /* | ||
| * Licensed to the Apache Software Foundation (ASF) under one | ||
| * or more contributor license agreements. See the NOTICE file | ||
| * distributed with this work for additional information | ||
| * regarding copyright ownership. The ASF licenses this file | ||
| * to you under the Apache License, Version 2.0 (the | ||
| * "License"); you may not use this file except in compliance | ||
| * with the License. You may obtain a copy of the License at | ||
| * <p> | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * <p> | ||
| * Unless required by applicable law or agreed to in writing, software | ||
| * distributed under the License is distributed on an "AS IS" BASIS, | ||
| * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| * See the License for the specific language governing permissions and | ||
| * limitations under the License. | ||
| */ | ||
|
|
||
| package org.apache.hadoop.ozone.container.common.statemachine.commandhandler; | ||
|
|
||
| import com.google.common.util.concurrent.ThreadFactoryBuilder; | ||
| import org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos.SCMCommandProto; | ||
| import org.apache.hadoop.ozone.container.common.statemachine.SCMConnectionManager; | ||
| import org.apache.hadoop.ozone.container.common.statemachine.StateContext; | ||
| import org.apache.hadoop.ozone.container.ozoneimpl.OzoneContainer; | ||
| import org.apache.hadoop.ozone.protocol.commands.ReconcileContainerCommand; | ||
| import org.apache.hadoop.ozone.protocol.commands.SCMCommand; | ||
| import org.apache.hadoop.util.Time; | ||
| import org.slf4j.Logger; | ||
| import org.slf4j.LoggerFactory; | ||
|
|
||
| import java.io.IOException; | ||
| import java.util.concurrent.CompletableFuture; | ||
| import java.util.concurrent.ExecutorService; | ||
| import java.util.concurrent.Executors; | ||
| import java.util.concurrent.atomic.AtomicInteger; | ||
| import java.util.concurrent.atomic.AtomicLong; | ||
|
|
||
| /** | ||
| * Handles commands from SCM to reconcile a container replica on this datanode with the replicas on its peers. | ||
| */ | ||
| public class ReconcileContainerCommandHandler implements CommandHandler { | ||
| private static final Logger LOG = | ||
| LoggerFactory.getLogger(ReconcileContainerCommandHandler.class); | ||
|
|
||
| private final AtomicLong invocationCount; | ||
| private final AtomicInteger queuedCount; | ||
| private final ExecutorService executor; | ||
| private long totalTime; | ||
|
|
||
| public ReconcileContainerCommandHandler(String threadNamePrefix) { | ||
| invocationCount = new AtomicLong(0); | ||
| queuedCount = new AtomicInteger(0); | ||
| // TODO Allow configurable thread pool size with a default value when the implementation is ready. | ||
| executor = Executors.newSingleThreadExecutor(new ThreadFactoryBuilder() | ||
| .setNameFormat(threadNamePrefix + "ReconcileContainerThread-%d") | ||
| .build()); | ||
| totalTime = 0; | ||
| } | ||
|
|
||
| @Override | ||
| public void handle(SCMCommand command, OzoneContainer container, StateContext context, | ||
| SCMConnectionManager connectionManager) { | ||
| queuedCount.incrementAndGet(); | ||
| CompletableFuture.runAsync(() -> { | ||
| invocationCount.incrementAndGet(); | ||
| long startTime = Time.monotonicNow(); | ||
| ReconcileContainerCommand reconcileCommand = (ReconcileContainerCommand) command; | ||
| LOG.info("Processing reconcile container command for container {} with peers {}", | ||
| reconcileCommand.getContainerID(), reconcileCommand.getPeerDatanodes()); | ||
| try { | ||
| container.getController().reconcileContainer(reconcileCommand.getContainerID(), | ||
| reconcileCommand.getPeerDatanodes()); | ||
| } catch (IOException ex) { | ||
| LOG.error("Failed to reconcile container {}.", reconcileCommand.getContainerID(), ex); | ||
| } finally { | ||
| long endTime = Time.monotonicNow(); | ||
| totalTime += endTime - startTime; | ||
| } | ||
| }, executor).whenComplete((v, e) -> queuedCount.decrementAndGet()); | ||
| } | ||
|
|
||
| @Override | ||
| public SCMCommandProto.Type getCommandType() { | ||
| return SCMCommandProto.Type.reconcileContainerCommand; | ||
| } | ||
|
|
||
| @Override | ||
| public int getInvocationCount() { | ||
| return (int)invocationCount.get(); | ||
| } | ||
|
|
||
| @Override | ||
| public long getAverageRunTime() { | ||
| if (invocationCount.get() > 0) { | ||
| return totalTime / invocationCount.get(); | ||
| } | ||
| return 0; | ||
| } | ||
|
|
||
| @Override | ||
| public long getTotalRunTime() { | ||
| return totalTime; | ||
| } | ||
|
|
||
| @Override | ||
| public int getQueuedCount() { | ||
| return queuedCount.get(); | ||
| } | ||
| } |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: Hex for printing to make it more human friendly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. That seemed standard but we could use a different format if there's a better option.