-
Notifications
You must be signed in to change notification settings - Fork 588
HDDS-9479. Pipeline close doesn't wait for containers to be closed. #5604
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
...p-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/pipeline/PipelineManagerImpl.java
Show resolved
Hide resolved
| "ozone.scm.pipeline.scrub.interval"; | ||
| public static final String OZONE_SCM_PIPELINE_SCRUB_INTERVAL_DEFAULT = | ||
| "5m"; | ||
| "60s"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can have unintended consequence. We do not know how long a pipeline stays in ALLOCATED state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a separate property for handling pipelines in ALLOCATED state.
Since the scrubber thread is handling both ALLOCATED and CLOSED pipelines, it should have a lower interval.
| this.creationTimestamp = Instant.now(); | ||
| this.suggestedLeaderId = suggestedLeaderId; | ||
| this.replicaIndexes = new HashMap<>(); | ||
| this.stateEnterTime = Instant.now(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is used to compare against the configured clock in SCMContainerManager
private void initializeSystemManagers(OzoneConfiguration conf,
SCMConfigurator configurator) throws IOException {
// Use SystemClock when data is persisted
// and used again after system restarts.
systemClock = Clock.system(ZoneOffset.UTC);
Would it be possible to get the creation timestamp from the same clock here? Currently it should be fine but it would be more resistant to bugs due to refactoring in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, let me create a separate Jira for handling this.
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/DeadNodeHandler.java
Show resolved
Hide resolved
6dfec9e to
5c75aa0
Compare
sumitagrawl
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nandakumar131 Thanks for working over this, have minor comment
| return creationTimestamp; | ||
| } | ||
|
|
||
| public Instant getStateEnterTime() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, need set closedState time when close is triggered for pipeline close, else this time is same as creation time as initialized in constructor.
This is to scrub from close time with delay.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sumitagrawl the PipelineState inside Pipeline class is immutable.
If we want to change the state of a pipeline, we create a new Pipeline object with new state which will also update the stateEnterTime with current time.
Since PipelineState is immutable, stateEnterTime is also designed to be immutable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nandakumar131
We can add comment in Pipeline build() to notify the stateEnterTime will be latest when enter the state and no need update the time while building this object. Just to avoid later on changes if context is missed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it is a little strange that creationTime and stateEnterTime are both basically hardcoded to "now". Feels like we should only need one of them, as creation time isn't really "correct". If the pipeline changes state from open to closed, the creation time is going to change too, due to its immutable nature. I can see how this could be confusing to someone in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sodonnel, I completely understand the confusion here. We should refactor the code for better understandability.
Currently we don't have any functional issues as we update the creation time in the builder while we create the Pipeline object from existing pipeline reference.
If you look at Pipeline$Builder(Pipeline pipeline), it assigns the creation time from the given pipeline reference.
With the current code, it's easy to introduce bugs in future. We should re-write the Pipeline class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The creationTimestamp in the Pipeline constructor is a fallback value. We update the creation time in the Pipeline$Builder#build. If the creation time is not set in the builder, we use the Instant#now value set in constructor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I missed that the builder sets creation time. I guess this is OK as the constructor is private, so the only way to create the object is via the builder.
I noticed that the setter is public:
public void setCreationTimestamp(Instant creationTimestamp) {
this.creationTimestamp = creationTimestamp;
}
We could make that private as the inner Builder class can call the private methods. This is not really in scope for this PR though.
|
|
||
| void openPipeline(PipelineID pipelineId) throws IOException; | ||
|
|
||
| @Deprecated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why leave this deprecated on around? This is not an external facing API, so nothing should be using it except tests. It would probably be better to just remove it now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, we can remove this method completely. I just didn't want to have large changes in a single PR if possible, it will become hard for the reviewers.
We will have to change another 15+ classes to remove this method, I will create a follow-up Jira to make this change.
| closePipeline(p, false); | ||
| LOG.info("Closed the stale pipeline: {}", p.getId()); | ||
| final PipelineID id = p.getId(); | ||
| LOG.info("Closing the stale pipeline: {}", id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I am not sure we need this log message. We have a message below for "closed" and then exception handler has a message for "failed to close".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
5c75aa0 to
d50d518
Compare
sumitagrawl
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
sodonnel
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
Thanks @sumitagrawl @sodonnel for the review. |
|
I think this PR might be cause a new test failure: Also failed locally for me locally on the first try: |
|
Thanks for the ping @sodonnel. The test only waits for 5 seconds before it timeout and fails. It's good to increase the max wait time in test to at least 10 seconds. Created HDDS-9801 to fix the same. |
…be closed. (apache#5604) (cherry picked from commit 2cbb0e6) Change-Id: Ie5d95e5cb910b9a98702ae11fcee88ade4fe9cc9
What changes were proposed in this pull request?
Whenever we close a pipeline, we have an option to give some grace time for the container to get closed, the grace time is configured using
ozone.scm.pipeline.destroy.timeout.We wait for the timeout to happen before we go ahead and delete the pipeline. This will give enough time for the datanodes to close the container gracefully.
It will prevent the containers from moving to the
QUASI_CLOSEDstate.This functionality is broken and we don't wait for the timeout to happen before we delete the Pipeline. This creates a lot of
QUASI_CLOSEDcontainers in the cluster when a node goes stale or when a datanode is getting decommissioned.This has to be fixed and we should wait for the configured amount of time before we delete the Pipeline, this will give datanodes enough time to
CLOSEthe containers on that pipeline.What is the link to the Apache JIRA
HDDS-9479
How was this patch tested?
Unit test modified to test the behaviour.