-
Notifications
You must be signed in to change notification settings - Fork 588
HDDS-4399. Safe mode rule for piplelines should only consider open pipelines. #1526
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
806de8b to
acaf777
Compare
bharatviswa504
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unrelated file pnpm-lock.yaml need to be removed.
I have one minor comment, overall patch LGTM.
Can you look into failed CI tests?
...erver-scm/src/main/java/org/apache/hadoop/hdds/scm/safemode/HealthyPipelineSafeModeRule.java
Outdated
Show resolved
Hide resolved
acaf777 to
acd7489
Compare
|
@bharatviswa504 @nandakumar131 can you please review? |
bharatviswa504
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 LGTM
|
Thank You @bshashikant for the contribution. |
* HDDS-3698-upgrade: (46 commits) HDDS-4468. Fix Goofys listBucket large than 1000 objects will stuck forever (apache#1595) HDDS-4417. Simplify Ozone client code with configuration object -- addendum (apache#1581) HDDS-4476. Improve the ZH translation of the HA.md in doc. (apache#1597) HDDS-4432. Update Ratis version to latest snapshot. (apache#1586) HDDS-4488. Open RocksDB read only when loading containers at Datanode startup (apache#1605) HDDS-4478. Large deletedKeyset slows down OM via listStatus. (apache#1598) HDDS-4452. findbugs.sh couldn't be executed after a full build (apache#1576) HDDS-4427. Avoid ContainerCache in ContainerReader at Datanode startup (apache#1549) HDDS-4448. Duplicate refreshPipeline in listStatus (apache#1569) HDDS-4450. Cannot run ozone if HADOOP_HOME points to Hadoop install (apache#1572) HDDS-4346.Ozone specific Trash Policy (apache#1535) HDDS-4426. SCM should create transactions using all blocks received from OM (apache#1561) HDDS-4399. Safe mode rule for piplelines should only consider open pipelines. (apache#1526) HDDS-4367. Configuration for deletion service intervals should be different for OM, SCM and datanodes (apache#1573) HDDS-4462. Add --frozen-lockfile to pnpm install to prevent ozone-recon-web/pnpm-lock.yaml from being updated automatically (apache#1589) HDDS-4082. Create ZH translation of HA.md in doc. (apache#1591) HDDS-4464. Upgrade httpclient version due to CVE-2020-13956. (apache#1590) HDDS-4467. Acceptance test fails due to new Hadoop 3 image (apache#1594) HDDS-4466. Update url in .asf.yaml to use TLP project (apache#1592) HDDS-4458. Fix Max Transaction ID value in OM. (apache#1585) ...
* HDDS-3698-upgrade: (47 commits) HDDS-4468. Fix Goofys listBucket large than 1000 objects will stuck forever (apache#1595) HDDS-4417. Simplify Ozone client code with configuration object -- addendum (apache#1581) HDDS-4476. Improve the ZH translation of the HA.md in doc. (apache#1597) HDDS-4432. Update Ratis version to latest snapshot. (apache#1586) HDDS-4488. Open RocksDB read only when loading containers at Datanode startup (apache#1605) HDDS-4478. Large deletedKeyset slows down OM via listStatus. (apache#1598) HDDS-4452. findbugs.sh couldn't be executed after a full build (apache#1576) HDDS-4427. Avoid ContainerCache in ContainerReader at Datanode startup (apache#1549) HDDS-4448. Duplicate refreshPipeline in listStatus (apache#1569) HDDS-4450. Cannot run ozone if HADOOP_HOME points to Hadoop install (apache#1572) HDDS-4346.Ozone specific Trash Policy (apache#1535) HDDS-4426. SCM should create transactions using all blocks received from OM (apache#1561) HDDS-4399. Safe mode rule for piplelines should only consider open pipelines. (apache#1526) HDDS-4367. Configuration for deletion service intervals should be different for OM, SCM and datanodes (apache#1573) HDDS-4462. Add --frozen-lockfile to pnpm install to prevent ozone-recon-web/pnpm-lock.yaml from being updated automatically (apache#1589) HDDS-4082. Create ZH translation of HA.md in doc. (apache#1591) HDDS-4464. Upgrade httpclient version due to CVE-2020-13956. (apache#1590) HDDS-4467. Acceptance test fails due to new Hadoop 3 image (apache#1594) HDDS-4466. Update url in .asf.yaml to use TLP project (apache#1592) HDDS-4458. Fix Max Transaction ID value in OM. (apache#1585) ...
What changes were proposed in this pull request?
Currently, for safe mode we consider all pipelines existing in DB for safe mode exit criteria. It ma happen that, SCM has the pipelines craeted , but none of the participants datanodes ever created these datanodes. In such cases, SCM fails to come out of safemode as these pipelines are never reported back to SCM.
The idea here is to consider pipelines which are marked open during SCM startup.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-4399
How was this patch tested?
Added unit tests