Skip to content

Conversation

@adoroszlai
Copy link
Contributor

What changes were proposed in this pull request?

As of now, upgrade acceptance test verifies upgrade from Ozone 0.5.0 to "current", ie. the commit being built. It has an OZONE_UPGRADE_TO variable, but its value is not really relevant to which version is run after the "upgrade".

This PR introduces the following changes:

  1. Test upgrade from multiple earlier versions, eg. from 0.5.0 or 1.0.0.
  2. Extract parts of the upgrade test script to two libraries (the existing main lib and an upgrade-specific one).
  3. Split upgrade test script:
    • upgrade_to_current: test upgrade from a release to the current binaries,
    • upgrade_to_release: test upgrade from one release to a later one.
  4. Improve the scripts which define version-specific behavior: instead of just declaring variables, define functions to activate/deactivate behavior of a specific version.
  5. Introduce "logical version" to be able to upgrade across multiple versions (eg. from 0.5.0 to 1.1.0).

https://issues.apache.org/jira/browse/HDDS-4741

How was this patch tested?

Regular CI now runs upgrade_to_current.sh from both 0.5.0 and 1.0.0:
https://github.com/adoroszlai/hadoop-ozone/runs/1912441301

Also tested upgrade_to_release.sh manually.

@adoroszlai adoroszlai self-assigned this Feb 16, 2021
@avijayanhwx
Copy link
Contributor

cc @errose28 for review.

Copy link
Contributor

@errose28 errose28 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this @adoroszlai. Overall looks good, a few minor comments in line. Also it looks like there is an extra robot log and report getting placed in the upgrade directory (outside the results folder). Looks like log and report for 0.5.0 first gets put there, and then log and report for 1.0.0 overwrites it. Is this supposed to happen?

SCRIPT_DIR=$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null && pwd )
ALL_RESULT_DIR="$SCRIPT_DIR/result"
mkdir -p "$ALL_RESULT_DIR"
rm "$ALL_RESULT_DIR/*" || true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Glob expansion won't happen in double quotes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch. BTW, this script is copied from hadoop-ozone/dist/src/main/compose/ozone-mr/test.sh.

if [[ "${OZONE_VOLUME_OWNER}" != "${current_user}" ]]; then
chown -R "${OZONE_VOLUME_OWNER}" "${OZONE_VOLUME}" \
|| sudo chown -R "${OZONE_VOLUME_OWNER}" "${OZONE_VOLUME}"
set -x
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Printing all the commands as they run seems like a chunk of extra output that is not very helpful. Is there an advantage to this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I think it is just leftover debug.

@adoroszlai
Copy link
Contributor Author

Also it looks like there is an extra robot log and report getting placed in the upgrade directory (outside the results folder). Looks like log and report for 0.5.0 first gets put there, and then log and report for 1.0.0 overwrites it. Is this supposed to happen?

Thanks for pointing this out. It turns out not specific to the upgrade test or this change. It happens for each "subtest" on master, too. Example (from here):

Log:     /mnt/ozone/hadoop-ozone/dist/target/ozone-1.1.0-SNAPSHOT/compose/compatibility/result/log.html
Report:  /mnt/ozone/hadoop-ozone/dist/target/ozone-1.1.0-SNAPSHOT/compose/compatibility/result/report.html
Output:  /mnt/ozone/hadoop-ozone/dist/target/ozone-1.1.0-SNAPSHOT/compose/result/compatibility.xml
Log:     /mnt/ozone/hadoop-ozone/dist/target/ozone-1.1.0-SNAPSHOT/compose/log.html
Report:  /mnt/ozone/hadoop-ozone/dist/target/ozone-1.1.0-SNAPSHOT/compose/report.html

(The second log and report are the ones created in the same way as you described.)

Since the fix is pretty simple, I incuded it here, instead of creating a separate issue.

@errose28
Copy link
Contributor

errose28 commented Feb 24, 2021

Thanks for working on this @adoroszlai. Tried it out locally and it ran smoothly. Changes LGTM +1.

@adoroszlai
Copy link
Contributor Author

@avijayanhwx @elek @fapifta can you please review? @errose28 is OK with it and would like to work on top of it in the upgrade branch.

Copy link
Member

@elek elek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 LGTM (except one typo).

Thanks the patch @adoroszlai

Co-authored-by: Elek, Márton <[email protected]>
Copy link
Contributor

@avijayanhwx avijayanhwx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this @adoroszlai. I have gone through the changes and they look good to me.

@avijayanhwx
Copy link
Contributor

Thanks for the patch @adoroszlai, and the reviews @errose28 & @elek. I am merging this.

@avijayanhwx avijayanhwx merged commit f8f1b5f into apache:master Mar 2, 2021
@adoroszlai adoroszlai deleted the HDDS-4741 branch March 2, 2021 20:14
errose28 added a commit to errose28/ozone that referenced this pull request Mar 2, 2021
…ing-upgrade

* upstream/master: (29 commits)
  HDDS-4741. Modularize upgrade test (apache#1928)
  HDDS-4864. Add acceptance tests to certify Ozone with boto3 python client. (apache#1976)
  HDDS-4791. StateContext.getReports may return list with size larger t… (apache#1892)
  HDDS-4867. Ozone admin datanode list should report dead and stale nodes (apache#1966)
  HDDS-4858. Useless Maven cache cleanup (apache#1956)
  HDDS-4769. Simplify insert operation of ContainerAttribute (apache#1865)
  HDDS-4847. Fix typo in name of IdentityService (apache#1941)
  HDDS-4869. Bump jackson version number (apache#1963)
  HDDS-4871. Fix intellij runConfigurations for datanode (apache#1968)
  HDDS-4870. Bump jetty version (apache#1964)
  HDDS-4722. Creating RDBStore fails due to RDBMetrics instance race (apache#1820)
  HDDS-4138. Improve crc efficiency by using Java.util.zip.CRC when available (apache#1950)
  HDDS-4816. Add UsageInfoSubcommand to get Datanode usage information. (apache#1919)
  HDDS-4754. Make scm heartbeat rpc retry interval configurable (apache#1942)
  HDDS-4832. Show Datanode OperationalState in Recon (apache#1937)
  HDDS-4653. Support TDE for MPU Keys on Encrypted Buckets (apache#1766)
  HDDS-4853. libexec/entrypoint.sh might copy from wrong path (apache#1951)
  HDDS-4857. Format ReplicationType.java which indentation are confusion (apache#1952)
  HDDS-4850. Intermittent failure in ozonesecure due to unable to allocate block (apache#1948)
  HDDS-4808. Add Genesis benchmark for various CRC implementations (apache#1910)
  ...

Conflicts:
	hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/scm/client/ScmClient.java
	hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/scm/protocol/StorageContainerLocationProtocol.java
	hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/OzoneConsts.java
	hadoop-hdds/framework/src/main/java/org/apache/hadoop/hdds/scm/protocolPB/StorageContainerLocationProtocolClientSideTranslatorPB.java
	hadoop-hdds/interface-admin/src/main/proto/ScmAdminProtocol.proto
	hadoop-hdds/interface-client/src/main/proto/hdds.proto
	hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/protocol/StorageContainerLocationProtocolServerSideTranslatorPB.java
	hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/server/SCMClientProtocolServer.java
	hadoop-hdds/tools/src/main/java/org/apache/hadoop/hdds/scm/cli/ContainerOperationClient.java
	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneManager.java
errose28 added a commit to errose28/ozone that referenced this pull request Mar 11, 2021
…ing-upgrade-merge-candidate

* upstream/master: (29 commits)
  HDDS-4741. Modularize upgrade test (apache#1928)
  HDDS-4864. Add acceptance tests to certify Ozone with boto3 python client. (apache#1976)
  HDDS-4791. StateContext.getReports may return list with size larger t… (apache#1892)
  HDDS-4867. Ozone admin datanode list should report dead and stale nodes (apache#1966)
  HDDS-4858. Useless Maven cache cleanup (apache#1956)
  HDDS-4769. Simplify insert operation of ContainerAttribute (apache#1865)
  HDDS-4847. Fix typo in name of IdentityService (apache#1941)
  HDDS-4869. Bump jackson version number (apache#1963)
  HDDS-4871. Fix intellij runConfigurations for datanode (apache#1968)
  HDDS-4870. Bump jetty version (apache#1964)
  HDDS-4722. Creating RDBStore fails due to RDBMetrics instance race (apache#1820)
  HDDS-4138. Improve crc efficiency by using Java.util.zip.CRC when available (apache#1950)
  HDDS-4816. Add UsageInfoSubcommand to get Datanode usage information. (apache#1919)
  HDDS-4754. Make scm heartbeat rpc retry interval configurable (apache#1942)
  HDDS-4832. Show Datanode OperationalState in Recon (apache#1937)
  HDDS-4653. Support TDE for MPU Keys on Encrypted Buckets (apache#1766)
  HDDS-4853. libexec/entrypoint.sh might copy from wrong path (apache#1951)
  HDDS-4857. Format ReplicationType.java which indentation are confusion (apache#1952)
  HDDS-4850. Intermittent failure in ozonesecure due to unable to allocate block (apache#1948)
  HDDS-4808. Add Genesis benchmark for various CRC implementations (apache#1910)
  ...

Conflicts:
	hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/scm/client/ScmClient.java
	hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/scm/protocol/StorageContainerLocationProtocol.java
	hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/OzoneConsts.java
	hadoop-hdds/framework/src/main/java/org/apache/hadoop/hdds/scm/protocolPB/StorageContainerLocationProtocolClientSideTranslatorPB.java
	hadoop-hdds/interface-admin/src/main/proto/ScmAdminProtocol.proto
	hadoop-hdds/interface-client/src/main/proto/hdds.proto
	hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/protocol/StorageContainerLocationProtocolServerSideTranslatorPB.java
	hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/server/SCMClientProtocolServer.java
	hadoop-hdds/tools/src/main/java/org/apache/hadoop/hdds/scm/cli/ContainerOperationClient.java
	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneManager.java
	hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/scm/ReconNodeManager.java
errose28 added a commit to errose28/ozone that referenced this pull request Mar 16, 2021
* HDDS-3698-nonrolling-upgrade: (29 commits)
  HDDS-4741. Modularize upgrade test (apache#1928)
  HDDS-4864. Add acceptance tests to certify Ozone with boto3 python client. (apache#1976)
  HDDS-4791. StateContext.getReports may return list with size larger t… (apache#1892)
  HDDS-4867. Ozone admin datanode list should report dead and stale nodes (apache#1966)
  HDDS-4858. Useless Maven cache cleanup (apache#1956)
  HDDS-4769. Simplify insert operation of ContainerAttribute (apache#1865)
  HDDS-4847. Fix typo in name of IdentityService (apache#1941)
  HDDS-4869. Bump jackson version number (apache#1963)
  HDDS-4871. Fix intellij runConfigurations for datanode (apache#1968)
  HDDS-4870. Bump jetty version (apache#1964)
  HDDS-4722. Creating RDBStore fails due to RDBMetrics instance race (apache#1820)
  HDDS-4138. Improve crc efficiency by using Java.util.zip.CRC when available (apache#1950)
  HDDS-4816. Add UsageInfoSubcommand to get Datanode usage information. (apache#1919)
  HDDS-4754. Make scm heartbeat rpc retry interval configurable (apache#1942)
  HDDS-4832. Show Datanode OperationalState in Recon (apache#1937)
  HDDS-4653. Support TDE for MPU Keys on Encrypted Buckets (apache#1766)
  HDDS-4853. libexec/entrypoint.sh might copy from wrong path (apache#1951)
  HDDS-4857. Format ReplicationType.java which indentation are confusion (apache#1952)
  HDDS-4850. Intermittent failure in ozonesecure due to unable to allocate block (apache#1948)
  HDDS-4808. Add Genesis benchmark for various CRC implementations (apache#1910)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants