Skip to content

Conversation

@adoroszlai
Copy link
Contributor

What changes were proposed in this pull request?

If HADOOP_HOME does not point to Ozone install, make an attempt to find it relative to the script being run. This logic is copied from start-ozone.sh, where it was added for HDDS-1912 to fix similar problem.

https://issues.apache.org/jira/browse/HDDS-4450

How was this patch tested?

Tested using ozonescripts compose environment after temporarily adding HADOOP_HOME=/usr/local/hadoop to docker-config. Verified that all three scripts (ozone and stop-ozone.sh, changed here, and start-ozone.sh, changed previously) work fine.

cd hadoop-ozone/dist/target/ozone-1.1.0-SNAPSHOT/compose/ozonescripts
docker-compose up -d
# <wait a bit>
./start.sh
...
./ps.sh
# <output with Ozone processes>
docker-compose exec scm bash
bash-4.2$ ozone freon ockg -n1 -t1 -F ONE
...
Successful executions: 1
bash-4.2$ echo $HADOOP_HOME
/usr/local/hadoop
bash-4.2$ exit
./stop.sh
...
./ps.sh
# <no more Ozone process>

@adoroszlai adoroszlai self-assigned this Nov 11, 2020
@adoroszlai adoroszlai requested a review from elek November 11, 2020 12:39
@adoroszlai adoroszlai added the bug Something isn't working label Nov 13, 2020
@elek
Copy link
Member

elek commented Nov 16, 2020

I guess, it's an incompatible change, but should be acceptable IMHO.

Copy link
Contributor

@avijayanhwx avijayanhwx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@elek Can you tell me how this is an incompatible change?

@elek
Copy link
Member

elek commented Nov 18, 2020

Can you tell me how this is an incompatible change?

Based on my understanding HADOOP_HOME was usable earlier to point to any Ozone install, now it's forced to point a specific dir.

The chance is very small for this behavior is used by any cluster manager, but I would like to be sure it's safe. (earlier I accidentally committed incompatible changes.)

But as you are fine with it, let's merge it.

Thanks the patch @adoroszlai and review @avijayanhwx

@elek elek merged commit 541ae9f into apache:master Nov 18, 2020
@adoroszlai
Copy link
Contributor Author

adoroszlai commented Nov 19, 2020

Thanks @elek for merging it and @avijayanhwx for the review.

Based on my understanding HADOOP_HOME was usable earlier to point to any Ozone install, now it's forced to point a specific dir.

The script is happy as long as it finds the ozone-config.sh file, does not care about specific directories, versions or instances. HADOOP_HOME is only forced if it does not already point to any Ozone install, rather to Hadoop or something completely different or maybe a non-existent dir. In this case, ozone (and stop-ozone.sh) did not work prior to this change.

Also, as long as Ozone scripts are executed, not sourced, updating the variables does not affect the caller.

Normally HADOOP_HOME and other environment variables can be defined for each component separately. However, in an Impala mini cluster (used for running tests), all components share the same environment. HADOOP_HOME points to Hadoop, so we need a way to override this "locally" for Ozone. This change allows that without extensive and potentially dangerous refactoring (renaming all HADOOP variables and updating Ozone's copy of hadoop functions etc.).

@adoroszlai adoroszlai deleted the HDDS-4450 branch November 19, 2020 09:44
@elek
Copy link
Member

elek commented Nov 23, 2020

❤️ Thanks for the detailed explanation.

errose28 added a commit to errose28/ozone that referenced this pull request Nov 24, 2020
* HDDS-3698-upgrade: (46 commits)
  HDDS-4468. Fix Goofys listBucket large than 1000 objects will stuck forever (apache#1595)
  HDDS-4417. Simplify Ozone client code with configuration object -- addendum (apache#1581)
  HDDS-4476. Improve the ZH translation of the HA.md in doc. (apache#1597)
  HDDS-4432. Update Ratis version to latest snapshot. (apache#1586)
  HDDS-4488. Open RocksDB read only when loading containers at Datanode startup (apache#1605)
  HDDS-4478. Large deletedKeyset slows down OM via listStatus. (apache#1598)
  HDDS-4452. findbugs.sh couldn't be executed after a full build (apache#1576)
  HDDS-4427. Avoid ContainerCache in ContainerReader at Datanode startup (apache#1549)
  HDDS-4448. Duplicate refreshPipeline in listStatus (apache#1569)
  HDDS-4450. Cannot run ozone if HADOOP_HOME points to Hadoop install (apache#1572)
  HDDS-4346.Ozone specific Trash Policy (apache#1535)
  HDDS-4426. SCM should create transactions using all blocks received from OM (apache#1561)
  HDDS-4399. Safe mode rule for piplelines should only consider open pipelines. (apache#1526)
  HDDS-4367. Configuration for deletion service intervals should be different for OM, SCM and datanodes (apache#1573)
  HDDS-4462. Add --frozen-lockfile to pnpm install to prevent ozone-recon-web/pnpm-lock.yaml from being updated automatically (apache#1589)
  HDDS-4082. Create ZH translation of HA.md in doc. (apache#1591)
  HDDS-4464. Upgrade httpclient version due to CVE-2020-13956. (apache#1590)
  HDDS-4467. Acceptance test fails due to new Hadoop 3 image (apache#1594)
  HDDS-4466. Update url in .asf.yaml to use TLP project (apache#1592)
  HDDS-4458. Fix Max Transaction ID value in OM. (apache#1585)
  ...
errose28 added a commit to errose28/ozone that referenced this pull request Nov 25, 2020
* HDDS-3698-upgrade: (47 commits)
  HDDS-4468. Fix Goofys listBucket large than 1000 objects will stuck forever (apache#1595)
  HDDS-4417. Simplify Ozone client code with configuration object -- addendum (apache#1581)
  HDDS-4476. Improve the ZH translation of the HA.md in doc. (apache#1597)
  HDDS-4432. Update Ratis version to latest snapshot. (apache#1586)
  HDDS-4488. Open RocksDB read only when loading containers at Datanode startup (apache#1605)
  HDDS-4478. Large deletedKeyset slows down OM via listStatus. (apache#1598)
  HDDS-4452. findbugs.sh couldn't be executed after a full build (apache#1576)
  HDDS-4427. Avoid ContainerCache in ContainerReader at Datanode startup (apache#1549)
  HDDS-4448. Duplicate refreshPipeline in listStatus (apache#1569)
  HDDS-4450. Cannot run ozone if HADOOP_HOME points to Hadoop install (apache#1572)
  HDDS-4346.Ozone specific Trash Policy (apache#1535)
  HDDS-4426. SCM should create transactions using all blocks received from OM (apache#1561)
  HDDS-4399. Safe mode rule for piplelines should only consider open pipelines. (apache#1526)
  HDDS-4367. Configuration for deletion service intervals should be different for OM, SCM and datanodes (apache#1573)
  HDDS-4462. Add --frozen-lockfile to pnpm install to prevent ozone-recon-web/pnpm-lock.yaml from being updated automatically (apache#1589)
  HDDS-4082. Create ZH translation of HA.md in doc. (apache#1591)
  HDDS-4464. Upgrade httpclient version due to CVE-2020-13956. (apache#1590)
  HDDS-4467. Acceptance test fails due to new Hadoop 3 image (apache#1594)
  HDDS-4466. Update url in .asf.yaml to use TLP project (apache#1592)
  HDDS-4458. Fix Max Transaction ID value in OM. (apache#1585)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants