Skip to content

Conversation

@adoroszlai
Copy link
Contributor

What changes were proposed in this pull request?

Logs for one of the containers for each run is missing from acceptance test artifacts.

For example, docker-ozone-datanode-1.log and docker-ozonesecure-ha-datanode1-1.log here:

$ find 2024/07/12/32233/acceptance-s3a -name 'docker*' | sort
2024/07/12/32233/acceptance-s3a/ozone/s3a/docker-ozone-datanode-2.log
2024/07/12/32233/acceptance-s3a/ozone/s3a/docker-ozone-datanode-3.log
2024/07/12/32233/acceptance-s3a/ozone/s3a/docker-ozone-httpfs-1.log
2024/07/12/32233/acceptance-s3a/ozone/s3a/docker-ozone-om-1.log
2024/07/12/32233/acceptance-s3a/ozone/s3a/docker-ozone-recon-1.log
2024/07/12/32233/acceptance-s3a/ozone/s3a/docker-ozone-s3g-1.log
2024/07/12/32233/acceptance-s3a/ozone/s3a/docker-ozone-scm-1.log
2024/07/12/32233/acceptance-s3a/ozonesecure-ha/s3a/docker-ozonesecure-ha-datanode2-1.log
2024/07/12/32233/acceptance-s3a/ozonesecure-ha/s3a/docker-ozonesecure-ha-datanode3-1.log
2024/07/12/32233/acceptance-s3a/ozonesecure-ha/s3a/docker-ozonesecure-ha-httpfs-1.log
2024/07/12/32233/acceptance-s3a/ozonesecure-ha/s3a/docker-ozonesecure-ha-kdc-1.log
2024/07/12/32233/acceptance-s3a/ozonesecure-ha/s3a/docker-ozonesecure-ha-kms-1.log
2024/07/12/32233/acceptance-s3a/ozonesecure-ha/s3a/docker-ozonesecure-ha-om1-1.log
2024/07/12/32233/acceptance-s3a/ozonesecure-ha/s3a/docker-ozonesecure-ha-om2-1.log
2024/07/12/32233/acceptance-s3a/ozonesecure-ha/s3a/docker-ozonesecure-ha-om3-1.log
2024/07/12/32233/acceptance-s3a/ozonesecure-ha/s3a/docker-ozonesecure-ha-recon-1.log
2024/07/12/32233/acceptance-s3a/ozonesecure-ha/s3a/docker-ozonesecure-ha-s3g-1.log
2024/07/12/32233/acceptance-s3a/ozonesecure-ha/s3a/docker-ozonesecure-ha-scm1.org-1.log
2024/07/12/32233/acceptance-s3a/ozonesecure-ha/s3a/docker-ozonesecure-ha-scm2.org-1.log
2024/07/12/32233/acceptance-s3a/ozonesecure-ha/s3a/docker-ozonesecure-ha-scm3.org-1.log

This happens due to difference in output of Docker Compose v1 and v2:

  • v1 prints two lines of header,
  • v2 prints only one.
$ docker-compose ps                                          
      Name                    Command               State                                            Ports                                          
----------------------------------------------------------------------------------------------------------------------------------------------------
ozone-datanode-1   /usr/local/bin/dumb-init - ...   Up      0.0.0.0:33548->19864/tcp,:::33548->19864/tcp, 0.0.0.0:33549->9882/tcp,:::33549->9882/tcp
ozone-datanode-2   /usr/local/bin/dumb-init - ...   Up      0.0.0.0:33550->19864/tcp,:::33550->19864/tcp, 0.0.0.0:33551->9882/tcp,:::33551->9882/tcp
ozone-datanode-3   /usr/local/bin/dumb-init - ...   Up      0.0.0.0:33546->19864/tcp,:::33546->19864/tcp, 0.0.0.0:33547->9882/tcp,:::33547->9882/tcp
ozone-httpfs-1     /usr/local/bin/dumb-init - ...   Up      0.0.0.0:14000->14000/tcp,:::14000->14000/tcp                                            
ozone-om-1         /usr/local/bin/dumb-init - ...   Up      0.0.0.0:9862->9862/tcp,:::9862->9862/tcp, 0.0.0.0:9874->9874/tcp,:::9874->9874/tcp      
ozone-recon-1      /usr/local/bin/dumb-init - ...   Up      0.0.0.0:9888->9888/tcp,:::9888->9888/tcp                                                
ozone-s3g-1        /usr/local/bin/dumb-init - ...   Up      0.0.0.0:9878->9878/tcp,:::9878->9878/tcp                                                
ozone-scm-1        /usr/local/bin/dumb-init - ...   Up      0.0.0.0:9860->9860/tcp,:::9860->9860/tcp, 0.0.0.0:9876->9876/tcp,:::9876->9876/tcp      
$ docker compose ps
NAME                IMAGE                                  COMMAND                  SERVICE             CREATED             STATUS              PORTS
ozone-datanode-1    apache/ozone-runner:20240316-jdk17-1   "/usr/local/bin/dumb…"   datanode            25 seconds ago      Up 23 seconds       0.0.0.0:33549->9882/tcp, :::33549->9882/tcp, 0.0.0.0:33548->19864/tcp, :::33548->19864/tcp
ozone-datanode-2    apache/ozone-runner:20240316-jdk17-1   "/usr/local/bin/dumb…"   datanode            25 seconds ago      Up 22 seconds       0.0.0.0:33551->9882/tcp, :::33551->9882/tcp, 0.0.0.0:33550->19864/tcp, :::33550->19864/tcp
ozone-datanode-3    apache/ozone-runner:20240316-jdk17-1   "/usr/local/bin/dumb…"   datanode            25 seconds ago      Up 23 seconds       0.0.0.0:33547->9882/tcp, :::33547->9882/tcp, 0.0.0.0:33546->19864/tcp, :::33546->19864/tcp
ozone-httpfs-1      apache/ozone-runner:20240316-jdk17-1   "/usr/local/bin/dumb…"   httpfs              25 seconds ago      Up 23 seconds       0.0.0.0:14000->14000/tcp, :::14000->14000/tcp
ozone-om-1          apache/ozone-runner:20240316-jdk17-1   "/usr/local/bin/dumb…"   om                  25 seconds ago      Up 23 seconds       0.0.0.0:9862->9862/tcp, :::9862->9862/tcp, 0.0.0.0:9874->9874/tcp, :::9874->9874/tcp
ozone-recon-1       apache/ozone-runner:20240316-jdk17-1   "/usr/local/bin/dumb…"   recon               25 seconds ago      Up 23 seconds       0.0.0.0:9888->9888/tcp, :::9888->9888/tcp
ozone-s3g-1         apache/ozone-runner:20240316-jdk17-1   "/usr/local/bin/dumb…"   s3g                 25 seconds ago      Up 23 seconds       0.0.0.0:9878->9878/tcp, :::9878->9878/tcp
ozone-scm-1         apache/ozone-runner:20240316-jdk17-1   "/usr/local/bin/dumb…"   scm                 25 seconds ago      Up 23 seconds       0.0.0.0:9860->9860/tcp, :::9860->9860/tcp, 0.0.0.0:9876->9876/tcp, :::9876->9876/tcp

and the code that collects logs only starts at the third line (tail -n +3):

save_container_logs() {
local output_name=$(get_output_name)
local c
for c in $(docker-compose ps -a "$@" | cut -f1 -d' ' | tail -n +3); do
docker logs "${c}" >> "$RESULT_DIR/docker-${output_name}${c}.log" 2>&1
done
}

https://issues.apache.org/jira/browse/HDDS-11186

How was this patch tested?

Verified that datanode1 log is also saved (in addition to all other container logs):

renamed 'ozone-balancer/result/docker-ozone-balancer-datanode1-1.log' -> '/home/runner/work/ozone/ozone/hadoop-ozone/dist/target/ozone-1.5.0-SNAPSHOT/compose/result/ozone-balancer/docker-ozone-balancer-datanode1-1.log'

https://github.com/adoroszlai/ozone/actions/runs/9954339826/job/27500597426#step:5:124

whereas previously datanode2 was the first item, datanode1 was missing:

renamed 'ozone-balancer/result/docker-ozone-balancer-datanode2-1.log' -> '/home/runner/work/ozone/ozone/hadoop-ozone/dist/target/ozone-1.5.0-SNAPSHOT/compose/result/ozone-balancer/docker-ozone-balancer-datanode2-1.log'

https://github.com/apache/ozone/actions/runs/9936569749/job/27445931351#step:5:123

@adoroszlai adoroszlai self-assigned this Jul 16, 2024
Copy link
Contributor

@sadanand48 sadanand48 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @adoroszlai for the patch, LGTM

@adoroszlai adoroszlai merged commit e01a57d into apache:master Jul 17, 2024
@adoroszlai adoroszlai deleted the HDDS-11186 branch July 17, 2024 11:22
@adoroszlai
Copy link
Contributor Author

Thanks @sadanand48 for the review.

xichen01 pushed a commit to xichen01/ozone that referenced this pull request Jul 17, 2024
xichen01 pushed a commit to xichen01/ozone that referenced this pull request Jul 17, 2024
xichen01 pushed a commit to xichen01/ozone that referenced this pull request Jul 18, 2024
xichen01 pushed a commit to xichen01/ozone that referenced this pull request Jul 18, 2024
xichen01 pushed a commit to xichen01/ozone that referenced this pull request Jul 18, 2024
errose28 added a commit to errose28/ozone that referenced this pull request Jul 30, 2024
…-delete

* HDDS-10239-container-reconciliation: (184 commits)
  HDDS-10373. Implement framework for capturing Merkle Tree Metrics. (apache#6864)
  HDDS-11188. Initial setup for new UI layout and enable users to switch to new UI (apache#6953)
  HDDS-11120. Rich rebalancing status info (apache#6911)
  HDDS-11187. Fix Event Handling in Recon OMDBUpdatesHandler to Prevent ClassCastException. (apache#6950)
  HDDS-11213. Bump commons-daemon to 1.4.0 (apache#6971)
  HDDS-11212. Bump commons-net to 3.11.1 (apache#6973)
  HDDS-11211. Bump assertj-core to 3.26.3 (apache#6972)
  HDDS-11210. Bump log4j2 to 2.23.1 (apache#6970)
  HDDS-11150. Recon Overview page crashes due to failed API Calls (apache#6944)
  HDDS-11183. Keys from DeletedTable and DeletedDirTable of AOS should be deleted on batch operation while creating a snapshot (apache#6946)
  HDDS-11198. Fix Typescript configs for Recon (apache#6961)
  HDDS-11180. Simplify HttpServer2#inferMimeType return statement (apache#6963)
  HDDS-11194. OM missing audit log for upgrade (apache#6958)
  HDDS-10389. Implement a search feature for users to locate open keys within the Open Keys Insights section. (apache#6231)
  HDDS-10561. Dashboard for delete key metrics (apache#6948)
  HDDS-11192. Increase SPNEGO URL test coverage (apache#6956)
  HDDS-11179. DBConfigFromFile#readFromFile result of toIOException not thrown (apache#6957)
  HDDS-11186. First container log missing from bundle (apache#6952)
  HDDS-10844. Clarify snapshot create error message. (apache#6955)
  HDDS-11166. Switch to Rocky Linux-based ozone-runner (apache#6942)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants