-
Notifications
You must be signed in to change notification settings - Fork 588
HDDS-3855. Add upgrade smoketest #1142
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
avijayanhwx
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @adoroszlai, this looks great! We can use this as a reference for API change tests, finalization etc.
elek
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks to work on this @adoroszlai
Overall it looks good to me, and really impressive approach. I have a few comments -- none of them are blocker, but i like to discuss technical details...
-
Can you please help me to understand why did you remove
-f "${compose_file}"? -
fixed ip / dedicated network in docker-compose file seems to be unnecessary in this cluster (IMHO)
-
It seems to be a big restriction that we can't start multiple datanode on the same file system without configuring the datanode path. This is the reason why you need dn1..dn3 directories. I am wondering if we can provide a generic solution to this one. Maybe we can support
${env...}notion when we set the datanode directory? -
you create external volume directories but
/datais already a volume inside the docker containers. If you use simpledocker-compose stopinstead ofdownit can be reused. Did you consider using this approach?
Why do you prefer external volumes? (I found two arguments: easier to debug + easier to execute commands when the cluster is down. But interested if you had any other motivations...).
Thanks for taking a look. I waited with the merge exactly to have this kind of discussion. ;)
Each
After So That's the reason for both volumes and network settings. I had started out without the network/ip settings, but the containers did not always get the same address after
Would be nice, I think we can explore it later. |
You mean that I can't start the upgrade cluster with different IP addresses? It seems to be a serious bug which should be fixed. But we can test it with the same approach: hard coded network stack and two different docker-compose file with different ip addresses.
I am fine to include it, not a big change. It's just good to have the explanation here. |
|
Other random thoughts: I plan to enable acceptance tests for k8s cluster definitions, too.
|
elek
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks the patch (and the discussion) @adoroszlai
I am merging it now.
|
Thanks @avijayanhwx for the review, and @elek for reviewing and merging this. |
* master: HDDS-3855. Add upgrade smoketest (apache#1142) HDDS-3964. Ratis config key mismatch (apache#1204) HDDS-3612. Allow mounting bucket under other volume (apache#1104) HDDS-3926. OM Token Identifier table should use in-house serialization. (apache#1182) HDDS-3824: OM read requests should make SCM#refreshPipeline outside BUCKET_LOCK (apache#1164)
…erface * upstream/master: HDDS-3855. Add upgrade smoketest (apache#1142) HDDS-3964. Ratis config key mismatch (apache#1204) HDDS-3612. Allow mounting bucket under other volume (apache#1104) HDDS-3926. OM Token Identifier table should use in-house serialization. (apache#1182) HDDS-3824: OM read requests should make SCM#refreshPipeline outside BUCKET_LOCK (apache#1164) HDDS-3966. Disable flaky TestOMRatisSnapshots
* master: HDDS-3984. Support filter and search the columns in recon UI (apache#1218) HDDS-3806. Support recognize aws v2 Authorization header. (apache#1098) HDDS-3955. Unable to list intermediate paths on keys created using S3G. (apache#1196) HDDS-3741. Reload old OM state if Install Snapshot from Leader fails (apache#1129) HDDS-3965. SCM failed to start up for duplicated pipeline detected. (apache#1210) HDDS-3855. Add upgrade smoketest (apache#1142) HDDS-3964. Ratis config key mismatch (apache#1204) HDDS-3612. Allow mounting bucket under other volume (apache#1104) HDDS-3926. OM Token Identifier table should use in-house serialization. (apache#1182) HDDS-3824: OM read requests should make SCM#refreshPipeline outside BUCKET_LOCK (apache#1164) HDDS-3966. Disable flaky TestOMRatisSnapshots
* add-deleted-block-table: (63 commits) Make block iterator tests use deleted blocks table, and remove the now unused #deleted# Replace uses of #deleted# key prefix with access to new deleted blocks table Add deleted blocks table to base level DB wrappers Have block deleting service test look for #deleted# keys in metadata table Move block delete to correct table and remove debugging print statement Import schema version when importing container data from export HDDS-3984. Support filter and search the columns in recon UI (apache#1218) HDDS-3806. Support recognize aws v2 Authorization header. (apache#1098) HDDS-3955. Unable to list intermediate paths on keys created using S3G. (apache#1196) HDDS-3741. Reload old OM state if Install Snapshot from Leader fails (apache#1129) Move new key value block iterator implementation and tests to new interface Fix checkstyle violations HDDS-3965. SCM failed to start up for duplicated pipeline detected. (apache#1210) Update comments Add comments on added helper method Remove seekToLast() from iterator interface, implementation, and tests Add more robust unit test with alternating key matches All unit tests pass after allowing keys with deleted and deleting prefixes to be made HDDS-3855. Add upgrade smoketest (apache#1142) HDDS-3964. Ratis config key mismatch (apache#1204) ...
What changes were proposed in this pull request?
Introduce a new sample docker-compose environment with test script geared towards running upgrades. Currently it only performs a smoketest: write some keys with old version, read with new one.
Add a script for performing workaround steps for HDDS-3499 during upgrade. This is executed using
ozone-runnerdocker image, which now comes withldb.https://issues.apache.org/jira/browse/HDDS-3855
How was this patch tested?
Executed
upgradeacceptance test locally and on GitHub.https://github.com/adoroszlai/hadoop-ozone/runs/815608054