Skip to content

Conversation

@elek
Copy link
Member

@elek elek commented Mar 29, 2021

What changes were proposed in this pull request?

This patch introduces the usage of ReplicationConfig on mater instead of using replicationType and replicationFactor. It is required by EC (see #2068 and #1973)

Using java based ReplicationFactor requires further changes in SCM (and OM) and we decidede to make the generic part of the refactor on master (it doesn't introduce any new logic only the java representation of the data is changing)

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-5011

How was this patch tested?

It contains unit tests. It introduces only new Java classes, therefore it's safe to include.


private final ReplicationFactor replicationFactor;

public StandaloneReplicationConfig(ReplicationFactor replicationFactor) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is Standalone by definition factor 1? Should we drop the replicationFactor parameter and just hardcode Factor.ONE ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good point. I agree that STANDALONE/THREE support should be removed if it's not available.

However, I am not sure about the backward compatibility: let's say I already have a STANDALONE/THREE key persisted in OM rocksdb. If we don't allow THREE for STANDALONE it means that we will change the persisted THREE to ONE during de-serialization.

I am not sure if it's good or not, but can be confusing.

What do you think about this problem?

I don't have a strong opinion. If we remove STANDALONE/THREE support, it seems to be a bigger change (we need to implement proper validation, fix unit tests where STANDALONE/THREE is used, etc. But I am closer to your suggestion, to do this cleanup together with the other changes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am thinking about it, and I think we may have STANDALONE/THREE support. I think we upload the key only to one server but the key may be replicated to multiple nodes by ReplicationManager when the container is closed...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not realise we already had an option for a STANDALONE/THREE. I thought it was always STANDALONE/ONE or RATIS/THREE. If we already have STANDALONE/THREE then I believe you are correct and we need to keep the option to keep the factor.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think even have RATIS/ONE is a valid option. But I agree, it's quite confusing.

This is something what we can start to fix together with EC as we need to modify the client to define the replication config based on the replication type. For example in case of replicationType=EC the 3:2 can be a valid replication config, while replicationType=STANDALONE we may accept only ONE or ONE/THREE.

I think this kind of validation is definitely something what we should do...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. Changing options for existing things should be discussed if we really want to do that. Let's just scope this JIRA to ad ReplicationConfig.

Copy link
Contributor

@sodonnel sodonnel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, +1.

Copy link
Contributor

@umamaheswararao umamaheswararao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM


private final ReplicationFactor replicationFactor;

public StandaloneReplicationConfig(ReplicationFactor replicationFactor) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. Changing options for existing things should be discussed if we really want to do that. Let's just scope this JIRA to ad ReplicationConfig.

@elek
Copy link
Member Author

elek commented Apr 7, 2021

Merging it. Thanks the review @sodonnel and @umamaheswararao

@elek elek merged commit ee4f759 into apache:master Apr 7, 2021
errose28 added a commit to errose28/ozone that referenced this pull request Apr 7, 2021
* HDDS-3698-nonrolling-upgrade:
  HDDS-5056. Avoid false positiver error messages during pipeline creations (apache#2105)
  HDDS-5027. [SCM HA Security] Handle leader changes during bootstrap. (apache#2113)
  HDDS-5032. Fix findbugs (apache#2120)
  HDDS-5062. Add a config to bypass clusterId validation for bootstrapping SCM. (apache#2114)
  HDDS-5011. Introduce Java based ReplicationConfig implementation (apache#2089)
  HDDS-4925. Introduce ContainerBalancer in SCM with start/stop capabilities. (apache#2097)
errose28 added a commit to errose28/ozone that referenced this pull request Apr 7, 2021
* HDDS-3698-nonrolling-upgrade:
  HDDS-5056. Avoid false positiver error messages during pipeline creations (apache#2105)
  HDDS-5027. [SCM HA Security] Handle leader changes during bootstrap. (apache#2113)
  HDDS-5032. Fix findbugs (apache#2120)
  HDDS-5062. Add a config to bypass clusterId validation for bootstrapping SCM. (apache#2114)
  HDDS-5011. Introduce Java based ReplicationConfig implementation (apache#2089)
  HDDS-4925. Introduce ContainerBalancer in SCM with start/stop capabilities. (apache#2097)
errose28 added a commit to errose28/ozone that referenced this pull request Apr 9, 2021
* HDDS-3698-nonrolling-upgrade: (150 commits)
  HDDS-5056. Avoid false positiver error messages during pipeline creations (apache#2105)
  HDDS-5027. [SCM HA Security] Handle leader changes during bootstrap. (apache#2113)
  HDDS-5032. Fix findbugs (apache#2120)
  HDDS-5062. Add a config to bypass clusterId validation for bootstrapping SCM. (apache#2114)
  HDDS-5011. Introduce Java based ReplicationConfig implementation (apache#2089)
  HDDS-4925. Introduce ContainerBalancer in SCM with start/stop capabilities. (apache#2097)
  fix project name in NOTICE.txt (apache#2112)
  HDDS-5066. Use fixed vesion from pnpm to build recon (apache#2115)
  HDDS-5014. Add non-rolling upgrade design docs.
  HDDS-5035. Use default config values to solve generated config file conflict (apache#2087)
  HDDS-5032. DN stopped to load containers on volume after a container load exception. (apache#2109)
  HDDS-4504. Datanode deletion config should be based on number of blocks (apache#1885)
  Fix ozone-ha acceptance test.
  HDDS-5058. Make getScmInfo retry for a duration.
  HDDS-4506. Support query parameter based v4 auth in S3g (apache#1628)
  HDDS-4553. ChunkInputStream should release buffer as soon as last byte in the buffer is read (apache#2062)
  HDDS-5022. SCM get roles command should provide Ratis Leader/Follower… (apache#2098)
  HDDS-5033. SCM may not be able to know full port list of Datanode after Datanode is started. (apache#2090)
  HDDS-3752. Fix o3fs list bucket contents issue when without tailing "/" (apache#2088)
  HDDS-4901. Remove OmOzoneAclMap from OmVolumeArgs to avoid OzoneAcl conversions (apache#1992)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants