Skip to content

Conversation

@xichen01
Copy link
Contributor

@xichen01 xichen01 commented Aug 22, 2024

What changes were proposed in this pull request?

Unify protobuf definition of storageType.

Context

  • We have multiple storageType enum definition in different protobuf files (ScmServerDatanodeHeartbeatProtocol.proto#StorageTypeProto and OmClientProtocol.proto#StorageTypeProto).
  • For the future, we can more easily use the same storageType protobuf definition in multiple places. We need to remove the existing storageType protobuf and define a new storageType definition in hdds.proto.
  • Most of the protobuf definitions that need to be imported in our project are located in hdds.proto, so we should to define storageType enum in hdds.proto

Modifications

  • Upgrade proto2.hadooprpc.protobuf.version form 2.5.0 to 2.6.1, the proto 2.6.1 start supports the enum deprecated syntax.
  • Mark ScmServerDatanodeHeartbeatProtocol.proto#StorageTypeProto and OmClientProtocol.proto#StorageTypeProto as deprecated = true.
  • Add a new storageType enum definition hdds.proto#StorageType, and replace ScmServerDatanodeHeartbeatProtocol.proto#StorageTypeProto, and restrict import of ScmServerDatanodeHeartbeatProtocol (by maven plugin restrict-imports-enforcer-rule)

The OmBucketInfo#storageType will be removed in the future, because Buckets should not have a StorageType field, and should use StoragePolicy instead of StorageType

Compatibility

  • In StorageLocationReport, StorageType is only used in UsageMetrics. Version changes may cause this indicator to be inaccurate, but it will not affect the main business process.
  • ScmServerDatanodeHeartbeatProtocol#StorageTypeProto.storageType (id 2) default value is DISK, so deprecate it will not cause NPE issues

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-11351

How was this patch tested?

existing test.

@xichen01
Copy link
Contributor Author

@szetszwo, @devabhishekpal Could you help to review this?

@xichen01
Copy link
Contributor Author

There are some compatibility issues If we upgrade the protobuf version from 2.5.0 to 2.6.1, this may introduce some risks, so let we keep the protobuf version unchanged.
We can use maven plugin restrict-imports-enforcer-rule to disable the "deprecated" enum.

@adoroszlai adoroszlai requested review from errose28 and szetszwo and removed request for devmadhuu October 17, 2024 15:56
Copy link
Contributor

@szetszwo szetszwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xichen01 , thanks for working on this! Please see my suggestions inlined.

ZERO = 0; // Invalid Factor
}

enum StorageType {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid confusing with org.apache.hadoop.hdds.protocol.StorageType, let's use StorageTypeProto?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both hdds.proto and ScmServerDatanodeHeartbeatProtocol.proto are package hadoop.hdds; so we cannot name it to StorageTypeProto

Copy link
Contributor

@szetszwo szetszwo Mar 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since they are in the same package, we could safely move the StorageTypeProto from ScmServerDatanodeHeartbeatProtocol.proto to hdds.proto. The only difference is the java_outer_classname. Fortunately, ScmServerDatanodeHeartbeatProtocol.proto is a non-user facing internal protocol and the change is wire compatible. So there is no compatibility issues. See https://issues.apache.org/jira/secure/attachment/13075616/7109_review.patch

BTW, let's rename ScmServerDatanodeHeartbeatProtocol.proto to StorageContainerDatanodeProtocol.proto, i.e. make it consistent with the java_outer_classname.

optional uint64 scmUsed = 4 [default = 0];
optional uint64 remaining = 5 [default = 0];
optional StorageTypeProto storageType = 6 [default = DISK];
optional StorageTypeProto storageType = 6 [default = DISK, deprecated = true];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to be safe to just replace StorageTypeProto with Hdds.StorageTypeProto.

We should test it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your suggestion, I think this is compatibility, but the maven plugin proto-backwards-compatibility will report error say the "field CONFLICT".

And the plugin proto-backwards-compatibility not support to ignore a specific field.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will also cause proto-backwards-compatibility to report an error. It seems that it is difficult for us to change the existing proto message

[INFO] --- proto-backwards-compatibility:1.0.7:backwards-compatibility-check (default) @ hdds-interface-server ---
[INFO] protolock cmd line: /Users/xichen/community/ozone/hadoop-hdds/interface-server/target/protolock-bin/protolock status --lockdir=/Users/xichen/community/ozone/hadoop-hdds/interface-server/target/classes --protoroot=/Users/xichen/community/ozone/hadoop-hdds/interface-server/target/classes
[INFO] CONFLICT: "StorageTypeProto" field: "ARCHIVE" has been removed, but is not reserved [ScmServerDatanodeHeartbeatProtocol.proto]
[INFO] CONFLICT: "StorageTypeProto" field: "DISK" has been removed, but is not reserved [ScmServerDatanodeHeartbeatProtocol.proto]
[INFO] CONFLICT: "StorageTypeProto" field: "PROVIDED" has been removed, but is not reserved [ScmServerDatanodeHeartbeatProtocol.proto]
[INFO] CONFLICT: "StorageTypeProto" field: "RAM_DISK" has been removed, but is not reserved [ScmServerDatanodeHeartbeatProtocol.proto]
[INFO] CONFLICT: "StorageTypeProto" field: "SSD" has been removed, but is not reserved [ScmServerDatanodeHeartbeatProtocol.proto]
[INFO] CONFLICT: "StorageTypeProto" integer: "1" has been removed, but is not reserved [ScmServerDatanodeHeartbeatProtocol.proto]
[INFO] CONFLICT: "StorageTypeProto" integer: "2" has been removed, but is not reserved [ScmServerDatanodeHeartbeatProtocol.proto]
[INFO] CONFLICT: "StorageTypeProto" integer: "3" has been removed, but is not reserved [ScmServerDatanodeHeartbeatProtocol.proto]
[INFO] CONFLICT: "StorageTypeProto" integer: "4" has been removed, but is not reserved [ScmServerDatanodeHeartbeatProtocol.proto]
[INFO] CONFLICT: "StorageTypeProto" integer: "5" has been removed, but is not reserved [ScmServerDatanodeHeartbeatProtocol.proto]

@adoroszlai adoroszlai marked this pull request as draft February 15, 2025 16:29
adoroszlai and others added 2 commits February 18, 2025 17:54
# Conflicts:
#	hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/SCMNodeManager.java
#	pom.xml
@xichen01
Copy link
Contributor Author

@szetszwo PTAL.

@szetszwo
Copy link
Contributor

@xichen01 , could you take my previous review comments #7109 (review) ?

Copy link
Contributor

@szetszwo szetszwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xichen01 , thanks for taking a look. I have some suggestions. Please see the comments inlined.

ZERO = 0; // Invalid Factor
}

enum StorageType {
Copy link
Contributor

@szetszwo szetszwo Mar 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since they are in the same package, we could safely move the StorageTypeProto from ScmServerDatanodeHeartbeatProtocol.proto to hdds.proto. The only difference is the java_outer_classname. Fortunately, ScmServerDatanodeHeartbeatProtocol.proto is a non-user facing internal protocol and the change is wire compatible. So there is no compatibility issues. See https://issues.apache.org/jira/secure/attachment/13075616/7109_review.patch

BTW, let's rename ScmServerDatanodeHeartbeatProtocol.proto to StorageContainerDatanodeProtocol.proto, i.e. make it consistent with the java_outer_classname.

optional uint64 scmUsed = 4 [default = 0];
optional uint64 remaining = 5 [default = 0];
optional StorageTypeProto storageType = 6 [default = DISK];
optional StorageTypeProto storageType = 6 [default = DISK, deprecated = true];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@szetszwo
Copy link
Contributor

@xichen01 , Currently, we are voting 2.0.0. How about getting this in 2.0.0? Then, we can legitimately update the protolock file.

Cc @jojochuang , @adoroszlai

@xichen01
Copy link
Contributor Author

@xichen01 , Currently, we are voting 2.0.0. How about getting this in 2.0.0? Then, we can legitimately update the protolock file.

Cc @jojochuang , @adoroszlai

It looks fine to me, and if we do it, do we need to modify both proto field and protolock in one PR?

@szetszwo
Copy link
Contributor

... both proto field and protolock in one PR?

I believe the answer is yes, although it is not mentioned in the release guideline

Could you update this PR? We should get this merged soon.

Comment on lines +311 to +315
DISK_TYPE = 1;
SSD_TYPE = 2;
ARCHIVE_TYPE = 3;
RAM_DISK_TYPE = 4;
PROVIDED_TYPE = 5;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't recall who, but back in HDFS land after storage policy feature was implemented, someone (Arpit or Anu?) regrettably said it was a mistake to have these types continuous, because it wasn't possible to add any storage types in between. It might be a good idea to space them out.

@szetszwo do you recall?

Copy link
Contributor

@szetszwo szetszwo Apr 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also don't recall.

... it was impossible to insert any new storage types in between. ...

Not really -- we may use 1.1 if there is a need.

Also, why it has to insert in between? We may put the new policy at the end. I don't think Storage policy has a total ordering.

repeated OzoneAclInfo acls = 3;
required bool isVersionEnabled = 4 [default = false];
required StorageTypeProto storageType = 5 [default = DISK];
optional StorageTypeProto storageType = 5 [default = DISK, deprecated = true];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this change from required to optional backward compatible?
Imagine a new OM server and an old client. The new server omits this field while client expects this field. Wouldn't the client fail to parse?

/**
* Ozone specific storage types.
*/
@Deprecated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it really deprecated? If so, what replaces it?

@szetszwo
Copy link
Contributor

szetszwo commented Apr 1, 2025

@jojochuang , please review #8208 instead. We have a slightly different approach now.

@xichen01 , let's close this in order to avoid confusion?

@adoroszlai
Copy link
Contributor

Superseded by 8208.

@adoroszlai adoroszlai closed this Apr 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants