Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
56 commits
Select commit Hold shift + click to select a range
95739ea
HDDS-3185 Construct a standalone ratis server for SCM. (#720)
timmylicheng Mar 27, 2020
31c7386
HDDS-3187 Construct SCM StateMachine. (#819)
timmylicheng Apr 17, 2020
8f2107a
Resolve conflicts with merge from master.
timmylicheng Apr 30, 2020
aa2884c
HDDS-3556 Refactor conf in SCMRatisServer to Java-based conf. (#907)
timmylicheng May 20, 2020
1f3ef36
HDDS-3186. Introduce generic SCMRatisRequest and SCMRatisResponse. (#…
nandakumar131 May 26, 2020
30e1751
HDDS-3192. Handle AllocateContainer operation for HA. (#975)
nandakumar131 May 28, 2020
c836720
HDDS-3196 New PipelineManager interface to persist to RatisServer. (#…
timmylicheng Jun 1, 2020
988b23a
HDDS-3693 Switch to new StateManager interface. (#1007)
timmylicheng Jun 3, 2020
5355939
HDDS-3711. Handle inner classes in SCMRatisRequest and SCMRatisRespon…
nandakumar131 Jun 4, 2020
8e86480
HDDS-3679 Add tests for PipelineManager V2. (#1019)
timmylicheng Jun 15, 2020
8d74c0c
HDDS-3652 Add test for SCMRatisResponse. (#1113)
timmylicheng Jun 25, 2020
3e7c427
Merge branch 'master' into HDDS-2823
nandakumar131 Jun 26, 2020
82c30a4
Merge branch 'master' into HDDS-2823
nandakumar131 Jun 27, 2020
7287e1d
HDDS-3651 Add tests for SCMRatisRequest. (#1112)
timmylicheng Jun 28, 2020
144f9a8
HDDS-3911. Compile error in acceptance test on HDDS-2823 (#1157)
adoroszlai Jul 3, 2020
565dabc
HDDS-3662 Decouple finalizeAndDestroyPipeline. (#1049)
timmylicheng Jul 10, 2020
8a8c9eb
HDDS-3191: switch from SCMPipelineManager to PipelineManagerV2Impl (#…
Jul 14, 2020
40127b3
Merge branch 'master' into HDDS-2823
timmylicheng Jul 15, 2020
6de98c6
Merge branch 'master' into HDDS-2823
nandakumar131 Oct 24, 2020
58394eb
HDDS-3837. Add isLeader check in SCMHAManager.
timmylicheng Oct 24, 2020
3ed29d8
HDDS-4059. SCMStateMachine::applyTransaction() should not invoke Tran…
Oct 24, 2020
d482abf
HDDS-4125. Pipeline is not removed when a datanode goes stale.
Oct 24, 2020
a70964e
HDDS-4130. remove the 1st edition of RatisServer of SCM HA which is c…
Oct 24, 2020
9e0dd84
HDDS-3895. Implement container related operations in ContainerManager…
nandakumar131 Oct 24, 2020
5f3981c
HDDS-4115. CLI command to show current SCM leader and follower status.
amaliujia Oct 24, 2020
9f7ab46
HDDS-3188. Add failover proxy for SCM block location.
timmylicheng Oct 24, 2020
5111126
HDDS-4192. enable SCM Raft Group based on config ozone.scm.names.
Oct 24, 2020
43b87fe
HDDS-4365. SCMBlockLocationFailoverProxyProvider should use ScmBlockL…
Oct 24, 2020
782057a
Resolving master merge conflict.
nandakumar131 Oct 25, 2020
44a6503
HDDS-4393. Addressing test failures after master merge. (#1587)
nandakumar131 Nov 16, 2020
517358b
HDDS-4488. Open RocksDB read only when loading containers at Datanode…
sodonnel Nov 20, 2020
49cd3ec
HDDS-4432. Update Ratis version to latest snapshot. (#1586)
hanishakoneru Nov 20, 2020
fd879be
HDDS-4476. Improve the ZH translation of the HA.md in doc. (#1597)
yuyang733 Nov 22, 2020
f71fc12
HDDS-4417. Simplify Ozone client code with configuration object -- ad…
adoroszlai Nov 23, 2020
6cc4a43
HDDS-4468. Fix Goofys listBucket large than 1000 objects will stuck f…
maobaolong Nov 23, 2020
51ffc82
HDDS-4497. Recon File Size Count task throws SQL Exception. (#1612)
avijayanhwx Nov 24, 2020
1b2f2ef
HDDS-3689. Add various profiles to MiniOzoneChaosCluster to run diffe…
mukul1987 Nov 24, 2020
a9ff68a
HDDS-4492. CLI flag --quota should default to 'spaceQuota' to preserv…
ayushtkn Nov 24, 2020
4b69f08
HDDS-4501. Reload OM State fail should terminate OM for any exception…
bharatviswa504 Nov 24, 2020
1a304ba
HDDS-4392. [DOC] Add Recon architecture to docs (#1602)
vivekratnavel Nov 25, 2020
54cca0b
HDDS-4308. Fix issue with quota update (#1489)
captainzmc Nov 25, 2020
a4cd12c
HDDS-4471. GrpcOutputStream length can overflow (#1617)
wycccccc Nov 25, 2020
fdb373f
HDDS-4487. SCM can avoid using RETRIABLE_DATANODE_COMMAND for datanod…
aryangupta1998 Nov 25, 2020
d83ec1a
HDDS-4481. With HA OM can send deletion blocks to SCM multiple times.…
bharatviswa504 Nov 25, 2020
1235430
HDDS-4512. Remove unused netty3 transitive dependency (#1627)
elek Nov 25, 2020
43fdd71
HDDS-4370. Datanode deletion service can avoid storing deleted blocks…
aryangupta1998 Nov 26, 2020
5704341
HDDS-3363. Intermittent failure in testContainerImportExport (#1618)
adoroszlai Nov 26, 2020
143f076
HDDS-4510. SCM can avoid creating RetriableDatanodeEventWatcher for d…
aryangupta1998 Nov 26, 2020
130ba4d
HDDS-4511: Avoiding StaleNodeHandler to take effect in TestDeleteWith…
Nov 27, 2020
f30bc4e
Merge branch 'master' into HDDS-2823
nandakumar131 Nov 30, 2020
285d793
HDDS-4191 Add failover proxy for SCM container location. (#1514)
timmylicheng Dec 1, 2020
48b9809
HDDS-4538: Workaround on HDDS-2823, hard code scmUuid and clusterID. …
Dec 2, 2020
0801203
HDDS-4542. Need throw exception to trigger FailoverProxyProvider of S…
Dec 4, 2020
0aa9ba3
HDDS-3988: DN can distinguish SCMCommand from stale leader SCM (#1314)
Dec 7, 2020
34c393c
HDDS-4551: Remove checkLeader in PipelineManager. (#1658)
Dec 8, 2020
8a84b03
HDDS-4575: Refactor SCMHAManager and SCMRatisServer with RaftServer.D…
Dec 10, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
5 changes: 0 additions & 5 deletions hadoop-hdds/client/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -51,11 +51,6 @@ https://maven.apache.org/xsd/maven-4.0.0.xsd">
<scope>test</scope>
</dependency>

<dependency>
<groupId>io.netty</groupId>
<artifactId>netty-all</artifactId>
</dependency>

<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdds-hadoop-dependency-test</artifactId>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@
import org.apache.hadoop.hdds.conf.ConfigGroup;
import org.apache.hadoop.hdds.conf.ConfigTag;
import org.apache.hadoop.hdds.conf.ConfigType;
import org.apache.hadoop.hdds.conf.PostConstruct;
import org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos.ChecksumType;
import org.apache.hadoop.ozone.OzoneConfigKeys;

Expand Down Expand Up @@ -111,9 +112,7 @@ public class OzoneClientConfig {
tags = ConfigTag.CLIENT)
private boolean checksumVerify = true;

public OzoneClientConfig() {
}

@PostConstruct
private void validate() {
Preconditions.checkState(streamBufferSize > 0);
Preconditions.checkState(streamBufferFlushSize > 0);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -217,12 +217,12 @@ private CompletableFuture<RaftClientReply> sendRequestAsync(
if (LOG.isDebugEnabled()) {
LOG.debug("sendCommandAsync ReadOnly {}", message);
}
return getClient().sendReadOnlyAsync(message);
return getClient().async().sendReadOnly(message);
} else {
if (LOG.isDebugEnabled()) {
LOG.debug("sendCommandAsync {}", message);
}
return getClient().sendAsync(message);
return getClient().async().send(message);
}

}
Expand Down Expand Up @@ -258,17 +258,17 @@ public XceiverClientReply watchForCommit(long index)
}
RaftClientReply reply;
try {
CompletableFuture<RaftClientReply> replyFuture = getClient()
.sendWatchAsync(index, RaftProtos.ReplicationLevel.ALL_COMMITTED);
CompletableFuture<RaftClientReply> replyFuture = getClient().async()
.watch(index, RaftProtos.ReplicationLevel.ALL_COMMITTED);
replyFuture.get();
} catch (Exception e) {
Throwable t = HddsClientUtils.checkForException(e);
LOG.warn("3 way commit failed on pipeline {}", pipeline, e);
if (t instanceof GroupMismatchException) {
throw e;
}
reply = getClient()
.sendWatchAsync(index, RaftProtos.ReplicationLevel.MAJORITY_COMMITTED)
reply = getClient().async()
.watch(index, RaftProtos.ReplicationLevel.MAJORITY_COMMITTED)
.get();
List<RaftProtos.CommitInfoProto> commitInfoProtoList =
reply.getCommitInfos().stream()
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -104,12 +104,18 @@ public static RaftPeerId toRaftPeerId(DatanodeDetails id) {
}

public static RaftPeer toRaftPeer(DatanodeDetails id) {
return new RaftPeer(toRaftPeerId(id), toRaftPeerAddressString(id));
return RaftPeer.newBuilder()
.setId(toRaftPeerId(id))
.setAddress(toRaftPeerAddressString(id))
.build();
}

public static RaftPeer toRaftPeer(DatanodeDetails id, int priority) {
return new RaftPeer(
toRaftPeerId(id), toRaftPeerAddressString(id), priority);
return RaftPeer.newBuilder()
.setId(toRaftPeerId(id))
.setAddress(toRaftPeerAddressString(id))
.setPriority(priority)
.build();
}

private static List<RaftPeer> toRaftPeers(Pipeline pipeline) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -273,6 +273,16 @@ public final class ScmConfigKeys {
// able to send back a new list to the datanodes.
public static final String OZONE_SCM_NAMES = "ozone.scm.names";

public static final String OZONE_SCM_INTERNAL_SERVICE_ID =
"ozone.scm.internal.service.id";

public static final String OZONE_SCM_SERVICE_IDS_KEY =
"ozone.scm.service.ids";
public static final String OZONE_SCM_NODES_KEY =
"ozone.scm.nodes";
public static final String OZONE_SCM_NODE_ID_KEY =
"ozone.scm.node.id";

public static final int OZONE_SCM_DEFAULT_PORT =
OZONE_SCM_DATANODE_PORT_DEFAULT;
// The path where datanode ID is to be written to.
Expand Down Expand Up @@ -364,6 +374,83 @@ public final class ScmConfigKeys {
public static final String HDDS_TRACING_ENABLED = "hdds.tracing.enabled";
public static final boolean HDDS_TRACING_ENABLED_DEFAULT = false;

// SCM Ratis related
public static final String OZONE_SCM_HA_ENABLE_KEY
= "ozone.scm.ratis.enable";
public static final boolean OZONE_SCM_HA_ENABLE_DEFAULT
= false;
public static final String OZONE_SCM_RATIS_PORT_KEY
= "ozone.scm.ratis.port";
public static final int OZONE_SCM_RATIS_PORT_DEFAULT
= 9864;
public static final String OZONE_SCM_RATIS_RPC_TYPE_KEY
= "ozone.scm.ratis.rpc.type";
public static final String OZONE_SCM_RATIS_RPC_TYPE_DEFAULT
= "GRPC";

// SCM Ratis Log configurations
public static final String OZONE_SCM_RATIS_STORAGE_DIR
= "ozone.scm.ratis.storage.dir";
public static final String OZONE_SCM_RATIS_SEGMENT_SIZE_KEY
= "ozone.scm.ratis.segment.size";
public static final String OZONE_SCM_RATIS_SEGMENT_SIZE_DEFAULT
= "16KB";
public static final String OZONE_SCM_RATIS_SEGMENT_PREALLOCATED_SIZE_KEY
= "ozone.scm.ratis.segment.preallocated.size";
public static final String OZONE_SCM_RATIS_SEGMENT_PREALLOCATED_SIZE_DEFAULT
= "16KB";

// SCM Ratis Log Appender configurations
public static final String
OZONE_SCM_RATIS_LOG_APPENDER_QUEUE_NUM_ELEMENTS =
"ozone.scm.ratis.log.appender.queue.num-elements";
public static final int
OZONE_SCM_RATIS_LOG_APPENDER_QUEUE_NUM_ELEMENTS_DEFAULT = 1024;
public static final String OZONE_SCM_RATIS_LOG_APPENDER_QUEUE_BYTE_LIMIT =
"ozone.scm.ratis.log.appender.queue.byte-limit";
public static final String
OZONE_SCM_RATIS_LOG_APPENDER_QUEUE_BYTE_LIMIT_DEFAULT = "32MB";
public static final String OZONE_SCM_RATIS_LOG_PURGE_GAP =
"ozone.scm.ratis.log.purge.gap";
public static final int OZONE_SCM_RATIS_LOG_PURGE_GAP_DEFAULT = 1000000;

// SCM Ratis server configurations
public static final String OZONE_SCM_RATIS_SERVER_REQUEST_TIMEOUT_KEY
= "ozone.scm.ratis.server.request.timeout";
public static final TimeDuration
OZONE_SCM_RATIS_SERVER_REQUEST_TIMEOUT_DEFAULT
= TimeDuration.valueOf(3000, TimeUnit.MILLISECONDS);
public static final String
OZONE_SCM_RATIS_SERVER_RETRY_CACHE_TIMEOUT_KEY
= "ozone.scm.ratis.server.retry.cache.timeout";
public static final TimeDuration
OZONE_SCM_RATIS_SERVER_RETRY_CACHE_TIMEOUT_DEFAULT
= TimeDuration.valueOf(600000, TimeUnit.MILLISECONDS);
public static final String OZONE_SCM_RATIS_MINIMUM_TIMEOUT_KEY
= "ozone.scm.ratis.minimum.timeout";
public static final TimeDuration OZONE_SCM_RATIS_MINIMUM_TIMEOUT_DEFAULT
= TimeDuration.valueOf(1, TimeUnit.SECONDS);

// SCM Ratis Leader Election configurations
public static final String
OZONE_SCM_LEADER_ELECTION_MINIMUM_TIMEOUT_DURATION_KEY =
"ozone.scm.ratis.leader.election.minimum.timeout.duration";
public static final TimeDuration
OZONE_SCM_LEADER_ELECTION_MINIMUM_TIMEOUT_DURATION_DEFAULT =
TimeDuration.valueOf(1, TimeUnit.SECONDS);
public static final String OZONE_SCM_RATIS_SERVER_FAILURE_TIMEOUT_DURATION_KEY
= "ozone.scm.ratis.server.failure.timeout.duration";
public static final TimeDuration
OZONE_SCM_RATIS_SERVER_FAILURE_TIMEOUT_DURATION_DEFAULT
= TimeDuration.valueOf(120, TimeUnit.SECONDS);

// SCM Leader server role check interval
public static final String OZONE_SCM_RATIS_SERVER_ROLE_CHECK_INTERVAL_KEY
= "ozone.scm.ratis.server.role.check.interval";
public static final TimeDuration
OZONE_SCM_RATIS_SERVER_ROLE_CHECK_INTERVAL_DEFAULT
= TimeDuration.valueOf(15, TimeUnit.SECONDS);

/**
* Never constructed.
*/
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,20 +18,29 @@

package org.apache.hadoop.hdds.scm;

import java.util.ArrayList;
import java.util.List;

/**
* ScmInfo wraps the result returned from SCM#getScmInfo which
* contains clusterId and the SCM Id.
*/
public final class ScmInfo {
private String clusterId;
private String scmId;
private List<String> peerRoles;

/**
* Builder for ScmInfo.
*/
public static class Builder {
private String clusterId;
private String scmId;
private List<String> peerRoles;

public Builder() {
peerRoles = new ArrayList<>();
}

/**
* sets the cluster id.
Expand All @@ -53,14 +62,25 @@ public Builder setScmId(String id) {
return this;
}

/**
* Set peer address in Scm HA.
* @param roles ratis peer address in the format of [ip|hostname]:port
* @return Builder for scmInfo
*/
public Builder setRatisPeerRoles(List<String> roles) {
peerRoles.addAll(roles);
return this;
}

public ScmInfo build() {
return new ScmInfo(clusterId, scmId);
return new ScmInfo(clusterId, scmId, peerRoles);
}
}

private ScmInfo(String clusterId, String scmId) {
private ScmInfo(String clusterId, String scmId, List<String> peerRoles) {
this.clusterId = clusterId;
this.scmId = scmId;
this.peerRoles = peerRoles;
}

/**
Expand All @@ -78,4 +98,12 @@ public String getClusterId() {
public String getScmId() {
return scmId;
}

/**
* Gets the list of peer roles (currently address) in Scm HA.
* @return List of peer address
*/
public List<String> getRatisPeerRoles() {
return peerRoles;
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -246,5 +246,8 @@ Map<String, Pair<Boolean, String>> getSafeModeRuleStatuses()
*/
boolean getReplicationManagerStatus() throws IOException;


/**
* returns the list of ratis peer roles. Currently only include peer address.
*/
List<String> getScmRatisRoles() throws IOException;
}
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@
import org.apache.commons.lang3.builder.CompareToBuilder;
import org.apache.commons.lang3.builder.EqualsBuilder;
import org.apache.commons.lang3.builder.HashCodeBuilder;
import org.apache.hadoop.hdds.protocol.proto.HddsProtos;

/**
* Container ID is an integer that is a value between 1..MAX_CONTAINER ID.
Expand All @@ -34,13 +35,14 @@ public final class ContainerID implements Comparable<ContainerID> {

private final long id;

// TODO: make this private.
/**
* Constructs ContainerID.
*
* @param id int
*/
public ContainerID(long id) {
private ContainerID(long id) {
Preconditions.checkState(id > 0,
"Container ID should be a positive. %s.", id);
this.id = id;
}

Expand All @@ -49,9 +51,7 @@ public ContainerID(long id) {
* @param containerID long
* @return ContainerID.
*/
public static ContainerID valueof(final long containerID) {
Preconditions.checkState(containerID > 0,
"Container ID should be a positive long. "+ containerID);
public static ContainerID valueOf(final long containerID) {
return new ContainerID(containerID);
}

Expand All @@ -60,14 +60,30 @@ public static ContainerID valueof(final long containerID) {
*
* @return int
*/
@Deprecated
/*
* Don't expose the int value.
*/
public long getId() {
return id;
}

/**
* Use proto message.
*/
@Deprecated
public byte[] getBytes() {
return Longs.toByteArray(id);
}

public HddsProtos.ContainerID getProtobuf() {
return HddsProtos.ContainerID.newBuilder().setId(id).build();
}

public static ContainerID getFromProtobuf(HddsProtos.ContainerID proto) {
return ContainerID.valueOf(proto.getId());
}

@Override
public boolean equals(final Object o) {
if (this == o) {
Expand All @@ -81,22 +97,22 @@ public boolean equals(final Object o) {
final ContainerID that = (ContainerID) o;

return new EqualsBuilder()
.append(getId(), that.getId())
.append(id, that.id)
.isEquals();
}

@Override
public int hashCode() {
return new HashCodeBuilder(61, 71)
.append(getId())
.append(id)
.toHashCode();
}

@Override
public int compareTo(final ContainerID that) {
Preconditions.checkNotNull(that);
return new CompareToBuilder()
.append(this.getId(), that.getId())
.append(this.id, that.id)
.build();
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -121,6 +121,11 @@ public static ContainerInfo fromProtobuf(HddsProtos.ContainerInfoProto info) {
.build();
}

/**
* This method is depricated, use {@code containerID()} which returns
* {@link ContainerID} object.
*/
@Deprecated
public long getContainerID() {
return containerID;
}
Expand Down Expand Up @@ -179,7 +184,7 @@ public void updateSequenceId(long sequenceID) {
}

public ContainerID containerID() {
return new ContainerID(getContainerID());
return ContainerID.valueOf(containerID);
}

/**
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,7 @@ public static ExcludeList getFromProtoBuf(
HddsProtos.ExcludeListProto excludeListProto) {
ExcludeList excludeList = new ExcludeList();
excludeListProto.getContainerIdsList().forEach(id -> {
excludeList.addConatinerId(ContainerID.valueof(id));
excludeList.addConatinerId(ContainerID.valueOf(id));
});
DatanodeDetails.Builder builder = DatanodeDetails.newBuilder();
excludeListProto.getDatanodesList().forEach(dn -> {
Expand Down
Loading