Skip to content
Closed
Original file line number Diff line number Diff line change
Expand Up @@ -1160,6 +1160,11 @@ public void transition(RMNodeImpl rmNode, RMNodeEvent event) {
// Update NM metrics during graceful decommissioning.
rmNode.updateMetricsForGracefulDecommission(initState, finalState);
rmNode.decommissioningTimeout = timeout;
// Notify NodesListManager to notify all RMApp so that each Application Master
// could take any required actions.
rmNode.context.getDispatcher().getEventHandler().handle(
new NodesListManagerEvent(
NodesListManagerEventType.NODE_USABLE, rmNode));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm.. don't think we should be sending NODE_USABLE event here.. since technically, it is not usable.
Maybe consider creating a new NODE_DECOMMISSIONING event ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't very sure about using NODE_USABLE, but while I was making the changes to follow your suggestion, I have found that in the current code TestRMNodeTransitions.testResourceUpdateOnDecommissioningNode is asserting that NodesListManagerEventType.NODE_USABLE is expected for a node that transitions to DECOMMISSIONING. Also, NodesListManagerEventType is transformed into the corresponding RMAppNodeUpdateType in NodesListManager.handle to build a RMAppNodeUpdateEvent that is processed in RMAppImpl.processNodeUpdate which just uses the RMAppNodeUpdateType for logging.

So it looks like it is ok to use NodesListManagerEventType.NODE_USABLE for nodes in decommissioning state. Do you still think it's worth adding some additional value for NodesListManagerEventType and RMAppNodeUpdateType?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel we should make intentions explicit - having a separate event type might make the code cleaner and easier to follow - rather than overloading. It could be that the assumption in the testcase is wrong (will have to double check though), in which case - it is perfectly alright to modify the testcasse with the new event.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, I have added a new commit for adding a new value NODE_DECOMMISSIONING for NodesListManagerEventType

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks - looking at the patch.. can you also attach a consolidated patch on the JIRA ? So as to kick Jenkins.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have just attached the patch. Thanks a lot for taking a look!

if (rmNode.originalTotalCapability == null){
rmNode.originalTotalCapability =
Resources.clone(rmNode.totalCapability);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,9 @@
import static org.mockito.Mockito.when;

import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collections;
import java.util.LinkedList;
import java.util.List;
import java.util.Random;

Expand Down Expand Up @@ -98,13 +100,16 @@ public void handle(SchedulerEvent event) {
}

private NodesListManagerEvent nodesListManagerEvent = null;


private List<NodeState> nodesListManagerEventsNodeStateSequence = new LinkedList<>();

private class TestNodeListManagerEventDispatcher implements
EventHandler<NodesListManagerEvent> {

@Override
public void handle(NodesListManagerEvent event) {
nodesListManagerEvent = event;
nodesListManagerEventsNodeStateSequence.add(event.getNode().getState());
}

}
Expand Down Expand Up @@ -150,7 +155,7 @@ public Void answer(InvocationOnMock invocation) throws Throwable {
NodeId nodeId = BuilderUtils.newNodeId("localhost", 0);
node = new RMNodeImpl(nodeId, rmContext, null, 0, 0, null, null, null);
nodesListManagerEvent = null;

nodesListManagerEventsNodeStateSequence.clear();
}

@After
Expand Down Expand Up @@ -721,6 +726,8 @@ private RMNodeImpl getDecommissioningNode() {
node.handle(new RMNodeEvent(node.getNodeID(),
RMNodeEventType.GRACEFUL_DECOMMISSION));
Assert.assertEquals(NodeState.DECOMMISSIONING, node.getState());
Assert.assertEquals(Arrays.asList(NodeState.NEW, NodeState.RUNNING),
nodesListManagerEventsNodeStateSequence);
Assert
.assertEquals("Active Nodes", initialActive - 1, cm.getNumActiveNMs());
Assert.assertEquals("Decommissioning Nodes", initialDecommissioning + 1,
Expand Down