HDFS-16333. fix balancer bug when transfer an EC block #3679

liubingxing · 2021-11-18T10:24:40Z

We set the EC policy to (6+3) and we also have nodes that were decommissioning when we executed balancer.

With the balancer running, we find many error logs as follow.

Node A wants to transfer an EC block to node B, but we found that the block is not on node A. The FSCK command to show the block status as follow

In the dispatcher. getBlockList function

Assume that the location of the an EC block in storageGroupMap look like this
indices:[0, 1, 2, 3, 4, 5, 6, 7, 8]
node:[a, b, c, d, e, f, g, h, i]

after decommission operation, the internal block on indices[1] were decommission to another node.
indices:[0, 1, 2, 3, 4, 5, 6, 7, 8]
node:[a, j, c, d, e, f, g, h, i]
the location of indices[1] change from node b to node j.

When the balancer get the block location and check it with the location in storageGroupMap.
If a node is not found in storageGroupMap, it will not be add to block locations.
In this case, node j will not be added to the block locations, while the indices is not updated.
Finally, the block location may look like this,
indices:[0, 1, 2, 3, 4, 5, 6, 7, 8]
block.location:[a, c, d, e, f, g, h, i]
the location of the nodes does not match their indices

Solution:
we should update the indices and match with the nodes
indices:[0, 2, 3, 4, 5, 6, 7, 8]
block.location:[a, c, d, e, f, g, h, i]

hadoop-yetus · 2021-11-18T10:27:32Z

💔 -1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	0m 0s		Docker mode activated.
-1 ❌	patch	0m 20s		#3679 does not apply to trunk. Rebase required? Wrong Branch? See https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute for help.

Subsystem	Report/Notes
GITHUB PR	#3679
Console output	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3679/1/console
versions	git=2.17.1
Powered by	Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

hemanthboyina · 2021-11-18T16:27:35Z

@liubingxing can you extend an UT for your scenario

hadoop-yetus · 2021-11-18T16:34:49Z

💔 -1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	0m 45s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 0s		No case conflicting files found.
+0 🆗	codespell	0m 1s		codespell was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
-1 ❌	test4tests	0m 0s		The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
			_ trunk Compile Tests _
+1 💚	mvninstall	36m 14s		trunk passed
+1 💚	compile	1m 37s		trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚	compile	1m 25s		trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚	checkstyle	1m 2s		trunk passed
+1 💚	mvnsite	1m 25s		trunk passed
+1 💚	javadoc	1m 1s		trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚	javadoc	1m 33s		trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚	spotbugs	3m 12s		trunk passed
+1 💚	shadedclient	21m 58s		branch has no errors when building and testing our client artifacts.
			_ Patch Compile Tests _
+1 💚	mvninstall	1m 16s		the patch passed
+1 💚	compile	1m 18s		the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚	javac	1m 18s		the patch passed
+1 💚	compile	1m 13s		the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚	javac	1m 13s		the patch passed
+1 💚	blanks	0m 0s		The patch has no blanks issues.
-0 ⚠️	checkstyle	0m 54s	/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt	hadoop-hdfs-project/hadoop-hdfs: The patch generated 3 new + 40 unchanged - 1 fixed = 43 total (was 41)
+1 💚	mvnsite	1m 19s		the patch passed
+1 💚	javadoc	0m 51s		the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚	javadoc	1m 25s		the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚	spotbugs	3m 16s		the patch passed
+1 💚	shadedclient	23m 7s		patch has no errors when building and testing our client artifacts.
			_ Other Tests _
+1 💚	unit	234m 29s		hadoop-hdfs in the patch passed.
+1 💚	asflicense	0m 41s		The patch does not generate ASF License warnings.
		337m 59s

Subsystem	Report/Notes
Docker	ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3679/2/artifact/out/Dockerfile
GITHUB PR	#3679
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell
uname	Linux 4ddcfd7667b5 4.15.0-156-generic #163-Ubuntu SMP Thu Aug 19 23:31:58 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/bin/hadoop.sh
git revision	trunk / 46dbec8d4354b9fadf094351b3edd501e9f67c40
Default Java	Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
Multi-JDK versions	/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
Test Results	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3679/2/testReport/
Max. process+thread count	3060 (vs. ulimit of 5500)
modules	C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3679/2/console
versions	git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by	Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

hadoop-yetus · 2021-11-19T07:11:02Z

💔 -1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	0m 37s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 0s		No case conflicting files found.
+0 🆗	codespell	0m 1s		codespell was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
-1 ❌	test4tests	0m 0s		The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
			_ trunk Compile Tests _
+1 💚	mvninstall	32m 14s		trunk passed
+1 💚	compile	1m 27s		trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚	compile	1m 17s		trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚	checkstyle	1m 0s		trunk passed
+1 💚	mvnsite	1m 25s		trunk passed
+1 💚	javadoc	1m 1s		trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚	javadoc	1m 33s		trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚	spotbugs	3m 12s		trunk passed
+1 💚	shadedclient	22m 14s		branch has no errors when building and testing our client artifacts.
			_ Patch Compile Tests _
+1 💚	mvninstall	1m 15s		the patch passed
+1 💚	compile	1m 16s		the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚	javac	1m 16s		the patch passed
+1 💚	compile	1m 14s		the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚	javac	1m 14s		the patch passed
+1 💚	blanks	0m 0s		The patch has no blanks issues.
+1 💚	checkstyle	0m 51s		hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 40 unchanged - 1 fixed = 40 total (was 41)
+1 💚	mvnsite	1m 16s		the patch passed
+1 💚	javadoc	0m 50s		the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚	javadoc	1m 25s		the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚	spotbugs	3m 11s		the patch passed
+1 💚	shadedclient	22m 3s		patch has no errors when building and testing our client artifacts.
			_ Other Tests _
-1 ❌	unit	230m 4s	/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt	hadoop-hdfs in the patch passed.
+1 💚	asflicense	0m 41s		The patch does not generate ASF License warnings.
		327m 50s

Reason	Tests
Failed junit tests	hadoop.hdfs.TestRollingUpgrade
	hadoop.hdfs.server.sps.TestExternalStoragePolicySatisfier

Subsystem	Report/Notes
Docker	ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3679/3/artifact/out/Dockerfile
GITHUB PR	#3679
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell
uname	Linux b9d809bed7c9 4.15.0-156-generic #163-Ubuntu SMP Thu Aug 19 23:31:58 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/bin/hadoop.sh
git revision	trunk / a6d001288a18bf65dedc4f3e7227e2f52125d394
Default Java	Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
Multi-JDK versions	/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
Test Results	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3679/3/testReport/
Max. process+thread count	3105 (vs. ulimit of 5500)
modules	C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3679/3/console
versions	git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by	Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

hadoop-yetus · 2021-11-23T19:25:24Z

💔 -1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	0m 58s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 0s		No case conflicting files found.
+0 🆗	codespell	0m 0s		codespell was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	test4tests	0m 0s		The patch appears to include 1 new or modified test files.
			_ trunk Compile Tests _
+1 💚	mvninstall	34m 13s		trunk passed
+1 💚	compile	1m 28s		trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚	compile	1m 27s		trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚	checkstyle	1m 2s		trunk passed
+1 💚	mvnsite	1m 28s		trunk passed
+1 💚	javadoc	1m 6s		trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚	javadoc	1m 33s		trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚	spotbugs	3m 19s		trunk passed
+1 💚	shadedclient	23m 11s		branch has no errors when building and testing our client artifacts.
			_ Patch Compile Tests _
+1 💚	mvninstall	1m 15s		the patch passed
+1 💚	compile	1m 17s		the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚	javac	1m 17s		the patch passed
+1 💚	compile	1m 13s		the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚	javac	1m 13s		the patch passed
+1 💚	blanks	0m 0s		The patch has no blanks issues.
+1 💚	checkstyle	0m 53s		hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 150 unchanged - 1 fixed = 150 total (was 151)
+1 💚	mvnsite	1m 21s		the patch passed
+1 💚	javadoc	0m 53s		the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚	javadoc	1m 27s		the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚	spotbugs	3m 31s		the patch passed
+1 💚	shadedclient	22m 11s		patch has no errors when building and testing our client artifacts.
			_ Other Tests _
-1 ❌	unit	245m 42s	/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt	hadoop-hdfs in the patch passed.
+1 💚	asflicense	0m 47s		The patch does not generate ASF License warnings.
		347m 58s

Reason	Tests
Failed junit tests	hadoop.hdfs.TestRollingUpgrade

Subsystem	Report/Notes
Docker	ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3679/4/artifact/out/Dockerfile
GITHUB PR	#3679
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell
uname	Linux 0e6e56cf6333 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/bin/hadoop.sh
git revision	trunk / e999e8708ade3fec186bd54f1705ba91e6add2eb
Default Java	Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
Multi-JDK versions	/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
Test Results	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3679/4/testReport/
Max. process+thread count	2923 (vs. ulimit of 5500)
modules	C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3679/4/console
versions	git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by	Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

liubingxing · 2021-11-24T01:16:31Z

@liubingxing can you extend an UT for your scenario

@hemanthboyina sorry for the late reply, I add a UT to simulate the balancer with EC file, the excluded node is to simulate the decommissioning node.

liubingxing · 2021-11-30T11:49:30Z

@hemanthboyina Please take a look at this and give some advice. Thanks a lot

tasanuma

@liubingxing Thanks for reporting the issues and submitting the PR. I reviewed it and left some comments. Please confirm them.

tasanuma · 2021-12-02T08:53:05Z

...s-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancer.java

How about adding another void runBalancer method

private void runBalancer(Configuration conf, long totalUsedSpace, long totalCapacity, BalancerParameters p, int excludedNodes) throws Exception { + runBalancer(conf, totalUsedSpace, totalCapacity, p, excludedNodes, false); + } + + private void runBalancer(Configuration conf, long totalUsedSpace, + long totalCapacity, BalancerParameters p, int excludedNodes, boolean checkFailedNum) + throws Exception { waitForHeartBeat(totalUsedSpace, totalCapacity, client, cluster);

and just calling it from the unit test?

Suggested change

final int run = runBalancer(namenodes, pBuilder.build(), conf, true);

if (conf.getInt(

DFSConfigKeys.DFS_DATANODE_BALANCE_MAX_NUM_CONCURRENT_MOVES_KEY,

DFSConfigKeys.DFS_DATANODE_BALANCE_MAX_NUM_CONCURRENT_MOVES_DEFAULT)

== 0) {

assertEquals(ExitStatus.NO_MOVE_PROGRESS.getExitCode(), run);

} else {

assertEquals(ExitStatus.SUCCESS.getExitCode(), run);

}

waitForHeartBeat(totalUsedSpace, totalCapacity, client, cluster);

runBalancer(namenodes, pBuilder.build(), conf, true);

I updated the UT and calling runBalancer from the unit test.
And I also add a parameter boolean checkExcludeNodesUtilization in waitForBalancer to determine whether to check the nodeUtilization of excluded datanode

DatanodeInfo[] datanodeReport = client.getDatanodeReport(DatanodeReportType.ALL); assertEquals(datanodeReport.length, cluster.getDataNodes().size()); balanced = true; int actualExcludedNodeCount = 0; for (DatanodeInfo datanode : datanodeReport) { double nodeUtilization = ((double)datanode.getDfsUsed()) / datanode.getCapacity(); if (Dispatcher.Util.isExcluded(p.getExcludedNodes(), datanode)) { if (checkExcludeNodesUtilization) { assertTrue(nodeUtilization == 0); }

tasanuma · 2021-12-02T09:33:23Z

...dfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java

Could you please provide more detailed comments on when the locations could be updated and why we need to adjust indices?

I fix the code and add more comments like this.

if (!adjustList.isEmpty()) { // block.locations mismatch with block.indices // adjust indices to get correct internalBlock for Datanode in #getInternalBlock ((DBlockStriped) block).adjustIndices(adjustList); Preconditions.checkArgument(((DBlockStriped) block).indices.length == block.locations.size()); }

tasanuma · 2021-12-02T09:36:44Z

...dfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java

Please provide comments on what this method does.

add the comments like this.

/** * Adjust EC block indices，it will remove the element of adjustList from indices. * @param adjustList the list will be removed from indices */ public void adjustIndices(List<Integer> adjustList) {

tasanuma · 2021-12-02T09:42:53Z

...s-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancer.java

I did the unit test multiple times, and sometimes the length of the locatedBlocks is larger than groupSize, and the verification failed. Could you check it?

This is because waitForBalancer not waiting for namenode to delete extra replicas.
I fix the code and check the total block counts before StripedFileTestUtil.verifyLocatedStripedBlocks like this.

// check total blocks, max wait time 60s long startTime = Time.monotonicNow(); int count = 0; while (count < 20) { count++; DatanodeInfo[] datanodeReport1 = client.getDatanodeReport(DatanodeReportType.ALL); long totalBlocksAfterBalancer = 0; for (DatanodeInfo dn : datanodeReport1) { totalBlocksAfterBalancer += dn.getNumBlocks(); } if (totalBlocks == totalBlocksAfterBalancer) { System.out.println("wait " + (Time.monotonicNow() - startTime) + "ms to check blocks, count " + count); break; } cluster.triggerHeartbeats(); Thread.sleep(3000L); } // verify locations of striped blocks locatedBlocks = client.getBlockLocations(fileName, 0, fileLen); StripedFileTestUtil.verifyLocatedStripedBlocks(locatedBlocks, groupSize);

It makes sense. How about using GenericTestUtils#waitFor for checking it?

liubingxing · 2021-12-06T08:58:28Z

@tasanuma Thank you for your review and comments.

I run the UT doTestBalancerWithStripedFile in current trunk branch and sometimes the errors occur .

Therefore, it is not good to use assertEquals(0, nnc.getBlocksFailed().get()) to check the result in this new UT.
I will redesign a UT to test this scenario as soon as possible.
If you have any suggestions, please let me know, thank you.

tasanuma · 2021-12-06T13:40:11Z

One possible solution is to add Preconditions.checkArgument that checks that the length of the indices is equal to the size of the block location, and to check that ExitStatus is SUCCESS in the unit test.

	if (blkLocs instanceof StripedBlockWithLocations) {
	  // adjust indices if locations has been updated
	  ((DBlockStriped) block).adjustIndices(adjustList);
	  Preconditions.checkArgument(((DBlockStriped) block).indices.length
		  == block.locations.size());
	}

liubingxing · 2021-12-07T12:21:48Z

@tasanuma Thank you for your advice and I fix the code and according to your suggestion. Please take a look.

hadoop-yetus · 2021-12-07T15:57:38Z

🎊 +1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	0m 43s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 0s		No case conflicting files found.
+0 🆗	codespell	0m 0s		codespell was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	test4tests	0m 0s		The patch appears to include 1 new or modified test files.
			_ trunk Compile Tests _
+1 💚	mvninstall	32m 24s		trunk passed
+1 💚	compile	1m 28s		trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚	compile	1m 19s		trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚	checkstyle	1m 0s		trunk passed
+1 💚	mvnsite	1m 26s		trunk passed
+1 💚	javadoc	1m 1s		trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚	javadoc	1m 36s		trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚	spotbugs	3m 21s		trunk passed
+1 💚	shadedclient	22m 54s		branch has no errors when building and testing our client artifacts.
			_ Patch Compile Tests _
+1 💚	mvninstall	1m 16s		the patch passed
+1 💚	compile	1m 44s		the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚	javac	1m 44s		the patch passed
+1 💚	compile	1m 14s		the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚	javac	1m 14s		the patch passed
+1 💚	blanks	0m 0s		The patch has no blanks issues.
-0 ⚠️	checkstyle	0m 55s	/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt	hadoop-hdfs-project/hadoop-hdfs: The patch generated 3 new + 150 unchanged - 1 fixed = 153 total (was 151)
+1 💚	mvnsite	1m 20s		the patch passed
+1 💚	javadoc	0m 51s		the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚	javadoc	1m 25s		the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚	spotbugs	3m 11s		the patch passed
+1 💚	shadedclient	22m 11s		patch has no errors when building and testing our client artifacts.
			_ Other Tests _
+1 💚	unit	225m 39s		hadoop-hdfs in the patch passed.
+1 💚	asflicense	0m 45s		The patch does not generate ASF License warnings.
		325m 36s

Subsystem	Report/Notes
Docker	ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3679/5/artifact/out/Dockerfile
GITHUB PR	#3679
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell
uname	Linux 72bab9b40a00 4.15.0-156-generic #163-Ubuntu SMP Thu Aug 19 23:31:58 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/bin/hadoop.sh
git revision	trunk / 1d6f9e68c82fedf9831b03cda6010ad3fc979300
Default Java	Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
Multi-JDK versions	/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
Test Results	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3679/5/testReport/
Max. process+thread count	3232 (vs. ulimit of 5500)
modules	C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3679/5/console
versions	git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by	Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

tasanuma

@liubingxing Thanks for updating PR. I left some comments. Would you please confirm them?

tasanuma · 2021-12-08T05:43:05Z

...s-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancer.java

Could you move testBalancerWithExcludeListWithStripedFile() and doTestBalancerWithExcludeListWithStripedFile () after doTestBalancerWithStripedFile?

tasanuma · 2021-12-08T05:47:47Z

...s-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancer.java

It makes sense. How about using GenericTestUtils#waitFor for checking it?

tasanuma · 2021-12-08T05:57:51Z

...dfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java

As Preconditions.checkArgument() can throw IllegalArgumentException, getBlockList() should declares it, and dispatchBlocks() should catch the exception.

- private long getBlockList() throws IOException { + private long getBlockList() throws IOException, IllegalArgumentException {

try { final long received = getBlockList(); if (received == 0) { return; } blocksToReceive -= received; continue; - } catch (IOException e) { + } catch (IOException|IllegalArgumentException e) { LOG.warn("Exception while getting reportedBlock list", e); return; }

hadoop-yetus · 2021-12-08T06:52:00Z

💔 -1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	0m 37s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 0s		No case conflicting files found.
+0 🆗	codespell	0m 0s		codespell was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	test4tests	0m 0s		The patch appears to include 1 new or modified test files.
			_ trunk Compile Tests _
+1 💚	mvninstall	32m 8s		trunk passed
+1 💚	compile	1m 26s		trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚	compile	1m 20s		trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚	checkstyle	1m 3s		trunk passed
+1 💚	mvnsite	1m 27s		trunk passed
+1 💚	javadoc	1m 1s		trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚	javadoc	1m 32s		trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚	spotbugs	3m 9s		trunk passed
+1 💚	shadedclient	22m 24s		branch has no errors when building and testing our client artifacts.
			_ Patch Compile Tests _
+1 💚	mvninstall	1m 16s		the patch passed
+1 💚	compile	1m 19s		the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚	javac	1m 19s		the patch passed
+1 💚	compile	1m 12s		the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚	javac	1m 12s		the patch passed
+1 💚	blanks	0m 0s		The patch has no blanks issues.
+1 💚	checkstyle	0m 51s		hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 150 unchanged - 1 fixed = 150 total (was 151)
+1 💚	mvnsite	1m 18s		the patch passed
+1 💚	javadoc	0m 50s		the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚	javadoc	1m 25s		the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚	spotbugs	3m 16s		the patch passed
+1 💚	shadedclient	22m 3s		patch has no errors when building and testing our client artifacts.
			_ Other Tests _
-1 ❌	unit	237m 28s	/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt	hadoop-hdfs in the patch passed.
+1 💚	asflicense	0m 48s		The patch does not generate ASF License warnings.
		335m 34s

Reason	Tests
Failed junit tests	hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes

Subsystem	Report/Notes
Docker	ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3679/6/artifact/out/Dockerfile
GITHUB PR	#3679
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell
uname	Linux e02472f6a9ee 4.15.0-156-generic #163-Ubuntu SMP Thu Aug 19 23:31:58 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/bin/hadoop.sh
git revision	trunk / ed3a77c6c22487e4ff1d941fe0047f54bf2eb55c
Default Java	Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
Multi-JDK versions	/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
Test Results	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3679/6/testReport/
Max. process+thread count	3173 (vs. ulimit of 5500)
modules	C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3679/6/console
versions	git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by	Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

tasanuma · 2021-12-08T09:00:31Z

Thanks for updating it. +1, pending Jenkins.

liubingxing · 2021-12-08T09:27:40Z

@tasanuma Thanks for your review.

hadoop-yetus · 2021-12-08T12:44:28Z

💔 -1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	0m 46s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 0s		No case conflicting files found.
+0 🆗	codespell	0m 1s		codespell was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	test4tests	0m 0s		The patch appears to include 1 new or modified test files.
			_ trunk Compile Tests _
+1 💚	mvninstall	39m 40s		trunk passed
+1 💚	compile	1m 32s		trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚	compile	1m 22s		trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚	checkstyle	1m 4s		trunk passed
+1 💚	mvnsite	1m 27s		trunk passed
+1 💚	javadoc	1m 0s		trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚	javadoc	1m 32s		trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚	spotbugs	3m 12s		trunk passed
+1 💚	shadedclient	22m 14s		branch has no errors when building and testing our client artifacts.
			_ Patch Compile Tests _
+1 💚	mvninstall	1m 17s		the patch passed
+1 💚	compile	1m 17s		the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚	javac	1m 17s		the patch passed
+1 💚	compile	1m 13s		the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚	javac	1m 13s		the patch passed
+1 💚	blanks	0m 0s		The patch has no blanks issues.
+1 💚	checkstyle	0m 52s		hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 150 unchanged - 1 fixed = 150 total (was 151)
+1 💚	mvnsite	1m 22s		the patch passed
+1 💚	javadoc	0m 51s		the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚	javadoc	1m 24s		the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚	spotbugs	3m 15s		the patch passed
+1 💚	shadedclient	23m 1s		patch has no errors when building and testing our client artifacts.
			_ Other Tests _
-1 ❌	unit	229m 11s	/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt	hadoop-hdfs in the patch passed.
+1 💚	asflicense	0m 46s		The patch does not generate ASF License warnings.
		336m 10s

Reason	Tests
Failed junit tests	hadoop.hdfs.TestRollingUpgrade

Subsystem	Report/Notes
Docker	ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3679/7/artifact/out/Dockerfile
GITHUB PR	#3679
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell
uname	Linux e4e612629e2e 4.15.0-156-generic #163-Ubuntu SMP Thu Aug 19 23:31:58 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/bin/hadoop.sh
git revision	trunk / `15a3fbd`
Default Java	Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
Multi-JDK versions	/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
Test Results	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3679/7/testReport/
Max. process+thread count	3194 (vs. ulimit of 5500)
modules	C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3679/7/console
versions	git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by	Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

liubingxing · 2021-12-08T13:00:26Z

The failed unit tests is not related to this PR

tasanuma · 2021-12-09T04:24:25Z

Merged into trunk. Thanks for your contribution, @liubingxing!

(cherry picked from commit 35556ea)

liubingxing · 2021-12-09T05:00:17Z

@tasanuma Thanks for your review and merge.

This reverts commit 55c0e67.

This reverts commit 2072a6a.

tasanuma · 2021-12-09T07:49:21Z

After cherry-picking into branch-3.3, the build of branch-3.3 fails with the following error. So I reverted it for now.

[ERROR] /.../hadoop/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancer.java:[2238,11] cannot find symbol
  symbol:   variable Assert
  location: class org.apache.hadoop.hdfs.server.balancer.TestBalancer

@liubingxing It seems TestBalancer doesn't import org.junit.Assert in branch-3.3, and it is the cause of the build failure. Could you create another PR for branch-3.3?

(cherry picked from commit 35556ea)

This reverts commit 55c0e67.

liubingxing force-pushed the HDFS-16333 branch from 3daa567 to 46dbec8 Compare November 18, 2021 10:55

liubingxing force-pushed the HDFS-16333 branch from 46dbec8 to a6d0012 Compare November 19, 2021 01:41

liubingxing force-pushed the HDFS-16333 branch from a6d0012 to e999e87 Compare November 23, 2021 13:35

tasanuma reviewed Dec 2, 2021

View reviewed changes

liubingxing force-pushed the HDFS-16333 branch from e999e87 to 1d6f9e6 Compare December 7, 2021 10:30

liubingxing force-pushed the HDFS-16333 branch from 1d6f9e6 to ed3a77c Compare December 8, 2021 01:15

tasanuma reviewed Dec 8, 2021

View reviewed changes

HDFS-16333. fix balancer bug when transfer an EC block

15a3fbd

liubingxing force-pushed the HDFS-16333 branch from ed3a77c to 15a3fbd Compare December 8, 2021 07:06

tasanuma approved these changes Dec 9, 2021

View reviewed changes

tasanuma merged commit 35556ea into apache:trunk Dec 9, 2021

tasanuma pushed a commit that referenced this pull request Dec 9, 2021

HDFS-16333. fix balancer bug when transfer an EC block (#3679)

55c0e67

(cherry picked from commit 35556ea)

tasanuma pushed a commit that referenced this pull request Dec 9, 2021

HDFS-16333. fix balancer bug when transfer an EC block (#3679)

2072a6a

(cherry picked from commit 35556ea)

tasanuma added a commit that referenced this pull request Dec 9, 2021

Revert "HDFS-16333. fix balancer bug when transfer an EC block (#3679)"

a67f4dc

This reverts commit 55c0e67.

tasanuma added a commit that referenced this pull request Dec 9, 2021

Revert "HDFS-16333. fix balancer bug when transfer an EC block (#3679)"

2315849

This reverts commit 2072a6a.

sunchao pushed a commit that referenced this pull request Jan 4, 2022

HDFS-16333. fix balancer bug when transfer an EC block (#3679)

5214140

(cherry picked from commit 35556ea)

sunchao pushed a commit that referenced this pull request Jan 4, 2022

Revert "HDFS-16333. fix balancer bug when transfer an EC block (#3679)"

1e3f94f

This reverts commit 55c0e67.

HarshitGupta11 pushed a commit to HarshitGupta11/hadoop that referenced this pull request Nov 28, 2022

HDFS-16333. fix balancer bug when transfer an EC block (apache#3679)

428b4ae

-      final int run = runBalancer(namenodes, pBuilder.build(), conf, true);
-      if (conf.getInt(
-          DFSConfigKeys.DFS_DATANODE_BALANCE_MAX_NUM_CONCURRENT_MOVES_KEY,
-          DFSConfigKeys.DFS_DATANODE_BALANCE_MAX_NUM_CONCURRENT_MOVES_DEFAULT)
-          == 0) {
-        assertEquals(ExitStatus.NO_MOVE_PROGRESS.getExitCode(), run);
-      } else {
-        assertEquals(ExitStatus.SUCCESS.getExitCode(), run);
-      }
-      waitForHeartBeat(totalUsedSpace, totalCapacity, client, cluster);
+     runBalancer(namenodes, pBuilder.build(), conf, true);

HDFS-16333. fix balancer bug when transfer an EC block #3679

HDFS-16333. fix balancer bug when transfer an EC block #3679

Uh oh!

Conversation

liubingxing commented Nov 18, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hadoop-yetus commented Nov 18, 2021

Uh oh!

hemanthboyina commented Nov 18, 2021

Uh oh!

hadoop-yetus commented Nov 18, 2021

Uh oh!

hadoop-yetus commented Nov 19, 2021

Uh oh!

hadoop-yetus commented Nov 23, 2021

Uh oh!

liubingxing commented Nov 24, 2021

Uh oh!

liubingxing commented Nov 30, 2021

Uh oh!

tasanuma left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

liubingxing Dec 7, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

liubingxing Dec 7, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

liubingxing commented Dec 6, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tasanuma commented Dec 6, 2021

Uh oh!

liubingxing commented Dec 7, 2021

Uh oh!

hadoop-yetus commented Dec 7, 2021

Uh oh!

tasanuma left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hadoop-yetus commented Dec 8, 2021

Uh oh!

tasanuma commented Dec 8, 2021

Uh oh!

liubingxing commented Dec 8, 2021

Uh oh!

hadoop-yetus commented Dec 8, 2021

Uh oh!

liubingxing commented Dec 8, 2021

Uh oh!

tasanuma commented Dec 9, 2021

Uh oh!

liubingxing commented Dec 9, 2021

Uh oh!

tasanuma commented Dec 9, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

liubingxing commented Nov 18, 2021 •

edited

Loading

liubingxing Dec 7, 2021 •

edited

Loading

liubingxing Dec 7, 2021 •

edited

Loading

liubingxing commented Dec 6, 2021 •

edited

Loading

tasanuma commented Dec 9, 2021 •

edited

Loading