Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert "HDFS-16776 Erasure Coding: The length of targets should be checked when DN gets a reconstruction task" #6964

Closed
wants to merge 1 commit into from

Conversation

tomscut
Copy link
Contributor

@tomscut tomscut commented Jul 26, 2024

Reverts #4901

As a result of this change, maintainance can get stuck in two ways:

  1. In order to satisfy the storage policy.
  2. In an ec block, there are more than 2 dn in Entering Maintenance state and dfs.namenode.maintenance.ec.replication.min >= 2.

Here's a more complex example. We recently did maintainance on a batch of nodes, including host4 and host8.
Configuration:

dfs.namenode.maintenance.ec.replication.min=1
storagePolicy=HDD

hdfs fsck -fs -blockId blk_-9223372035217210640

[blk_-9223372035217210640:DatanodeInfoWithStorage[host1:50010,DS-b9b2ea24-e69b-4a95-8a36-8b73b32003d3,DISK], 
blk_-9223372035217210639:DatanodeInfoWithStorage[host2:50010,DS-dfc9b308-a493-4d9b-b1c1-a134552f089f,SSD], 
blk_-9223372035217210638:DatanodeInfoWithStorage[host3:50010,DS-67669a8d-57d9-4825-8e1e-0e834d1fd47a,DISK], 
blk_-9223372035217210637:DatanodeInfoWithStorage[host4:50010,DS-6826ff2a-a6e5-4676-ad40-284099652670,DISK], Entering Maintenance
blk_-9223372035217210636:DatanodeInfoWithStorage[host5:50010,DS-2e042fb1-dbc2-4ccf-ba43-da51a9ef2079,DISK], 
blk_-9223372035217210635:DatanodeInfoWithStorage[host6:50010,DS-005f2bce-eb46-432f-85b0-61919554692f,DISK], 
blk_-9223372035217210633:DatanodeInfoWithStorage[host7:50010,DS-cc11ce37-e121-4602-8688-ec7d45a0f276,DISK], 
blk_-9223372035217210632:DatanodeInfoWithStorage[host8:50010,DS-076891a0-4166-4584-9cea-13c853cbd667,DISK]] Entering Maintenance

Datanode log:

2024-07-25 12:46:42,680 INFO [Command processor] org.apache.hadoop.hdfs.server.datanode.DataNode: processErasureCodingTasks  BlockECReconstructionInfo(
  Recovering BP-1956563710-x.x.x.x-1622796911268:blk_-9223372035217210640_105868369 
  From: [host1:50010, host2:50010, host3:50010, host4:50010, host5:50010, host6:50010, host7:50010, host8:50010] 
  To: [[host9:50010, host10:50010])
 Block Indices: [0, 1, 2, 3, 4, 5, 7, 8]
2024-07-25 12:46:42,680 WARN [Command processor] org.apache.hadoop.hdfs.server.datanode.DataNode: Failed to reconstruct striped block blk_-9223372035217210640_105868369
java.lang.IllegalArgumentException: Reconstruction work gets too much targets.
	at com.google.common.base.Preconditions.checkArgument(Preconditions.java:141)
	at org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedWriter.<init>(StripedWriter.java:86)
	at org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.<init>(StripedBlockReconstructor.java:47)
	at org.apache.hadoop.hdfs.server.datanode.erasurecode.ErasureCodingWorker.processErasureCodingTasks(ErasureCodingWorker.java:134)
	at org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:797)
	at org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:680)
	at org.apache.hadoop.hdfs.server.datanode.BPServiceActor$CommandProcessingThread.processCommand(BPServiceActor.java:1327)
	at org.apache.hadoop.hdfs.server.datanode.BPServiceActor$CommandProcessingThread.lambda$enqueue$2(BPServiceActor.java:1365)
	at org.apache.hadoop.hdfs.server.datanode.BPServiceActor$CommandProcessingThread.processQueue(BPServiceActor.java:1301)
	at org.apache.hadoop.hdfs.server.datanode.BPServiceActor$CommandProcessingThread.run(BPServiceActor.java:1288)

In this block group, there is a block written on the SSD (blk_-9223372035217210639).

When doing maintainance, two blocks need to be added: one is to migrate the blocks of SSD to HDD(In order to satisfy the storage policy), and the other is to ensure at least 7 blocks during maintainance.

Then the maintainance process to get stuck.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 19s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
+1 💚 mvninstall 32m 42s trunk passed
+1 💚 compile 0m 44s trunk passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 💚 compile 0m 40s trunk passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 💚 checkstyle 0m 37s trunk passed
+1 💚 mvnsite 0m 42s trunk passed
+1 💚 javadoc 0m 42s trunk passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 💚 javadoc 1m 7s trunk passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 💚 spotbugs 1m 42s trunk passed
+1 💚 shadedclient 21m 6s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 37s the patch passed
+1 💚 compile 0m 38s the patch passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 💚 javac 0m 38s the patch passed
+1 💚 compile 0m 38s the patch passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 💚 javac 0m 38s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 29s the patch passed
+1 💚 mvnsite 0m 38s the patch passed
+1 💚 javadoc 0m 30s the patch passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 💚 javadoc 1m 2s the patch passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 💚 spotbugs 1m 43s the patch passed
+1 💚 shadedclient 21m 1s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 196m 12s hadoop-hdfs in the patch passed.
+1 💚 asflicense 0m 30s The patch does not generate ASF License warnings.
283m 48s
Subsystem Report/Notes
Docker ClientAPI=1.46 ServerAPI=1.46 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6964/1/artifact/out/Dockerfile
GITHUB PR #6964
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 198f6b1e1d11 5.15.0-106-generic #116-Ubuntu SMP Wed Apr 17 09:17:56 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 24ddec9
Default Java Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6964/1/testReport/
Max. process+thread count 4786 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6964/1/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@LoseYSelf
Copy link

i have an idea,we can limit the BlockPlacementPolicyDefault to choose ec target one by one,so it will not break dn checkArgument, , instead of choose all target storageType

@Hexiaoqiao
Copy link
Contributor

cc @zhangshuyan0 and @haiyang1987

@tomscut
Copy link
Contributor Author

tomscut commented Jul 29, 2024

i have an idea,we can limit the BlockPlacementPolicyDefault to choose ec target one by one,so it will not break dn checkArgument, , instead of choose all target storageType

Thanks for your comment. Optimize the BlockPlacementPolicyDefault, we can solve part of the issue. Datanodes can still be affected when performing maintenance.

@tomscut
Copy link
Contributor Author

tomscut commented Aug 1, 2024

We try to solve this problem from the NN side. Close this PR. thank you~

@tomscut tomscut closed this Aug 1, 2024
@zhengchenyu
Copy link
Contributor

zhengchenyu commented Aug 7, 2024

@Hexiaoqiao @tomscut @zhangshuyan0 @haiyang1987

I think the problem is that ec's code is too dependent on the original process, the original process is based on continuous block copy. Many bugs come from over-reliance on this process and unnecessary parameter passing. I submit HDFS-17542, try to rearrange these code. After this pr, I think there are no need to check the length of target. Can you please review HDFS-17542?

As for the maintenance, maybe I think it is not just maintainance. There are some imperfections in calculating the ec replica state, mainly because the uniqueness of the internal blocks is not taken into account when calculating NumberReplicas. Although some issues attempt to solve these problems (such as the inaccurate calculation of DECOMMISSIONING solved by HDFS-14920), it is not thorough. HDFS-17542 introduce NumberReplicasStriped, I think it is a better way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants