Skip to content

Conversation

@zhengchenyu
Copy link
Contributor

@tez-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 32m 42s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ master Compile Tests _
+1 💚 mvninstall 15m 0s master passed
+1 💚 compile 0m 59s master passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚 compile 0m 55s master passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 checkstyle 1m 32s master passed
+1 💚 javadoc 1m 3s master passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚 javadoc 0m 53s master passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+0 🆗 spotbugs 1m 53s Used deprecated FindBugs config; considering switching to SpotBugs.
+1 💚 findbugs 1m 51s master passed
_ Patch Compile Tests _
+1 💚 mvninstall 0m 27s the patch passed
+1 💚 compile 0m 30s the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚 javac 0m 30s the patch passed
+1 💚 compile 0m 27s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 javac 0m 27s the patch passed
+1 💚 checkstyle 0m 25s the patch passed
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
+1 💚 javadoc 0m 24s the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚 javadoc 0m 23s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 findbugs 1m 10s the patch passed
_ Other Tests _
+1 💚 unit 5m 23s tez-dag in the patch passed.
+1 💚 asflicense 0m 16s The patch does not generate ASF License warnings.
65m 20s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-235/1/artifact/out/Dockerfile
GITHUB PR #235
Optional Tests dupname asflicense javac javadoc unit spotbugs findbugs checkstyle compile
uname Linux b1c11f8cf5dd 4.15.0-175-generic #184-Ubuntu SMP Thu Mar 24 17:48:36 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/tez.sh
git revision master / 621a831
Default Java Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Test Results https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-235/1/testReport/
Max. process+thread count 228 (vs. ulimit of 5500)
modules C: tez-dag U: tez-dag
Console output https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-235/1/console
versions git=2.25.1 maven=3.6.3 findbugs=3.0.1
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

@abstractdog
Copy link
Contributor

thanks for this patch @zhengchenyu!

can you include a unit test to TestTaskScheduler which confirms that a TaskScheduler returns Resource(0,0) even if the RM client returned null?

I'm not familiar with yarn federation, but defaulting to Resource(0,0) makes sense in edge cases
can you please clarify if this is specific to yarn federation or can happen without yarn federation too? (it has never been reported yet)
why does it return null? does it reflect the state of a specific RM or the whole cluster of RMs?

@zhengchenyu
Copy link
Contributor Author

thanks for this patch @zhengchenyu!

can you include a unit test to TestTaskScheduler which confirms that a TaskScheduler returns Resource(0,0) even if the RM client returned null?

I'm not familiar with yarn federation, but defaulting to Resource(0,0) makes sense in edge cases can you please clarify if this is specific to yarn federation or can happen without yarn federation too? (it has never been reported yet) why does it return null? does it reflect the state of a specific RM or the whole cluster of RMs?

It happen only in yarn federation, will never happen without yarn federation.
In fact, YARN-8933 have fix it. After apply YARN-8933, it will never happen in yarn federation.
I don't know it is necessary to continue it. Because it is not a problem for latest hadoop version, but still a problem for some popular version (For example: hadoop-3.2.1).
If you think it is necessary, I will add some unit test. If you think it is not necessary, I will close it.

For why return null in yarn federation?

It is another issue about yarn.
Yarn router use some async thread to connect rm. When all down streaming resourcemanager timeout, yarn router may return null. But After YARN-8933, will return Resource(0,0).

@abstractdog
Copy link
Contributor

thanks @zhengchenyu, after reading YARN-8933 this definitely makes sense
I don't insist on adding a unit test as we're "fixing" a yarn issue here, which is not present anymore after YARN-8933

@abstractdog abstractdog self-requested a review August 19, 2022 11:25
@abstractdog abstractdog merged commit db915eb into apache:master Aug 29, 2022
asfgit pushed a commit that referenced this pull request Aug 29, 2022
udaynpusa pushed a commit to mapr/tez that referenced this pull request Jan 30, 2024
…#235) (zhengchenyu reviewed by Laszlo Bodor)

(cherry picked from commit db915eb)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants