Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDFS-17067 Use BlockingThreadPoolExecutorService for nnProbingThreadPool in ObserverReadProxy #5803

Merged
merged 7 commits into from
Jul 20, 2023

Conversation

xinglin
Copy link
Contributor

@xinglin xinglin commented Jul 3, 2023

Description of PR

In HDFS-17030, we introduced an ExecutorService, to submit getHAServiceState() requests. We constructed the ExecutorService directly from a basic ThreadPoolExecutor, without setting allowCoreThreadTimeOut to true. Then, the core thread will be kept up and running even when the main thread exits. To fix it, one could set allowCoreThreadTimeOut to true. However, in this PR, we decide to directly use an existing executorService implementation (BlockingThreadPoolExecutorService) in hadoop instead. It takes care of setting allowCoreThreadTimeOut and also allows setting the prefix for thread names.

  private final ExecutorService nnProbingThreadPool =
      new ThreadPoolExecutor(1, 4, 1L, TimeUnit.MINUTES,
          new ArrayBlockingQueue<Runnable>(1024));

A second minor issue is we did not shutdown the executorService in close(). It is a minor issue as close() will only be called when the garbage collector starts to reclaim an ObserverReadProxyProvider object, not when there is no reference to the ObserverReadProxyProvider object. The time between when an ObserverReadProxyProvider becomes dereferenced and when the garage collector actually starts to reclaim that object is out of control/under-defined (unless the program is shutdown with an explicit System.exit(1)).

I also tested with a standalone Java program.

  1. When pool.allowCoreThreadTimeOut(true); is commented out, the JVM process won't exit (no Process finished with exit code 0). The threaddump shows myThread-1 is still waiting for new tasks.
Mon Jul 03 15:42:50 PDT 2023: Main thread started
Mon Jul 03 15:42:50 PDT 2023: task is running
Mon Jul 03 15:42:51 PDT 2023: Main thread exited

Screenshot 2023-07-03 at 12 06 39 PM

  1. When we commented out pool.allowCoreThreadTimeOut(true);, the JVM process exits after 10 seconds.
Mon Jul 03 15:43:43 PDT 2023: Main thread started
Mon Jul 03 15:43:43 PDT 2023: task is running
Mon Jul 03 15:43:44 PDT 2023: Main thread exited

Process finished with exit code 0
import java.io.Closeable;
import java.io.IOException;
import java.util.Date;
import java.util.concurrent.ArrayBlockingQueue;
import java.util.concurrent.ThreadFactory;
import java.util.concurrent.ThreadPoolExecutor;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.AtomicInteger;


public class ExecutorServiceCoreThreadIdleTimeoutTest implements Closeable {
  ExecutorServiceCoreThreadIdleTimeoutTest() {
    pool =
    new ThreadPoolExecutor(1, 4, 10, TimeUnit.SECONDS, new ArrayBlockingQueue<Runnable>(1024),
        namedThreadFactory);
     
    //pool.allowCoreThreadTimeOut(true);
  }

  ThreadFactory namedThreadFactory = new ThreadFactory() {
    private final AtomicInteger threadNumber = new AtomicInteger(1);

    @Override
    public Thread newThread(Runnable r) {
      String name = "myThread-" + threadNumber.getAndIncrement();
      return new Thread(r, name);
    }
  };

  private final ThreadPoolExecutor pool;

  public void submitTask() {

    pool.submit(() -> {
      System.out.printf("%tc: task is running\n", new Date());
    });
  }

  public static void main(String[] args) throws InterruptedException {
    System.out.printf("%tc: Main thread started\n", new Date());

    ExecutorServiceCoreThreadIdleTimeoutTest test = new ExecutorServiceCoreThreadIdleTimeoutTest();
    test.submitTask();
    Thread.sleep(1000);
    System.out.printf("%tc: Main thread exited\n", new Date());
  }

  @Override
  public void close() throws IOException {
    pool.shutdown();
    System.out.printf("%tc: shutdown thread pool\n", new Date());
  }
}

How was this patch tested?

~/p/h/t/h/hadoop-hdfs (HDFS-17067)> mvn test -Dtest="TestObserverReadProxyProvider"
[INFO] Running org.apache.hadoop.hdfs.server.namenode.ha.TestObserverReadProxyProvider
[INFO] Tests run: 16, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 7.965 s - in org.apache.hadoop.hdfs.server.namenode.ha.TestObserverReadProxyProvider
[INFO]
[INFO] Results:
[INFO]
[INFO] Tests run: 16, Failures: 0, Errors: 0, Skipped: 0

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 39s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
+1 💚 mvninstall 47m 11s trunk passed
+1 💚 compile 1m 3s trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 compile 0m 59s trunk passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 checkstyle 0m 35s trunk passed
+1 💚 mvnsite 1m 2s trunk passed
+1 💚 javadoc 0m 54s trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 javadoc 0m 46s trunk passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 spotbugs 2m 40s trunk passed
+1 💚 shadedclient 35m 13s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 53s the patch passed
+1 💚 compile 0m 53s the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 javac 0m 53s the patch passed
+1 💚 compile 0m 46s the patch passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 javac 0m 46s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 24s the patch passed
+1 💚 mvnsite 0m 49s the patch passed
+1 💚 javadoc 0m 36s the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 javadoc 0m 37s the patch passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 spotbugs 2m 33s the patch passed
+1 💚 shadedclient 35m 19s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 26s hadoop-hdfs-client in the patch passed.
+1 💚 asflicense 0m 41s The patch does not generate ASF License warnings.
138m 59s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5803/1/artifact/out/Dockerfile
GITHUB PR #5803
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 2a0a727e34de 4.15.0-213-generic #224-Ubuntu SMP Mon Jun 19 13:30:12 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 6b71a65
Default Java Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5803/1/testReport/
Max. process+thread count 596 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs-client U: hadoop-hdfs-project/hadoop-hdfs-client
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5803/1/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 37s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
+1 💚 mvninstall 44m 9s trunk passed
+1 💚 compile 1m 2s trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 compile 0m 59s trunk passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 checkstyle 0m 36s trunk passed
+1 💚 mvnsite 1m 3s trunk passed
+1 💚 javadoc 0m 53s trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 javadoc 0m 46s trunk passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 spotbugs 2m 39s trunk passed
+1 💚 shadedclient 34m 58s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 52s the patch passed
+1 💚 compile 0m 53s the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 javac 0m 53s the patch passed
+1 💚 compile 0m 47s the patch passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 javac 0m 47s the patch passed
+1 💚 blanks 0m 1s The patch has no blanks issues.
+1 💚 checkstyle 0m 23s the patch passed
+1 💚 mvnsite 0m 51s the patch passed
+1 💚 javadoc 0m 36s the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 javadoc 0m 35s the patch passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 spotbugs 2m 37s the patch passed
+1 💚 shadedclient 35m 22s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 26s hadoop-hdfs-client in the patch passed.
+1 💚 asflicense 0m 41s The patch does not generate ASF License warnings.
135m 13s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5803/2/artifact/out/Dockerfile
GITHUB PR #5803
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 5221563ce6eb 4.15.0-212-generic #223-Ubuntu SMP Tue May 23 13:09:22 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / d5cee02
Default Java Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5803/2/testReport/
Max. process+thread count 652 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs-client U: hadoop-hdfs-project/hadoop-hdfs-client
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5803/2/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 38s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
+1 💚 mvninstall 45m 20s trunk passed
+1 💚 compile 1m 1s trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 compile 0m 57s trunk passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 checkstyle 0m 37s trunk passed
+1 💚 mvnsite 1m 1s trunk passed
+1 💚 javadoc 0m 52s trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 javadoc 0m 46s trunk passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 spotbugs 2m 39s trunk passed
+1 💚 shadedclient 35m 9s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 53s the patch passed
+1 💚 compile 0m 53s the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 javac 0m 53s the patch passed
+1 💚 compile 0m 47s the patch passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 javac 0m 47s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 24s the patch passed
+1 💚 mvnsite 0m 51s the patch passed
+1 💚 javadoc 0m 35s the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1
+1 💚 javadoc 0m 35s the patch passed with JDK Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
+1 💚 spotbugs 2m 38s the patch passed
+1 💚 shadedclient 34m 41s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 28s hadoop-hdfs-client in the patch passed.
+1 💚 asflicense 0m 41s The patch does not generate ASF License warnings.
135m 51s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5803/3/artifact/out/Dockerfile
GITHUB PR #5803
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux cc8c458cde8c 4.15.0-212-generic #223-Ubuntu SMP Tue May 23 13:09:22 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / d5cee02
Default Java Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u372-gaus1-0ubuntu120.04-b09
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5803/3/testReport/
Max. process+thread count 738 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs-client U: hadoop-hdfs-project/hadoop-hdfs-client
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5803/3/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@xinglin xinglin marked this pull request as ready for review July 4, 2023 02:57
@xinglin
Copy link
Contributor Author

xinglin commented Jul 4, 2023

Hi @goiri,

In this PR, we basically changed ThreadPoolExecutor to BlockingThreadPoolExecutorService, which comes with some default settings. I am not sure what unit test we should add here. What do you think? Can we merge in this change without adding new unit tests?

@xinglin
Copy link
Contributor Author

xinglin commented Jul 18, 2023

Hi @goiri,

Could you review this PR as well? thanks,

Copy link
Contributor

@mccormickt12 mccormickt12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm.
Per @xinglin - This is deployed at LinkedIn and been running for approx a day without thread issue. Previous thread issue presented within 4 hours

@xinglin
Copy link
Contributor Author

xinglin commented Jul 20, 2023

Thanks @mccormickt12 for reviewing and approving the PR!

@goiri, could you take a look? thanks,

@goiri goiri merged commit 80fefd0 into apache:trunk Jul 20, 2023
@xinglin
Copy link
Contributor Author

xinglin commented Jul 20, 2023

thanks @goiri for committing this PR to trunk.

xinglin added a commit to xinglin/hadoop that referenced this pull request Jul 23, 2023
jiajunmao pushed a commit to jiajunmao/hadoop-MLEC that referenced this pull request Feb 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants