-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Description
Describe the problem you faced
We 've noticed that when you enable hoodie JMX metrics Shutdown hook is never called.
We took a thread dump to check with threads remain in running state and keep the JVM up .
"RMI TCP Accept-9889": running
at [java.net.PlainSocketImpl.socketAccept(Native Method)](https://spotify.github.io/threaddump-analyzer/#java.net.PlainSocketImpl.socketAccept(Native%20Method))
at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:409)
at java.net.ServerSocket.implAccept(ServerSocket.java:560)
at java.net.ServerSocket.accept(ServerSocket.java:528)
at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(TCPTransport.java:405)
at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(TCPTransport.java:377)
at java.lang.Thread.run(Thread.java:750)
To Reproduce
Steps to reproduce the behavior:
- Enable JMX hoodie metrics
hoodie.metrics.on=true
hoodie.metrics.reporter.type=JMX
- Run a Deltastreamer job on Kubernetes
Expected behavior
We expect when Spark Context is stopped also the JVM should stop.
Environment Description
-
Hudi version : 0.9.0
-
Spark version : 3.1.2
-
Hive version : 2.3.7
-
Hadoop version : N/A
-
Storage (HDFS/S3/GCS..) : S3
-
Running on Docker? (yes/no) : yes
-
Java Version :
openjdk version "1.8.0_322"
OpenJDK Runtime Environment (build 1.8.0_322-b06)
OpenJDK 64-Bit Server VM (build 25.322-b06, mixed mode)
&
openjdk version "11.0.14.1" 2022-02-08
OpenJDK Runtime Environment 18.9 (build 11.0.14.1+1)
OpenJDK 64-Bit Server VM 18.9 (build 11.0.14.1+1, mixed mode, sharing)
Additional context
We tested this with both Java 11 and Java 8 and got the same results .
You can find below the full thread dump output once the Spark Context is succefully stoppped .
We got it with kill -3 <pid> .
The only relevant issue I could find is this
Full Thread Dump Output
Full thread dump OpenJDK 64-Bit Server VM (25.322-b06 mixed mode):
"DestroyJavaVM" #73 prio=5 os_prio=0 tid=0x00007f8a7c011000 nid=0x1a waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"RMI TCP Connection(idle)" #67 daemon prio=5 os_prio=0 tid=0x00007f8974003800 nid=0x5d waiting on condition [0x00007f897b8f9000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000c10bc948> (a java.util.concurrent.SynchronousQueue$TransferStack)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:362)
at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:941)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
"RMI RenewClean-[100.97.64.2:9889]" #66 daemon prio=5 os_prio=0 tid=0x00007f896c006800 nid=0x5c in Object.wait() [0x00007f897b9fa000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:144)
- locked <0x00000000c1064a20> (a java.lang.ref.ReferenceQueue$Lock)
at sun.rmi.transport.DGCClient$EndpointEntry$RenewCleanThread.run(DGCClient.java:563)
at java.lang.Thread.run(Thread.java:750)
"RMI Scheduler(0)" #65 daemon prio=5 os_prio=0 tid=0x00007f8a7ddf6000 nid=0x5b waiting on condition [0x00007f897bafb000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000c1064c28> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
"RMI TCP Connection(idle)" #64 daemon prio=5 os_prio=0 tid=0x00007f8974001800 nid=0x5a waiting on condition [0x00007f897bbfc000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000c10bc948> (a java.util.concurrent.SynchronousQueue$TransferStack)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:362)
at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:941)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
"GC Daemon" #63 daemon prio=2 os_prio=0 tid=0x00007f8a7dde9800 nid=0x59 in Object.wait() [0x00007f897befd000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at sun.misc.GC$Daemon.run(GC.java:117)
- locked <0x00000000c1067260> (a sun.misc.GC$LatencyLock)
"RMI Reaper" #62 prio=5 os_prio=0 tid=0x00007f8a7dde7800 nid=0x58 in Object.wait() [0x00007f897bffe000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:144)
- locked <0x00000000c10673e8> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:165)
at sun.rmi.transport.ObjectTable$Reaper.run(ObjectTable.java:351)
at java.lang.Thread.run(Thread.java:750)
"RMI TCP Accept-9889" #61 daemon prio=5 os_prio=0 tid=0x00007f8a7c275800 nid=0x57 runnable [0x00007f89d81e5000]
java.lang.Thread.State: RUNNABLE
at java.net.PlainSocketImpl.socketAccept(Native Method)
at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:409)
at java.net.ServerSocket.implAccept(ServerSocket.java:560)
at java.net.ServerSocket.accept(ServerSocket.java:528)
at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(TCPTransport.java:405)
at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(TCPTransport.java:377)
at java.lang.Thread.run(Thread.java:750)
"java-sdk-http-connection-reaper" #58 daemon prio=5 os_prio=0 tid=0x00007f8a7da85800 nid=0x54 waiting on condition [0x00007f89d8ae8000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at com.amazonaws.http.IdleConnectionReaper.run(IdleConnectionReaper.java:181)
"Timer for 's3a-file-system' metrics system" #57 daemon prio=5 os_prio=0 tid=0x00007f8a7e0fa000 nid=0x53 in Object.wait() [0x00007f89d8fe9000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.util.TimerThread.mainLoop(Timer.java:552)
- locked <0x00000000d6720a88> (a java.util.TaskQueue)
at java.util.TimerThread.run(Timer.java:505)
"MutableQuantiles-0" #56 daemon prio=5 os_prio=0 tid=0x00007f8a7e0c2000 nid=0x52 waiting on condition [0x00007f89d90ea000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000d66e87c8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
"org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner" #55 daemon prio=5 os_prio=0 tid=0x00007f8a7e0bd000 nid=0x51 in Object.wait() [0x00007f89d91eb000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:144)
- locked <0x00000000d66e8a08> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:165)
at org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner.run(FileSystem.java:3762)
at java.lang.Thread.run(Thread.java:750)
"Okio Watchdog" #41 daemon prio=5 os_prio=0 tid=0x00007f8a7d1ea800 nid=0x41 in Object.wait() [0x00007f89dbbfe000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at okio.AsyncTimeout.awaitTimeout(AsyncTimeout.java:347)
at okio.AsyncTimeout$Watchdog.run(AsyncTimeout.java:312)
- locked <0x00000000d62d18c8> (a java.lang.Class for okio.AsyncTimeout)
"OkHttp ConnectionPool" #40 daemon prio=5 os_prio=0 tid=0x00007f8a7da40000 nid=0x40 in Object.wait() [0x00007f8a4c69b000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:460)
at okhttp3.ConnectionPool$1.run(ConnectionPool.java:67)
- locked <0x00000000d5ccb0e8> (a okhttp3.ConnectionPool)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
"process reaper" #10 daemon prio=10 os_prio=0 tid=0x00007f8a7ec2d800 nid=0x24 waiting on condition [0x00007f8a6c082000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000d5888a60> (a java.util.concurrent.SynchronousQueue$TransferStack)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:362)
at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:941)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
"Service Thread" #7 daemon prio=9 os_prio=0 tid=0x00007f8a7c0c1800 nid=0x21 runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C1 CompilerThread1" #6 daemon prio=9 os_prio=0 tid=0x00007f8a7c0bc800 nid=0x20 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread0" #5 daemon prio=9 os_prio=0 tid=0x00007f8a7c0ba800 nid=0x1f waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Signal Dispatcher" #4 daemon prio=9 os_prio=0 tid=0x00007f8a7c0a6800 nid=0x1e waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Finalizer" #3 daemon prio=8 os_prio=0 tid=0x00007f8a7c083000 nid=0x1d in Object.wait() [0x00007f8a810e3000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:144)
- locked <0x00000000d55feb78> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:165)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:216)
"Reference Handler" #2 daemon prio=10 os_prio=0 tid=0x00007f8a7c07e800 nid=0x1c in Object.wait() [0x00007f8a811e4000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:502)
at java.lang.ref.Reference.tryHandlePending(Reference.java:191)
- locked <0x00000000d55fed30> (a java.lang.ref.Reference$Lock)
at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153)
"VM Thread" os_prio=0 tid=0x00007f8a7c075000 nid=0x1b runnable
"VM Periodic Task Thread" os_prio=0 tid=0x00007f8a7c0c4800 nid=0x22 waiting on condition
Metadata
Metadata
Assignees
Labels
Type
Projects
Status