Skip to content

Conversation

@turboFei
Copy link
Member

@turboFei turboFei commented Apr 23, 2025

Why are the changes needed?

Followup for #7034 to fix the SparkOnKubernetesTestsSuite.

Sorry, I forget that the appInfo name and pod name were deeply bound before, the appInfo name was used as pod name and used to delete pod.

In this PR, we add podName into applicationInfo to separate app name and pod name.

How was this patch tested?

GA should pass.

Was this patch authored or co-authored using generative AI tooling?

No.

@turboFei turboFei changed the title fix test Fix flaky kubernetes integration test Apr 23, 2025
@codecov-commenter
Copy link

codecov-commenter commented Apr 23, 2025

Codecov Report

Attention: Patch coverage is 0% with 11 lines in your changes missing coverage. Please review.

Project coverage is 0.00%. Comparing base (f0c31e2) to head (0ff7018).
Report is 5 commits behind head on master.

Files with missing lines Patch % Lines
...kyuubi/engine/KubernetesApplicationOperation.scala 0.00% 7 Missing ⚠️
...rg/apache/kyuubi/engine/ApplicationOperation.scala 0.00% 4 Missing ⚠️
Additional details and impacted files
@@          Coverage Diff           @@
##           master   #7039   +/-   ##
======================================
  Coverage    0.00%   0.00%           
======================================
  Files         695     695           
  Lines       42816   42827   +11     
  Branches     5830    5832    +2     
======================================
- Misses      42816   42827   +11     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@turboFei
Copy link
Member Author

(base) ➜  target grep eventType= unit-tests.log
07:43:13.820 -1704927850-pool-118-thread-2 INFO KubernetesApplicationAuditLogger: eventType=ADD	label=13496573-9b18-4a25-b919-2b8f0c3fc1f5	context=null	namespace=null	pod=kyuubi-spark-driver-1745394188472	podState=Pending	containers=[]	appId=spark-57718fe22cdf4ea1a8c59c37b84c4365	appName=spark-pi	appState=PENDING	appError=''
07:43:13.825 -1704927850-pool-117-thread-2 INFO KubernetesApplicationAuditLogger: eventType=ADD	label=13496573-9b18-4a25-b919-2b8f0c3fc1f5	context=minikube	namespace=null	pod=kyuubi-spark-driver-1745394188472	podState=Pending	containers=[]	appId=spark-57718fe22cdf4ea1a8c59c37b84c4365	appName=spark-pi	appState=PENDING	appError=''
07:43:13.829 -1704927850-pool-118-thread-1 INFO KubernetesApplicationAuditLogger: eventType=UPDATE	label=13496573-9b18-4a25-b919-2b8f0c3fc1f5	context=null	namespace=null	pod=kyuubi-spark-driver-1745394188472	podState=Pending	containers=[]	appId=spark-57718fe22cdf4ea1a8c59c37b84c4365	appName=spark-pi	appState=PENDING	appError=''
07:43:13.840 -1704927850-pool-118-thread-1 INFO KubernetesApplicationAuditLogger: eventType=UPDATE	label=13496573-9b18-4a25-b919-2b8f0c3fc1f5	context=null	namespace=null	pod=kyuubi-spark-driver-1745394188472	podState=Pending	containers=[spark-kubernetes-driver->ContainerState(running=null, terminated=null, waiting=ContainerStateWaiting(message=null, reason=ContainerCreating, additionalProperties={}), additionalProperties={})]	appId=spark-57718fe22cdf4ea1a8c59c37b84c4365	appName=spark-pi	appState=PENDING	appError=''
07:43:13.840 -1704927850-pool-117-thread-4 INFO KubernetesApplicationAuditLogger: eventType=UPDATE	label=13496573-9b18-4a25-b919-2b8f0c3fc1f5	context=minikube	namespace=null	pod=kyuubi-spark-driver-1745394188472	podState=Pending	containers=[]	appId=spark-57718fe22cdf4ea1a8c59c37b84c4365	appName=spark-pi	appState=PENDING	appError=''
07:43:13.841 -1704927850-pool-117-thread-3 INFO KubernetesApplicationAuditLogger: eventType=UPDATE	label=13496573-9b18-4a25-b919-2b8f0c3fc1f5	context=minikube	namespace=null	pod=kyuubi-spark-driver-1745394188472	podState=Pending	containers=[spark-kubernetes-driver->ContainerState(running=null, terminated=null, waiting=ContainerStateWaiting(message=null, reason=ContainerCreating, additionalProperties={}), additionalProperties={})]	appId=spark-57718fe22cdf4ea1a8c59c37b84c4365	appName=spark-pi	appState=PENDING	appError=''
07:43:15.257 -1704927850-pool-118-thread-2 INFO KubernetesApplicationAuditLogger: eventType=UPDATE	label=13496573-9b18-4a25-b919-2b8f0c3fc1f5	context=null	namespace=null	pod=kyuubi-spark-driver-1745394188472	podState=Running	containers=[spark-kubernetes-driver->ContainerState(running=ContainerStateRunning(startedAt=2025-04-23T07:43:14Z, additionalProperties={}), terminated=null, waiting=null, additionalProperties={})]	appId=spark-57718fe22cdf4ea1a8c59c37b84c4365	appName=spark-pi	appState=RUNNING	appError=''
07:43:15.266 -1704927850-pool-117-thread-4 INFO KubernetesApplicationAuditLogger: eventType=UPDATE	label=13496573-9b18-4a25-b919-2b8f0c3fc1f5	context=minikube	namespace=null	pod=kyuubi-spark-driver-1745394188472	podState=Running	containers=[spark-kubernetes-driver->ContainerState(running=ContainerStateRunning(startedAt=2025-04-23T07:43:14Z, additionalProperties={}), terminated=null, waiting=null, additionalProperties={})]	appId=spark-57718fe22cdf4ea1a8c59c37b84c4365	appName=spark-pi	appState=RUNNING	appError=''
07:43:23.334 -1704927850-pool-117-thread-4 INFO KubernetesApplicationAuditLogger: eventType=UPDATE	label=13496573-9b18-4a25-b919-2b8f0c3fc1f5	context=minikube	namespace=null	pod=kyuubi-spark-driver-1745394188472	podState=Running	containers=[spark-kubernetes-driver->ContainerState(running=null, terminated=ContainerStateTerminated(containerID=docker://f19b1e67b63e5d0e73763367a8f0f8ce291b8b05a7776bf6e17ee30d6344e213, exitCode=0, finishedAt=2025-04-23T07:43:22Z, message=null, reason=Completed, signal=null, startedAt=2025-04-23T07:43:14Z, additionalProperties={}), waiting=null, additionalProperties={})]	appId=spark-57718fe22cdf4ea1a8c59c37b84c4365	appName=spark-pi	appState=RUNNING	appError=''
07:43:23.337 -1704927850-pool-118-thread-2 INFO KubernetesApplicationAuditLogger: eventType=UPDATE	label=13496573-9b18-4a25-b919-2b8f0c3fc1f5	context=null	namespace=null	pod=kyuubi-spark-driver-1745394188472	podState=Running	containers=[spark-kubernetes-driver->ContainerState(running=null, terminated=ContainerStateTerminated(containerID=docker://f19b1e67b63e5d0e73763367a8f0f8ce291b8b05a7776bf6e17ee30d6344e213, exitCode=0, finishedAt=2025-04-23T07:43:22Z, message=null, reason=Completed, signal=null, startedAt=2025-04-23T07:43:14Z, additionalProperties={}), waiting=null, additionalProperties={})]	appId=spark-57718fe22cdf4ea1a8c59c37b84c4365	appName=spark-pi	appState=RUNNING	appError=''
07:43:24.378 -1704927850-pool-118-thread-1 INFO KubernetesApplicationAuditLogger: eventType=UPDATE	label=13496573-9b18-4a25-b919-2b8f0c3fc1f5	context=null	namespace=null	pod=kyuubi-spark-driver-1745394188472	podState=Succeeded   	containers=[spark-kubernetes-driver->ContainerState(running=null, terminated=ContainerStateTerminated(containerID=docker://f19b1e67b63e5d0e73763367a8f0f8ce291b8b05a7776bf6e17ee30d6344e213, exitCode=0, finishedAt=2025-04-23T07:43:22Z, message=null, reason=Completed, signal=null, startedAt=2025-04-23T07:43:14Z, additionalProperties={}), waiting=null, additionalProperties={})]	appId=spark-57718fe22cdf4ea1a8c59c37b84c4365	appName=spark-pi	appState=FINISHED	appError=''
07:43:24.379 -1704927850-pool-117-thread-3 INFO KubernetesApplicationAuditLogger: eventType=UPDATE	label=13496573-9b18-4a25-b919-2b8f0c3fc1f5	context=minikube	namespace=null	pod=kyuubi-spark-driver-1745394188472	podState=Succeeded	containers=[spark-kubernetes-driver->ContainerState(running=null, terminated=ContainerStateTerminated(containerID=docker://f19b1e67b63e5d0e73763367a8f0f8ce291b8b05a7776bf6e17ee30d6344e213, exitCode=0, finishedAt=2025-04-23T07:43:22Z, message=null, reason=Completed, signal=null, startedAt=2025-04-23T07:43:14Z, additionalProperties={}), waiting=null, additionalProperties={})]	appId=spark-57718fe22cdf4ea1a8c59c37b84c4365	appName=spark-pi	appState=FINISHED	appError=''
07:46:14.726 -1704927850-pool-117-thread-6 INFO KubernetesApplicationAuditLogger: eventType=ADD	label=4ff66d56-ce78-477f-b51c-34902112bc06	context=minikube	namespace=null	pod=kyuubi-user-spark-sql-runner-default-4ff66d56-ce78-477f-b51c-34902112bc06-driver	podState=Pending	containers=[]	appId=spark-bf31068709f241539e413038c05ed49a	appName=kyuubi-user-spark-sql-runner-default-4ff66d56-ce78-477f-b51c-34	appState=PENDINGappError=''
07:46:14.738 -1704927850-pool-117-thread-5 INFO KubernetesApplicationAuditLogger: eventType=UPDATE	label=4ff66d56-ce78-477f-b51c-34902112bc06	context=minikube	namespace=null	pod=kyuubi-user-spark-sql-runner-default-4ff66d56-ce78-477f-b51c-34902112bc06-driver	podState=Pending	containers=[]	appId=spark-bf31068709f241539e413038c05ed49a	appName=kyuubi-user-spark-sql-runner-default-4ff66d56-ce78-477f-b51c-34	appState=PENDING	appError=''
07:46:14.752 -1704927850-pool-117-thread-6 INFO KubernetesApplicationAuditLogger: eventType=UPDATE	label=4ff66d56-ce78-477f-b51c-34902112bc06	context=minikube	namespace=null	pod=kyuubi-user-spark-sql-runner-default-4ff66d56-ce78-477f-b51c-34902112bc06-driver	podState=Pending	containers=[spark-kubernetes-driver->ContainerState(running=null, terminated=null, waiting=ContainerStateWaiting(message=null, reason=ContainerCreating, additionalProperties={}), additionalProperties={})]	appId=spark-bf31068709f241539e413038c05ed49a	appName=kyuubi-user-spark-sql-runner-default-4ff66d56-ce78-477f-b51c-34	appState=PENDING	appError=''
07:46:16.088 -1704927850-pool-117-thread-6 INFO KubernetesApplicationAuditLogger: eventType=UPDATE	label=4ff66d56-ce78-477f-b51c-34902112bc06	context=minikube	namespace=null	pod=kyuubi-user-spark-sql-runner-default-4ff66d56-ce78-477f-b51c-34902112bc06-driver	podState=Running	containers=[spark-kubernetes-driver->ContainerState(running=ContainerStateRunning(startedAt=2025-04-23T07:46:15Z, additionalProperties={}), terminated=null, waiting=null, additionalProperties={})]	appId=spark-bf31068709f241539e413038c05ed49a	appName=kyuubi-user-spark-sql-runner-default-4ff66d56-ce78-477f-b51c-34	appState=RUNNING	appError=''
(base) ➜  target

@turboFei
Copy link
Member Author

turboFei commented Apr 24, 2025

07:43:24.443 KyuubiSessionManager-exec-pool: Thread-583 INFO OperationAuditLogger: operation=093505d1-74ab-4f2b-8994-7fd5f4e1f9da	opType=BatchJobSubmission	state=FINISHED	user=runner	session=13496573-9b18-4a25-b919-2b8f0c3fc1f5
07:46:08.577 ScalaTest-main-running-KyuubiOperationKubernetesClusterClusterModeSuite INFO KyuubiOperationKubernetesClusterClusterModeSuite: 

===== FINISHED o.a.k.kubernetes.test.spark.KyuubiOperationKubernetesClusterClusterModeSuite: 'Spark Cluster Mode On Kubernetes Kyuubi KubernetesApplicationOperation Suite' =====

(base) ➜  target grep 'operation=093505d1-74ab-4f2b-8994-7fd5f4e1f9da' unit-tests.log
07:43:08.485 ScalaTest-main-running-KyuubiOperationKubernetesClusterClusterModeSuite INFO OperationAuditLogger: operation=093505d1-74ab-4f2b-8994-7fd5f4e1f9da	opType=BatchJobSubmission	state=INITIALIZED	user=runner	session=13496573-9b18-4a25-b919-2b8f0c3fc1f5
07:43:08.522 ScalaTest-main-running-KyuubiOperationKubernetesClusterClusterModeSuite INFO OperationAuditLogger: operation=093505d1-74ab-4f2b-8994-7fd5f4e1f9da	opType=BatchJobSubmission	state=PENDING	user=runner	session=13496573-9b18-4a25-b919-2b8f0c3fc1f5
07:43:14.619 KyuubiSessionManager-exec-pool: Thread-583 INFO OperationAuditLogger: operation=093505d1-74ab-4f2b-8994-7fd5f4e1f9da	opType=BatchJobSubmission	state=RUNNING	user=runner	session=13496573-9b18-4a25-b919-2b8f0c3fc1f5
07:43:24.443 KyuubiSessionManager-exec-pool: Thread-583 INFO OperationAuditLogger: operation=093505d1-74ab-4f2b-8994-7fd5f4e1f9da	opType=BatchJobSubmission	state=FINISHED	user=runner	session=13496573-9b18-4a25-b919-2b8f0c3fc1f5
(base) ➜  target

@turboFei turboFei changed the title Fix flaky kubernetes integration test [KYUUBI #7034][FOLLOWUP] Fix SparkOnKubernetesTestsSuite caused by app info name change Apr 24, 2025
@turboFei turboFei requested a review from pan3793 April 24, 2025 20:40
@turboFei turboFei added this to the v1.10.2 milestone Apr 24, 2025
@turboFei turboFei self-assigned this Apr 24, 2025
@turboFei turboFei marked this pull request as draft April 24, 2025 22:48
@turboFei turboFei marked this pull request as ready for review April 24, 2025 23:02
@turboFei turboFei force-pushed the fix_test branch 2 times, most recently from 5f383b0 to c4100b0 Compare April 24, 2025 23:06
@turboFei turboFei changed the title [KYUUBI #7034][FOLLOWUP] Fix SparkOnKubernetesTestsSuite caused by app info name change [KYUUBI #7034][FOLLOWUP] Decouple the kubernetes pod name and app name Apr 24, 2025
@turboFei turboFei force-pushed the fix_test branch 2 times, most recently from 4ff56cc to b7914d4 Compare April 24, 2025 23:33
@pan3793
Copy link
Member

pan3793 commented Apr 25, 2025

LGTM, only a nit about style

@turboFei
Copy link
Member Author

The spark k8s IT has passed, merging to fix the GA.

@turboFei turboFei closed this in 75891d1 Apr 25, 2025
turboFei added a commit that referenced this pull request Apr 25, 2025
### Why are the changes needed?

Followup for #7034  to fix the SparkOnKubernetesTestsSuite.

Sorry, I forget that the appInfo name and pod name were deeply bound before, the appInfo name was used as pod name and used to delete pod.

In this PR, we add `podName` into applicationInfo to separate app name and pod name.

### How was this patch tested?

GA should pass.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #7039 from turboFei/fix_test.

Closes #7034

0ff7018 [Wang, Fei] revert
18e48c0 [Wang, Fei] comments
19f34bc [Wang, Fei] do not get pod name from appName
c1d3084 [Wang, Fei] reduce interval for test stability
50fad6b [Wang, Fei] fix ut

Authored-by: Wang, Fei <fwang12@ebay.com>
Signed-off-by: Wang, Fei <fwang12@ebay.com>
(cherry picked from commit 75891d1)
Signed-off-by: Wang, Fei <fwang12@ebay.com>
@turboFei turboFei deleted the fix_test branch April 25, 2025 05:40
@turboFei
Copy link
Member Author

thanks, merged to main and 1.10.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants