-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-43648][CONNECT][TESTS] Move interrupt all related test to a new test file to pass Maven test
#41487
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-43648][CONNECT][TESTS] Move interrupt all related test to a new test file to pass Maven test
#41487
Changes from 5 commits
4963bd5
b9e7e4a
4822924
ca3e6b1
3b885b2
4648813
57e23f1
c71eac4
5ec3f93
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| @@ -0,0 +1,99 @@ | ||||||||||||||
| /* | ||||||||||||||
| * Licensed to the Apache Software Foundation (ASF) under one or more | ||||||||||||||
| * contributor license agreements. See the NOTICE file distributed with | ||||||||||||||
| * this work for additional information regarding copyright ownership. | ||||||||||||||
| * The ASF licenses this file to You under the Apache License, Version 2.0 | ||||||||||||||
| * (the "License"); you may not use this file except in compliance with | ||||||||||||||
| * the License. You may obtain a copy of the License at | ||||||||||||||
| * | ||||||||||||||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||||||||||||||
| * | ||||||||||||||
| * Unless required by applicable law or agreed to in writing, software | ||||||||||||||
| * distributed under the License is distributed on an "AS IS" BASIS, | ||||||||||||||
| * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||||||||||||||
| * See the License for the specific language governing permissions and | ||||||||||||||
| * limitations under the License. | ||||||||||||||
| */ | ||||||||||||||
| package org.apache.spark.sql | ||||||||||||||
|
|
||||||||||||||
| import scala.concurrent.{ExecutionContext, ExecutionContextExecutor, Future} | ||||||||||||||
| import scala.concurrent.duration._ | ||||||||||||||
| import scala.util.{Failure, Success} | ||||||||||||||
|
|
||||||||||||||
| import org.scalatest.concurrent.Eventually._ | ||||||||||||||
|
|
||||||||||||||
| import org.apache.spark.sql.connect.client.util.RemoteSparkSession | ||||||||||||||
| import org.apache.spark.util.ThreadUtils | ||||||||||||||
|
|
||||||||||||||
| /** | ||||||||||||||
| * Warning(SPARK-43648): Please don't import classes that only exist in | ||||||||||||||
| * `spark-connect-client-jvm.jar` into the this class, as it will trigger udf deserialization | ||||||||||||||
| * error during Maven testing. | ||||||||||||||
| */ | ||||||||||||||
| class SparkSessionCleanRoomE2ESuite extends RemoteSparkSession { | ||||||||||||||
|
|
||||||||||||||
| test("interrupt all - background queries, foreground interrupt") { | ||||||||||||||
| val session = spark | ||||||||||||||
| import session.implicits._ | ||||||||||||||
| implicit val ec: ExecutionContextExecutor = ExecutionContext.global | ||||||||||||||
| val q1 = Future { | ||||||||||||||
| spark.range(10).map(n => { Thread.sleep(30000); n }).collect() | ||||||||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So just because ClientE2ETestSuite imported Also, this import of SparkResult was only added yesterday in 62338ed#diff-7fa161b193c8792c8c0d8dd4bcae3e683ab8553edafa2ae5c13df42b26f612b0R41, while this test was already failing before, complaining about the lack of SparkResult class during deserialization? So it must have pulled it somehow even before it was imported in the test suite class?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am more inclined towards implicit imports caused by the use of spark/connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/Dataset.scala Lines 3181 to 3186 in 64855fa
I have tried deleting cases related to
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
@vicennial do we have a way to clean up unused Imports during the udf serialization process? I haven't investigated this yet.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not an expert on this, but it looks to me that Spark has a
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No, I've already tried it before give this pr, at least the existing
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In non-connect mode, we should add the necessary jars when submitting a Spark App, I thinks similar operations should also be performed in connect mode, so in https://github.com/apache/spark/pull/41483/files, using
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @rednaxelafx could I ask for your help?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In fact, in traditional mode, all dependencies are in the classpath, so I think there is no need to cleanup imports. Perhaps my understanding is incorrect?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I checked that in this case there is but I don't really understand it.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Crash or threw NoSuchFieldException or throw other exceptions? In my impression, |
||||||||||||||
| } | ||||||||||||||
| val q2 = Future { | ||||||||||||||
| spark.range(10).map(n => { Thread.sleep(30000); n }).collect() | ||||||||||||||
| } | ||||||||||||||
| var q1Interrupted = false | ||||||||||||||
| var q2Interrupted = false | ||||||||||||||
| var error: Option[String] = None | ||||||||||||||
| q1.onComplete { | ||||||||||||||
| case Success(_) => | ||||||||||||||
| error = Some("q1 shouldn't have finished!") | ||||||||||||||
| case Failure(t) if t.getMessage.contains("cancelled") => | ||||||||||||||
| q1Interrupted = true | ||||||||||||||
| case Failure(t) => | ||||||||||||||
| error = Some("unexpected failure in q1: " + t.toString) | ||||||||||||||
| } | ||||||||||||||
| q2.onComplete { | ||||||||||||||
| case Success(_) => | ||||||||||||||
| error = Some("q2 shouldn't have finished!") | ||||||||||||||
| case Failure(t) if t.getMessage.contains("cancelled") => | ||||||||||||||
| q2Interrupted = true | ||||||||||||||
| case Failure(t) => | ||||||||||||||
| error = Some("unexpected failure in q2: " + t.toString) | ||||||||||||||
| } | ||||||||||||||
| // 20 seconds is < 30 seconds the queries should be running, | ||||||||||||||
| // because it should be interrupted sooner | ||||||||||||||
| eventually(timeout(20.seconds), interval(1.seconds)) { | ||||||||||||||
| // keep interrupting every second, until both queries get interrupted. | ||||||||||||||
| spark.interruptAll() | ||||||||||||||
| assert(error.isEmpty, s"Error not empty: $error") | ||||||||||||||
| assert(q1Interrupted) | ||||||||||||||
| assert(q2Interrupted) | ||||||||||||||
| } | ||||||||||||||
| } | ||||||||||||||
|
|
||||||||||||||
| test("interrupt all - foreground queries, background interrupt") { | ||||||||||||||
| val session = spark | ||||||||||||||
| import session.implicits._ | ||||||||||||||
| implicit val ec: ExecutionContextExecutor = ExecutionContext.global | ||||||||||||||
|
|
||||||||||||||
| @volatile var finished = false | ||||||||||||||
| val interruptor = Future { | ||||||||||||||
| eventually(timeout(20.seconds), interval(1.seconds)) { | ||||||||||||||
| spark.interruptAll() | ||||||||||||||
| assert(finished) | ||||||||||||||
| } | ||||||||||||||
| finished | ||||||||||||||
| } | ||||||||||||||
| val e1 = intercept[io.grpc.StatusRuntimeException] { | ||||||||||||||
| spark.range(10).map(n => { Thread.sleep(30.seconds.toMillis); n }).collect() | ||||||||||||||
| } | ||||||||||||||
| assert(e1.getMessage.contains("cancelled"), s"Unexpected exception: $e1") | ||||||||||||||
| val e2 = intercept[io.grpc.StatusRuntimeException] { | ||||||||||||||
| spark.range(10).map(n => { Thread.sleep(30.seconds.toMillis); n }).collect() | ||||||||||||||
| } | ||||||||||||||
| assert(e2.getMessage.contains("cancelled"), s"Unexpected exception: $e2") | ||||||||||||||
| finished = true | ||||||||||||||
| assert(ThreadUtils.awaitResult(interruptor, 10.seconds)) | ||||||||||||||
| } | ||||||||||||||
| } | ||||||||||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if it happens through implicit imports, then such a rule maybe won't help...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I didn't add this originally because I found it difficult to describe