Skip to content
Closed
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -17,12 +17,16 @@

package org.apache.spark.sql.hive

import java.io.File
import java.io.{File, FileInputStream, IOException}
import java.nio.file.Files

import scala.io.Codec
import scala.io.Source
import scala.sys.process._

import org.apache.spark.TestUtils
import org.apache.hadoop.conf.Configuration

import org.apache.spark.{SecurityManager, SparkConf, TestUtils}
import org.apache.spark.sql.{QueryTest, Row, SparkSession}
import org.apache.spark.sql.catalyst.TableIdentifier
import org.apache.spark.sql.catalyst.catalog.CatalogTableType
Expand Down Expand Up @@ -56,10 +60,11 @@ class HiveExternalCatalogVersionsSuite extends SparkSubmitTestUtils {
// Try mirrors a few times until one succeeds
for (i <- 0 until 3) {
val preferredMirror =
Seq("wget", "https://www.apache.org/dyn/closer.lua?preferred=true", "-q", "-O", "-").!!.trim
val url = s"$preferredMirror/spark/spark-$version/spark-$version-bin-hadoop2.7.tgz"
getStringFromUrl("https://www.apache.org/dyn/closer.lua?preferred=true")
val filename = s"spark-$version-bin-hadoop2.7.tgz"
val url = s"$preferredMirror/spark/spark-$version/" + filename
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use filename in interpolation also?

logInfo(s"Downloading Spark $version from $url")
if (Seq("wget", url, "-q", "-P", path).! == 0) {
if (getFileFromUrl(url, path, filename)) {
return
}
logWarning(s"Failed to download Spark $version from $url")
Expand All @@ -85,6 +90,43 @@ class HiveExternalCatalogVersionsSuite extends SparkSubmitTestUtils {
new File(tmpDataDir, name).getCanonicalPath
}

private def getFileFromUrl(urlString: String, targetDir: String, filename: String): Boolean = {
val conf = new SparkConf
val securityManager = new SecurityManager(conf)
val hadoopConf = new Configuration

val outDir = new File(targetDir)
if (!outDir.exists()) {
outDir.mkdirs()
}

try {
val result = Utils.doFetchFile(urlString, outDir, filename, conf, securityManager, hadoopConf)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Utils.doFetchFile is a little overkill for this case, maybe we can just implement a simple downloading function with java.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Meh, if we can reuse code instead of rewriting it, seems OK to me

result.exists()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems unnecessary (just return true? doFetchFile should always return a file that exists), but ok.

} catch {
case ex: Exception => logWarning("Could not get file from url " + urlString + ": "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case of multi-line case statements, put them in the next line, properly indented.

Also: logWarning(s"I shall prefer interpolation", ex)

+ ex.getMessage)
false
}
}

private def getStringFromUrl(urlString: String, encoding: String = "UTF-8"): String = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use Charset instead of String... but since you're really not using that parameter, I'd just hardcode the charset in the only place where it's used.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems encoding: String = "UTF-8" is not used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops. I should have caught that. Will fix.

val outDir = Files.createTempDirectory("string-")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use Utils.createTempDir; it handles deleting the directory for you later.

Another option is to use File.createTempFile + File.deleteOnExit. Same result.

val filename = "string-out.txt"

if (!getFileFromUrl(urlString, outDir.toString, filename)) {
throw new IOException("Could not get string from url " + urlString)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about letting the exception from doFetchFile propagate, and only handle it as part of the retries in tryDownloadSpark?

}

val contentFile = new File(outDir.toFile, filename)
try {
Source.fromFile(contentFile)(Codec(encoding)).mkString
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is ok, but there's an easier / cleaner API for this. Look around for calls like this:

Files.toString(file, StandardCharsets.UTF_8);

} finally {
contentFile.delete()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can get rid of this finally block if you switch to Utils.createTempDir.

outDir.toFile.delete()
}
}

override def beforeAll(): Unit = {
super.beforeAll()

Expand Down