-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Provided a repl-embedded artifact which shades and isolates everything to avoid conflicts
#24494
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
repl-embedded artifact which shades everything to avoid conflictsrepl-embedded artifact which shades everything to avoid conflicts
# Conflicts: # repl/src/dotty/tools/repl/ReplMain.scala # repl/test/dotty/tools/repl/ReplMainTest.scala
repl-embedded artifact which shades everything to avoid conflictsrepl-embedded artifact which shades and isolates everything to avoid conflicts
repl-embedded artifact which shades and isolates everything to avoid conflictsrepl-embedded artifact which shades and isolates everything to avoid conflicts
|
This should be ready for review I think, CC @Gedochao since you seem to be the one who always assigns people |
|
would it be possible to use https://github.com/eed3si9n/jarjar-abrams for the shading? I see at as the standard tool for this, battle-tested by sbt and other projects @sjrd I see that on #23849, you argued against using a binary dependency, on the grounds that circular dependencies are pernicious. do you remain opposed to binary-based shading? at the time, we hadn't yet moved the REPL to its own JAR; does that change affect your stance? |
| (Compile / run).toTask(" -usejavacp").value | ||
| }, | ||
| bspEnabled := false, | ||
| (Compile / sourceGenerators) += Def.task { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you consider using one of the shading plugins for sbt instead of doing it by hand?
|
With the separate artifact, the binary dependency is less of an issue. At least there's no circular dependency between artifacts (counting the compiler you're using to build an artifact as a dependency of that artifact). Anyway now it seems to be the status quo that we're depending on the binary, so turning that into binary shading is fine. Regardless, if shading is our solution, then it should be done in a single artifact. There is no need to publish 2 artifacts, one with unshaded things and one that is actually usable because of the shading. Shading is not the best solution to our problem, though. A better fix to the underlying issue would be to isolate the interpreted code from the REPL logic via an appropriate class loader setup. There are essentially two possibilities:
IMO the latter is a superior solution. With the former, the REPL code would rely on an external setup handler to create the correct class loaders, since it must itself be in a child class loader. With the second option, the REPL code can be in charge of setting up the class loaders, which means it needs no external supervisor. All of that to say: at the very least, eventually we'll want one of those solutions. We will investigate whether we can make it happen for 3.8.0. In that case, the present PR won't be necessary. If that doesn't work out, then we should shade as a fallback (still better than reverting the introduction of pprint altogether). But then it really should be in a single artifact, not two artifacts among which one is unusable in practice. |
|
there seemed to be consensus on core that we don't want two artifacts; the shading can happen in the main and only REPL JAR shading is not necessarily where we'll eventually land; ultimately we'd like to find a classloader-based solution instead, or at least try. but in the meantime we consider shading acceptable for 3.8.0, if it's the best we can do under the time constraint EDIT: oh, jinx, Seb and I were writing at the same time. well, I'll let it stand |
|
one extra factor in this layering strategy - the current status quo is that the REPL rewrites all classes anyway (except base JDK?) to install the |
|
Thanks for the review @sjrd It is worth repeating that there are two levels of shading/isolation being done here:
We could always go with just
So the big reason to have both a shaded and unshaded REPL jar is that shading "closes the world": someone using Notably, many other projects have a similar artifact layout publishing both unshaded-piecemeal and shaded-assembly artfiacts, e.g.
As mentioned in the PR description, this does not work in the all cases since #24194 because the REPL instruments the scala-library classes to properly implement instrumentation, resulting in different instances of classes such as It's possible to not instrument the scala-library by passing in The default REPL instrumentation does mean that the REPL will not be able to pass back and forth with the enclosing codebase due to the class mismatch, which is why
The issue is how can the REPL code be in charge of setting up classloaders to isolate itself when the REPL code is already on the user classpath? By the time your REPL code can run, stuff like |
In the original PR implementing this the decision was to build from source to avoid a binary dependency, and I reproduced the same implementation here. We could certainly try binary shading, though I'm less familiar with those plugins and how they interact with dynamic compilation and dynamic classloading that we do in the REPL |
|
Here's a small patch using binary shading rather than source shading. Seems to work for some things, but the binary shading seems to confuse the compiler when trying to reference the shaded classes in the REPL lihaoyi scala3$ git diff embedded-repl | cat
diff --git a/project/Build.scala b/project/Build.scala
index 3e21224e42..a5713fd5ce 100644
--- a/project/Build.scala
+++ b/project/Build.scala
@@ -1113,6 +1113,9 @@ object Build {
"com.github.sbt" % "junit-interface" % "0.13.3" % Test,
"io.get-coursier" % "interface" % "1.0.28", // used by the REPL for dependency resolution
"org.virtuslab" % "using_directives" % "1.1.4", // used by the REPL for parsing magic comments
+ "com.lihaoyi" %% "pprint" % "0.9.0", // shaded by scala3-repl-embedded into dotty.shaded.pprint
+ "com.lihaoyi" %% "fansi" % "0.5.0", // shaded by scala3-repl-embedded into dotty.shaded.fansi
+ "com.lihaoyi" %% "sourcecode" % "0.4.2", // shaded by scala3-repl-embedded into dotty.shaded.sourcecode
),
// Configure to use the non-bootstrapped compiler
managedScalaInstance := false,
@@ -1177,92 +1180,6 @@ object Build {
(Compile / run).toTask(" -usejavacp").value
},
bspEnabled := false,
- (Compile / sourceGenerators) += Def.task {
- val s = streams.value
- val cacheDir = s.cacheDirectory
- val dest = (Compile / sourceManaged).value / "downloaded"
- val lm = dependencyResolution.value
-
- val dependencies = Seq(
- ("com.lihaoyi", "pprint_3", "0.9.5"),
- ("com.lihaoyi", "fansi_3", "0.5.1"),
- ("com.lihaoyi", "sourcecode_3", "0.4.4"),
- )
-
- // Create a marker file that tracks the dependencies for cache invalidation
- val markerFile = cacheDir / "shaded-sources-marker"
- val markerContent = dependencies.map { case (org, name, version) => s"$org:$name:$version:sources" }.mkString("\n")
- if (!markerFile.exists || IO.read(markerFile) != markerContent) {
- IO.write(markerFile, markerContent)
- }
-
- FileFunction.cached(cacheDir / "fetchShadedSources",
- FilesInfo.lastModified, FilesInfo.exists) { _ =>
- s.log.info(s"Downloading and processing shaded sources to $dest...")
-
- if (dest.exists) IO.delete(dest)
- IO.createDirectory(dest)
-
- for((org, name, version) <- dependencies) {
- import sbt.librarymanagement._
-
- val moduleId = ModuleID(org, name, version).sources()
- val retrieveDir = cacheDir / "retrieved" / s"$org-$name-$version-sources"
-
- s.log.info(s"Retrieving $org:$name:$version:sources...")
- val retrieved = lm.retrieve(moduleId, scalaModuleInfo = None, retrieveDir, s.log)
- val jarFiles = retrieved.fold(
- w => throw w.resolveException,
- files => files.filter(_.getName.contains("-sources.jar"))
- )
-
- jarFiles.foreach { jarFile =>
- s.log.info(s"Extracting ${jarFile.getName}...")
- IO.unzip(jarFile, dest)
- }
- }
-
- val scalaFiles = (dest ** "*.scala").get
-
- val patches = Map( // Define patches as a map from search text to replacement text
- "import scala" -> "import _root_.scala",
- " scala.collection." -> " _root_.scala.collection.",
- "def apply(c: Char): Trie[T]" -> "def apply(c: Char): Trie[T] | Null",
- "var head: Iterator[T] = null" -> "var head: Iterator[T] | Null = null",
- "if (head != null && head.hasNext) true" -> "if (head != null && head.nn.hasNext) true",
- "head.next()" -> "head.nn.next()",
- "abstract class Walker" -> "@scala.annotation.nowarn abstract class Walker",
- "object TPrintLowPri" -> "@scala.annotation.nowarn object TPrintLowPri",
- "x.toString match{" -> "scala.runtime.ScalaRunTime.stringOf(x) match{"
- )
-
- val patchUsageCounter = scala.collection.mutable.Map(patches.keys.map(_ -> 0).toSeq: _*)
-
- scalaFiles.foreach { file =>
- val text = IO.read(file)
- if (!file.getName.equals("CollectionName.scala")) {
- var processedText = "package dotty.shaded\n" + text
-
- // Apply patches and count usage
- for((search, replacement) <- patches if processedText.contains(search)){
- processedText = processedText.replace(search, replacement)
- patchUsageCounter(search) += 1
- }
-
- IO.write(file, processedText)
- }
- }
-
- // Assert that all patches were applied at least once
- val unappliedPatches = patchUsageCounter.filter(_._2 == 0).keys
- if (unappliedPatches.nonEmpty) {
- throw new RuntimeException(s"Patches were not applied: ${unappliedPatches.mkString(", ")}")
- }
-
- scalaFiles.toSet
- } (Set(markerFile)).toSeq
-
- }
)
lazy val `scala3-repl-embedded` = project.in(file("repl-embedded"))
@@ -1303,6 +1220,12 @@ object Build {
name.contains("jline")
}
},
+ // Shade pprint, fansi, sourcecode into dotty.shaded.* to avoid conflicts
+ assembly / assemblyShadeRules := Seq(
+ ShadeRule.rename("pprint.**" -> "dotty.shaded.pprint.@1").inAll,
+ ShadeRule.rename("fansi.**" -> "dotty.shaded.fansi.@1").inAll,
+ ShadeRule.rename("sourcecode.**" -> "dotty.shaded.sourcecode.@1").inAll,
+ ),
assembly := {
val originalJar = assembly.value
@@ -1321,10 +1244,8 @@ object Build {
val shouldKeepInPlace =
relativePath.startsWith("dotty/embedded/")||
- // These are manually shaded when vendored/patched so leave them alone
- relativePath.startsWith("dotty/shaded/") ||
- // This needs to be inside scala/collection so cannot be moved
- relativePath.startsWith("scala/collection/internal/pprint/")
+ // These are shaded via assemblyShadeRules so leave them alone
+ relativePath.startsWith("dotty/shaded/")
if (!shouldKeepInPlace) {
val newPath = shadedDir / relativePath
diff --git a/repl-embedded/src/dotty/embedded/EmbeddedReplMain.scala b/repl-embedded/src/dotty/embedded/EmbeddedReplMain.scala
index 92b19640ea..ea964b239a 100644
--- a/repl-embedded/src/dotty/embedded/EmbeddedReplMain.scala
+++ b/repl-embedded/src/dotty/embedded/EmbeddedReplMain.scala
@@ -37,7 +37,7 @@ class UnshadingClassLoader(parent: ClassLoader) extends ClassLoader(parent) {
case None =>
// These classes are loaded shared between all classloaders, because
// they misbehave if loaded multiple times in separate classloaders
- if (name.startsWith("java.") || name.startsWith("org.jline.")) parent.loadClass(name)
+ if (name.startsWith("sun.") ||name.startsWith("java.") || name.startsWith("org.jline.")) parent.loadClass(name)
// Other classes loaded by the `UnshadingClassLoader` *must* be found in the
// `dotty.isolated` package. If they're not there, throw an error rather than
// trying to look for them at their normal package path, to ensure we're not
diff --git a/repl/src/dotty/tools/repl/Rendering.scala b/repl/src/dotty/tools/repl/Rendering.scala
index 7b71d16ed5..f054d0a7c3 100644
--- a/repl/src/dotty/tools/repl/Rendering.scala
+++ b/repl/src/dotty/tools/repl/Rendering.scala
@@ -9,7 +9,6 @@ import printing.ReplPrinter
import printing.SyntaxHighlighting
import reporting.Diagnostic
import StackTraceOps.*
-import dotty.shaded.*
import scala.compiletime.uninitialized
import scala.util.control.NonFatal
diff --git a/repl/src/dotty/tools/repl/ReplDriver.scala b/repl/src/dotty/tools/repl/ReplDriver.scala
index 4c54b2959c..995326f5b8 100644
--- a/repl/src/dotty/tools/repl/ReplDriver.scala
+++ b/repl/src/dotty/tools/repl/ReplDriver.scala
@@ -706,4 +706,4 @@ class ReplDriver(settings: Array[String],
end ReplDriver
object ReplDriver:
- def pprintImport = "import dotty.shaded.pprint.pprintln\n"
\ No newline at end of file
+ def pprintImport = ""
\ No newline at end of file
diff --git a/repl/src/dotty/tools/repl/StackTraceOps.scala b/repl/src/dotty/tools/repl/StackTraceOps.scala
index c81d8c435d..618bdd9796 100644
--- a/repl/src/dotty/tools/repl/StackTraceOps.scala
+++ b/repl/src/dotty/tools/repl/StackTraceOps.scala
@@ -16,7 +16,6 @@ import scala.language.unsafeNulls
import collection.mutable, mutable.ListBuffer
import dotty.tools.dotc.util.chaining.*
import java.lang.System.lineSeparator
-import dotty.shaded.*
object StackTraceOps:
lihaoyi scala3$ JAVA_HOME=/Users/lihaoyi/Library/Caches/Coursier/arc/https/cdn.azul.com/zulu/bin/zulu17.62.17-ca-jdk17.0.17-macosx_aarch64.tar.gz/zulu17.62.17-ca-jdk17.0.17-macosx_aarch64/ java -cp $(cs fetch -p org.scala-lang:scala3-repl-embedded_3:3.8.1-RC1-bin-SNAPSHOT) dotty.embedded.EmbeddedReplMain
-- [E008] Not Found Error: -----------------------------------------------------
1 |import dotty.shaded.pprint.pprintln
| ^^^^^^^^
| value pprintln is not a member of dotty.shaded.pprint
1 error found
Welcome to Scala 3.8.1-RC1-bin-SNAPSHOT-git-c632d86 (17.0.17, Java OpenJDK 64-Bit Server VM).
Type in expressions for evaluation. Or try :help.
scala> import pprint.pprintln
-- Error: ----------------------------------------------------------------------
1 |import pprint.pprintln
| ^
|Recursion limit exceeded.
|Maybe there is an illegal cyclic reference?
|If that's not the case, you could also try to increase the stacksize using the -Xss JVM option.
|For the unprocessed stack trace, compile with -Xno-enrich-error-messages.
|A recurring operation is (inner to outer):
|
| find-member pprint.PPrinter
| find-member pprint.PPrinter
| find-member pprint.PPrinter
| find-member pprint.PPrinter
| find-member pprint.PPrinter
| find-member pprint.PPrinter
| find-member pprint.PPrinter
| find-member pprint.PPrinter
| find-member pprint.PPrinter
| find-member pprint.PPrinter
| ...
|
| find-member pprint.PPrinter
| find-member pprint.PPrinter
| find-member pprint.PPrinter
| find-member pprint.PPrinter
| find-member pprint.PPrinter
| find-member pprint.PPrinter
| find-member pprint.PPrinter
| find-member pprint.PPrinter
| find-member pprint.PPrinter
| find-member pprint.pprintln
1 error found |
|
If I understand correctly the use case you're describing for repl-embedded, IMO that's a client application concern. If a client application wants to use the repl as a library, but not pollute its classpath with the dependencies of the repl, then it should be that application's responsibility to shade the repl artifact for its own purpose. To do that, it would typically use a binary shading plugin. This is not repl-specific. It's similar to any other regular library that we want to use without taking on its transitive dependencies. Moreover, there is also an alternative for clients here that they themselves load the library (the repl) in a class loader. For example, IIRC we switched from the former to the latter strategy to load the closure compiler library in the Scala.js linker. If that is indeed the use case, we shouldn't provide an artifact for this. Client applications should build their own. |
|
Yes, that is indeed the use case. I guess it depends on who you consider the end users of the REPL to be:
It's fine if you decide that you only want to support (1) and that (2) is out of scope, but I do think (2) would be a great additional use case to have for the REPL with a very small maintenance cost: ~100 lines of code overall. Sure it could be placed in some third-party library, but the cost of discoverability of such a library would mean ~nobody would use it. In contrast the cost of maintaining a ~100 lines of |
Fixes #24382
This PR re-implements the shading and classpath isolation first implemented in #23849, but lost in the churn of #24243. It implements both the original source-based shading of PPrint/Fansi/Sourcecode, with additional classloader-based isolation of REPL implementation classes to further avoid conflicts with libraries on the user classpath
replcontains the current implementation of the Scala3 REPLrepl-embeddedis the assembly jar ofrepl, containing all the transitive classfiles, but moved under thedotty.isolatednamespace.repl-embeddedcontains three main un-isolated things:scala.tools.repl.EmbeddedReplMain, which spawns an isolated classloader that knows how to load stuff from thedotty.isolatednamespace, and uses that to instantiate aReplDriverdotty.shaded.*via source code mangling but are not classloader-isolated from user codescala-library.jar, which is guaranteed to be the same anywayThis approach allows anyone using the
repl-embeddedartifact to be confident there will not be classpath conflicts with their own classpath, since all the Scala compiler jars and transitive dependencies are isolated within a classloader and will not collide with user code. Furthermore, this allows us to begin using arbitrary third-party dependencies in thescala3-repl's implementation without them bleeding into user code inside or outside the REPL, with a few exceptions like PPrint/Fansi/Sourcecode below. Compared to the earlier implementation in #23849, this provides much stronger isolation from the enclosing classpath and avoids leaking implementation dependencies likecoursierapiinto the enclosing scopeThe PPrint/Fansi/Sourcecode libraries cannot be fully isolated because they need to be present in the same classloader as user code in order for PPrint logic like
case s: Seq[_] =>to work correctly: if they were in a separate classloader, theSeqin PPrint would differ from theSeqin user code, and these type tests would all fail. And due to the REPL interrupt instrumentation implemented in #24194, the REPL-code-classloader will ~always be different from the external user code classloader, so the PPrint/Fansi/Sourcecode classes have to be in that classloader to make things work and the only way to make it workTested manually to verify that the REPL classes do not appear on the classpath
Note that both within the REPL and outside the REPL in the enclosing codebase,
dotty.tools.repl.Maincannot be resolved. That is an improvement over the status quo wheredotty.tools.repl.Mainand it's dependencies likecoursierapicould conflict with things on the user classpath