Skip to content

createSmithyJarManifestUrl: incorrect handling of complex paths #2103

@kubukoz

Description

@kubukoz

Hi! I first bumped into this over a year ago, but so far had a workaround that seemed decent, and the usecase seemed pretty niche. Now me and my team maintain about 4 copies of the same workaround, so it's reaching a point where it'd be best to resolve upstream - right here.

What I'm trying to do

  • grab a bunch of jars, find Smithy model files in them, and add them to a model assembler
  • ...allowing some of these jars to be missing a manifest altogether (simply ignoring them).

Originally, here's how we did it in smithy4s:

val modelsInJars = deps.flatMap { file => // java.io.File
   val manifestUrl = 
     ModelDiscovery.createSmithyJarManifestUrl(file.getAbsolutePath()) 
   try { ModelDiscovery.findModels(manifestUrl).asScala } 
   catch { 
     case _: ModelManifestException => Seq.empty //fallback to nothing
   } 
 }

So far so good.

The problem

Trouble begins when certain artifacts are involved, namely:

  • artifacts fetched with Coursier
  • ...which have urlencoded characters in their path. Which is typical for snapshots named after their corresponding commit and .

For example, org.polyvariant:test-library-core_2.13:0.0.1+123-SNAPSHOT from Sonatype Snapshots would be placed in:

$COURSIER_CACHE_LOCATION/v1/https/s01.oss.sonatype.org/content/repositories/snapshots/org/polyvariant/test-library-core_2.13/0.0.1%2B123-SNAPSHOT/test-library-core_2.13-0.0.1%2B123-SNAPSHOT.jar

Here's what happens when you use that file with ModelDiscovery:

  • createSmithyJarManifestUrl yields this URL: jar:file:$COURSIER_CACHE_LOCATION/v1/https/s01.oss.sonatype.org/content/repositories/snapshots/org/polyvariant/test-library-core_2.13/0.0.1%2B123-SNAPSHOT/test-library-core_2.13-0.0.1%2B123-SNAPSHOT.jar!/META-INF/smithy/manifest - so far so good, it seems
  • findModels fails with ModelManifestException caused by FileNotFoundException. the path that cause reports is $COURSIER_CACHE_LOCATION/v1/https/s01.oss.sonatype.org/content/repositories/snapshots/org/polyvariant/test-library-core_2.13/0.0.1+123-SNAPSHOT/test-library-core_2.13-0.0.1+123-SNAPSHOT.jar - the urlescaping is gone now.

I think this is an issue with ModelDiscovery.findModels.

The expected behavior can be achieved with the following workaround: treat the jar as a FileSystem, and use the Path API to get an URL:

Using.resource(
  FileSystems.newFileSystem(file.toPath())
) { fs =>
  val path = fs.getPath("META-INF", "smithy", "manifest")
  ModelDiscovery.findModels(path.toUri().toURL())
}

The path.toUri() part stringifies as: jar:file://$COURSIER_FILE_LOCATION/v1/https/s01.oss.sonatype.org/content/repositories/snapshots/org/polyvariant/test-library-core_2.13/0.0.1%252B123-SNAPSHOT/test-library-core_2.13-0.0.1%252B123-SNAPSHOT.jar!/META-INF/smithy/manifest - which, to me, sounds correct.

Request

I think there's more than one issue here:

  1. createSmithyJarManifestUrl going from URL (a relatively structured type) to a String (no structure at all), offering little to no type safety when it's used down the line
  2. findModels doing something strange with its parameter, discarding the urlencoding of the given path.

so the ideal solution would also be twofold:

  • deprecate the Stringly-typed methods and add some safer ones, operating on URL or URI
  • fix the handling of urlencoded paths in findModels

Context, if you want to learn more about the issue we had and the workaround: disneystreaming/smithy4s#850

Notes / questions

  • Are there better ways to load a bunch of jars, not all of which necessarily contain Smithy manifests? If not, could one be baked into ModelAssembler?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions