Skip to content

GH-802: Reimplement URL handling by decorating input streams with commons-compress#803

Merged
ascopes merged 7 commits into
mainfrom
feature/GH-802-new-protocol-support
Sep 27, 2025
Merged

GH-802: Reimplement URL handling by decorating input streams with commons-compress#803
ascopes merged 7 commits into
mainfrom
feature/GH-802-new-protocol-support

Conversation

@ascopes

@ascopes ascopes commented Sep 14, 2025

Copy link
Copy Markdown
Owner

This reimplements all the low-level URL handling logic internally to remove the SPI workaround that is currently in place, and use a recursive dispatcher that has custom URLStreamHandlerFactories associated with it. These implementations can delegate to archivers and compressors in Apache commons-compress to provide support for protocols such as 'tar:gz:https://...'.

This effectively allows users to utilise arbitrary tarballs as the source of their plugins now rather than just JARs.

Current implementation provides:

  • Archivers for 'jar', 'war', 'ear', 'kar', 'zip', 'tar'
  • Compressors for 'gzip', 'bz2'

We can add further integrations such as 'xz', 'cpio', 'ar', 'ar2', 'lzma', '7z', 'deflate', 'z', 'zstd', etc as needed by appending to the list of providers in the new UrlFactory.

TODO:

  • This has currently only been tested using tar.gz to prove that nested retrieval works as expected. Other implementations will still need unit tests to be added.
  • The tests for UriResourceFetcher are now invalid and need a full rewrite to handle delegating to this new mechanism under the hood.
  • Unit tests for new classes.
    • URLConnection wrapper classes
    • Integration with file protocol for UrlFactory
    • Integration with http(s) protocols for UrlFactory
    • Test cases for each scheme in each archive protocol we support
    • Nested archive test cases
  • Documentation needs updating with new examples in the plugin Mojo javadocs.
  • Documentation in the user guide needs updating.
  • Minor version bump or major version bump.
  • Set @since tags in new classes.
  • Check if we need to propagate timeouts, useCaches, etc to nested connections.

If possible, help is needed to verify this against existing behaviour in real world use cases to help ensure no new edge cases exist. If this is something that is of interest, please comment on this PR. If any semantic changes are spotted, this may need to be a breaking change and be part of a newer major version instead. Hopefully that will not be the case.


Closes #802.

Covers #722, #795 (for the core use case, further elaboration on other points may be needed elsewhere but those should be a new issue once confirmed).

@ascopes ascopes self-assigned this Sep 14, 2025
@ascopes ascopes added documentation Improvements or additions to documentation dependencies Issues and changes related to dependencies new feature A new user-facing feature. chore General tech debt work. help needed Any expertise or time or resources will always be appreciated! labels Sep 14, 2025
@codecov

codecov Bot commented Sep 14, 2025

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 85.95041% with 17 lines in your changes missing coverage. Please review.
✅ Project coverage is 91.11%. Comparing base (1cc5f4a) to head (74eb43c).
⚠️ Report is 8 commits behind head on main.

Files with missing lines Patch % Lines
...urls/AbstractRecursiveUrlStreamHandlerFactory.java 76.79% 9 Missing and 4 partials ⚠️
...venplugin/urls/ArchiveUrlStreamHandlerFactory.java 83.34% 2 Missing and 2 partials ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #803      +/-   ##
==========================================
- Coverage   91.61%   91.11%   -0.49%     
==========================================
  Files          60       63       +3     
  Lines        1811     1901      +90     
  Branches      121      129       +8     
==========================================
+ Hits         1659     1732      +73     
- Misses        114      125      +11     
- Partials       38       44       +6     
Files with missing lines Coverage Δ
...protobufmavenplugin/mojo/AbstractGenerateMojo.java 100.00% <ø> (ø)
...enplugin/mojo/ProtobufMavenPluginConfigurator.java 100.00% <ø> (ø)
...tobufmavenplugin/plugins/BinaryPluginResolver.java 63.64% <ø> (ø)
...pes/protobufmavenplugin/protoc/ProtocResolver.java 65.00% <ø> (ø)
...otobufmavenplugin/sources/ProtoSourceResolver.java 80.96% <100.00%> (ø)
...plugin/urls/DecoratingUrlStreamHandlerFactory.java 100.00% <100.00%> (ø)
...s/protobufmavenplugin/urls/UriPlexusConverter.java 100.00% <ø> (ø)
...s/protobufmavenplugin/urls/UriResourceFetcher.java 100.00% <100.00%> (ø)
...b/ascopes/protobufmavenplugin/urls/UrlFactory.java 100.00% <100.00%> (ø)
...venplugin/urls/ArchiveUrlStreamHandlerFactory.java 83.34% <83.34%> (ø)
... and 1 more
🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@ascopes ascopes changed the title GH-802: Reimplement URL handling by decorating input streams with com… GH-802: Reimplement URL handling by decorating input streams with commons-compress Sep 14, 2025
@ascopes ascopes force-pushed the feature/GH-802-new-protocol-support branch 11 times, most recently from 07d0da6 to d66647d Compare September 21, 2025 10:41
@ascopes ascopes force-pushed the feature/GH-802-new-protocol-support branch from d66647d to 4e4e0bd Compare September 22, 2025 06:50
@ascopes ascopes marked this pull request as ready for review September 22, 2025 06:54
@ascopes ascopes force-pushed the feature/GH-802-new-protocol-support branch from d254416 to 95ca432 Compare September 22, 2025 06:57
@ascopes ascopes mentioned this pull request Sep 23, 2025
1 task
@ascopes ascopes force-pushed the feature/GH-802-new-protocol-support branch 4 times, most recently from a8c69a9 to afc279e Compare September 27, 2025 09:58
…mons-compress

This reimplements all the low-level URL handling logic internally to remove
the SPI workaround that is currently in place, and use a recursive dispatcher
that has custom URLStreamHandlerFactories associated with it. These
implementations can delegate to archivers and compressors in Apache
commons-compress to provide support for protocols such as 'tar:gz:https://...'.

This effectively allows users to utilise arbitrary tarballs as the source of
their plugins now rather than just JARs.

Current implementation provides:
- Archivers for 'jar', 'war', 'ear', 'zip', 'tar'
- Compressors for 'gzip', 'bz2'

We can add further integrations such as 'xz', 'cpio', 'ar', 'ar2', 'lzma',
'7z', 'deflate', 'z', 'zstd', etc as needed by appending to the list
of providers in the new UrlFactory.

This has currently only been tested using tar.gz to prove that nested
retrieval works as expected. Other implementations will still need
unit tests to be added.

In addition, the tests for UriResourceFetcher are now invalid and need
a full rewrite to handle delegating to this new mechanism under the
hood.

GH-802: Document new protocols for URLs
GH-802: Add implementation notes to package-info.java

GH-802: Cache decorated input streams
GH-802: Fix UrlFactory to avoid init being able to add duplicate handlers

GH-802: Fix mockito causing a huge memory leak and crashing the JVM

GH-802: Fix bug resolving nested tarballs

Some tar entries seem to start with './' and others do not,
which confuses our crude lookup mechanism. This is now
fixed.
GH-802: Include Karaf archive format in list of ZIP protocols

GH-802: revert change to SLF4J API scope

GH-802: Readd commons-compress to pom after rebase removed it

GH-802: http url test cases
@ascopes ascopes force-pushed the feature/GH-802-new-protocol-support branch from afc279e to 74eb43c Compare September 27, 2025 10:00
@ascopes

ascopes commented Sep 27, 2025

Copy link
Copy Markdown
Owner Author

I need to have a further think about how the URL bits and pieces are tested. Regardless, I am happy enough with how that part of it is tested via ITs currently, so I'm going to merge this for now and see how it goes.

@ascopes ascopes merged commit dfccc5d into main Sep 27, 2025
15 of 17 checks passed
@ascopes ascopes deleted the feature/GH-802-new-protocol-support branch September 27, 2025 12:47
mergify Bot added a commit to ArcadeData/arcadedb that referenced this pull request Sep 29, 2025
…to 3.10.0 [skip ci]

Bumps [io.github.ascopes:protobuf-maven-plugin](https://github.com/ascopes/protobuf-maven-plugin) from 3.9.1 to 3.10.0.
Release notes

*Sourced from [io.github.ascopes:protobuf-maven-plugin's releases](https://github.com/ascopes/protobuf-maven-plugin/releases).*

> v3.10.0
> -------
>
> New features
> ------------
>
> * URL parsing has been reimplemented to work with Apache Commons Compress. This allows users to extract plugins implicitly from tarballls and other archive types that are listed in the documentation. This includes:
>   + `jar:https://somewebsite.lan/archive.jar!/path/to/exe`
>   + `zip:file://some/local/path/archive.zip!/path/to/exe`
>   + `ear:https://somewebsite.lan/archive.ear!/path/to/exe`
>   + `war:https://somewebsite.lan/archive.war!/path/to/exe`
>   + `kar:https://somewebsite.lan/archive.kar!/path/to/exe`
>   + `tar:https://somewebsite.lan/archive.tar!/path/to/exe`
>   + `tar:gz:https://somewebsite.lan/archive.tgz!/path/to/exe`
>   + `tar:bz2:https://somewebsite.lan/archive.tar.bz2!/path/to/exe`
>   + Further support for LZMA, CPIO archives, 7z archives, XZ, Z, etc is possible, please raise an issue to discuss.
> * Deeply nested URL protocols are now valid. If you need to extract a tarball from a zip and then extract a file from that tarball, this should work as expected.
>
> What's Changed
> --------------
>
> * Build on Java 25 in CI by [`@​ascopes`](https://github.com/ascopes) in [ascopes/protobuf-maven-plugin#805](https://github.com/ascopes/protobuf-maven-plugin/pull/805)
> * Include GH contributing guide and security notes in generated site by [`@​ascopes`](https://github.com/ascopes) in [ascopes/protobuf-maven-plugin#807](https://github.com/ascopes/protobuf-maven-plugin/pull/807)
> * Use v5 codecov action by [`@​ascopes`](https://github.com/ascopes) in [ascopes/protobuf-maven-plugin#806](https://github.com/ascopes/protobuf-maven-plugin/pull/806)
> * Bump com.google.api.grpc:proto-google-common-protos from 2.61.1 to 2.61.2 in /protobuf-maven-plugin/src/it/setup by [`@​dependabot`](https://github.com/dependabot)[bot] in [ascopes/protobuf-maven-plugin#811](https://github.com/ascopes/protobuf-maven-plugin/pull/811)
> * Bump org.apache.maven.plugins:maven-javadoc-plugin from 3.11.3 to 3.12.0 by [`@​dependabot`](https://github.com/dependabot)[bot] in [ascopes/protobuf-maven-plugin#809](https://github.com/ascopes/protobuf-maven-plugin/pull/809)
> * Bump org.apache.maven.plugins:maven-compiler-plugin from 3.14.0 to 3.14.1 by [`@​dependabot`](https://github.com/dependabot)[bot] in [ascopes/protobuf-maven-plugin#808](https://github.com/ascopes/protobuf-maven-plugin/pull/808)
> * Bump org.assertj:assertj-core from 3.27.4 to 3.27.5 by [`@​dependabot`](https://github.com/dependabot)[bot] in [ascopes/protobuf-maven-plugin#810](https://github.com/ascopes/protobuf-maven-plugin/pull/810)
> * [GH-804](https://github.com/ascopes/protobuf-maven-plugin/issues/804): Document usage of ZIP/JAR archives for sourceDirectories by [`@​ascopes`](https://github.com/ascopes) in [ascopes/protobuf-maven-plugin#812](https://github.com/ascopes/protobuf-maven-plugin/pull/812)
> * [GH-802](https://github.com/ascopes/protobuf-maven-plugin/issues/802): Reimplement URL handling by decorating input streams with commons-compress by [`@​ascopes`](https://github.com/ascopes) in [ascopes/protobuf-maven-plugin#803](https://github.com/ascopes/protobuf-maven-plugin/pull/803)
> * Bump org.sonatype.central:central-publishing-maven-plugin from 0.8.0 to 0.9.0 by [`@​dependabot`](https://github.com/dependabot)[bot] in [ascopes/protobuf-maven-plugin#813](https://github.com/ascopes/protobuf-maven-plugin/pull/813)
> * Bump org.assertj:assertj-core from 3.27.5 to 3.27.6 by [`@​dependabot`](https://github.com/dependabot)[bot] in [ascopes/protobuf-maven-plugin#814](https://github.com/ascopes/protobuf-maven-plugin/pull/814)
> * Bump org.immutables:bom from 2.11.3 to 2.11.4 by [`@​dependabot`](https://github.com/dependabot)[bot] in [ascopes/protobuf-maven-plugin#815](https://github.com/ascopes/protobuf-maven-plugin/pull/815)
>
> **Full Changelog**: <ascopes/protobuf-maven-plugin@v3.9.1...v3.10.0>


Commits

* [`a3827ea`](ascopes/protobuf-maven-plugin@a3827ea) [maven-release-plugin] prepare release v3.10.0
* [`c4f9a58`](ascopes/protobuf-maven-plugin@c4f9a58) Fix deploy.yml
* [`524f637`](ascopes/protobuf-maven-plugin@524f637) Update deploy.yml
* [`0d402b3`](ascopes/protobuf-maven-plugin@0d402b3) [maven-release-plugin] rollback the release of v3.10.0
* [`588944a`](ascopes/protobuf-maven-plugin@588944a) [maven-release-plugin] prepare release v3.10.0
* [`909183c`](ascopes/protobuf-maven-plugin@909183c) Fix mistake in JDK25 config for deploy.yml
* [`eb739b0`](ascopes/protobuf-maven-plugin@eb739b0) Merge pull request [#815](https://github.com/ascopes/protobuf-maven-plugin/issues/815) from ascopes/dependabot/maven/main/org.immutables-bom...
* [`027fa3e`](ascopes/protobuf-maven-plugin@027fa3e) Merge pull request [#814](https://github.com/ascopes/protobuf-maven-plugin/issues/814) from ascopes/dependabot/maven/main/org.assertj-assert...
* [`ae5d8de`](ascopes/protobuf-maven-plugin@ae5d8de) Merge pull request [#813](https://github.com/ascopes/protobuf-maven-plugin/issues/813) from ascopes/dependabot/maven/main/org.sonatype.centr...
* [`69ed3c2`](ascopes/protobuf-maven-plugin@69ed3c2) Bump org.immutables:bom from 2.11.3 to 2.11.4
* Additional commits viewable in [compare view](ascopes/protobuf-maven-plugin@v3.9.1...v3.10.0)
  
[![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility\_score?dependency-name=io.github.ascopes:protobuf-maven-plugin&package-manager=maven&previous-version=3.9.1&new-version=3.10.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
Dependabot commands and options
  
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- `@dependabot show  ignore conditions` will show all of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

chore General tech debt work. dependencies Issues and changes related to dependencies documentation Improvements or additions to documentation help needed Any expertise or time or resources will always be appreciated! new feature A new user-facing feature.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature]: Rewrite URL handling to delegate to commons-compress for common archives/compressors

1 participant