Skip to content

Update ModuleWriter logic to use ArchiveSource enum#2882

Merged
messense merged 1 commit intoPyO3:mainfrom
e-nomem:archive-source-all
Dec 3, 2025
Merged

Update ModuleWriter logic to use ArchiveSource enum#2882
messense merged 1 commit intoPyO3:mainfrom
e-nomem:archive-source-all

Conversation

@e-nomem
Copy link
Copy Markdown
Contributor

@e-nomem e-nomem commented Dec 1, 2025

This PR updates the ModuleWriter trait to use the ArchiveSource enum that was added to support the BindingGenerator trait.

This is part 1 of 2 PRs, where the second one will add a VirtualWriter struct that will wrap around WheelWriter and SDistWriter. The virtual writer will store a HashMap<PathBuf, ArchiveSource> internally so, with this update, the virtual writer can eventually just call self.inner.add_entry(target, source).

The primary purpose of this virtual writer is to ensure that the files are sorted by path before being written to the underlying archive and ensure that builds are reproducible. As a useful side effect, this will move the file exclusion and duplicate tracking to the virtual writer as well, providing a nice separation of concerns with the virtual writer deciding if a file should be present in the archive and the underlying writer actually doing the writing to the archive.

The intended end result for this set of patches can be viewed on my virtual-writer branch.

Comment thread src/archive_source.rs
#[derive(Debug, Clone)]
pub(crate) struct GeneratedSourceData {
pub(crate) data: Vec<u8>,
pub(crate) path: Option<PathBuf>,
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels odd to have a path on a generated source but, without it, the tests for rewriting path dependencies during sdist creation fail because we try to add the Cargo.toml again after rewriting it.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the ModuleWriter trait to use the ArchiveSource enum for a cleaner API design. This is part 1 of 2 PRs that will enable a future VirtualWriter implementation for reproducible builds by sorting files before writing them to archives.

Key changes:

  • Splits ModuleWriter into ModuleWriterInternal (with add_entry) and ModuleWriter (with convenience methods)
  • Updates GeneratedSourceData to include an optional path field for source tracking
  • Refactors all writer implementations (Wheel, SDist, Path) to handle ArchiveSource enum directly instead of Read trait

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
src/module_writer/mod.rs Refactored trait structure with sealed pattern; added blanket impl for convenience methods
src/module_writer/wheel_writer.rs Updated to implement ModuleWriterInternal with add_entry method handling ArchiveSource
src/module_writer/sdist_writer.rs Updated to implement ModuleWriterInternal with add_entry method handling ArchiveSource
src/module_writer/path_writer.rs Updated to implement ModuleWriterInternal; improved error messages with Debug formatting
src/module_writer/mock_writer.rs New test mock writer implementing the refactored trait
src/archive_source.rs Added path field to GeneratedSourceData and path() helper method
src/binding_generator/mod.rs Simplified to use add_entry directly instead of matching on ArchiveSource
src/binding_generator/{uniffi,pyo3,cffi,bin}_binding.rs Updated GeneratedSourceData construction to include path: None field
src/source_distribution.rs Updated imports to use ModuleWriter trait directly
src/build_context.rs Updated imports to use ModuleWriter instead of ModuleWriterExt
tests/common/mod.rs Removed unused metadata module reference

Comment thread src/module_writer/mod.rs
@e-nomem e-nomem force-pushed the archive-source-all branch from d2727f6 to c2d11e8 Compare December 2, 2025 17:22
@messense messense self-requested a review December 3, 2025 02:33
@messense messense merged commit 439f525 into PyO3:main Dec 3, 2025
45 checks passed
@e-nomem e-nomem deleted the archive-source-all branch December 3, 2025 05:25
messense pushed a commit that referenced this pull request Dec 4, 2025
As I mentioned previously in #2882,
this PR adds a `VirtualWriter` that wraps around the existing module
writer structs and extracts the common file tracking and exclusion
logic. In addition, it tracks all files before writing them at the end
allowing us to write the files with a consistent order.

After this PR, it should be possible to have mostly reproducible builds.
For sdists this requires using the same version of maturin for
reproducibility, and for wheels it requires using the same versions of
maturin, the rust toolchain, and external libraries.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants