Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache tarballs streamed through pacote #74

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 8 additions & 3 deletions lib/fetcher.js
Original file line number Diff line number Diff line change
Expand Up @@ -166,8 +166,8 @@ class FetcherBase {
}

// private, should be overridden.
// Note that they should *not* calculate or check integrity, but *just*
// return the raw tarball data stream.
// Note that they should *not* calculate or check integrity or cache,
// but *just* return the raw tarball data stream.
[_tarballFromResolved] () {
throw this.notImplementedError
}
Expand Down Expand Up @@ -203,7 +203,12 @@ class FetcherBase {
// gets to the point of re-setting the integrity.
const istream = ssri.integrityStream(this.opts)
istream.on('integrity', i => this.integrity = i)
return stream.on('error', er => istream.emit('error', er)).pipe(istream)
const cstream = cacache.put.stream(
this.opts.cache,
`pacote:tarball:${this.from}`,
this.opts
)
return stream.on('error', er => istream.emit('error', er)).pipe(istream).pipe(cstream)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming the pipe-ing is somewhat correct, would a MinipassPipeline be better here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think this should almost certainly be return new Pipeline(stream, istream, cstream)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Er, wait, no. I think cstream might not be a Passthrough?

I'll pull this locally and poke at it a bit, sorry for the uninformed comment 😅

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I have a fix for this, will squash into the commit when landing.

Suggested change
return stream.on('error', er => istream.emit('error', er)).pipe(istream).pipe(cstream)
const pipeline = new Pipeline(stream, istream)
stream.pipe(cacache.put.stream(
this.opts.cache,
`pacote:tarball:${this.from}`,
this.opts
))
return pipeline

}

pickIntegrityAlgorithm () {
Expand Down
1 change: 1 addition & 0 deletions lib/remote.js
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ class RemoteFetcher extends Fetcher {
spec: this.spec,
integrity: this.integrity,
algorithms: [ this.pickIntegrityAlgorithm() ],
cache: null, // leave the caching of the tarball up to _istream
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will also prevent make-fetch-happen from seeing the tarballs that pacote has cached. The options seem to be:

  1. Accept extra requests being made when the tarball already exists in cache (not really acceptable)
  2. Let it write twice into the cache (cacache prevents data integrity problems by using atomic writes, but still not great, because it's double disk work when fetching a remote tarball)
  3. Only pipe into the cstream for fetchers that don't expect their fetches to be cached anyway (a little bit dirty, because it requires that the fetcher knows what mfh is going to do)
  4. Provide a cacheManager object that reads from cache, but only writes packuments to cache, not tarballs (probably the "cleanest" solution, but... oof, that's a lot of duplicated code).

I'm back to thinking (3) makes the most sense here.

}
fetch(this.resolved, fetchOpts).then(res => {
const hash = res.headers.get('x-local-cache-hash')
Expand Down