Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache remote refs when downloading, refactor cachedownloader #457

Merged
merged 7 commits into from
Jul 8, 2024

Conversation

sirosen
Copy link
Member

@sirosen sirosen commented Jul 1, 2024

Resolves #452.

  • Refactor the cachedownloader to bound/unbound
  • Expand caching to cache remote refs
  • Add tests for ref resolution on-disk caching
  • Preserve extension on cached refs
  • Add FAQ docs on caching

sirosen added 5 commits June 28, 2024 00:17
Separate the existing singular downloader into two distinct objects:
the bound and unbound variants. An unbound downloader implements the
core logic, almost to completion. A bound downloader *contains* an
unbound one and adds a known file target (remote and local names to
use).

The two are tied together via a single method:

    CacheDownloader.bind(URI, name) -> BoundCacheDownloader

The result allows for a CacheDownloader to be built and then bound
multiple times.
Several refinements are needed in the CacheDownloader to support this.
Primarily, support for sibling directories to the `downloads` dir, in
the cache dir. This allows the ref resolver to pass in `"refs"` as a
directory name.

As a related change in the course of this work, HTTP retries are
expanded in scope to also cover connection errors and timeouts.

Additionally, `disable_cache` gets passed down from the CLI through to
the ref resolution layer.

Tests are enhanced to better explore CacheDownloader behaviors, but
not to test the usage in ref resolution.
Rather than pure MD5, capture the extension used. This allows for
`.json5` and `.yaml` files, which indicate filetype to parsing.
sirosen added 2 commits July 3, 2024 10:58
With any non-`.json` suffix, e.g., `.yaml`, we need the cache to
preserve extensions in order to be certain that the cache loader will
choose the same parser when loading the cached ref as the one it used
when the ref was remote. The `.yaml` ref + cache population trickery
allows us to test this codepath in an acceptance test.
Add new fixtures which expose nicer interfaces for interacting with
the download caches. These are split into two variants -- one for
`refs/` and one for `downloads/` -- for the relevant functions. The
new fixtures cover getting the relevant cache dir as a path, getting
the expected cache location for a URI as a path (without creating it),
and injecting some specific file contents into said path, including
creation of the parent dir structure.

These new fixtures are then applied in as many locations as possible
for a small but valuable reduction in test complexity and tight
coupling between specific tests and the implementation.
@sirosen sirosen merged commit e31b55f into main Jul 8, 2024
45 checks passed
@sirosen sirosen deleted the cache-refs branch July 8, 2024 15:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant