-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Specify location to store metadata
for registry clients
#322
Comments
I'll note that really this is about having a slight convenience: right now package managers would have to fetch both the As a package-manager author I don't really mind fetching two repos. |
What if we write the In general, it seems like we could use the registry repository as the 'source of truth' (it's essentially our database), and then use the storage backend to store packages and the replica of registry repository information (the registry index, the package metadata). Package managers don't need to look at the As an aside: package managers also look at the |
The reason why the
Note that package managers must always look at the most up-to-date metadata (i.e. the registry repo), because stale data is just incorrect (e.g. think about how it could go "oh sure you can publish version Y of package X, I don't see it in my cache!"), and I can't think of a mechanism to figure out that the cache is outdated other than, well, trying to fetch the source.
There's a section of the spec about mirroring the Registry. So that chunk of code you link is so that package managers (or anyone else really) can figure out the URLs of where to fetch packages from all the different mirrors. So to get back to your initial question:
...maybe instead of trying to reduce the amount of locations that package managers need to deal with (that I feel it's pretty minimal right now, since all the different pieces do different things) we could focus on making sure that there is one entrypoint (this repo) from which all the information on how to get the rest of the things (and their mirrors!) can be easily figured out? |
That makes sense to me.
In this case, it sounds like we don't bother publishing the metadata anywhere other than this repository? Since ultimately it has to be looked up here anyway?
That's fine with me: ensuring this entrypoint gives you what you need to find the other locations, and it's up to package managers to follow that.
I'm curious how you see mirrors fitting in with things like metadata. If the metadata must be looked up from the registry repo, because caches can be out of date, then what is different about a mirror? What if a particular version of the registry fails to be mirrored over to GitLab, for example? Do we assert that you can look up essentially everything except for mirrors from the registry repo, but you must look up metadata from this repository? |
Yeah, I'm for this
This is now a distributed system and we can't support all CAP properties, so we have to choose between
I think in the case of this repo we should go with CP: if this repo is unreachable, then it's pretty likely that everything else at GitHub is down, and you wouldn't be able to run the pipeline anyways. I'll note that while mirroring storage backends is about ensuring availability (i.e. we do AP there, since if you don't find a package on a mirror you can just try the next one, but you're sure it's there somewhere because the Metadata says so, and that is always consistent), mirroring this repo is more of an insurance policy: if we don't want to deal with GitHub and its CI anymore we can move somewhere else and the data will be already there. |
We have a Dhall specification for the metadata about a particular package as well as particular package versions:
https://github.com/purescript/registry/blob/master/v1/Metadata.dhall
This metadata is information we collect about a package or package version on top of the information provided by users via their package manifests. For example, we collect the package size, compute a hash, and so on. The spec asserts that the metadata for packages is stored in a directory named
packages
here in the registry repo:https://github.com/purescript/registry#package-metadata
However, during our registry call this morning in the
#registry
channel of the PureScript chat we discussed mirroring information to other repositories from the registry, like how we write the package manifests to theregistry-index
repository. The metadata about packages seems like it would be equally useful to package managers as the contents of the manifests themselves, and so we discussed whether the metadata should be mirrored to theregistry-index
as well.The question is: should we mirror the metadata to the registry index or elsewhere, and if so, where should it be stored in the registry index?
The text was updated successfully, but these errors were encountered: