-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Draft] RFC: Patch dependencies using unidiff patchfiles #3177
Conversation
Clarified more in the examples, as well as some language fixes. Made the Quality of life improvements a bit more clear Filled in missing fields Preparing for removal of the question text
I feel the Reference-level explanation is lacking, and would like it if someone helped out here |
I think the patch file should be specified in the |
@TotalKrill can you please add a rendered link like other rfc pull requests? |
@bjorn3 Generally I am a bit unhappy with the [patches] section, I feel it is hard to direct the patch to the specific dependency one would want changed. |
This should definitely go in the FWIW, I have a slightly different motivation for such a feature. At work, our codebase has a bunch of crates in the To ensure we can always rebuild old versions of our codebase, we need to ensure that the pinned hashes of our patches stay alive forever, which means making sure we never The only current alternative to our current workflow is to vendor those patched dependencies, but this has its own set of downsides. Having patchfiles would be ideal for us: it'd trivially ensure old versions are reproducible, as we'd store the patchfiles in the same repo as the rest of the project. |
@roblabla That is very valid points, and we are in the same boat basically. If you go to my branch, I think you could expand the motivation text for this with those points, and also making the note about the workspace case, which I had forgotten, and Ill merge them and this RFC get that update? |
Thanks for taking the time to write this up, personally I think this would be a neat feature to add to Cargo. One of my main questions about a feature like this is how will the patch files get generated? The patch file format is relatively standard so I'm not worried about specifying something like that, but this would be one of Cargo's first features where you can't use Cargo for the whole feature, but rather it has to be supplemented with something else like Another question about how to actually generated patch files is where is the source coming from to actually generate a diff against? There seems to be two primary sources to patch, one being git and one being crates.io. For crates.io crates there's not really an official method to actually check out and view the source of any particular version. For git crates you can also run into the problem where the revision used in the lock file is not easily looked up to get a checkout elsewhere to generate a patch against. I think this basically boils down to that it's actually pretty involved to check out the precisely correct version of the source code to generate a patch against, and I think there's a good deal of risk of generating a patch against a slightly wrong version of the source. I would agree with other folks here as well that this information most likely should live in the Finally one other thing I'd like to discuss here is how patches are actually applied. While I think that there's some leeway of generating patches can require something like In any case personally I think there's still somewhat fundamental design decisions to sift through and consider before more leaf concerns like the text of the RFC are considered. I think that these kinds of the decisions end up being the meat of the RFC and it sort of writes itself once all of this is documented. |
Thank you @alexcrichton for taking the time to review and come with feedback regarding this feature I wholeheartedly agree that these design decisions needs to be acknowledged and solved before the RFC gets finalized, and my idea was that this PR is the discussion forum for that, I believe it makes it easy for people to find and contribute.
Not sure I agree here though, I will liken it to the usage and creation of source code. Cargo can use source code and make it compile, but it cannot create it. But yes, as one of the implementation steps the documentation should be updated to acknowledge this. "it is recommended to use
This is a huge hassle indeed, determining the exact version of a source code when using crates.io, I would argue that this is actually out of scope for this RFC since I believe solutions here are more into the workflow for obtaining the source code, while I would like to put the scope touched subjects of this RFC between "generating patch file <-> getting patchfile applied" A lot of the dangers of applying the patchfiles to the "wrong source" are actually handled in the format itself, so a patch will either apply cleanly ( the source code is very likely the one we wanted to patch ), or with errors since the lines to remove, add or the span of said changes along with the contextual lines are not correct, and this should in my opinion refuse to compile, the same way that errors in the dependencies will not even start the compilation. There could be another option where the
I Agree, will change this.
Looking at the unidiff standard, I believe that this patching functionality is actually quite trivial to implement. I would suggest rolling our own patch apply library for this. Rolling our own would additionally allow us to set the strictness level of applying patches. in short
|
Moved the new entry into the [patch] section, this added the requirement of a version key to specify on top of which software a patch should be applied
I updated the RFC to move it into the Also I edited the reference level explanation to be more explicit and motivate some design decisions I believe would be important |
The RFC mentions unidiff as being a standardized format in some form or shape. You also mentioned “Looking at the unidiff standard” in one of your earlier comments. Is there a canonical description of this format anywhere? And if so, could it be linked from the RFC? |
@nagisa You are correct, I did mention the unidiff standard, but that was massively overselling it on my part. What I meant by those words, is that I was looking at the "detailed specifcation" found at GNU, and generating and looking at some diffs made by git on a project I had lying around. These two resources were what I was refering to, and the format is indeed quite simple, even if the "detailed" specification is not very detailed.. https://www.gnu.org/software/diffutils/manual/html_node/Detailed-Unified.html |
Can you mention that patching |
@bjorn3 I had no idea, this would invalidate much of what I have written in the RFC... |
I think patching the generated |
I'm not sure if I entirely agree that we can wave away concerns about generating patch files with "well if the patch doesn't apply cleanly that's an error". There is no supported way really to actually get the source of a crates.io crate. You can't use git repositories since Cargo or I'm happy to be proven wrong here, but I don't think that this feature would be a great fit for Cargo if all the documentation says is "if you hand Cargo a patch it'll apply it" without actually providing any guidance at all to actually generate said patch. |
I mean, there are supported ways for 3rd party tooling to acquire it? Afaik the crates.io registry protocol is documented and stable, and there are existing cargo subcommands to download them. Wouldn't guiding users towards those work? Related issue: rust-lang/cargo#1861 |
|
||
[patch.crates-io] | ||
foo = { path = "../path/to/foo" } | ||
bar = { version = "2.0", patchfiles = ["patches/bar-update-foo-dependency.patch"] } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bar = { version = "2.0", patchfiles = ["patches/bar-update-foo-dependency.patch"] } | |
bar = { version = "=2.0", patchfiles = ["patches/bar-update-foo-dependency.patch"] } |
I'd suggest requiring that anything with a patchfiles
entry needs to have an exact version specifier, to help ensure that the source being patched is the same as the source the patch was generated against.
On @alexcrichton's "source of truth for patch generation" question, I think we can give some guidance in documentation for how a patch may be generated against a particular crate's source, but at the end of the day it's up to the user to generate the patch that they need - requiring either exact version number specifiers, or exact git commit version specifiers, helps us to make this easier to describe and enact.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for reviewing!
Worrying to much about which source a patch would apply to would severly limit the value of allowing patchfiles in my opinion.
Patchfiles as I have used them has always been more of an "apply this change to code that looks like this" compared to the "apply this change to this exact version and edition of a software package"
If one would want to apply the patch to an exact version of a source code or package, I would not want to stop them from specifing the exact git hash or force the patch to apply to only a specific version from crates.io.
But forcing all patchfile users to always specify exactly which software package or version I want to apply this change to just seems rude and would severly limit the usefulness of allowing patchfiles in the first case.
As a real life example of this: https://github.com/rust-windowing/winit/pull/2025/files
I would like to have that addition as a patchfile, instead of as now, patching winit to use the one from my git branch, that I have to actually rebase with every change. That way I could be using the master of the original git branch, with that patch applied on top of whatever changes the owner(s) of the repository are deciding to merge or not.
Long time since this was last discussed, had hoped some more interest would have gathered for this, but patching using patchfiles doesnt require any knowledge about git or anything in the usual case, it requires knowledge about where the file to patch is located, which lines, and some optionally configurable settings that can define that the previous lines should be something, or what the lines following the change should be. Basically a patchfile is doing similar to what the changelog in merge requests does, it applies changes to files that looks a certain way. In no way does that require intricate knowledge of what git revision or the exact code it acts on. If the files the patchfile wants to change does not look like the patchfile specifies ( or the files does not exist ) then a clear error should be thrown during compilation. something along the lines of "patch XXXXX does not apply: {filename does not exist, patchfile XXX does not exist, lines 119-120 in filename does not match }" |
For prior art, here an equivalent functionality from NodeJS packaging context: patch-package. It also uses unidiff files. Seems it is widely used (6k stars), and we can learn from its implementation and how to teach users the usage of this new functionality. |
Here is a tool that has sprung up allowing patching. https://crates.io/crates/patch I would like some kind of feedback on how to get this RFC moving towards a decision. |
I've been thinking about this RFC, (which would really help us at work) and I think something we could do to push this forward is to experiment with this workflow out-of-tree, by (ab)using cargo vendor to build the dependencies. E.G. you could have a cargo plugin crate (say,
It wouldn't be ideal (especially the whole "needs a custom build command" bit), but it'd allow us to work out the details of how the patches are generated and applied. |
the dependency declaration, if a patch does not apply cleanly, it would result in an compilation error | ||
that shows which patch that did not apply cleanly to which library. This would make sure that patchfiles are not to dangerous. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the dependency declaration, if a patch does not apply cleanly, it would result in an compilation error | |
that shows which patch that did not apply cleanly to which library. This would make sure that patchfiles are not to dangerous. | |
the dependency declaration, if a patch has any "fuzz" (mismatched context lines), it would result in an compilation error | |
that shows which patch that did not apply cleanly to which library. This would make sure that patchfiles are not too dangerous. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@joshtriplett imho it should also specify other mismatches too otherwise it'd technically be legal to not fail if a patch deletes lines that don't match since they're not context lines
I'd love to see this happen. A few bits of feedback:
|
What is the state on this? |
How would this work if the patchfile patches the dependency Cargo.toml? For example, adding feature flags to dependency that it doesn't expose. In this case, dependency resolution would fail to find a package with added features, which would need the patchfile to be applied before dependency resolution, which would in turn require dependency resolution. |
I'm going to propose to postpone this RFC. The Cargo team agrees that this seems like it would be useful and a good idea. However, there are some details to work out, and this RFC has stalled somewhat. Primarily, there isn't anyone on the team who has the capacity at this time to champion this feature. We would be interested in possibly seeing this land as an experimental feature, where we can explore options for the solution. This would be with the intent that an RFC would follow up before stabilization. If anyone is interested in working on that, feel free to reach out to the Cargo team on #t-cargo on Zulip, or on the issue rust-lang/cargo#4648, or come to Office hours. @rfcbot fcp postpone |
Team member @ehuss has proposed to postpone this. The next step is review by the rest of the tagged team members: No concerns currently listed. Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up! See this document for info about what commands tagged team members can give me. |
🔔 This is now entering its final comment period, as per the review above. 🔔 |
The final comment period, with a disposition to postpone, as per the review above, is now complete. As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed. This is now postponed. |
This being my first RFC for rust, I got some feedback that I should open a PR, even though the PR and the RFC is not completed.
rendered
Feedback on the RFC itself, the writing and the process are all welcome. I find it easier to discuss around a PR ( thanks to githubs discussion system around them ) than trying to gather feedback on my fork.
I realise that patchfiles are an incredible powerful tool, and the usecase I am presenting here might not be the optimal representation of this feature. But it was the solution I wanted when encountering this problem for the N-th time, where N > 3.
This RFC draft is based upon discussions in: rust-lang/cargo#4648 and rust-lang/cargo#3830