-
-
Notifications
You must be signed in to change notification settings - Fork 392
Add script to generate tests cases for gix-diff
#2197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add script to generate tests cases for gix-diff
#2197
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much for making this happen! Once the slider problem is solved for us and validated, it's such a major leap for the diff-quality in gitoxide
.
There is a major question I have that isn't mentioned inline: Is it easy to read/understand the output of the pretty assertion that compares both diffs using the internal hunk format? How does that fare in comparison to diffing unified diffs?
This obviates the need to create a git history which considerably simplifies the test as it can just direclty read files instead.
This makes visually interpreting differences between the two versions significantly easier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this looks so much simpler now!
The diffs are also much easier to understand, yet I am wondering: would an actual unified diff be better to diff against another unified diff? That would retain diff lines, and I think the common term for it is 'interdiff'.
Thanks for trying it out, and… it does seem to be easier on the eyes. If you think it's better, I guess that will be the way differences are presented. On another note, I did take a look at how This is just to throw it out there - right now we have some tech-debt due to the 0.1/0.2 chasm, and this feels like a good opportunity to fix it. |
Don't initialize repo in generated script.
Quick update: this is almost ready for a round of reviews! There’s one thing I want to add before I declare it ready, and that is setting the diff algorithm for creating the baseline as well as the comparison diffs, but that should hopefully not take much more than a couple of days. |
gix-diff
gix-diff
@Byron I think/hope that this PR is now ready for a first round of reviews! (If you don’t feel it is, feel free to set it back to draft. 😄) My main question is: is this useful in its current state? Is there enough context for people to understand what the new command is supposed to do? There’s still a few |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the ping :)!
I also agree it's ready for review, and I even think it's basically ready for merge.
However, before that, I want to try the tests myself, comments regarding that are below.
Once I tried it, I think I will merge it right away.
Thanks again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love to keep this info, but it feels it would be more accessible in gix-diff/README.md
(instead of a seemingly fake shell script).
Or am I missing something, maybe the placeholder is needed for something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And in that file, could you provide a snipped, behind <details><summary>…
that I can use as a diff slider example?
That way I don't have to repeat all these steps and… run python
… (yes, I am snobby ;)), just to try it out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I can tell, make_diff_for_sliders_repo.sh
needs to be present for this line not to fail: https://github.com/cruessler/gitoxide/blob/dd5127f54ce4e055e6a87719b4962482cd82090d/gix-diff/tests/diff/blob/slider.rs#L257. I just deleted the script locally which made the associated test fail with a file not found message.
What do you mean by “diff slider example”? If you want the test to run on one single diff in order to compare the git baseline against gix-diff
, you would need two .blob
files in assets
as well as the corresponding code in the script to generate the .baseline
files (one for each of the 2 diff algorithms currently being tested). While that, in an of itself, would be easy to add, the resulting test would most likely fail as gix-diff
doesn’t yet match the git baseline in most cases as far as I can tell. Which is why I didn’t even try in the first place. 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from that, I was also wondering where the best place for the info that currently lives in the script is. Feel free to move it to a more appropriate place, the README.md
seems like a good choice to me!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This clearly shows that I don't fully understand the setup yet. What I read made me think the slider file is all that's needed and the test only depends on that, but that's not the case as it doesn't include the actual blobs. It just refers to them by ID.
So I will be back with this tomorrow probably and spend the time that I probably tried to safe today 😅.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That kind of feedback is already very valuable! 😄 I’m happy to provide more context, hoping to make this a bit more self-contained.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually think you already did everything you could and I will use my ignorance and naiveté towards that matter to make it work for other readers with a similar level of ignorance and naiveté :D. So I think I am predestined to do that 😁.
This PR is about 75–80 % done (hopefully). It still needs comments and there’s also a couple of
TODO
comments that I want to go through. I already wanted to open it at this early stage, kind of as a sneak preview of the things I’m currently working on. :-)What’s the context and what are the goals of this PR?
We assume that most people would expect the output of
gix-diff
to be identical to that ofgit diff
. Therefore, we want an easy way to compare the output ofgix-diff
andgit diff
on a large corpus of diffs to make sure they match.This PR tries to make as much of that process scripted and as easily reproducible as possible.
There is a repository that contains “[t]ools for experimenting [with] diff "slider" heuristics”: diff-slider-tools. We can use
diff-slider-tools
to, for any git repository, generate a list of locations where a diff between two files is not unambiguous.This PR creates a tool that takes this a list of locations generated by
diff-slider-tools
and turns it into test cases that can be run as part ofgix-diff
’s test suite.This enables us to, whenever we want, run large-scale tests comparing
gix-diff
togix diff
, hopefully uncovering any edge case we might have missed in our slider heuristic and making sure we conform to our users’ expectations.Usage
create-diff-cases
to create the script ingix-diff/tests/fixtures/
. The script which will be calledmake_diff_for_sliders_repo.sh
.cargo test sliders -- --nocapture
insidegix-diff/tests
to run the actual tests.Implementation details
The preamble to
make_diff_for_sliders_repo.sh
comes from a similar script ingix-diff
: https://github.com/cruessler/gitoxide/blob/f0ceafa62ff519484e016789c50fa49a79819a80/gix/tests/fixtures/make_diff_repos.sh#L1-L4.make_diff_for_sliders_repo.sh
does not create a git repository. Instead, it runsgit diff --no-index
on pairs of blobs, saving the result to files whose names end in.<algorithm>.baseline
. The corresponding test then iterates through all these baselines in order to compare them to diffs produced bygix-diff
.2025-10-17: the following section refers to a previous state of this PR
The repo created by
make_diff_for_sliders_repo.sh
follows a few conventions:The test runs assertions for each pair of commits. It reads the baseline the comes from
git diff
and compares it to the output ofgix-diff
.Open questions