-
Notifications
You must be signed in to change notification settings - Fork 302
Gitoxide for performance improvements #635
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
It's using the latest version from github for now until it's clear what should be published later - gitoxide might change to provide the best possible experience.
…necessary work Now we only create as many authors as we need to avoid allocating more strings then needed when converting BString to its display representation.
It's actually quite involved to get proper short-ids, and git even applies some heuristics to calculate good starting values for the abbreviation length, right before validating uniqueness.
f372e67 to
4fc3334
Compare
This can take a long time on massive repos.
spenserblack
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for all your hard work so far! I have a couple more questions:
… identity Additionally we sort contributors by name in case they have the same amount of commits to assure stable results, which may help in some cases where similar commits counts would be displayed as they are in the top 3.
We cache the lower-case value only if it is different from what's there to speed up comparisons while using memory/allocations only when needed.
|
I think it's ready for a formal review. Thanks. |
`git shortlog -sne | wc -l` is the reference.
…leading to it being displayed twice.
87b020e to
eb753f9
Compare
|
I took a look at what mainline does when displaying shallow clones and noticed that it displays it as Hopefully this is something you find as valuable as me, please let me know what you think. Edit: Here is how it looks like. |
That way apps using it won't be surprised by default, but instead can upgrade to deal even better with shallow clones by using the data provided in the iterator.
That is a good idea! Shallow clones can misrepresent the commit count, after all. Issue #592 may also be relevant -- it seems to be another way that a user might truncate the history. I'd consider this a new feature besides the goal performance improvement. Would you be willing to keep a log of additional bugfixes/features in this PR's description? That way we can make sure to give you credit for all your improvements in the next release 🙂 ## Additional Changes
- detect shallow clones |
|
@spenserblack Thanks so much for pointing me to this issue. Even though I stumbled across the graft/replace feature when studying git's source I could never make sense of it. That changed now and I will implement support for it in @o2sh Thanks for the issues, I created a new section in the PR body to track those. Besides that, I think all review comments are now addressed 🎉, but I will let you know separately once I think it's ready for a final look. |
|
Many thanks for the hint about replacement objects, thanks to that This means that from my side, this PR is now feature complete and I am looking forward to your feedback to push it over the finishing line. |
|
I've just approved the PR. Prior to that, don't forget to:
@spenserblack, do we keep the Thanks again for your contribution @Byron, |
spenserblack
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your hard work maintaining gitoxide and making this PR!
These are just some comments to think about. The only one that's somewhat high-priority is the BUG expects, but overall LGTM.
Ideally we add more test coverage so users have no chance from ever seeing it.
Previously this wasn't possible as commits would be kept in `Repo` which would cause self-referential borrow check issues unless the git2 repository was kept outside.
You are welcome, it's a pleasure :). Thanks for all your suggestions, you find them implemented in the new commits. I will let it settle for a day or so in case other bits and pieces come up, and go ahead with the merge otherwise. |
|
And it's done 🎉! Thanks everyone for their suggestions and comments! I can't wait to finalize the transition to |
This is a very first preview of what it would mean to use gitoxide for commit-traversal alone, done with the smallest possible changes for everything to remain as familiar as possible. It ain't super pretty yet, but it will get there. My rough plan is to get feedback early and ultimately squash all/most commits once a merge is possible.
Furthermore I think a delayed progress bar could easily be implemented to provide some entertainment while people are waiting for their huge repositories (those will never finish below 1s no matter how hard we try 😅).
Right now it's 2.2x faster on reactos and 2.4x faster on the linux kernel at 4% the heap memory consumption compared to what's on
main.I am looking forward to your feedback.
Making-of Video
gitoxideto make this PR possible.Additional Changes
git replace) #592)Tasks
git-mailmapgitoxidefor commit graph traversalSiggitoxidein all possible places to validate the API is en-par withgit2or more convenientgit2, but we are working on it, this should be ready this year.gitoxideand switch to usingcrates.ioPerformance
e3b29b0 - commit traversal with
gitoxide0652bbe - minimize allocations
633f0ce - parallelize workspace status, tokei, and commit traversal
Performance comparison with
mainOn 633f0ce
reactos
linux v5.16
On 0652bbe
reactos
** linux**
On e3b29b0
reactos
linux v5.16
It also uses less memory (already)
Note that the contributor count is off by 2 - it's likely to be related to the mailmap, and for I don't know if
libgit2is off orgitoxide. I will look into it.