-
-
Notifications
You must be signed in to change notification settings - Fork 167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GitPack: Don't use memory mapped streams when accessing pack files #626
GitPack: Don't use memory mapped streams when accessing pack files #626
Conversation
The perf tests seem to indicate that performance is decent on Linux and Windows, but there's a significant regression on macOS (relative to the libgit2 perf). Pack files are accessed very frequently (because a single object can be deltafied on top of a base object, which in itself can be deltafied once more). So it looks like we can either accept a performance regression on macOS, or need to identify a better strategy for caching these streams on macOS. |
I would accept a perf regression to correct for a functional regression, which is what we're looking at now with #584, right? |
On macOS the processes are always 64-bit so having two strategies (mmap for 64-bit process, regular I/O otherwise) would likely cover all of the cases without significant performance regressions. The effective maximum size of the pack file is limited anyway so on 64-bit you are basically guaranteed that there's enough virtual memory. |
What do you think of that idea, @qmfrederik? Can we make using memory mapped pack files conditional on process pointer size? |
Sure, that should work. The conditional logic can go into Not sure whether the best approach would be to switch on process pointer size (32-bit vs 64-bit) or operating system (macOS vs. Linux & Windows). |
I'm thinking process pointer as that feels a bit less hacky or dependent on OS implementation details that we don't understand that might explain the perf difference between them. In a 64-bit process the memory impact seems to be not an issue, so mapping huge files should be fine. |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
We're seeing file locking test failures. Do we have conditions under which file handles are not released? Also, I suppose we should now modify the test pipeline to run tests in a 32-bit process, and again in a 64-bit process, so that both code paths are guaranteed to be tested. Do you want to take that on?
|
Looks good. You still have this PR as a draft. Are you ready for it to merge? |
Oh right, I still want to get the 32/64-bit test run in the pipeline. Did you want to work on that? If not, I'll try to find time by this weekend. |
@AArnott This should be good to go, modulo the 32 bit tests. I won't have time for that until early next week, so feel free to have a go at it. |
I see you're working on 32-bit testing in #632. Thank you, @qmfrederik. |
🎉 Thanks a lot! |
This commit preserves the usage of memory mapped streams when reading the pack index, but reverts to using standard streams when reading raw data in pack files on 32-bit operating systems.
Fixes #584