-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance fix for delta generation #2
Comments
Ha, forgot about that. I did have some horrible buffering algorithm before, but I was trying to make readable code so I just did that :) I've committed a change that doesn't do that anymore - but it does issue lots of small reads. I also added support for buffering when opening a file (just the 3rd argument to open), so hopefully that's not too slow. |
Thank you so much :) You are very kindly! |
Just a warning - you might want to write some test cases for this if you actually use it on real data. I tried a few, but who knows what crazy stuff I've done... |
Hello, Michael! I'm compiled (oh, I need for it about 8 hours!) and installed pypy-3 for testing mainline version of pyrdiff. I run it for 4Gb file and it eat almost all my memory :) 695218 root 20 0 8380m 6.6g 12m R 100.0 1.7 3:47.99 pypy-c Because of this: def _generate_delta(signatures, changedfd):
"""Given a file object and signatures from another file,
generate a set of deltas (LiteralChange / CopyChange)"""
buf = changedfd.read() # Just read the whole damn file into memory Could you fix it in Python-3 branch? |
And this code: buf = buf[offset+blocksize:] Looks like memory allocator killer.. But I'm not sure about underlying implementation. |
Those two are related. I don't have any time to do this work at the moment, but feel free to send a pull request if you make the change. |
Hello! Did it :) Please check this: #3 But please review patch very thoroughly because I did not execute tests for enough amount of test cases. |
Hello!
I plans to test delta generation performance but found this in your code:
Could you fix this code to real file reading instead putting whole file to memory?
Thank you so much :)
The text was updated successfully, but these errors were encountered: