-
Notifications
You must be signed in to change notification settings - Fork 5
Use a rope of string array instead of a list of strings #37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
1c2a055 to
0295c6b
Compare
|
On my computer (FreeBSD, OCaml 4.14.2), this works nicely and is a great improvement. As a matter of fact, before this change, there are some code paths in conex that take an unbearable amount of time (3 minutes) -- with this change, it works nicely in 10 seconds. So, I really would like to get this merged. |
This improves my usecase (a single file diffed with 2000 hunks by a factor of 20 GitHub workflow: run on 4.08 as well
ed59358 to
26dd217
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had only a cursory look at Rope, since I assume it has been reviewed and tested more thoroughly elsewhere. The squash of the last 3 commits (aka recognizing a List.rev_append) is good to go.
The changes in src/patch.ml seem to pick up exactly the equivalent operations in Rope.
So this looks good to me.
I also tested the new opatch on the patch series in OCaml/Solo5 and it yields the expected result.
|
while the initial rope has been copied from utcp, it was modified extensively. it may actually be worth to write some unit testing for it... |
|
I used the grepping for errors in old.log: and in new.log: I'm not sure what I'm supposed to do with this output. The test script is very nice, but is equivalence achieved with patch released as 3.0.0? |
With this output in particular, nothing beyond a simple diff. The real output is in |
kit-ty-kate
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of comments at a glance
The |
kit-ty-kate
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah yes i see, sorry i missed that part. The diff shows a couple of differences, it might be worth looking into it to make sure these aren't real issues (some of them are expected to fail, e.g. if they use context diffs instead of unified diffs)
|
Digging a bit deeper, using And the remaining (we actually need only field 6, and can observe the difference: To go through them one-by-one: core 109.07.00I've no clue why there was no warning in new. dbm 1.0This is a context diff graphics 3.12.0So, there was a download failure. libsvm 0.8.3context diffs ocp-indent 0.6.0context diff ocp-indent 0.6.2context diff ocp-indent 0.9.0context diff why3 0.81[WARNING] Some errors extracting to /tmp/test-patch/tmp/why3.0.81: Looking back at old.log, the why3 server was down when that test was run. ConclusionI can't find any divergence, but I'm not sure what to look for. For sure, context diffs are not supported anymore. Also, "bad prefix" is a common cause. I was curious whether you @kit-ty-kate have a saved run from earlier versions (so what should the test.log look like for a successful test)? |
|
I'll review again tomorrow but so far it looks great. I'm starting to understand the little invariants of the new No diff, which is perfect. The time difference isn't the most accurate data so disregard it for now. I'll open a PR in opam to have more reliable performance numbers. |
|
ocaml/opam#6772 (ocaml-benchmark) doesn't see any difference in terms of performance compared to before (it might actually very slightly take more time), but maybe it's because |
|
Thanks for testing. I added a test case (many-hunks) with lots of hunks which should show the performance difference. |
|
Thanks a lot for these tests. I can indeed confirm that the time taken running the tests went from ~20s to ~2s with both OCaml 4.14 and 5.4. One side note/question i have about that commit is: the new files add about ~2MB to the repository, do you want to keep them in or move them to https://codeberg.org/kit-ty-kate/ocaml-patch-tests (aka. |
|
Oh please feel free to move them over to your repo, I can force-push here to avoid having them in this repository. |
004ab8e to
4f36296
Compare
|
and force-pushed to remove the test here, but use the external files (many-hunks.*). |
|
hmm, so the diff is actually not what |
4f36296 to
01d557e
Compare
kit-ty-kate
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm modulo squash/clean-off of the branch.
I find the new Rope abstraction quite complex, and i'm personally not huge fan of adding such a complex datatype with hidden invariants that can be subtly broken later into the codebase (the nl parameter also seems particularly ad-hoc in that regard), but the performance improvement is worth it and it shown to work with the largest set of test that we have so it's ok for now.
Further improvements could be made in the future by changing our hunk type to use the rope type, this might avoid the extra copies and simplify some operations.
Let's merge this and i'm personally hoping to find a way to simplify the definition or something like that, later. Thanks a lot for this huge work!
|
Thanks @kit-ty-kate. I squash-merged and will cut a release. |
CHANGES: * Use a rope data structure instead of lists. Improves performance especially with many hunks (hannesm/patch#37 @hannesm, review by @shym @kit-ty-kate)
This improves my usecase (a single file diffed with 2000 hunks) by a factor of 20
GitHub workflow: run on 4.08 as well