Skip to content

Conversation

@kit-ty-kate
Copy link
Member

@kit-ty-kate kit-ty-kate commented Mar 26, 2024

Noticed while working on #5892

In this particular case, this reduced all of the performance degradation of this version of diff.

I'm opening this PR separately hoping that our benchmark system is heavy enough in reads to show the difference.

Requires #5900 to be merged first

@kit-ty-kate kit-ty-kate marked this pull request as ready for review March 28, 2024 13:24
@kit-ty-kate kit-ty-kate force-pushed the speedup-read branch 3 times, most recently from d6e5a37 to 5774d1c Compare April 2, 2024 16:05
@kit-ty-kate kit-ty-kate changed the title Speedup OpamSystem.read Speedup OpamSystem.read by 8% Apr 2, 2024
@kit-ty-kate kit-ty-kate added this to the 2.3.0~alpha milestone Jul 2, 2024
@kit-ty-kate
Copy link
Member Author

Local test using the test added in #5900 vs. In_channel.input_all:

  • pre-PR: 0.21s
  • string_of_channel = In_channel.input_all: 0.14s
  • this-PR: 0.14s

@rjbou rjbou self-requested a review July 9, 2024 08:58
@kit-ty-kate kit-ty-kate force-pushed the speedup-read branch 2 times, most recently from 69841b7 to 22321df Compare July 24, 2024 20:28
@kit-ty-kate
Copy link
Member Author

Mmh, the benchmark shows an increase of the time taken for some reason now... :/ I'm converting this PR to draft while we're figuring this out

@kit-ty-kate kit-ty-kate marked this pull request as draft July 24, 2024 20:57
@kit-ty-kate kit-ty-kate removed this from the 2.3.0~alpha milestone Jul 24, 2024
@kit-ty-kate kit-ty-kate force-pushed the speedup-read branch 2 times, most recently from 4e80cdf to ff1c7ac Compare July 29, 2024 10:42
@kit-ty-kate kit-ty-kate added this to the 2.4.0~alpha1 milestone Sep 25, 2024
@kit-ty-kate kit-ty-kate removed the request for review from rjbou October 17, 2024 14:30
@kit-ty-kate kit-ty-kate added PR: WIP Not for merge at this stage and removed PR: NEEDS UPDATE labels Jan 21, 2025
@kit-ty-kate kit-ty-kate modified the milestones: 2.4.0~alpha1, 2.4.0~alpha2 Apr 7, 2025
@kit-ty-kate kit-ty-kate removed this from the 2.4.0~alpha2 milestone May 5, 2025
@kit-ty-kate kit-ty-kate added this to the 2.4.0~alpha3 milestone May 5, 2025
@kit-ty-kate
Copy link
Member Author

One advantage of using In_channel.input_all is we avoid the "wasted memory" of Buffer when the file being read is a lot larger than 32768 bytes long and falls on the wrong side of the resizing formula (see ocaml/ocaml#14080).

For example if OpamSystem.read is being called on a 135MB file, the Buffer will be resized (several times, taking extra time) to 268MB, which means 133MB of extra memory will be allocated by the GC, which might allocate even more (see Gc.space_overhead). This is an issue for smaller machines with limited amount of RAM available.

Related to #6521

@kit-ty-kate kit-ty-kate modified the milestones: 2.4.0~rc1, 2.5.0~alpha1 Jul 3, 2025
Comment on lines +52 to +57
(rule
(enabled_if (>= %{ocaml_version} "4.14"))
(action (copy opamCompatInChannel.real.ml opamCompatInChannel.ml)))
(rule
(enabled_if (< %{ocaml_version} "4.14"))
(action (copy opamCompatInChannel.compat.ml opamCompatInChannel.ml)))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A pretty nifty alternative can be found in ocaml-re: ocaml/ocaml-re@a35565d

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AREA: PERFORMANCE PR: WIP Not for merge at this stage

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants