Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve performance of diffs #32393

Merged
merged 8 commits into from
Nov 2, 2024

Conversation

bohde
Copy link
Contributor

@bohde bohde commented Oct 31, 2024

This has two major changes that significantly reduce the amount of work done for large diffs:

  • Kill a running git process when reaching the maximum number of files in a diff, preventing it from processing the entire diff.
  • When loading a diff with the URL param file-only=true, skip loading stats. This speeds up loading both hidden files of a diff and sections of a diff when clicking the "Show More" button.

A couple of minor things from profiling are also included:

  • Reuse existing repo in PrepareViewPullInfo if head and base are the same.

The performance impact is going to depend heavily on the individual diff and the hardware it runs on, but when testing locally on a diff changing 100k+ lines over hundreds of files, I'm seeing a roughly 75% reduction in time to load the result of "Show More"

@GiteaBot GiteaBot added the lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. label Oct 31, 2024
@pull-request-size pull-request-size bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Oct 31, 2024
@github-actions github-actions bot added the modifies/go Pull requests that update Go code label Oct 31, 2024
@bohde
Copy link
Contributor Author

bohde commented Oct 31, 2024

@lunny lunny added the performance/speed performance issues with slow downs label Oct 31, 2024
services/gitdiff/gitdiff.go Outdated Show resolved Hide resolved
routers/web/repo/pull.go Outdated Show resolved Hide resolved
routers/web/repo/pull.go Outdated Show resolved Hide resolved
@pull-request-size pull-request-size bot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Nov 1, 2024
@wxiaoguang wxiaoguang added this to the 1.23.0 milestone Nov 1, 2024
Copy link
Contributor

@wxiaoguang wxiaoguang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could vote my approval (although the error handling seems strange, I might propose some following PRs if I could get a clear picture for it).

@GiteaBot GiteaBot added lgtm/need 1 This PR needs approval from one additional maintainer to be merged. and removed lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. labels Nov 1, 2024
@GiteaBot GiteaBot added lgtm/done This PR has enough approvals to get merged. There are no important open reservations anymore. and removed lgtm/need 1 This PR needs approval from one additional maintainer to be merged. labels Nov 1, 2024
bohde and others added 7 commits November 1, 2024 15:38
This has two major changes that significantly reduce the amount of work done for large diffs:

* Kill a running git process when reaching the maximum number of files in a diff, preventing it from processing the entire diff.
* When loading a diff with the URL param `file-only=true`, skip loading stats. This speeds up loading both hidden files of a diff and sections of a diff when clicking the "Show More" button.

A couple of minor things from profiling:

* Reuse open git commits if possible to avoid querying the repo.
* Reuse existing repo in `PrepareViewPullInfo` if head and base are the same.
@bohde bohde force-pushed the rb/diff-performance-improvements branch from 2f443b2 to 69acf6a Compare November 1, 2024 20:38
@wxiaoguang wxiaoguang merged commit 7dcccc3 into go-gitea:main Nov 2, 2024
26 checks passed
@wxiaoguang
Copy link
Contributor

We need this to correctly handle the errors: Fix git error handling #32401

zjjhot added a commit to zjjhot/gitea that referenced this pull request Nov 6, 2024
* giteaofficial/main: (21 commits)
  Use 8 as default value for git lfs concurrency (go-gitea#32421)
  Fix milestone deadline and date related problems (go-gitea#32339)
  Only query team tables if repository is under org when getting assignees (go-gitea#32414)
  Refactor RepoRefByType (go-gitea#32413)
  Refactor template ctx and render utils (go-gitea#32422)
  Refactor DateUtils and merge TimeSince (go-gitea#32409)
  Refactor markup package (go-gitea#32399)
  Add some handy markdown editor features (go-gitea#32400)
  Make LFS http_client parallel within a batch. (go-gitea#32369)
  Refactor repo legacy (go-gitea#32404)
  Replace DateTime with proper functions (go-gitea#32402)
  Fix git error handling (go-gitea#32401)
  Fix created_unix for mirroring (go-gitea#32342)
  Replace DateTime with DateUtils (go-gitea#32383)
  improve performance of diffs (go-gitea#32393)
  Refactor tests to prevent from unnecessary preparations (go-gitea#32398)
  Add artifacts test fixture (go-gitea#30300)
  Fix `missing signature key` error when pulling Docker images with `SERVE_DIRECT` enabled (go-gitea#32365)
  Fix a number of typescript issues (go-gitea#32308)
  Update go dependencies (go-gitea#32389)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm/done This PR has enough approvals to get merged. There are no important open reservations anymore. modifies/go Pull requests that update Go code performance/speed performance issues with slow downs size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants