Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jj git fetch with multiple remotes can abandon reachable commits (and rebase reachable descendants) #4920

Open
mateon1 opened this issue Nov 19, 2024 · 8 comments · May be fixed by #4960
Open

Comments

@mateon1
Copy link

mateon1 commented Nov 19, 2024

Description

jj abandons work from pruned PR branches and mangles history if the pr repo is fetched before the origin.

Steps to Reproduce the Problem

Minimal reproduction:

  1. Add two git remotes b, c, with only an initial commit
  2. In b: create a pr branch, commit some work
  3. fetch into jj repo
  4. In c: do some other work on master, merge b/pr
  5. In b: delete pr branch
  6. Current state should have b/master pointing to initial commit, c/master to merge commit
  7. In jj repo: jj git fetch --all-remotes

Expected Behavior

Bookmarks are updated, repo has state consistent with both remotes.
Fetch never, under any circumstances modifies any commits. Abandons should only happen for commits that are unreachable after all fetches.

Actual Behavior

b is pulled first, the commit is marked to be abandoned
c is pulled, referencing the lost commit
jj blindly performs the abandon, modifying history such that 'work' is lost.

[mateon@ceres:/tmp/a]$ jj git fetch --all-remotes
bookmark: pr@b [deleted] untracked
Abandoned 1 commits that are no longer reachable.
bookmark: master@c [updated] untracked
Rebased 1 descendant commits

[mateon@ceres:/tmp/a]$ jj log -r 'all()'
○    xqknuwot [email protected] 2024-11-19 17:24:42 ba93e3ed
├─╮  (empty) Merge remote-tracking branch 'b/pr'
◆ │  qrmnvrwp [email protected] 2024-11-19 17:15:43 8ac29516
├─╯  other
◆  yrpzrykw [email protected] 2024-11-19 17:06:57 master@b 103e2828
│  (empty) Initial commit
│ @  pznkzwlt [email protected] 2024-11-19 17:05:48 47efbbb4
├─╯  (empty) (no description set)
◆  zzzzzzzz root() 00000000

[mateon@ceres:/tmp/a]$ jj op show --patch d255b75cda76
d255b75cda76 mateon@ceres 1 minute ago, lasted 298 milliseconds
fetch from git remote(s) b,c
args: jj git fetch --all-remotes

Changed commits:
○  Change xqknuwottovp
   + xqknuwot ba93e3ed (empty) Merge remote-tracking branch 'b/pr'
○  Change mszwrokvtxot
   - mszwrokv hidden 1780e38a work
   Added regular file work:
       (empty)

Changed remote bookmarks:
master@c:
+ untracked xqknuwot hidden bdd92bb1 master@c | (empty) Merge remote-tracking branch 'b/pr'
- untracked qrmnvrwp 8ac29516 other
pr@b:
+ untracked (absent)
- untracked mszwrokv hidden 1780e38a work

Specifications

  • Platform: Linux ceres 6.6.53 #1-NixOS SMP PREEMPT_DYNAMIC Mon Sep 30 14:25:15 UTC 2024 x86_64 GNU/Linux
  • Version: jj 0.22.0
@PhilipMetzger PhilipMetzger added the 🐛bug Something isn't working label Nov 19, 2024
@essiene
Copy link
Contributor

essiene commented Nov 19, 2024

This is probably happening because git::fetch is called one remote at a time; Within git::fetch, we download refs per repo and finish up by calling import_some_refs to update the local jj repo.

import_some_refs will mark commits as abandoned by looking at the current changed_refs and the mut_repo after download; so b will be downloaded first and the deleted commits marked as abandoned. By the time c is downloaded, it's already too late.

Maybe we could collate changed_refs for all remotes and call import_some_refs only after all remotes have been downloaded?

@martinvonz martinvonz removed the 🐛bug Something isn't working label Nov 19, 2024
@martinvonz
Copy link
Member

As discussed on IRC, changing the behavior would mean that jj git fetch --remote foo --remote bar would behave differently from jj git fetch --remote foo && jj git fetch --remote bar. I think that would be surprising, but I rarely work with multiple remotes, so maybe don't listen to me.

@scott2000
Copy link
Contributor

It's surprising to me that the order of remotes matters when fetching. If I have two remotes A and B, I would expect all of these commands to behave the same:

jj git fetch --all-remotes
jj git fetch --remote A --remote B
jj git fetch --remote B --remote A

@martinvonz
Copy link
Member

It's surprising to me that the order of remotes matters when fetching.

Do you mean s/that/if/ ? Or is there an existing case where they behave differently? (I'm not saying there isn't. I just haven't thought of one.)

@scott2000
Copy link
Contributor

It's surprising to me that the order of remotes matters when fetching.

Do you mean s/that/if/ ? Or is there an existing case where they behave differently? (I'm not saying there isn't. I just haven't thought of one.)

I thought this issue was an example, where jj git fetch --remote b --remote c causes some commits to be abandoned, but jj git fetch --remote c --remote b wouldn't have. But I may have misunderstood.

@martinvonz
Copy link
Member

Oh, of course. It's at least close to that behavior. If you fetch first from the remote where the branch was deleted, then your local commit would be rebased off it it and then when you fetch from the other remote, the abandoned commits would be added back. If you instead fetched from the other remote first, then you would get some new commits from the remote and the next fetch would rebase your local commits and the new remote commits off of the abandoned commits.

@mateon1
Copy link
Author

mateon1 commented Nov 19, 2024

It's surprising to me that the order of remotes matters when fetching. If I have two remotes A and B, I would expect all of these commands to behave the same:

jj git fetch --all-remotes
jj git fetch --remote A --remote B
jj git fetch --remote B --remote A

I would expect this to be the correct behavior, the order mattering would be massively unintuitive

Oh, of course. It's at least close to that behavior. If you fetch first from the remote where the branch was deleted, then your local commit would be rebased off it it and then when you fetch from the other remote, the abandoned commits would be added back. If you instead fetched from the other remote first, then you would get some new commits from the remote and the next fetch would rebase your local commits and the new remote commits off of the abandoned commits.

Part of the problem is that fetches performed in one command are one jj operation.
Either way:
For the former, fetch --remote b, then fetch --remote c, what you describe is intuitive. Sure. I think it's the wrong behavior when it's a single jj command, though.
For the latter, fetch --remote c, then fetch --remote b - you get the history from remote c, and no further changes besides the deleted bookmark pr@b. The commit is not abandoned since it's reachable from c's master.

@mateon1
Copy link
Author

mateon1 commented Nov 19, 2024

Maybe we could collate changed_refs for all remotes and call import_some_refs only after all remotes have been downloaded?

From what little I know of jj's internals, this seems like the obvious, correct solution, and would give the expected behavior where order doesn't matter, and changes are abandoned if and only if they are not reachable from any bookmark in the fresh tree.

essiene added a commit that referenced this issue Dec 1, 2024
…rectly.

* Make the new `GitFetch` api public.
* Rewrite all tests that used `git::fetch*` to use the new API.
  One interesting artifact of the new API is that it holds onto a reference
  of the `tx.repo_mut()`, so the GitFetch object needs to fall go of scope
  completely before we can make any other mutable calls to `tx`.
* Delete the `git::fetch`
* Update `jj git clone` and `git_fetch` in `cli/src/git_utils.rs` to use
  the new api directly. Removing one redundant layer of indirection.
* This fixes #4923 as it first fetches from all remotes before `import_refs()`
  is called, so there is no race condition if the same commit is treated
  differently in different remotes specified in the same command.

Fixes: #4920
essiene added a commit that referenced this issue Dec 1, 2024
…rectly.

* Make the new `GitFetch` api public.
* Rewrite all tests that used `git::fetch*` to use the new API.
  One interesting artifact of the new API is that it holds onto a reference
  of the `tx.repo_mut()`, so the GitFetch object needs to fall go of scope
  completely before we can make any other mutable calls to `tx`.
* Delete the `git::fetch`
* Update `jj git clone` and `git_fetch` in `cli/src/git_utils.rs` to use
  the new api directly. Removing one redundant layer of indirection.
* This fixes #4920 as it first fetches from all remotes before `import_refs()`
  is called, so there is no race condition if the same commit is treated
  differently in different remotes specified in the same command.

Fixes: #4920
@essiene essiene linked a pull request Dec 1, 2024 that will close this issue
2 tasks
essiene added a commit that referenced this issue Dec 1, 2024
* Make the new `GitFetch` api public.
* Rewrite all tests that used `git::fetch*` to use the new API.
  One interesting artifact of the new API is that it holds onto a reference
  of the `tx.repo_mut()`, so the GitFetch object needs to fall go of scope
  completely before we can make any other mutable calls to `tx`.
* Delete the `git::fetch`
* Update `jj git clone` and `git_fetch` in `cli/src/git_utils.rs` to use
  the new api directly. Removing one redundant layer of indirection.
* This fixes #4920 as it first fetches from all remotes before `import_refs()`
  is called, so there is no race condition if the same commit is treated
  differently in different remotes specified in the same command.

Fixes: #4920
essiene added a commit that referenced this issue Dec 1, 2024
* Make the new `GitFetch` api public.
* Rewrite all tests that used `git::fetch*` to use the new API.
  One interesting artifact of the new API is that it holds onto a reference
  of the `tx.repo_mut()`, so the GitFetch object needs to fall go of scope
  completely before we can make any other mutable calls to `tx`.
* Delete the `git::fetch`
* Update `jj git clone` and `git_fetch` in `cli/src/git_utils.rs` to use
  the new api directly. Removing one redundant layer of indirection.
* This fixes #4920 as it first fetches from all remotes before `import_refs()`
  is called, so there is no race condition if the same commit is treated
  differently in different remotes specified in the same command.

Fixes: #4920
essiene added a commit that referenced this issue Dec 15, 2024
* Make the new `GitFetch` api public.
* Move `git::fetch` to `lib/tests/test_git.rs` as `git_fetch`, to minimize
  churn in the tests. Update test call sites to use `git_fetch`
* Delete the `git::fetch` from `lib/src/git.rs`.
* Update `jj git clone` and `git_fetch` in `cli/src/git_utils.rs` to use
  the new api directly. Removing one redundant layer of indirection.
* This fixes #4920 as it first fetches from all remotes before `import_refs()`
  is called, so there is no race condition if the same commit is treated
  differently in different remotes specified in the same command.
* Add test for the race condition.

Fixes: #4920
essiene added a commit that referenced this issue Dec 15, 2024
* Make the new `GitFetch` api public.
* Move `git::fetch` to `lib/tests/test_git.rs` as `git_fetch`, to minimize
  churn in the tests. Update test call sites to use `git_fetch`
* Delete the `git::fetch` from `lib/src/git.rs`.
* Update `jj git clone` and `git_fetch` in `cli/src/git_utils.rs` to use
  the new api directly. Removing one redundant layer of indirection.
* This fixes #4920 as it first fetches from all remotes before `import_refs()`
  is called, so there is no race condition if the same commit is treated
  differently in different remotes specified in the same command.
* Add test for the race condition.

Fixes: #4920
essiene added a commit that referenced this issue Dec 15, 2024
* Make the new `GitFetch` api public.
* Move `git::fetch` to `lib/tests/test_git.rs` as `git_fetch`, to minimize
  churn in the tests. Update test call sites to use `git_fetch`
* Delete the `git::fetch` from `lib/src/git.rs`.
* Update `jj git clone` and `git_fetch` in `cli/src/git_utils.rs` to use
  the new api directly. Removing one redundant layer of indirection.
* This fixes #4920 as it first fetches from all remotes before `import_refs()`
  is called, so there is no race condition if the same commit is treated
  differently in different remotes specified in the same command.
* Add test for the race condition.

Fixes: #4920
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants