Skip to content

Conversation

@EliahKagan
Copy link
Member

@EliahKagan EliahKagan commented May 9, 2025

This makes several improvements to the MSRV check. Most are to make clearer what is going on, in the msrv.yml workflow and elsewhere, by modifying and expanding comments, and by slightly reorganizing the commands. But this also improves the robustness of the check, by:

  • Fixing a bug where jobs could report the wrong status, which was analogous to #1559 but much less likely to occur. If both the rustup commands were to fail, but only on Windows, then the third command that was in that same step with them could cause the whole step to fail--since pwsh is used by default on Windows and it doesn't have anything analogous to -e--whereupon the usually more recent toolchain preinstalled in the windows-2022 runner would be tested with, instead of the MSRV. I don't think this has actually happened in any runs so far, but I think it's worth preventing entirely.
  • Passing --locked to the cargo commands in the ci-check-msrv recipe, so they fail if they would have to further modify Cargo.lock, rather than modifying it and continuing. This is because the effects of -Zminimal-versions are non-obvious and subject to change (--minimal-versions has not been stabilized), and also to try to account at least partially for possible disagreement in what the minimal dependency versions are between the nightly toolchain that is used to downgrade the locked dependencies (because only nightly has -Zminimal-versions) and the old stable toolchain versioned at the MSRV.
  • Using cargo build instead of cargo check in the ci-check-msrv recipe, as suggested in #1808 (comment). (Please note, however, that #1808 is still probably not completely fixed. My guess is that the build problems noted there occurred, and could still occur, in some features, or some crates, that are not covered by the commands in the recipe.)

This also fixes an accessibility bug in the MSRV badge, where the aria-label attribute of the top-level svg element had the value rustc: 1.70.0+, instead of the correct and intended value rustc: 1.75.0+ that was already present in the other places in the SVG code where the version appears (dc1d271).

EliahKagan added 8 commits May 9, 2025 00:27
Probably it should not be in quite this form in the `justfile`,
though, because only the `#` line immediately preceding the first
line that is formally part of the recipe is shown in `just --list`.

This only changes the `justfile`. It does not modify the `##`
comment in the corresponding `Makefile` rule.
This rewrites the newly expanded multi-line comment explaining what
the `ci-check-msrv` recipe in the `justfile` does, so that it is
still readable from top to bottom, but so that the one-line summary
is the last line.

This is needed because `just --list` only treats a single `#` line,
preceding the formal beginning of the recipe, as its documentation.

(This differs from most other uses of multi-line comments, such as
in a Rust `///` or `//!` comment, where the first line could be a
short summary, and subsequent paragraphs are regarded subordinate.)

This continues to change only the `justfile` and not to modify the
`##` comment in the corresponding `Makefile` rule.
This fixes an edge case in the Windows MSRV job, by using `bash`
for script steps on all platforms.

The MSRV workflow had alredy implicitly used `bash` on Ubuntu,
which implicitly passed `-e` (as also happens when `shell: bash` is
used explicitly). But it had implicitly used `pwsh` on Windows.

Although GHA runners make an effort to fail `pwsh` script steps
when a command has failed, PowerShell (whether `pwsh` or
`powershell`) does not have an analogue of POSIX `-e`. Script steps
with `pwsh` do not always fail fast, nor even always fail at all,
when a command fails that is not the last command in the script.

Recent experiments confirm that running an implicit `pwsh` script
step with two commands, each of which run external programs, where
the first program exits normally but indicating failure (such as
with an exit code of 1) but the second program does not fail, will
run both commands, then report success for the whole step.

This is to say that the kind of bug that 4f2ab5b (GitoxideLabs#1559) fixed in
the Windows `test-fast` job in `ci.yml` also existed in the Windows
`check-msrv` job in `msrv.yml` (which GitoxideLabs#1559 did not fix).
Fortunately, this version of the bug is far less severe, in that
the circumstnaces under which a failure would be concealed appear
to have been unlikely.

However, at least one such concealed-failure mode was plausible. If
the `rustup toolchain install ...` command failed, thereby causing
`rustup default...` to fail, then the cargo update command could
still succeed, masking those two failures. Error messages for the
failures would still be shown in the log, but the step would have
reported success. Then the `gix check` commands run via
`just ci-check-msrv` would check using the wrong toolchain, i.e. a
typically newer toolchain than the MSRV, present in the
`windows-2022` runner image.

This commit fixes that bug by setting a default of `shell: bash`
for all script steps not specifying `shell:`. This implicitly
passes `-e` to the `bash` interpreter. Although `-e` has its own
intricacies, for simple commands such as those shown here, it has
the intuitive fail-fast behavior.

(The alternative of splitting the steps up so each step directly
runs only a single command, as was done in GitoxideLabs#1559, was considered.
But the code seems like it will be less readable if written that
way. It may make sense to organize the commands here into more
steps, but likely with some steps having two or more commands.)

This commit also rewords a conceptually related comment in `ci.yml`
for clarity, and so that differences in wording between comments
on `shell: bash` in various workflows reflect differences in
circumstances.
- Split into more steps, logically related.
- Identify both toolchains in the step name.
- Name the `just` step so it's clear what it does.
- Add a TODO for switching to `--minimal-versions`.
- Check that `cargo` didn't have to further modify `Cargo.toml`.
- Capitalize `RUST_VERSION`, as it is an environment variable.
The verification being done here is exactly what `--locked` is for.

Allowing the check to proceed even if it changes the dependencies
could have the benefit of showing more information. But it is not
obvious that the information it shows would be relevant to the
situation being investigated.

(If that information would usually be relevant, then
`cargo +nightly update -Zminimal-versions` may be too strong, and
`cargo +nightly update -Zdirect-minimal-versions` could be
considered. Even better than either of those *might* be, somehow,
to temporarily downgrade locked dependencies only as much as
necessary to make them all MSRV-compatible.)

In any case, `--locked` is more readable, and considerably simpler.

In addition to making complementary changes to the `msrv.yml`
workflow and to the `justfile` recipe it uses, this also changes
the `Makefile` rule corresponding to the `justfile` recipe, so they
continue to do the same thing if used in local experiments.
This commit splits the commands in the steps of the `check-msrv`
job definition to be one per step, documenting each step.

In 510847b I predicted that it would not make sense to split the
steps to one command per step in `msrv.yml`. The change made there
of using `bash` even on Windows allowed for experimenting with how
the workflow logs look when reorganized in various ways. But in
view of other changes -- and to better clarify that two toolchains
were installed, but the MSRV toolchain is set default, so it is
used by all subsequent operations *except* the `-Zminimal-versions`
dependency downgrade that currently requires `nightly` -- it now
looks like having one command per step is better after all.

When running only one (simple) command per step, the main change in
510847b is no longer needed. That is, this now avoids that problem
in the same way it has been avoided in `test-fast` in `ci.yml`
since 4f2ab5b (GitoxideLabs#1559). So this commit also removes `shell: bash` in
`msrv.yml` (but keeps the comment clarification in `ci.yml`).
The SVG badge showing the project's MSRV, `etc/msrv-badge.svg`, has
four occurrences of the MSRV in it. Three had the current intended
MSRV. But the top-level `svg` element's `aria-label` attribute
still had the value `rustc: 1.70.0+`, having not been updated the
most recent two times the badge SVG was adjusted for a new MSRV.
That would, at least potentially, result in screen readers and
possibly other software saying/outputting 1.70.0 instead of 1.75.0.

This changes the `aria-label` value to `rustc: 1.75.0+`, as
intended.

This also rephrases the comment in `msrv.yml` about updating the
badge when changing the MSRV, to mention the need to change "all
occurrences".

(We may want to add a CI check to make sure the badge is fully
consistent with the current MSRV, but that isn't attempted here.)
This changes `cargo check` commands to `cargo build` in the
`ci-check-msrv` recipe in the `justfile`, which the `check-msrv` CI
jobs in `msrv.yml` use. (It updates a CI step name accordingly.)

The idea is to make sure at least `gix`, with the two combinations
of features tested, actually *builds* under the MSRV toolchain.

As in f10f18d, this also updates the `Makefile` rule corresponding
to that `justfile` recipe.

The idea of actually building was suggested in:
GitoxideLabs#1808 (comment)

However, this does not uncover any new breakages. And there has
been further improvement on GitoxideLabs#1808, including in the commits leading
up to this, as well as earlier, in 569c186 (GitoxideLabs#1909). Nonetheless, it
seems likely that some problems remain with some combinations of
crates and features that are not currently exercised in the MSRV
check. GitoxideLabs#1808 is most likely not yet fully fixed.
Comment on lines -129 to +130
cargo check --package gix
cargo check --package gix --no-default-features --features async-network-client,max-performance
cargo build --locked --package gix
cargo build --locked --package gix --no-default-features --features async-network-client,max-performance
Copy link
Member Author

@EliahKagan EliahKagan May 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know why we have a check-msrv-on-ci rule in the Makefile, equivalent to the ci-check-msrv recipe in the justfile that the msrv.yml workflow uses. I've expanded the comment only in the justfile. But I've updated the command in the Makefile, rather than only in the justfile, so that their behavior can remain equivalent as long as we keep duplicating functionality across both.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for bringing this up. It also seems like this target isn't used at all, so I think it's better to remove it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I'll remove it from Makefile in a subsequent PR.

More broadly, I'm not clear on what is supposed to go in the Makefile and what is supposed to go in the justfile. Is the Makefile just left over from before the justfile was introduced, such that everything there is there is for historical reasons only? Should its contents be migtated to the justfile? Or are there reason to prefer that some things be in the Makefile instead of the justfile?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a left-over, and the port to justfile was never completed.
make stress is still called, and it could be that it leverages the natural parallelism of make which is harder to do in a justfile.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. I'll remember to consider the effect on parallelism if and when moving things, especially if they are not already duplicated, from the Makefile to the justfile. The check-msrv-on-ci rule/target probably doesn't benefit from that, though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe casey/just#626 will be completed--for example, maybe casey/just#1562 will be picked back up--and then they could all be moved over to the justfile.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be nice.

My general feeling about just was that it was supposed to remain a sequential task runner though, and thus far the community around it didn't feel strongly enough about it to fork and merge in their desired solution. So it's probably nothing more than a wildcard at this point.

Copy link
Member Author

@EliahKagan EliahKagan Jul 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just noticed that casey/just#626 has very recently been implemented in casey/just#2803 in the form of a new [parallel] attribute. My understanding is that, if a recipe $Y$ declares recipes $X_1$, $X_2$, and $X_3$ as dependencies and $Y$ is marked [parallel], then running $Y$ causes just to run $X_1$, $X_2$, and $X_3$ in parallel rather than sequentially.

This is not equivalent to -j for make. However, at least for make stress, it seems to me that [parallel] does exactly what we want, and that we are currently using make -j as a workaround for not having such a feature. The stress recipe runs various commands with time, but the first command it runs is only to set things up for the subsequent commands, and it does not use time, nor any other shell features.

$(MAKE) -j3 $(linux_repo) $(rust_repo) release-lean

That first command runs three other recipes with $(MAKE) -j3 so they run in parallel, and the recipe declares no dependencies. Therefore, conceptually it has the three recipes identified in the first command as dependencies, but it runs them through a command with $(MAKE) instead, to be able to pass -j3.

Furthermore, as far as I can tell, while stress runs various recipes directly and indirectly, this is the only place where -j parallelism is involved. Even if one runs make -jN stress with one's chosen N, I don't think one gets greater parallelism, nor do I think greater parallelism is intended when running the stress test.

Per casey/just#626 (comment), [parallel] will be available starting in the next release of just. It seems to me that whether this will let us move everything from the Makefile into the justfile depends on how accurate the note at the top of the make help text is:

Note: Make is only for specific functionality used by CI - run just for developer targets

Assuming that is (still) true, it seems to me that [parallel] will let us move everything into the justfile (or, if preferred, a separate justfile, though I don't think that would be preferred). This is because, at least going by what I see when I inspect the output of git grep -Fwn make, the only use of make on CI--looking only at uses that that involve our Makefile--is to run make stress in cron.yml.

I believe there are other indirect implicit uses of make in building native code dependencies, as occurs when some features of some crates are enabled, and for this reason we do install make in containers that may not have it, in some other CI jobs. But these would not have anything to do with the Makefile in this project.

However, if make -jN is also being used manually for local operations, possibly explicitly specifying multiple recipes on the command line, then that is a different story. There may also be other relevant subtleties.

Although I don't know that we should do it regularly, I think it would be useful to be able to run the stress test on Windows. I expect that eliminating the Makefile would ease portability, since various interesting things can happen with Makefiles on Windows. But it may not be necessary.

(As an aside, torvalds/linux is one of the test repositories used there, and just cloning that on Windows is very slow, even when it is a bare clone. However, since the test uses a bare clone, it should at least be able to run, since the usual problems such as case clashes and conflicts with the reserved DOS device name AUX should not apply.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for sharing!

After reading this I arrive at the same conclusion and can confirm that with [parallel] our Makefile wouldn't be needed anymore. I also think it's valuable to not have a Makefile when just would now do in every way.

Thus I very much welcome this transition to happen whenever possible, and whenever you have some time.

@EliahKagan EliahKagan enabled auto-merge May 9, 2025 06:15
@EliahKagan EliahKagan merged commit 9d0c809 into GitoxideLabs:main May 9, 2025
22 checks passed
@EliahKagan EliahKagan deleted the run-ci/msrv-check branch May 9, 2025 06:29
Copy link
Member

@Byron Byron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot!

Comment on lines -129 to +130
cargo check --package gix
cargo check --package gix --no-default-features --features async-network-client,max-performance
cargo build --locked --package gix
cargo build --locked --package gix --no-default-features --features async-network-client,max-performance
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for bringing this up. It also seems like this target isn't used at all, so I think it's better to remove it.

EliahKagan added a commit to EliahKagan/gitoxide that referenced this pull request May 24, 2025
Because it was equivalent to the `ci-check-msrv` recipe in the
`justfile`, and only the latter is used.

For details, see:
GitoxideLabs#2003 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants