Skip to content

Ci/crate publish test#6550

Closed
mircea-c wants to merge 7 commits intoanza-xyz:masterfrom
mircea-c:ci/crate-publish-test
Closed

Ci/crate publish test#6550
mircea-c wants to merge 7 commits intoanza-xyz:masterfrom
mircea-c:ci/crate-publish-test

Conversation

@mircea-c
Copy link
Copy Markdown

Problem

Publishing crates to crates.io sometimes fails during releases leaving versions in an inconsistent state.

Summary of Changes

Add an extra task to the crate-check workflow which publishes the crates to a test repo.

Closes https://github.com/anza-xyz/devops/issues/299

@mircea-c mircea-c requested a review from yihau June 12, 2025 20:44
@mircea-c mircea-c self-assigned this Jun 12, 2025
@mircea-c mircea-c force-pushed the ci/crate-publish-test branch from 733d551 to 49f64ad Compare June 12, 2025 21:15
Comment thread .github/workflows/release.yml Outdated

draft-release:
if: github.repository == 'anza-xyz/agave'
needs: [ trigger-buildkite-pipeline ]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need to wait for the trigger-buildkite-pipeline step 🤔 they won't need each other's data

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is meant to stop the draft release and version bump from running in case the crate and bin publishing fails.

Copy link
Copy Markdown
Member

@yihau yihau Jun 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when the buildkite pipeline is triggered, the trigger-buildkite-pipeline step will show green immediately. it doesn't reflect the pipeline's result. we will need to update the flow to achieve your goal

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, damn, that makes sense 😞

Copy link
Copy Markdown
Author

@mircea-c mircea-c Jun 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the action code. The new version of the buildkite action adds a wait functionality, so this will wait until the pipeline is finished.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have, but github runners are under powered compared to our CI machines and this takes too long as it is. I'm also trying to keep the changes to the release pipeline in this PR to a minimum.

we can use more powerful GitHub-hosted machines to run it. do you have any link you can share? I’d like to know what time you got.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a lot hacky, but changing the buildkite pipeline is only done manually through the UI right now, and I don't want to touch it. Not sure I understand your comment about 2 builds for the same commit though.

oh, the 2 builds means that in this PR, the building process looks like

  1. triggering the secondary pipeline with a special message
  2. triggering the secondary pipeline again without the special message

so we will get 2 different build numbers in the buildkite page, but they are the same commits.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can use more powerful GitHub-hosted machines to run it. do you have any link you can share? I’d like to know what time you got.

The last successful run from buildkite took 1h:

https://buildkite.com/anza/agave/builds/24657/steps/canvas?sid=019746d4-191b-45f9-ba67-f8af0a9415e3

We can use self hosted github runners I guess, but that's a big change.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can just use github larger runner atm I guess

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we’re only concerned about build time, maybe we can run it only on push and place it in the same block as the coverage test 🤔

Comment on lines +10 to +17
- name: "publish dry-run"
command: "ci/publish-crate.sh --dry-run"
timeout_in_minutes: 180
artifact_paths: "log-*.txt"
env:
CRATE_PUBLISH_TEST: "true"
agents:
queue: "release-build"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this one should happen in the main pipeline or a different pipeline instead of here. some reasons:

  1. I don't think we want the crates publishing issue will block our bin shipping
  2. PRs won't trigger this pipeline, which means we won't be able to catch this error immediately

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want the release to stop if there's an issue with the crate publishing. That's why I moved this back here.

This test takes over an hour to run, so it's not a good choice to have it run on every PR.

I'll do some more work on making a more targeted check for each crate that we can add to regular CI later.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay. if we want to to do so, I will suggest that we can use queue: "solana" instead of queue: "release-build".

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the benefit of moving it to the other queue?

I see there's only one agent in the release-build queue. Are you concerned about the check taking up too much time in that queue?

Copy link
Copy Markdown
Member

@yihau yihau Jun 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the only difference atm is that release-build has some secrets. if you don't need them, you can use other machines for reducing wait time.

Comment thread ci/change-crate-deps.py
Comment thread ci/publish-crate.sh Outdated
Comment thread ci/publish-crate.sh Outdated
Comment thread ci/change-crate-deps.py Outdated
mircea-c and others added 4 commits June 13, 2025 09:54
Co-authored-by: Mykola Dzham <i@levsha.me>
Co-authored-by: Mykola Dzham <i@levsha.me>
Co-authored-by: Mykola Dzham <i@levsha.me>
@mircea-c mircea-c requested review from levsha and yihau June 16, 2025 20:41
@yihau yihau mentioned this pull request Dec 1, 2025
@mircea-c
Copy link
Copy Markdown
Author

mircea-c commented Dec 1, 2025

Superseded by #9334

@mircea-c mircea-c closed this Dec 1, 2025
@mircea-c mircea-c deleted the ci/crate-publish-test branch January 8, 2026 23:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants