Skip to content

[Ex CI] nonpolling dispatch action and summary check#607

Merged
danielsu-amd merged 6 commits into
developfrom
users/danielsu/az-dispatch-nonpolling
Jul 11, 2025
Merged

[Ex CI] nonpolling dispatch action and summary check#607
danielsu-amd merged 6 commits into
developfrom
users/danielsu/az-dispatch-nonpolling

Conversation

@danielsu-amd
Copy link
Copy Markdown
Contributor

@danielsu-amd danielsu-amd commented Jul 11, 2025

Progress for #479

Reworks the Azure dispatch action to be nonpolling, which resolves the 360 minute timeout issue. It now creates a separate Azure CI Summary check, which will be updated inside Azure pipeline runs via the report-summary-check.yml template. This summary check is intended to be marked as required.

Once an Azure pipelines run finishes, it will report its overall status to the summary. If any jobs have failed or are cancelled, the summary check will be marked as failed. If all jobs have succeeded, the summary will pass.

Adds logic for handling rerunning the initial dispatch action, which will either start new runs or rerun existing runs depending on the state of the PR. Instructions for doing so are included directly in the summary checks for easy access.

Overview of flow:

  1. PR is created, Trigger Azure CI action is run on PR
  2. Trigger Azure CI kicks off Azure runs and creates an Azure CI Summary check
  3. As Azure runs finish, Azure CI Summary is updated with their statuses
  4. After all runs are finished, Azure CI Summary will be marked as successful or failed depending on the runs' statuses
  5. If a rerun is desired, Trigger Azure CI can be rerun, which will overwrite the existing Azure CI Summary with a new one

Sample runs:
Branch PR: #561 - summary
Fork PR: #600 - summary

Other stuff:

  • Copied the Github CLI fallback logic from pr_category_label.py to pr_detect_changed_subtrees.py
    • To fix dispatch action not running on PRs with infinite diffs
  • Removed paths-ignore from dispatch action

Copy link
Copy Markdown
Collaborator

@jayhawk-commits jayhawk-commits left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary looks great and has additional documentation to guide developers. This is a great first step.

Primary concern is the infinite loop, and then potential with code-reuse for the stage sequence. There is potential in doing some refactor in the recurring theme of bash code blocks running curl requests, but that can be a future improvement.

Comment thread .azuredevops/hipblas-common.yml Outdated
Comment thread .azuredevops/templates/report-summary-check.yml
Comment thread .github/workflows/azure-ci-dispatcher.yml Outdated
@danielsu-amd danielsu-amd force-pushed the users/danielsu/az-dispatch-nonpolling branch from 7a4f3ce to 372791d Compare July 11, 2025 20:27
bstefanuk pushed a commit that referenced this pull request Jul 11, 2025
@danielsu-amd
Copy link
Copy Markdown
Contributor Author

Created a new PR to test the stage template and loop limit: #610
https://github.com/ROCm/rocm-libraries/pull/610/checks?check_run_id=45827685201

Copy link
Copy Markdown
Collaborator

@jayhawk-commits jayhawk-commits left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@danielsu-amd danielsu-amd merged commit 2e26752 into develop Jul 11, 2025
6 checks passed
@danielsu-amd danielsu-amd deleted the users/danielsu/az-dispatch-nonpolling branch July 11, 2025 20:45
danielsu-amd added a commit that referenced this pull request Jul 11, 2025
Progress for #479

Reworks the Azure dispatch action to be nonpolling, which resolves the
360 minute timeout issue. It now creates a separate `Azure CI Summary`
check, which will be updated inside Azure pipeline runs via the
`report-summary-check.yml` template. This summary check is intended to
be marked as required.

Once an Azure pipelines run finishes, it will report its overall status
to the summary. If any jobs have failed or are cancelled, the summary
check will be marked as failed. If all jobs have succeeded, the summary
will pass.

Adds logic for handling rerunning the initial dispatch action, which
will either start new runs or rerun existing runs depending on the state
of the PR. Instructions for doing so are included directly in the
summary checks for easy access.

Overview of flow:
1. PR is created, `Trigger Azure CI` action is run on PR
2. `Trigger Azure CI` kicks off Azure runs and creates an `Azure CI
Summary` check
3. As Azure runs finish, `Azure CI Summary` is updated with their
statuses
4. After all runs are finished, `Azure CI Summary` will be marked as
successful or failed depending on the runs' statuses
5. If a rerun is desired, `Trigger Azure CI` can be rerun, which will
overwrite the existing `Azure CI Summary` with a new one

Sample runs:
Branch PR: #561 -
[summary](https://github.com/ROCm/rocm-libraries/pull/561/checks?check_run_id=45813198366)
Fork PR: #600 -
[summary](https://github.com/ROCm/rocm-libraries/pull/600/checks?check_run_id=45815418773)

Other stuff:
- Copied the Github CLI fallback logic from `pr_category_label.py` to
`pr_detect_changed_subtrees.py`
  - To fix dispatch action not running on PRs with infinite diffs
- Removed `paths-ignore` from dispatch action
danielsu-amd added a commit that referenced this pull request Jul 11, 2025
Pick Azure CI changes from develop into release-staging/rocm-rel-7.0

- #609
- #607
- #611
- #613
ammallya pushed a commit that referenced this pull request Jul 22, 2025
[ROCm/hipSPARSE commit: 39b2356]
ammallya pushed a commit that referenced this pull request Jul 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants