Skip to content

ci: route HF_TOKEN-using jobs through approved-workflow environment (…#1491

Merged
mgawarkiewicz-intel merged 1 commit into
releases/v0.21.0from
adobrzyn/v0.21.0-port-1473
May 25, 2026
Merged

ci: route HF_TOKEN-using jobs through approved-workflow environment (…#1491
mgawarkiewicz-intel merged 1 commit into
releases/v0.21.0from
adobrzyn/v0.21.0-port-1473

Conversation

@adobrzyn
Copy link
Copy Markdown
Collaborator

#1473)

Adds environment: approved-workflow to every job that consumes secrets.HF_TOKEN across the three CI workflows. Together with the existing approval gate in pre-merge-trigger.yaml (environment: pre-merge-approval, added in #1471), this completes the two-layer protection model:

PR opened
  -> pre-merge-trigger `gate` job: pauses for required reviewer (approval #1)
  -> on approval, pre-merge.yaml is dispatched
  -> downstream secret-using jobs resolve HF_TOKEN from the
     `approved-workflow` environment (no second per-job approval)

With HF_TOKEN previously at repo-secret scope, any matrix entry of any e2e/test job had direct access the moment CI started. The recent malicious fork PR exfiltrated it via an auto-discovered run_* function. After this change, the token is only released from a GitHub Environment that a maintainer-controlled deployment-branch rule restricts to main / releases/**, and only after the upstream gate has approved the dispatch.

We deliberately add the environment only on jobs that actually use the secret (15 jobs). Helper jobs (gatekeeper, discover_*, retrieve_*, pre-commit, post-comment, cleanup_*, build_nixl_dockerfile, check_dockerfile_changes, prepare-release-branch, summarize_and_notify, setup_and_build,
store_last_stable_vllm_commit) do not touch HF_TOKEN and are not modified, to avoid pointless extra gate evaluations.

  • pre-merge.yaml: hpu_unit_tests, hpu_pd_tests, hpu_perf_tests, hpu_dp_tests, e2e, calibration_tests
  • hourly-ci.yaml: run_unit_tests, e2e, run_data_parallel_test, run_pd_disaggregate_test
  • create-release-branch.yaml: run_unit_tests, e2e, run_data_parallel_test, run_pd_disaggregate_test, run_hpu_perf_tests

+15 lines, 0 deletions. Each touched job gets exactly one new line: environment: approved-workflow, inserted immediately after runs-on:.

  1. Settings → Environments → create environment approved-workflow.
  2. Add HF_TOKEN as an environment secret (the rotated value).
  3. No required reviewers on this environment (the upstream pre-merge-approval gate already enforces approval; adding reviewers here would prompt once per job).
  4. Deployment branches and tags: Selected branches → main, releases/**. Prevents a fork PR from claiming the environment from a non-trusted ref.
  5. Delete HF_TOKEN from repository-level secrets so the environment value is the only source.

Validated end-to-end against bmyrcha/vllm-gaudi first using a benign fork PR. With the two environments configured as above, the gate paused as expected, jobs received the secret after approval without a second prompt, and a deliberately mis-authored downstream PR could not reach the secret.

Close-cross-ref: builds on #1471.

(cherry picked from commit ca2d952)

…1473)

Adds `environment: approved-workflow` to every job that consumes
`secrets.HF_TOKEN` across the three CI workflows. Together with the
existing approval gate in `pre-merge-trigger.yaml` (`environment:
pre-merge-approval`, added in #1471), this completes the two-layer
protection model:

```
PR opened
  -> pre-merge-trigger `gate` job: pauses for required reviewer (approval #1)
  -> on approval, pre-merge.yaml is dispatched
  -> downstream secret-using jobs resolve HF_TOKEN from the
     `approved-workflow` environment (no second per-job approval)
```

With `HF_TOKEN` previously at repo-secret scope, any matrix entry of any
e2e/test job had direct access the moment CI started. The recent
malicious fork PR exfiltrated it via an auto-discovered `run_*`
function. After this change, the token is only released from a GitHub
Environment that a maintainer-controlled deployment-branch rule
restricts to `main` / `releases/**`, and only after the upstream gate
has approved the dispatch.

We deliberately add the environment only on jobs that actually use the
secret (15 jobs). Helper jobs (`gatekeeper`, `discover_*`, `retrieve_*`,
`pre-commit`, `post-comment`, `cleanup_*`, `build_nixl_dockerfile`,
`check_dockerfile_changes`, `prepare-release-branch`,
`summarize_and_notify`, `setup_and_build`,
`store_last_stable_vllm_commit`) do not touch HF_TOKEN and are not
modified, to avoid pointless extra gate evaluations.

- `pre-merge.yaml`: `hpu_unit_tests`, `hpu_pd_tests`, `hpu_perf_tests`,
`hpu_dp_tests`, `e2e`, `calibration_tests`
- `hourly-ci.yaml`: `run_unit_tests`, `e2e`, `run_data_parallel_test`,
`run_pd_disaggregate_test`
- `create-release-branch.yaml`: `run_unit_tests`, `e2e`,
`run_data_parallel_test`, `run_pd_disaggregate_test`,
`run_hpu_perf_tests`

+15 lines, 0 deletions. Each touched job gets exactly one new line:
`environment: approved-workflow`, inserted immediately after `runs-on:`.

1. Settings → Environments → create environment **`approved-workflow`**.
2. Add **`HF_TOKEN`** as an environment secret (the rotated value).
3. **No required reviewers** on this environment (the upstream
`pre-merge-approval` gate already enforces approval; adding reviewers
here would prompt once per job).
4. **Deployment branches and tags**: Selected branches → `main`,
`releases/**`. Prevents a fork PR from claiming the environment from a
non-trusted ref.
5. **Delete** `HF_TOKEN` from repository-level secrets so the
environment value is the only source.

Validated end-to-end against `bmyrcha/vllm-gaudi` first using a benign
fork PR. With the two environments configured as above, the gate paused
as expected, jobs received the secret after approval without a second
prompt, and a deliberately mis-authored downstream PR could not reach
the secret.

Close-cross-ref: builds on #1471.

Signed-off-by: Agata Dobrzyniewicz <adobrzyniewicz@habana.ai>
(cherry picked from commit ca2d952)
Copilot AI review requested due to automatic review settings May 25, 2026 08:07
@adobrzyn adobrzyn had a problem deploying to pre-merge-approval May 25, 2026 08:07 — with GitHub Actions Error
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Routes all jobs that reference secrets.HF_TOKEN through the approved-workflow GitHub Environment, so the secret is only released after the upstream approval gate and only for trusted refs, reducing the blast radius of CI runs on untrusted PR content.

Changes:

  • Added environment: approved-workflow to HF_TOKEN-consuming jobs in pre-merge.yaml.
  • Added environment: approved-workflow to HF_TOKEN-consuming jobs in hourly-ci.yaml.
  • Added environment: approved-workflow to HF_TOKEN-consuming jobs in create-release-branch.yaml.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
.github/workflows/pre-merge.yaml Adds environment: approved-workflow to all pre-merge jobs that pass secrets.HF_TOKEN into containers.
.github/workflows/hourly-ci.yaml Adds environment: approved-workflow to hourly jobs that pass secrets.HF_TOKEN into containers.
.github/workflows/create-release-branch.yaml Adds environment: approved-workflow to release-branch test jobs that pass secrets.HF_TOKEN into containers.

@mgawarkiewicz-intel mgawarkiewicz-intel merged commit 95b5db5 into releases/v0.21.0 May 25, 2026
2 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants