Skip to content

Add pre-merge-approval for execute_pre_merge#1471

Merged
adobrzyn merged 1 commit into
vllm-project:mainfrom
bmyrcha:bmyrcha/pre-merge-approval
May 21, 2026
Merged

Add pre-merge-approval for execute_pre_merge#1471
adobrzyn merged 1 commit into
vllm-project:mainfrom
bmyrcha:bmyrcha/pre-merge-approval

Conversation

@bmyrcha
Copy link
Copy Markdown
Collaborator

@bmyrcha bmyrcha commented May 21, 2026

No description provided.

Signed-off-by: Bartosz Myrcha <bartosz.myrcha@intel.com>
Copilot AI review requested due to automatic review settings May 21, 2026 10:06
@adobrzyn adobrzyn merged commit dc459b8 into vllm-project:main May 21, 2026
1 of 2 checks passed
@github-actions
Copy link
Copy Markdown

🚧 CI Blocked

The main CI workflow was not started for the following reason:

Your branch is behind the base branch. Please merge or rebase to get the latest changes.

@bmyrcha bmyrcha review requested due to automatic review settings May 21, 2026 10:28
adobrzyn added a commit to adobrzyn/vllm-gaudi-bmyrcha that referenced this pull request May 21, 2026
Adds 'step-security/harden-runner@v2.19.3' (SHA-pinned) as the first
step of every CI job that consumes secrets.HF_TOKEN, configured with
'egress-policy: block' and a curated allow-list of endpoints that the
current build + test pipeline actually needs.

Allow-list (derived from reading .github/Dockerfile.ci, the workflow
files, and tests/full_tests/ci_e2e_discoverable_tests.sh):

  GitHub Actions infrastructure:
    api.github.com, github.com, codeload.github.com,
    objects.githubusercontent.com, raw.githubusercontent.com,
    release-assets.githubusercontent.com,
    *.actions.githubusercontent.com,
    results-receiver.actions.githubusercontent.com,
    ghcr.io, pkg-containers.githubusercontent.com,
    *.blob.core.windows.net  (cache / artifacts)

  Docker base image (build phase):
    vault.habana.ai  (Habana Gaudi base)

  Python packages (build + test phase):
    pypi.org, files.pythonhosted.org,
    download.pytorch.org  (torchaudio CPU wheel)

  Model weights (test phase):
    huggingface.co, cdn-lfs.huggingface.co, cdn-lfs.hf.co,
    cdn-lfs-us-1.hf.co, cas-bridge.xethub.hf.co, xet-lfs-us-1.hf.co

Because every test container is launched with '--network=host', the
host-level eBPF filter installed by harden-runner sees and enforces
on the container's traffic — no per-container instrumentation needed.

This is defense-in-depth, layered on top of:
  - pre-merge-trigger approval gate (vllm-project#1471)
  - approved-workflow environment for HF_TOKEN (vllm-project#1473)

Together these three changes mean a planted payload in a PR cannot:
  1. run at all without maintainer approval        (vllm-project#1471)
  2. receive HF_TOKEN without environment approval (vllm-project#1473)
  3. exfiltrate to an attacker-controlled host     (this PR)

If anything legitimate gets blocked, the harden-runner check run
will identify the host that was denied; we add it to the allow-list
in a follow-up.

Affected jobs (15 - same set as vllm-project#1473):
  pre-merge.yaml:           hpu_unit_tests, hpu_pd_tests, hpu_perf_tests,
                            hpu_dp_tests, e2e, calibration_tests
  hourly-ci.yaml:           run_unit_tests, e2e, run_data_parallel_test,
                            run_pd_disaggregate_test
  create-release-branch:    run_unit_tests, e2e, run_data_parallel_test,
                            run_pd_disaggregate_test, run_hpu_perf_tests

Signed-off-by: Agata Dobrzyniewicz <adobrzyniewicz@habana.ai>
adobrzyn added a commit that referenced this pull request May 21, 2026
…1473)

## Summary

Adds `environment: approved-workflow` to every job that consumes
`secrets.HF_TOKEN` across the three CI workflows. Together with the
existing approval gate in `pre-merge-trigger.yaml` (`environment:
pre-merge-approval`, added in #1471), this completes the two-layer
protection model:

```
PR opened
  -> pre-merge-trigger `gate` job: pauses for required reviewer (approval #1)
  -> on approval, pre-merge.yaml is dispatched
  -> downstream secret-using jobs resolve HF_TOKEN from the
     `approved-workflow` environment (no second per-job approval)
```

## Why

With `HF_TOKEN` previously at repo-secret scope, any matrix entry of any
e2e/test job had direct access the moment CI started. The recent
malicious fork PR exfiltrated it via an auto-discovered `run_*`
function. After this change, the token is only released from a GitHub
Environment that a maintainer-controlled deployment-branch rule
restricts to `main` / `releases/**`, and only after the upstream gate
has approved the dispatch.

We deliberately add the environment only on jobs that actually use the
secret (15 jobs). Helper jobs (`gatekeeper`, `discover_*`, `retrieve_*`,
`pre-commit`, `post-comment`, `cleanup_*`, `build_nixl_dockerfile`,
`check_dockerfile_changes`, `prepare-release-branch`,
`summarize_and_notify`, `setup_and_build`,
`store_last_stable_vllm_commit`) do not touch HF_TOKEN and are not
modified, to avoid pointless extra gate evaluations.

## Affected jobs (15)

- `pre-merge.yaml`: `hpu_unit_tests`, `hpu_pd_tests`, `hpu_perf_tests`,
`hpu_dp_tests`, `e2e`, `calibration_tests`
- `hourly-ci.yaml`: `run_unit_tests`, `e2e`, `run_data_parallel_test`,
`run_pd_disaggregate_test`
- `create-release-branch.yaml`: `run_unit_tests`, `e2e`,
`run_data_parallel_test`, `run_pd_disaggregate_test`,
`run_hpu_perf_tests`

## Diff

+15 lines, 0 deletions. Each touched job gets exactly one new line:
`environment: approved-workflow`, inserted immediately after `runs-on:`.

## Required repo configuration (before this PR can be merged safely)

1. Settings → Environments → create environment **`approved-workflow`**.
2. Add **`HF_TOKEN`** as an environment secret (the rotated value).
3. **No required reviewers** on this environment (the upstream
`pre-merge-approval` gate already enforces approval; adding reviewers
here would prompt once per job).
4. **Deployment branches and tags**: Selected branches → `main`,
`releases/**`. Prevents a fork PR from claiming the environment from a
non-trusted ref.
5. **Delete** `HF_TOKEN` from repository-level secrets so the
environment value is the only source.

## Testing

Validated end-to-end against `bmyrcha/vllm-gaudi` first using a benign
fork PR. With the two environments configured as above, the gate paused
as expected, jobs received the secret after approval without a second
prompt, and a deliberately mis-authored downstream PR could not reach
the secret.

Close-cross-ref: builds on #1471.

Signed-off-by: Agata Dobrzyniewicz <adobrzyniewicz@habana.ai>
12010486 pushed a commit to 12010486/vllm-gaudi that referenced this pull request May 21, 2026
Signed-off-by: Bartosz Myrcha <bartosz.myrcha@intel.com>
Signed-off-by: 12010486 <silvia.colabrese@intel.com>
12010486 pushed a commit to 12010486/vllm-gaudi that referenced this pull request May 21, 2026
…llm-project#1473)

## Summary

Adds `environment: approved-workflow` to every job that consumes
`secrets.HF_TOKEN` across the three CI workflows. Together with the
existing approval gate in `pre-merge-trigger.yaml` (`environment:
pre-merge-approval`, added in vllm-project#1471), this completes the two-layer
protection model:

```
PR opened
  -> pre-merge-trigger `gate` job: pauses for required reviewer (approval #1)
  -> on approval, pre-merge.yaml is dispatched
  -> downstream secret-using jobs resolve HF_TOKEN from the
     `approved-workflow` environment (no second per-job approval)
```

## Why

With `HF_TOKEN` previously at repo-secret scope, any matrix entry of any
e2e/test job had direct access the moment CI started. The recent
malicious fork PR exfiltrated it via an auto-discovered `run_*`
function. After this change, the token is only released from a GitHub
Environment that a maintainer-controlled deployment-branch rule
restricts to `main` / `releases/**`, and only after the upstream gate
has approved the dispatch.

We deliberately add the environment only on jobs that actually use the
secret (15 jobs). Helper jobs (`gatekeeper`, `discover_*`, `retrieve_*`,
`pre-commit`, `post-comment`, `cleanup_*`, `build_nixl_dockerfile`,
`check_dockerfile_changes`, `prepare-release-branch`,
`summarize_and_notify`, `setup_and_build`,
`store_last_stable_vllm_commit`) do not touch HF_TOKEN and are not
modified, to avoid pointless extra gate evaluations.

## Affected jobs (15)

- `pre-merge.yaml`: `hpu_unit_tests`, `hpu_pd_tests`, `hpu_perf_tests`,
`hpu_dp_tests`, `e2e`, `calibration_tests`
- `hourly-ci.yaml`: `run_unit_tests`, `e2e`, `run_data_parallel_test`,
`run_pd_disaggregate_test`
- `create-release-branch.yaml`: `run_unit_tests`, `e2e`,
`run_data_parallel_test`, `run_pd_disaggregate_test`,
`run_hpu_perf_tests`

## Diff

+15 lines, 0 deletions. Each touched job gets exactly one new line:
`environment: approved-workflow`, inserted immediately after `runs-on:`.

## Required repo configuration (before this PR can be merged safely)

1. Settings → Environments → create environment **`approved-workflow`**.
2. Add **`HF_TOKEN`** as an environment secret (the rotated value).
3. **No required reviewers** on this environment (the upstream
`pre-merge-approval` gate already enforces approval; adding reviewers
here would prompt once per job).
4. **Deployment branches and tags**: Selected branches → `main`,
`releases/**`. Prevents a fork PR from claiming the environment from a
non-trusted ref.
5. **Delete** `HF_TOKEN` from repository-level secrets so the
environment value is the only source.

## Testing

Validated end-to-end against `bmyrcha/vllm-gaudi` first using a benign
fork PR. With the two environments configured as above, the gate paused
as expected, jobs received the secret after approval without a second
prompt, and a deliberately mis-authored downstream PR could not reach
the secret.

Close-cross-ref: builds on vllm-project#1471.

Signed-off-by: Agata Dobrzyniewicz <adobrzyniewicz@habana.ai>
Signed-off-by: 12010486 <silvia.colabrese@intel.com>
mgawarkiewicz-intel pushed a commit that referenced this pull request May 25, 2026
(cherry picked from commit dc459b8)

Signed-off-by: Bartosz Myrcha <bartosz.myrcha@intel.com>
Co-authored-by: Bartosz Myrcha <bartosz.myrcha@intel.com>
mgawarkiewicz-intel pushed a commit that referenced this pull request May 25, 2026
#1491)

…#1473)

Adds `environment: approved-workflow` to every job that consumes
`secrets.HF_TOKEN` across the three CI workflows. Together with the
existing approval gate in `pre-merge-trigger.yaml` (`environment:
pre-merge-approval`, added in #1471), this completes the two-layer
protection model:

```
PR opened
  -> pre-merge-trigger `gate` job: pauses for required reviewer (approval #1)
  -> on approval, pre-merge.yaml is dispatched
  -> downstream secret-using jobs resolve HF_TOKEN from the
     `approved-workflow` environment (no second per-job approval)
```

With `HF_TOKEN` previously at repo-secret scope, any matrix entry of any
e2e/test job had direct access the moment CI started. The recent
malicious fork PR exfiltrated it via an auto-discovered `run_*`
function. After this change, the token is only released from a GitHub
Environment that a maintainer-controlled deployment-branch rule
restricts to `main` / `releases/**`, and only after the upstream gate
has approved the dispatch.

We deliberately add the environment only on jobs that actually use the
secret (15 jobs). Helper jobs (`gatekeeper`, `discover_*`, `retrieve_*`,
`pre-commit`, `post-comment`, `cleanup_*`, `build_nixl_dockerfile`,
`check_dockerfile_changes`, `prepare-release-branch`,
`summarize_and_notify`, `setup_and_build`,
`store_last_stable_vllm_commit`) do not touch HF_TOKEN and are not
modified, to avoid pointless extra gate evaluations.

- `pre-merge.yaml`: `hpu_unit_tests`, `hpu_pd_tests`, `hpu_perf_tests`,
`hpu_dp_tests`, `e2e`, `calibration_tests`
- `hourly-ci.yaml`: `run_unit_tests`, `e2e`, `run_data_parallel_test`,
`run_pd_disaggregate_test`
- `create-release-branch.yaml`: `run_unit_tests`, `e2e`,
`run_data_parallel_test`, `run_pd_disaggregate_test`,
`run_hpu_perf_tests`

+15 lines, 0 deletions. Each touched job gets exactly one new line:
`environment: approved-workflow`, inserted immediately after `runs-on:`.

1. Settings → Environments → create environment **`approved-workflow`**.
2. Add **`HF_TOKEN`** as an environment secret (the rotated value).
3. **No required reviewers** on this environment (the upstream
`pre-merge-approval` gate already enforces approval; adding reviewers
here would prompt once per job).
4. **Deployment branches and tags**: Selected branches → `main`,
`releases/**`. Prevents a fork PR from claiming the environment from a
non-trusted ref.
5. **Delete** `HF_TOKEN` from repository-level secrets so the
environment value is the only source.

Validated end-to-end against `bmyrcha/vllm-gaudi` first using a benign
fork PR. With the two environments configured as above, the gate paused
as expected, jobs received the secret after approval without a second
prompt, and a deliberately mis-authored downstream PR could not reach
the secret.

Close-cross-ref: builds on #1471.


(cherry picked from commit ca2d952)

Signed-off-by: Agata Dobrzyniewicz <adobrzyniewicz@habana.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants