-
Notifications
You must be signed in to change notification settings - Fork 2.6k
chore: add structure-check framework to prevent prebuilt binaries #6031
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Note I intentionally didn't remove the existing committed binaries in this PR as I think that should be a distinct followup so that merging this PR is obviously safe. |
2ee29c4 to
853da7c
Compare
| uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # pin@v4 | ||
|
|
||
| - uses: actions-rust-lang/setup-rust-toolchain@v1 | ||
| - name: Set up Rust toolchain |
Check failure
Code scanning / Semgrep OSS
Insecure GitHub Actions: Third-Party Action Not Pinned to Commit SHA Error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not new to this PR though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this maybe slipped in before this check was active? not sure why it wouldn't have been flagged before.
In any case, pinning is as simple as
npx pin-github-action .github/workflows/ci.yml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR introduces a new CI framework for repository structure validation that runs before builds. It generalizes the existing rustfmt check and adds a security check for prebuilt binaries, making it easy to add new structure checks in the future.
Key Changes
- Adds extensible structure check framework in
scripts/ci/structure/ - Implements security check to prevent prebuilt binaries from being committed
- Moves Rust formatting check into the new framework
- Updates CI to run structure checks as a dedicated job
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
.github/workflows/ci.yml |
Renames rust-format job to structure-check and adds hermit activation to run just structure-check |
Justfile |
Adds structure-check target that discovers and runs all scripts in scripts/ci/structure/, integrates into check-everything |
scripts/ci/structure/check-rust-format.sh |
Extracts Rust formatting check into a standalone script that runs cargo fmt --check |
scripts/ci/structure/check-prebuilt-binaries.sh |
Implements comprehensive security check for prebuilt binaries using file extensions and content analysis |
scripts/ci/structure/README.md |
Documents the structure check framework with guidelines for adding new checks |
|
|
||
| for file in "${!allowlisted_files[@]}"; do | ||
| echo " [ALLOWLISTED] $file" | ||
| if [ -f "$file" ]; then |
Copilot
AI
Dec 9, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hardcoded /usr/bin/file path may fail on systems where file is installed elsewhere (e.g., macOS with Homebrew). Consider using file without the full path, which will use the system PATH.
| echo "" | ||
|
|
||
| for file in "${!violations[@]}"; do | ||
| echo " [VIOLATION] $file" |
Copilot
AI
Dec 9, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hardcoded /usr/bin/file path may fail on systems where file is installed elsewhere (e.g., macOS with Homebrew). Consider using file without the full path, which will use the system PATH.
Justfile
Outdated
| for script in $scripts; do | ||
| echo "" | ||
| echo "========================================" | ||
| echo "Running: $script" | ||
| echo "========================================" | ||
| $script || exit 1 | ||
| done |
Copilot
AI
Dec 9, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The loop variable $scripts should be quoted to prevent word splitting issues if any script path contains spaces. Change for script in $scripts to while read -r script; do ... done <<< "$scripts" or use an array.
| for script in $scripts; do | |
| echo "" | |
| echo "========================================" | |
| echo "Running: $script" | |
| echo "========================================" | |
| $script || exit 1 | |
| done | |
| while read -r script; do | |
| echo "" | |
| echo "========================================" | |
| echo "Running: $script" | |
| echo "========================================" | |
| $script || exit 1 | |
| done <<< "$scripts" |
Justfile
Outdated
| echo "Running repository structure checks..." | ||
|
|
||
| # Find and sort all executable scripts | ||
| scripts=$(find scripts/ci/structure -name "*.sh" -type f -executable 2>/dev/null | sort) |
Copilot
AI
Dec 9, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The -executable flag is a GNU extension that may not work on BSD systems (macOS uses BSD find by default). Consider using -perm +111 for better portability, or use: find scripts/ci/structure -name "*.sh" -type f -perm +111
| scripts=$(find scripts/ci/structure -name "*.sh" -type f -executable 2>/dev/null | sort) | |
| scripts=$(find scripts/ci/structure -name "*.sh" -type f -perm +111 2>/dev/null | sort) |
Justfile
Outdated
| echo "========================================" | ||
| echo "Running: $script" | ||
| echo "========================================" | ||
| $script || exit 1 |
Copilot
AI
Dec 9, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The variable $script should be quoted to handle filenames with spaces safely. Change to "$script" to prevent word splitting.
| $script || exit 1 | |
| "$script" || exit 1 |
| else | ||
| violations["$filepath"]=1 | ||
| fi | ||
| fi |
Copilot
AI
Dec 9, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hardcoded /usr/bin/file path may fail on systems where file is installed elsewhere (e.g., macOS with Homebrew). Consider using file without the full path, which will use the system PATH.
| else | ||
| violations["$filepath"]=1 | ||
| fi | ||
| fi |
Copilot
AI
Dec 9, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The use of xargs with printf '%s\n' "${files_to_check[@]}" and /usr/bin/file allows file paths with spaces, newlines, or leading dashes to be split or interpreted as options, so extensionless binaries with such names can bypass this prebuilt-binary check. An attacker could commit a malicious executable with a crafted filename that is never scanned correctly by file, undermining the intended supply-chain protection. To fix this, use NUL-delimited paths (e.g., git ls-files -z / xargs -0) and pass -- before file paths, or avoid xargs entirely by looping over the array directly.
| echo "" | ||
| echo "To remove these files:" | ||
| for file in "${!violations[@]}"; do | ||
| echo " git rm \"$file\"" |
Copilot
AI
Dec 9, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The suggested remediation commands git rm "$file" interpolate untrusted file paths directly into a shell command string, so a malicious filename containing shell metacharacters (e.g., $(...) or backticks) could trigger arbitrary command execution on a developer machine if copied and executed as-is. An attacker can open a PR with such a filename, cause this check to fail, and rely on a maintainer copying the provided git rm lines from CI logs, leading to remote code execution. To mitigate this, shell-escape paths before printing (e.g., printf 'git rm -- %q\n' "$file") or print the file names separately and instruct developers to type git rm -- manually.
| echo " git rm \"$file\"" | |
| printf ' git rm -- %q\n' "$file" |
|
OK most of the Copilot review comments are basically "shell script is a very suboptimal programming language" and I agree. One thing we could do instead is use OpenSSF - probably a good idea in general, but this will run up against ossf/scorecard#1270 unless we remove the binaries first. |
|
Another alternative (one I like a lot) is to adopt the xtask pattern here. I use it in bootc for example. This code would be a lot nicer in Rust (and using https://docs.rs/xshell/latest/xshell/ ). |
Add a new CI framework for repository structure validation that runs before any builds, with initial checks for prebuilt binaries and Rust formatting. Generalize the `rustfmt` job and add scripts/ci/structure/ directory for extensible static checks. Wire it up in `Justfile`. The framework is designed to be easily extended - just add an executable .sh script to scripts/ci/structure/ and it will automatically run in CI. Assisted-By: Claude Code (Sonnet 4.5) Signed-off-by: Colin Walters <walters@verbum.org>
853da7c to
e36b1b9
Compare
|
I noticed there's discussion about filtering the git repository and moving it in https://discord.com/channels/1287729918100246654/1408153538537721966/1448751127175893024 I hope we can drop the pre-generated binaries from the git repository at the same time. Also I have another medium-spicy request: if we're reworking the git repository, can we create a "goose-core" distinct from "goose" that is just the CLI and HTTP APIs? Closer to Gemini CLI and Claude Code etc. This should also drop the default dependency on libx11. And things like the blog posts and videos would also go in goose and not goose-core. And...I think we should have a higher review/code quality bar for goose-core. |
A while ago I was working on goose and found they'd committed binary files into their git repo, and worked on a PR to block that: block/goose#6031 However at the time I'd investigated OpenSSF (which has a check for this among other things) but didn't quite get it to work. After some iteration with my agent, this model seems to work well. That said I think what I'd like to do is propose something like this to upstream OpenSSF - their default action doesn't check *each* commit for example, but we want to deny binaries in all commits. This also implements a model where the check only fails if the score *regresses*. Assisted-by: OpenCode (Opus 4.5) Signed-off-by: Colin Walters <walters@verbum.org>
jamadeo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @cgwalters this looks great. We do run the OpenSSF scorecard, but it doesn't block on any regressions: https://scorecard.dev/viewer/?uri=github.com%2Fblock%2Fgoose (lots of room for improvement there, certainly)
I think your suggestions make a lot of sense, and while we don't have any immediate plans to split the repo, there are good candidates for separate projects (providers, e.g.) and good reasons to split built in MCPs (x11, as you said).
| uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # pin@v4 | ||
|
|
||
| - uses: actions-rust-lang/setup-rust-toolchain@v1 | ||
| - name: Set up Rust toolchain |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this maybe slipped in before this check was active? not sure why it wouldn't have been flagged before.
In any case, pinning is as simple as
npx pin-github-action .github/workflows/ci.yml
|
Since I wrote this PR I ended up doing https://github.com/bootc-dev/actions/blob/main/openssf-scorecard/action.yml - could either use directly or copy. A key thing here is it's not just about binaries, but about verifying the OpenSSF scorecard doesn't go backwards (on any commit). That said I may see about trying to upstream some of this into the openssf actions. |
That makes sense to me. We've had several issues that would have been caught by looking at a regression in OpenSSF scorecard checks. Do you think it makes sense to merge what you've done here? or go with the scorecard regression check approach? |
Summary
Add a new CI framework for repository structure validation that runs before any builds, with initial checks for prebuilt binaries and Rust formatting.
Generalize the
rustfmtjob and add scripts/ci/structure/ directory for extensible static checks. Wire it up inJustfile.The framework is designed to be easily extended - just add an executable .sh script to scripts/ci/structure/ and it will automatically run in CI.
Assisted-By: Claude Code (Sonnet 4.5)
Type of Change
AI Assistance
Testing
This is a test, but I also cross checked that adding a new binary triggered the warning, etc.
Related Issues
Relates to #880 (comment)