Skip to content

Conversation

@cgwalters
Copy link
Contributor

Summary

Add a new CI framework for repository structure validation that runs before any builds, with initial checks for prebuilt binaries and Rust formatting.

Generalize the rustfmt job and add scripts/ci/structure/ directory for extensible static checks. Wire it up in Justfile.

The framework is designed to be easily extended - just add an executable .sh script to scripts/ci/structure/ and it will automatically run in CI.

Assisted-By: Claude Code (Sonnet 4.5)

Type of Change

  • Feature
  • Bug fix
  • Refactor / Code quality
  • Performance improvement
  • Documentation
  • Tests
  • Security fix
  • Build / Release
  • Other (specify below)

AI Assistance

  • This PR was created or reviewed with AI assistance

Testing

This is a test, but I also cross checked that adding a new binary triggered the warning, etc.

Related Issues

Relates to #880 (comment)

Copilot AI review requested due to automatic review settings December 9, 2025 23:05
@cgwalters cgwalters mentioned this pull request Dec 9, 2025
@cgwalters
Copy link
Contributor Author

Note I intentionally didn't remove the existing committed binaries in this PR as I think that should be a distinct followup so that merging this PR is obviously safe.

uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # pin@v4

- uses: actions-rust-lang/setup-rust-toolchain@v1
- name: Set up Rust toolchain

Check failure

Code scanning / Semgrep OSS

Insecure GitHub Actions: Third-Party Action Not Pinned to Commit SHA Error

Insecure GitHub Actions: Third-Party Action Not Pinned to Commit SHA
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not new to this PR though.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this maybe slipped in before this check was active? not sure why it wouldn't have been flagged before.

In any case, pinning is as simple as

npx pin-github-action .github/workflows/ci.yml

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a new CI framework for repository structure validation that runs before builds. It generalizes the existing rustfmt check and adds a security check for prebuilt binaries, making it easy to add new structure checks in the future.

Key Changes

  • Adds extensible structure check framework in scripts/ci/structure/
  • Implements security check to prevent prebuilt binaries from being committed
  • Moves Rust formatting check into the new framework
  • Updates CI to run structure checks as a dedicated job

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
.github/workflows/ci.yml Renames rust-format job to structure-check and adds hermit activation to run just structure-check
Justfile Adds structure-check target that discovers and runs all scripts in scripts/ci/structure/, integrates into check-everything
scripts/ci/structure/check-rust-format.sh Extracts Rust formatting check into a standalone script that runs cargo fmt --check
scripts/ci/structure/check-prebuilt-binaries.sh Implements comprehensive security check for prebuilt binaries using file extensions and content analysis
scripts/ci/structure/README.md Documents the structure check framework with guidelines for adding new checks


for file in "${!allowlisted_files[@]}"; do
echo " [ALLOWLISTED] $file"
if [ -f "$file" ]; then
Copy link

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcoded /usr/bin/file path may fail on systems where file is installed elsewhere (e.g., macOS with Homebrew). Consider using file without the full path, which will use the system PATH.

Copilot uses AI. Check for mistakes.
echo ""

for file in "${!violations[@]}"; do
echo " [VIOLATION] $file"
Copy link

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcoded /usr/bin/file path may fail on systems where file is installed elsewhere (e.g., macOS with Homebrew). Consider using file without the full path, which will use the system PATH.

Copilot uses AI. Check for mistakes.
Justfile Outdated
Comment on lines 21 to 27
for script in $scripts; do
echo ""
echo "========================================"
echo "Running: $script"
echo "========================================"
$script || exit 1
done
Copy link

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The loop variable $scripts should be quoted to prevent word splitting issues if any script path contains spaces. Change for script in $scripts to while read -r script; do ... done <<< "$scripts" or use an array.

Suggested change
for script in $scripts; do
echo ""
echo "========================================"
echo "Running: $script"
echo "========================================"
$script || exit 1
done
while read -r script; do
echo ""
echo "========================================"
echo "Running: $script"
echo "========================================"
$script || exit 1
done <<< "$scripts"

Copilot uses AI. Check for mistakes.
Justfile Outdated
echo "Running repository structure checks..."

# Find and sort all executable scripts
scripts=$(find scripts/ci/structure -name "*.sh" -type f -executable 2>/dev/null | sort)
Copy link

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The -executable flag is a GNU extension that may not work on BSD systems (macOS uses BSD find by default). Consider using -perm +111 for better portability, or use: find scripts/ci/structure -name "*.sh" -type f -perm +111

Suggested change
scripts=$(find scripts/ci/structure -name "*.sh" -type f -executable 2>/dev/null | sort)
scripts=$(find scripts/ci/structure -name "*.sh" -type f -perm +111 2>/dev/null | sort)

Copilot uses AI. Check for mistakes.
Justfile Outdated
echo "========================================"
echo "Running: $script"
echo "========================================"
$script || exit 1
Copy link

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable $script should be quoted to handle filenames with spaces safely. Change to "$script" to prevent word splitting.

Suggested change
$script || exit 1
"$script" || exit 1

Copilot uses AI. Check for mistakes.
else
violations["$filepath"]=1
fi
fi
Copy link

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcoded /usr/bin/file path may fail on systems where file is installed elsewhere (e.g., macOS with Homebrew). Consider using file without the full path, which will use the system PATH.

Copilot uses AI. Check for mistakes.
else
violations["$filepath"]=1
fi
fi
Copy link

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of xargs with printf '%s\n' "${files_to_check[@]}" and /usr/bin/file allows file paths with spaces, newlines, or leading dashes to be split or interpreted as options, so extensionless binaries with such names can bypass this prebuilt-binary check. An attacker could commit a malicious executable with a crafted filename that is never scanned correctly by file, undermining the intended supply-chain protection. To fix this, use NUL-delimited paths (e.g., git ls-files -z / xargs -0) and pass -- before file paths, or avoid xargs entirely by looping over the array directly.

Copilot uses AI. Check for mistakes.
echo ""
echo "To remove these files:"
for file in "${!violations[@]}"; do
echo " git rm \"$file\""
Copy link

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The suggested remediation commands git rm "$file" interpolate untrusted file paths directly into a shell command string, so a malicious filename containing shell metacharacters (e.g., $(...) or backticks) could trigger arbitrary command execution on a developer machine if copied and executed as-is. An attacker can open a PR with such a filename, cause this check to fail, and rely on a maintainer copying the provided git rm lines from CI logs, leading to remote code execution. To mitigate this, shell-escape paths before printing (e.g., printf 'git rm -- %q\n' "$file") or print the file names separately and instruct developers to type git rm -- manually.

Suggested change
echo " git rm \"$file\""
printf ' git rm -- %q\n' "$file"

Copilot uses AI. Check for mistakes.
@cgwalters
Copy link
Contributor Author

OK most of the Copilot review comments are basically "shell script is a very suboptimal programming language" and I agree.

One thing we could do instead is use OpenSSF - probably a good idea in general, but this will run up against ossf/scorecard#1270 unless we remove the binaries first.

@cgwalters
Copy link
Contributor Author

Another alternative (one I like a lot) is to adopt the xtask pattern here. I use it in bootc for example.

This code would be a lot nicer in Rust (and using https://docs.rs/xshell/latest/xshell/ ).

Add a new CI framework for repository structure validation that runs
before any builds, with initial checks for prebuilt binaries and Rust
formatting.

Generalize the `rustfmt` job and add scripts/ci/structure/ directory
for extensible static checks. Wire it up in `Justfile`.

The framework is designed to be easily extended - just add an executable
.sh script to scripts/ci/structure/ and it will automatically run in CI.

Assisted-By: Claude Code (Sonnet 4.5)
Signed-off-by: Colin Walters <walters@verbum.org>
@cgwalters
Copy link
Contributor Author

I noticed there's discussion about filtering the git repository and moving it in https://discord.com/channels/1287729918100246654/1408153538537721966/1448751127175893024

I hope we can drop the pre-generated binaries from the git repository at the same time.

Also I have another medium-spicy request: if we're reworking the git repository, can we create a "goose-core" distinct from "goose" that is just the CLI and HTTP APIs? Closer to Gemini CLI and Claude Code etc.

This should also drop the default dependency on libx11.

And things like the blog posts and videos would also go in goose and not goose-core.

And...I think we should have a higher review/code quality bar for goose-core.

@jamadeo jamadeo self-assigned this Dec 17, 2025
cgwalters added a commit to cgwalters/actions that referenced this pull request Jan 15, 2026
A while ago I was working on goose and found they'd committed
binary files into their git repo, and worked on a PR to block
that:
block/goose#6031

However at the time I'd investigated OpenSSF (which has a check
for this among other things) but didn't quite get it to work.

After some iteration with my agent, this model seems to work
well. That said I think what I'd like to do is propose
something like this to upstream OpenSSF - their default action
doesn't check *each* commit for example, but we want to deny
binaries in all commits.

This also implements a model where the check only fails if the
score *regresses*.

Assisted-by: OpenCode (Opus 4.5)
Signed-off-by: Colin Walters <walters@verbum.org>
Copy link
Collaborator

@jamadeo jamadeo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @cgwalters this looks great. We do run the OpenSSF scorecard, but it doesn't block on any regressions: https://scorecard.dev/viewer/?uri=github.com%2Fblock%2Fgoose (lots of room for improvement there, certainly)

I think your suggestions make a lot of sense, and while we don't have any immediate plans to split the repo, there are good candidates for separate projects (providers, e.g.) and good reasons to split built in MCPs (x11, as you said).

uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # pin@v4

- uses: actions-rust-lang/setup-rust-toolchain@v1
- name: Set up Rust toolchain
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this maybe slipped in before this check was active? not sure why it wouldn't have been flagged before.

In any case, pinning is as simple as

npx pin-github-action .github/workflows/ci.yml

@cgwalters
Copy link
Contributor Author

Since I wrote this PR I ended up doing https://github.com/bootc-dev/actions/blob/main/openssf-scorecard/action.yml - could either use directly or copy.

A key thing here is it's not just about binaries, but about verifying the OpenSSF scorecard doesn't go backwards (on any commit). That said I may see about trying to upstream some of this into the openssf actions.

@jamadeo
Copy link
Collaborator

jamadeo commented Jan 23, 2026

Since I wrote this PR I ended up doing https://github.com/bootc-dev/actions/blob/main/openssf-scorecard/action.yml - could either use directly or copy.

A key thing here is it's not just about binaries, but about verifying the OpenSSF scorecard doesn't go backwards (on any commit). That said I may see about trying to upstream some of this into the openssf actions.

That makes sense to me. We've had several issues that would have been caught by looking at a regression in OpenSSF scorecard checks.

Do you think it makes sense to merge what you've done here? or go with the scorecard regression check approach?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants