Skip to content

Conversation

@edsiper
Copy link
Member

@edsiper edsiper commented Dec 2, 2025

This PR adds a new commit message linter

Overview

The commit prefix checker (commit_prefix_check.py) validates commit messages according to Fluent Bit standards:

  • Single prefix (plugin or subsystem)
  • Prefix must match modified files
  • No combined subjects (detect bad squashes)
  • Subject ≤ 80 characters
  • Signed-off-by line required

Prerequisites

Install the required Python dependency:

pip install gitpython

Running Locally

From the repository root directory, run:

python .github/scripts/commit_prefix_check.py

Behavior

  • Default: Validates the HEAD commit only
  • Pull Request mode: If the GITHUB_EVENT_NAME environment variable is set to pull_request, it validates the last 20 commits
  • Exit codes:
    • 0 if validation passes
    • 1 if validation fails

Example Output

Success:

✅ Commit prefix validation passed.

Failure:

❌ Commit deadbeef12 failed:
Subject prefix 'wrong_prefix:' does not match files changed.
Expected one of: router:
Commit prefix validation failed.

Testing Specific Commits

To test a specific commit, you can temporarily checkout that commit:

# Checkout the commit you want to test
git checkout <commit-hash>

# Run the validator
python .github/scripts/commit_prefix_check.py

# Return to your branch
git checkout <your-branch>

CI Integration

This script is automatically run in CI via the .github/workflows/commit-lint.yaml workflow on:

  • Pull requests (opened, synchronize, reopened, edited)
  • Pushes to the master branch

Commit Message Format

See CONTRIBUTING.md for the full commit message format requirements.


Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

Summary by CodeRabbit

  • Chores

    • Added a commit-prefix validation script and a CI workflow to run it on PRs and pushes.
  • Tests

    • Added a comprehensive test suite for prefix inference, squash detection, and commit validation behavior.
  • Documentation

    • Added commit lint docs with usage, local testing, CI integration, examples, and exit-code guidance.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Dec 2, 2025

Caution

Review failed

The pull request is closed.

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Adds a commit-prefix linter: a new Python script that validates commit subjects, Signed-off-by lines, and infers expected prefixes from changed paths; a GitHub Actions workflow runs the checker on PRs and master pushes; documentation and unit tests accompany the feature.

Changes

Cohort / File(s) Summary
Commit prefix validation script
\.github/scripts/commit_prefix_check.py
New Python script implementing commit validation: infers expected subject prefixes from changed file paths, detects bad squash patterns in commit bodies, enforces subject length and Signed-off-by requirements, and validates subject prefix matches inferred expectations.
CI workflow
\.github/workflows/commit-lint.yaml
New GitHub Actions workflow "Commit Prefix Lint" triggered on PR events and pushes to master; checks out code, fetches base branch for PRs, sets up Python 3.10, installs gitpython, and runs the commit prefix checker.
Tests
\.github/scripts/tests/test_commit_lint.py
New comprehensive test suite exercising infer_prefix_from_paths, detect_bad_squash, and validate_commit with many success and failure cases covering path inference, body heuristics, signoffs, and subject length rules.
Documentation
COMMIT_LINT.md
New documentation describing the linter, usage (local and CI), exit codes, example outputs, testing instructions, and commit message format reference.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Areas to inspect closely:
    • infer_prefix_from_paths() — correctness of mapping from path patterns to expected prefixes and handling of mixed/unknown paths.
    • detect_bad_squash() — heuristics for detecting subject-like lines and treatment of multiple Signed-off-by lines.
    • validate_commit() — enforcement order (subject prefix, length, sign-off), error messaging, and case-insensitive comparisons.
    • Workflow fetch behavior and git usage in .github/workflows/commit-lint.yaml to ensure PR vs push commits are collected as intended.
    • Test coverage for edge cases and any brittle string-matching assumptions.

Poem

🐇 I hopped through diffs at break of night,

Checking prefixes tidy and right.
Signed-off stamps neat as a print,
No squashed confusion left in a lint.
A rabbit's nod—commits are bright! 🥕

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'workflows: add new commit linter tool' accurately summarizes the main change: introducing a new commit linting workflow and tool. It is concise, clear, and directly related to the primary objective of the PR.

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3ffea92 and c3b0054.

📒 Files selected for processing (4)
  • .github/scripts/commit_prefix_check.py (1 hunks)
  • .github/scripts/tests/test_commit_lint.py (1 hunks)
  • .github/workflows/commit-lint.yaml (1 hunks)
  • COMMIT_LINT.md (1 hunks)

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines 161 to 164
if event_name == "pull_request":
commits = list(repo.iter_commits("HEAD", max_count=20))
else:
commits = [head]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Skip linting synthetic merge commit on PR runs

On pull_request events actions/checkout uses the auto-generated merge ref, so main() iterates commits starting from that merge (lines 161‑164) rather than the branch commits. GitHub’s synthetic merge messages have no component prefix or Signed-off-by, so every PR run will fail the prefix/signoff checks even when the actual PR commits are valid. Exclude the merge commit or check out the head SHA to avoid blocking all PRs with false failures.

Useful? React with 👍 / 👎.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (6)
COMMIT_LINT.md (1)

41-52: Add language specifier to fenced code blocks.

The example output code blocks lack a language specifier, which triggers markdownlint warnings (MD040). Consider adding text or console as the language identifier for consistency.

 **Success:**
-```
+```text
 ✅ Commit prefix validation passed.

Failure:
- +text
❌ Commit deadbeef12 failed:
Subject prefix 'wrong_prefix:' does not match files changed.
Expected one of: router:

Commit prefix validation failed.

.github/workflows/commit-lint.yaml (1)

3-8: Consider whether the edited trigger is intentional.

The edited trigger fires when the PR title or description is changed, not just when commits are updated. This may cause redundant workflow runs. If the intent is only to lint commits when they change, opened, synchronize, and reopened are sufficient.

 on:
   pull_request:
-    types: [opened, synchronize, reopened, edited]
+    types: [opened, synchronize, reopened]
   push:
     branches:
       - master
.github/scripts/commit_prefix_check.py (4)

20-20: Global Repo instantiation may cause issues.

Instantiating Repo(".") at module level makes the script fail if imported from a different working directory or during unit testing. Consider moving this into main() and passing the repo as a parameter to functions that need it, or wrap it in a lazy initialization.

-repo = Repo(".")
+def get_repo():
+    return Repo(".")

Then in main():

def main():
+    repo = get_repo()
     head = repo.head.commit
     ...

34-38: Potential IndexError on malformed paths.

If a path is exactly "plugins/" or "plugins" without a subdirectory, parts[1] will raise an IndexError. While unlikely in practice, adding a guard improves robustness.

         if p.startswith("plugins/"):
             parts = p.split("/")
+            if len(parts) < 2 or not parts[1]:
+                continue
             prefix = parts[1]
             prefixes.add(f"{prefix}:")
             continue

66-68: Rename ambiguous variable l to improve readability.

The variable name l is flagged by Ruff (E741) as ambiguous because it can be confused with 1 or I. Use a more descriptive name like line.

     # Normalize and discard empty lines
-    lines = [l.strip() for l in body.splitlines() if l.strip()]
+    lines = [line.strip() for line in body.splitlines() if line.strip()]

-    prefix_lines = [l for l in lines if PREFIX_RE.match(l)]
+    prefix_lines = [line for line in lines if PREFIX_RE.match(line)]

111-114: Remove unnecessary pass statement.

The pass statement is redundant here since the block has a comment. While the comment clarifies intent, the pass can be removed.

         # If due to multiple sign-offs, tests expect validate_commit() to still PASS
         # So we do NOT return False here.
         # validate_commit ignores multi signoff warnings.
-        pass
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 10ebd3a and 3ffea92.

📒 Files selected for processing (3)
  • .github/scripts/commit_prefix_check.py (1 hunks)
  • .github/workflows/commit-lint.yaml (1 hunks)
  • COMMIT_LINT.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
COMMIT_LINT.md

[uncategorized] ~71-~71: The official name of this software platform is spelled with a capital “H”.
Context: ...ript is automatically run in CI via the .github/workflows/commit-lint.yaml workflow on...

(GITHUB)

🪛 markdownlint-cli2 (0.18.1)
COMMIT_LINT.md

41-41: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


46-46: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🪛 Ruff (0.14.7)
.github/scripts/commit_prefix_check.py

1-1: Shebang is present but file is not executable

(EXE001)


66-66: Ambiguous variable name: l

(E741)


68-68: Ambiguous variable name: l

(E741)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (21)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_COMPILER_STRICT_POINTER_TYPES=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_ARROW=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
🔇 Additional comments (4)
COMMIT_LINT.md (1)

1-78: Documentation is comprehensive and well-structured.

The documentation clearly explains prerequisites, local execution, behavior modes, exit codes, and CI integration. The reference to CONTRIBUTING.md for commit format is appropriate.

.github/workflows/commit-lint.yaml (1)

14-30: Workflow structure is appropriate.

The workflow correctly checks out with sufficient depth (50) for commit history analysis, sets up Python, installs the required dependency, and runs the linting script. The configuration is straightforward and functional.

.github/scripts/commit_prefix_check.py (2)

136-141: Strict validation fails commits touching multiple components.

This always fails when files from multiple components are modified, even if the subject prefix matches one of them. This seems intentional per the comment, but may be overly strict for legitimate refactoring commits that touch multiple areas. Verify this matches the project's desired behavior.


157-178: Main function logic is correct.

The function properly handles both CI (pull_request) and local/push modes, aggregates errors across commits, and provides clear output with appropriate exit codes.

@edsiper edsiper merged commit a8f1e4b into master Dec 2, 2025
35 of 36 checks passed
@edsiper edsiper deleted the workflows-commit-lint branch December 2, 2025 21:17
@patrick-stephens
Copy link
Contributor

Only thing I'd say here is we used old versions of actions which then had to be updated after merge - potentially they have issues so need to make sure we use latest.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants