Skip to content

Add automated quarterly metrics collection and reporting#1

Merged
spetrosi merged 4 commits into
mainfrom
init
May 25, 2026
Merged

Add automated quarterly metrics collection and reporting#1
spetrosi merged 4 commits into
mainfrom
init

Conversation

@spetrosi
Copy link
Copy Markdown
Contributor

@spetrosi spetrosi commented May 20, 2026

Collects quarterly metrics from GitHub and Ansible Galaxy:

  • GitHub: PR/issue statistics from linux-system-roles org
  • Galaxy: Download counts for legacy roles and collections
  • Automated workflow runs on last day of each quarter
  • Generates 8 graphs showing historical trends
  • AI skill for on-demand analysis with insights and recommendations

Data stored in CSV format with full git history.

Summary by CodeRabbit

  • New Features

    • Automated quarterly metrics pipeline (scheduled + manual) that collects GitHub and Galaxy data, generates graphs, and publishes branch/PRs with generated artifacts; produces per-quarter Markdown analysis reports.
  • Documentation

    • Expanded project README and data-directory docs describing collection, report format, setup, and troubleshooting; updated Q2 analysis into a concise table-and-bullets report.
  • Chores

    • Added automation targets, CI workflow, dependency requirements, and updated ignore rules.

Review Change Stack

Collects quarterly metrics from GitHub and Ansible Galaxy:
- GitHub: PR/issue statistics from linux-system-roles org
- Galaxy: Download counts for legacy roles and collections
- Automated workflow runs on last day of each quarter
- Generates 8 graphs showing historical trends
- AI skill for on-demand analysis with insights and recommendations

Data stored in CSV format with full git history.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 20, 2026

📝 Walkthrough

Walkthrough

Adds an end-to-end quarterly metrics pipeline: docs/skill spec, Makefile and deps, GitHub & Galaxy collectors, aggregation/upserts, graph generation, and a GitHub Actions workflow that creates quarter-scoped PRs containing generated data and reports.

Changes

Quarterly Metrics Reporting System

Layer / File(s) Summary
System definition and documentation
.claude/skills/analyze-quarterly-metrics/SKILL.md, README.md, data/README.md, reports/2026-Q2-analysis.md
Skill spec and comprehensive docs plus an example Q2 2026 analysis report covering workflow, metrics, outputs, and data-quality notes.
Build infra and deps
Makefile, .gitignore, requirements.txt
Makefile auto-detects quarter/date-range and exposes targets (collect-github, collect-galaxy, update-summary, generate-graphs, quarterly-report, clean). .gitignore ignores __pycache__/ and venv/. requirements.txt lists pandas, matplotlib, seaborn, beautifulsoup4, requests, lxml.
GitHub stats collection
scripts/collect_all_github_stats.sh, scripts/collect_github_stats.sh
Orchestrator iterates org repos and delegates to per-repo collector. Collector uses gh CLI, retry wrapper, collaborator caching to split maintainer vs non-maintainer counts, filters bot titles/patterns, and writes CSV or prints summaries.
Galaxy API integration
scripts/collect_galaxy_stats.py
Queries Galaxy v1 for legacy role downloads (paginated, exclusions, extras) and Galaxy v3 for collections using retry_request; writes galaxy_legacy.csv and galaxy_collections.csv under data/{QUARTER}/. Note: collection fetch raises on HTTP errors.
Quarterly aggregation
scripts/update_quarterly_summary.py
Aggregates per-quarter CSVs into top-level summary CSVs (github_prs_summary.csv, github_issues_summary.csv, galaxy_legacy_summary.csv, galaxy_collections_summary.csv), computes collection deltas vs stored cumulative totals, and maintains per-role history CSV.
Graph generation
scripts/generate_graphs.py
Generates PNG graphs for GitHub PR/issue trends, legacy per-role/current-quarter deltas, total legacy downloads and deltas, and per-collection series; saves images to reports/images/.
GitHub Actions workflow
.github/workflows/quarterly-metrics.yml
Scheduled quarterly cron + manual dispatch; resolves QUARTER/DATE_RANGE, runs scripts, stages data/ and reports/, creates metrics/${QUARTER} branch, commits, pushes, opens PR via gh, and writes job summary with PR link or data path.

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Add automated quarterly metrics collection and reporting' directly and clearly describes the main change: implementation of an automated system for collecting and reporting quarterly metrics from GitHub and Ansible Galaxy.
Docstring Coverage ✅ Passed Docstring coverage is 88.89% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch init

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 11

🧹 Nitpick comments (1)
scripts/collect_github_stats.sh (1)

104-159: ⚡ Quick win

Use unique temp files to avoid cross-run collisions.

At Line 104 and Line 140, fixed filenames (prs.txt, issues.txt) can race if this script is ever run concurrently (manual runs, future parallelization), causing mixed/corrupted counts.

Suggested fix
-get_prs > prs.txt
+prs_tmp="$(mktemp)"
+issues_tmp="$(mktemp)"
+trap 'rm -f "$prs_tmp" "$issues_tmp"' EXIT
+
+get_prs > "$prs_tmp"
@@
-done < prs.txt
-rm -f prs.txt
+done < "$prs_tmp"
@@
-get_issues > issues.txt
+get_issues > "$issues_tmp"
@@
-done < issues.txt
-rm -f issues.txt
+done < "$issues_tmp"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/collect_github_stats.sh` around lines 104 - 159, The script uses
fixed temp filenames prs.txt and issues.txt when calling get_prs and get_issues
which can collide across concurrent runs; update the code that calls get_prs and
get_issues to write to unique temp files (use mktemp or a PID/$$-based name),
store those names in variables (e.g., PRS_TMP and ISSUES_TMP), read from those
variables in the while loops (instead of prs.txt/issues.txt), and ensure you
register a trap to rm -f "$PRS_TMP" "$ISSUES_TMP" on exit to always clean up;
touch up references around the get_prs/get_issues calls and the subsequent rm -f
lines so they remove the correct temp files.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/quarterly-metrics.yml:
- Around line 25-27: The workflow grants only contents: write under permissions
which prevents using gh pr create later; update the permissions block to also
include pull-requests: write (e.g., add "pull-requests: write" alongside
"contents: write") so the GH CLI action that calls gh pr create can successfully
create and push the pull request.
- Around line 51-53: Don't embed `${{ github.event.inputs.* }}` directly inside
`run:` shell code; instead bind inputs to step environment variables and
reference those safe shell vars. For this workflow replace direct interpolations
like `if [ -n "${{ github.event.inputs.quarter }}" ]; then QUARTER="${{
github.event.inputs.quarter }}"` with an `env:` mapping (e.g. env: QUARTER: ${{
github.event.inputs.quarter }}) and then in the `run:` use the shell variable
`$QUARTER` (quoted) and safe parameter expansion/defaults (e.g. `if [ -n
"$QUARTER" ]; then ...` or `QUARTER="${QUARTER:-default}"`) and apply the same
change to the other input usages around lines 63–64 so no `${{
github.event.inputs.* }}` is interpolated directly into shell code.
- Line 30: The workflow uses mutable action tags (uses: actions/checkout@v4 and
uses: actions/setup-node@v5); replace those tag pinning with the corresponding
immutable commit SHAs for each action reference (i.e., change the uses entries
to uses: actions/checkout@<commit-sha> and uses:
actions/setup-node@<commit-sha>), and optionally add Dependabot/workflow
auto-update to keep SHAs current; update the two occurrences in the workflow
file where actions/checkout and actions/setup-node are referenced.

In `@Makefile`:
- Line 5: The CURRENT_MONTH assignment uses GNU-specific date format "%-m" which
fails on macOS/BSD; change it to use the portable "%m" and strip any leading
zero after capturing the value (i.e., use date +%m piped through a simple text
transform such as sed or awk) so the Makefile's CURRENT_MONTH variable works
cross-platform (update the CURRENT_MONTH := $(shell ...) line accordingly).

In `@reports/2026-Q2-analysis.md`:
- Line 44: The Galaxy download projections were overstated; recalc each
projected "Full Quarter" value using the stated linear multiplier (~1.82) and
replace the incorrect numbers (e.g., the "**Projected Full Quarter:** ~173,000"
and the projections for "Legacy roles", "Collections total",
"fedora.linux_system_roles", "microsoft.sql") with the accurate products
(86,384×1.82=157,219; 81,283×1.82=147,935; 71,296×1.82=129,759;
9,987×1.82≈18,176), then update any derived QoQ decline percentages and
narrative that reference those projections so the analysis (decline numbers and
conclusions) reflect the corrected values across the document.

In `@requirements.txt`:
- Around line 1-11: Update the minimum dependency versions in requirements.txt
to current 2025–2026 minimums: change matplotlib to "matplotlib>=3.10.0",
seaborn to "seaborn>=0.13.0", pandas to "pandas>=3.0.0", beautifulsoup4 to
"beautifulsoup4>=4.14.0", requests to "requests>=2.34.0", and lxml to
"lxml>=6.1.0"; after updating these package lines run the project's dependency
validation (tests and/or lockfile refresh) to ensure Python 3.12+ compatibility
and that transitive packages like Pillow are resolved to patched versions.

In `@scripts/collect_galaxy_stats.py`:
- Around line 173-203: collect_collections() currently coerces a failed
get_collection_downloads_from_api() (which returns None on API errors) into 0
and writes false data; instead, detect when download_count is None and skip
appending that collection to collection_data (or append with an explicit None
and an error flag) so failures are not recorded as zeroes—log the failure and
continue to next collection rather than setting download_count to 0.

In `@scripts/collect_github_stats.sh`:
- Around line 140-157: The script is counting ISSUES_CLOSED from the list
produced by get_issues which queries created:$DATE_RANGE, thus missing issues
opened earlier but closed in the quarter; change the flow to run a separate
query for closed issues (e.g., call get_issues with "closed:$DATE_RANGE" or a
new helper like get_issues_closed) and iterate over that result to increment
ISSUES_CLOSED["$repo"] and ISSUES_CLOSED_NON_MAINT["$repo"] (still using
user_is_maintainer to filter) instead of using the created-date result; keep the
existing created-date loop for ISSUES_CREATED and ISSUES_CREATED_NON_MAINT and
ensure you store the closed-query output separately (e.g., issues_closed.txt) or
in a separate while-read loop so counts come from the correct population.

In `@scripts/generate_graphs.py`:
- Around line 347-357: The delta is computed on the CSV's current row order,
which can be unsorted; before calculating df['Delta'] you should sort the
DataFrame by the quarter column (e.g., the 'Quarter' or equivalent time column)
in chronological order (convert to a datetime/period type if needed) so diffs
are computed between consecutive quarters correctly; then compute df['Delta'] =
df['Total Downloads'].diff() and produce df_delta = df.iloc[1:].copy() as before
(references: df, df['Delta'], df_delta, 'Total Downloads', and the quarter/time
column).
- Around line 47-49: The bare except around plt.style.use(GRAPH_STYLE) should be
replaced with a specific OSError handler: catch OSError as e, log/print the
warning including GRAPH_STYLE and the exception message, then fall back to
plt.style.use('seaborn-v0_8'); update the try/except block surrounding the call
to plt.style.use in scripts/generate_graphs.py so it only catches OSError (not
all exceptions) and preserves unexpected exceptions.

In `@scripts/update_quarterly_summary.py`:
- Around line 160-163: previous_df is chosen with previous_df.iloc[-1] without
guaranteeing chronological order, so sort before picking the baseline: ensure
cumulative_df (or previous_df) is sorted by the 'Quarter' column (e.g.,
cumulative_df = cumulative_df.sort_values('Quarter') or previous_df =
previous_df.sort_values('Quarter')) so last_row is the true most recent prior
quarter; then proceed to set previous_cumulative['fedora.linux_system_roles']
from last_row.get(...).

---

Nitpick comments:
In `@scripts/collect_github_stats.sh`:
- Around line 104-159: The script uses fixed temp filenames prs.txt and
issues.txt when calling get_prs and get_issues which can collide across
concurrent runs; update the code that calls get_prs and get_issues to write to
unique temp files (use mktemp or a PID/$$-based name), store those names in
variables (e.g., PRS_TMP and ISSUES_TMP), read from those variables in the while
loops (instead of prs.txt/issues.txt), and ensure you register a trap to rm -f
"$PRS_TMP" "$ISSUES_TMP" on exit to always clean up; touch up references around
the get_prs/get_issues calls and the subsequent rm -f lines so they remove the
correct temp files.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 39b81d9d-5221-437f-8251-82ce224fa3e5

📥 Commits

Reviewing files that changed from the base of the PR and between c2610f2 and d5a0e3f.

⛔ Files ignored due to path filters (18)
  • data/2026-Q2/galaxy_collections.csv is excluded by !**/*.csv
  • data/2026-Q2/galaxy_legacy.csv is excluded by !**/*.csv
  • data/2026-Q2/issues.csv is excluded by !**/*.csv
  • data/2026-Q2/prs.csv is excluded by !**/*.csv
  • data/galaxy_collections_cumulative.csv is excluded by !**/*.csv
  • data/galaxy_collections_summary.csv is excluded by !**/*.csv
  • data/galaxy_legacy_per_role_history.csv is excluded by !**/*.csv
  • data/galaxy_legacy_summary.csv is excluded by !**/*.csv
  • data/github_issues_summary.csv is excluded by !**/*.csv
  • data/github_prs_summary.csv is excluded by !**/*.csv
  • reports/images/galaxy-collection-fedora-linux_system_roles.png is excluded by !**/*.png
  • reports/images/galaxy-collection-microsoft-sql.png is excluded by !**/*.png
  • reports/images/galaxy-legacy-per-role-2026-Q2.png is excluded by !**/*.png
  • reports/images/galaxy-legacy-per-role-delta-2026-Q2.png is excluded by !**/*.png
  • reports/images/galaxy-legacy-total-delta.png is excluded by !**/*.png
  • reports/images/galaxy-legacy-total.png is excluded by !**/*.png
  • reports/images/github-issues.png is excluded by !**/*.png
  • reports/images/github-prs.png is excluded by !**/*.png
📒 Files selected for processing (13)
  • .claude/skills/analyze-quarterly-metrics/SKILL.md
  • .github/workflows/quarterly-metrics.yml
  • .gitignore
  • Makefile
  • README.md
  • data/README.md
  • reports/2026-Q2-analysis.md
  • requirements.txt
  • scripts/collect_all_github_stats.sh
  • scripts/collect_galaxy_stats.py
  • scripts/collect_github_stats.sh
  • scripts/generate_graphs.py
  • scripts/update_quarterly_summary.py

Comment thread .github/workflows/quarterly-metrics.yml
Comment thread .github/workflows/quarterly-metrics.yml Outdated

steps:
- name: Checkout repository
uses: actions/checkout@v4
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

cat -n .github/workflows/quarterly-metrics.yml | sed -n '25,40p'

Repository: linux-system-roles/linux-system-roles-upstream-metrics

Length of output: 605


🌐 Web query:

GitHub Actions pinning by commit SHA vs tag security best practice

💡 Result:

Pinning GitHub Actions by commit SHA is the definitive security best practice, whereas relying on tags poses a significant supply chain risk [1][2][3]. The primary difference lies in immutability: 1. Commit SHA (Best Practice): Pinning to a full-length commit SHA guarantees that the exact, validated code version is executed every time [4][5][3]. Because SHAs are immutable, they cannot be silently altered by an attacker who gains access to the repository, providing protection against retroactive supply chain attacks [1][5][6]. 2. Tags (High Risk): Version tags (e.g., v3) are mutable references [1][7]. Even if you trust an action's creator, a compromised maintainer account or repository can allow an attacker to move tags to point to malicious "imposter" commits [5][6][7]. Any workflow referencing the tag will automatically pull the malicious code upon the next execution without warning [5][6]. Recommendation for Implementation: - Enforce SHA Pinning: Administrators can configure organization or repository-level policies to enforce the use of full commit SHAs, causing workflows that use tags or branches to fail automatically [2][8][9]. - Use Descriptive Comments: To maintain readability and manageability while using SHAs, append the human-readable version tag as a comment in the workflow file (e.g., uses: actions/checkout@f43a0e5 # v3.6.0) [8][10]. - Automate Updates: Since SHA pinning prevents automatic updates, use automated dependency management tools like Dependabot or Renovate to monitor for new versions and propose pull requests for updates [1][10][7]. - Consider Cooldowns: When updating actions, implementing a "cooldown" period (e.g., waiting 7-14 days after a release) can help detect malicious activity or community reports of issues before adopting new versions into your production environment [9]. For high-security requirements, verify that the SHA belongs to the legitimate repository and avoid pinning to forks [3]. Note that while SHA pinning secures the immediate action, transitive dependencies resolved at runtime by the action remain a potential attack vector [9][7].

Citations:


🏁 Script executed:

grep -n "uses:" .github/workflows/quarterly-metrics.yml

Repository: linux-system-roles/linux-system-roles-upstream-metrics

Length of output: 176


Pin GitHub Actions by commit SHA instead of version tags.

Lines 30 and 33 use tag-pinned actions (@v4, @v5), which are mutable references that pose a supply chain risk. An attacker who gains access to the upstream repository could move the tag to point to malicious code, and all workflows would automatically pull the compromised version without warning. Pin to immutable commit SHAs instead:

Proposed fix
      - name: Checkout repository
-       uses: actions/checkout@v4
+       uses: actions/checkout@f43a0e5ff2bd294095638e18286ca9a3d1956744 # v4

      - name: Set up Python
-       uses: actions/setup-python@v5
+       uses: actions/setup-python@40c695bfec7a6cc14eaaf3fe5b33537548c3f129 # v5

Use Dependabot or similar tools to automate SHA updates and maintain currency.

🧰 Tools
🪛 zizmor (1.25.2)

[warning] 29-30: credential persistence through GitHub Actions artifacts (artipacked): does not set persist-credentials: false

(artipacked)


[error] 30-30: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)

(unpinned-uses)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/quarterly-metrics.yml at line 30, The workflow uses
mutable action tags (uses: actions/checkout@v4 and uses: actions/setup-node@v5);
replace those tag pinning with the corresponding immutable commit SHAs for each
action reference (i.e., change the uses entries to uses:
actions/checkout@<commit-sha> and uses: actions/setup-node@<commit-sha>), and
optionally add Dependabot/workflow auto-update to keep SHAs current; update the
two occurrences in the workflow file where actions/checkout and
actions/setup-node are referenced.

Comment thread .github/workflows/quarterly-metrics.yml Outdated
Comment thread Makefile
Comment thread reports/2026-Q2-analysis.md Outdated
Comment thread scripts/collect_galaxy_stats.py
Comment thread scripts/collect_github_stats.sh Outdated
Comment thread scripts/generate_graphs.py Outdated
Comment thread scripts/generate_graphs.py Outdated
Comment thread scripts/update_quarterly_summary.py Outdated
Apply suggestions, regenerate stats and rerun AI report
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
.github/workflows/quarterly-metrics.yml (1)

130-145: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add idempotency checks for branch and PR creation to handle workflow reruns.

The workflow is triggered both by cron schedule and manual workflow_dispatch, enabling reruns for the same quarter. Currently, git checkout -b "metrics/${QUARTER}" (line 132) and unconditional gh pr create (line 141) will fail if the branch or PR already exists, breaking the workflow on reruns or manual retriggers.

Add existence checks before branch/PR creation:

Suggested fix
-          BRANCH_NAME="metrics/${QUARTER}"
-          git checkout -b "$BRANCH_NAME"
+          BRANCH_NAME="metrics/${QUARTER}"
+          if git ls-remote --exit-code --heads origin "$BRANCH_NAME" >/dev/null 2>&1; then
+            git fetch origin "$BRANCH_NAME"
+            git checkout -B "$BRANCH_NAME" "origin/$BRANCH_NAME"
+          else
+            git checkout -b "$BRANCH_NAME"
+          fi
@@
-          gh pr create \
-            --title "Quarterly Metrics - ${QUARTER}" \
-            --body "Automated metrics collection for ${QUARTER}. **Data:** \`data/${QUARTER}/\` **Graphs:** \`reports/images/\` After merging, generate analysis: \`/analyze-quarterly-metrics ${QUARTER}\`" \
-            --base main \
-            --head "$BRANCH_NAME"
+          PR_URL=$(gh pr list --head "$BRANCH_NAME" --base main --json url -q '.[0].url')
+          if [ -n "$PR_URL" ]; then
+            echo "PR already exists: $PR_URL"
+          else
+            gh pr create \
+              --title "Quarterly Metrics - ${QUARTER}" \
+              --body "Automated metrics collection for ${QUARTER}. **Data:** \`data/${QUARTER}/\` **Graphs:** \`reports/images/\` After merging, generate analysis: \`/analyze-quarterly-metrics ${QUARTER}\`" \
+              --base main \
+              --head "$BRANCH_NAME"
+          fi
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/quarterly-metrics.yml around lines 130 - 145, The workflow
fails on reruns because it unconditionally creates a branch and PR; update the
script around BRANCH_NAME and the gh pr create call to be idempotent: before
running git checkout -b "$BRANCH_NAME" check whether the branch exists locally
or remotely and if so switch to it (git checkout or git fetch && git checkout),
otherwise create it; before running gh pr create check for an existing open PR
with head "$BRANCH_NAME" (e.g., gh pr list or gh pr view) and skip creation if
one exists (or capture and reuse its URL), and ensure pushes use git push
--set-upstream origin "$BRANCH_NAME" so repeated pushes succeed.
🧹 Nitpick comments (1)
scripts/collect_galaxy_stats.py (1)

196-198: ⚡ Quick win

Remove unreachable None guard after hard-fail API fetch.

get_collection_downloads_from_api() now raises on failures, so this branch is dead and can be removed to keep control flow clear.

Suggested diff
-        if download_count is None:
-            raise RuntimeError(f"Missing download_count for {namespace}.{name}")
-
         collection_data.append({
             'namespace': namespace,
             'name': name,
             'full_name': f"{namespace}.{name}",
             'download_count': download_count
         })
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/collect_galaxy_stats.py` around lines 196 - 198, Remove the
now-unreachable None check after calling get_collection_downloads_from_api():
delete the conditional block that tests "if download_count is None: raise
RuntimeError(...)" because get_collection_downloads_from_api() already raises on
failure; keep the code that consumes download_count as-is and ensure no
duplicate error handling remains (look for occurrences of download_count,
get_collection_downloads_from_api, and the RuntimeError raise in the same
scope).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@reports/2026-Q2-analysis.md`:
- Line 140: The decline statement referencing fedora.linux_system_roles uses
collection-total numbers (389K → ~149K) rather than the
fedora-collection-specific figures; update the sentence for
fedora.linux_system_roles to use the correct fedora collection baseline and
projected Q2 value (or remove the per-collection claim if that data isn't
available), and ensure the text and any accompanying table/footnote for
fedora.linux_system_roles reflect the corrected numeric source.
- Line 54: The sentence "**Issue Resolution Rate:** 28.6% (up from 0%, but still
well below Q1's 21.4%)" contradicts itself; update that phrase so the comparison
is correct — e.g., replace "but still well below Q1's 21.4%" with "and higher
than Q1's 21.4%" or another accurate qualifier (like "but still below target")
to reflect that 28.6% > 21.4%; edit the exact text shown to correct the
comparison.

In `@scripts/generate_graphs.py`:
- Around line 47-49: The except OSError handler that catches a missing
GRAPH_STYLE currently retries the same style via plt.style.use('seaborn-v0_8'),
which will fail again; change the fallback in that except block to use a
guaranteed built-in style (e.g., 'default' or 'classic') instead of the
configured GRAPH_STYLE constant. Update the handler that references GRAPH_STYLE
and plt.style.use to call plt.style.use('default') (or another built-in) and
keep the warning message but include the chosen fallback name.

---

Outside diff comments:
In @.github/workflows/quarterly-metrics.yml:
- Around line 130-145: The workflow fails on reruns because it unconditionally
creates a branch and PR; update the script around BRANCH_NAME and the gh pr
create call to be idempotent: before running git checkout -b "$BRANCH_NAME"
check whether the branch exists locally or remotely and if so switch to it (git
checkout or git fetch && git checkout), otherwise create it; before running gh
pr create check for an existing open PR with head "$BRANCH_NAME" (e.g., gh pr
list or gh pr view) and skip creation if one exists (or capture and reuse its
URL), and ensure pushes use git push --set-upstream origin "$BRANCH_NAME" so
repeated pushes succeed.

---

Nitpick comments:
In `@scripts/collect_galaxy_stats.py`:
- Around line 196-198: Remove the now-unreachable None check after calling
get_collection_downloads_from_api(): delete the conditional block that tests "if
download_count is None: raise RuntimeError(...)" because
get_collection_downloads_from_api() already raises on failure; keep the code
that consumes download_count as-is and ensure no duplicate error handling
remains (look for occurrences of download_count,
get_collection_downloads_from_api, and the RuntimeError raise in the same
scope).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 3917dc42-c766-491d-ae66-1ed279bac152

📥 Commits

Reviewing files that changed from the base of the PR and between d5a0e3f and 8ec320b.

⛔ Files ignored due to path filters (15)
  • data/2026-Q2/galaxy_collections.csv is excluded by !**/*.csv
  • data/2026-Q2/galaxy_legacy.csv is excluded by !**/*.csv
  • data/2026-Q2/issues.csv is excluded by !**/*.csv
  • data/galaxy_collections_cumulative.csv is excluded by !**/*.csv
  • data/galaxy_collections_summary.csv is excluded by !**/*.csv
  • data/galaxy_legacy_per_role_history.csv is excluded by !**/*.csv
  • data/galaxy_legacy_summary.csv is excluded by !**/*.csv
  • data/github_issues_summary.csv is excluded by !**/*.csv
  • reports/images/galaxy-collection-fedora-linux_system_roles.png is excluded by !**/*.png
  • reports/images/galaxy-collection-microsoft-sql.png is excluded by !**/*.png
  • reports/images/galaxy-legacy-per-role-2026-Q2.png is excluded by !**/*.png
  • reports/images/galaxy-legacy-per-role-delta-2026-Q2.png is excluded by !**/*.png
  • reports/images/galaxy-legacy-total-delta.png is excluded by !**/*.png
  • reports/images/galaxy-legacy-total.png is excluded by !**/*.png
  • reports/images/github-issues.png is excluded by !**/*.png
📒 Files selected for processing (8)
  • .claude/skills/analyze-quarterly-metrics/SKILL.md
  • .github/workflows/quarterly-metrics.yml
  • README.md
  • reports/2026-Q2-analysis.md
  • scripts/collect_galaxy_stats.py
  • scripts/collect_github_stats.sh
  • scripts/generate_graphs.py
  • scripts/update_quarterly_summary.py
✅ Files skipped from review due to trivial changes (1)
  • .claude/skills/analyze-quarterly-metrics/SKILL.md

Comment thread reports/2026-Q2-analysis.md Outdated
Comment thread reports/2026-Q2-analysis.md Outdated
Comment thread scripts/generate_graphs.py Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/quarterly-metrics.yml:
- Around line 130-142: The workflow currently stages files with git add data/
reports/ before switching to an existing remote branch using git checkout -B
"$BRANCH_NAME" "origin/$BRANCH_NAME", which resets the index and discards staged
changes; to fix, move or repeat the staging so changes are added after the
checkout (i.e., ensure git add data/ reports/ runs after git checkout -B
"$BRANCH_NAME" "origin/$BRANCH_NAME"), or alternatively stash before checkout
and pop after, then run git commit with BRANCH_NAME as before so the commit
includes the intended files.

In `@reports/2026-Q2-analysis.md`:
- Around line 49-57: The markdown tables (e.g., the table starting with "|
Metric | Q2 2026 (partial) | Q1 2026 | QoQ Change | Projected Q2 Full |" and the
other tables referenced at ranges 63-69, 76-79, 82-86) need an empty line both
immediately before and immediately after each table to satisfy MD058; edit the
document to insert a single blank line above and below each table block so each
table is separated from surrounding paragraphs/headings.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 8acfd615-3293-4753-b894-a8ce015da301

📥 Commits

Reviewing files that changed from the base of the PR and between 8ec320b and 28e9a22.

📒 Files selected for processing (5)
  • .claude/skills/analyze-quarterly-metrics/SKILL.md
  • .github/workflows/quarterly-metrics.yml
  • reports/2026-Q2-analysis.md
  • scripts/collect_galaxy_stats.py
  • scripts/generate_graphs.py
💤 Files with no reviewable changes (1)
  • scripts/collect_galaxy_stats.py
✅ Files skipped from review due to trivial changes (1)
  • .claude/skills/analyze-quarterly-metrics/SKILL.md

Comment on lines +130 to +142
# Create or switch to branch for this quarter
BRANCH_NAME="metrics/${QUARTER}"
if git ls-remote --exit-code --heads origin "$BRANCH_NAME" >/dev/null 2>&1; then
echo "Branch $BRANCH_NAME exists, updating it"
git fetch origin "$BRANCH_NAME"
git checkout -B "$BRANCH_NAME" "origin/$BRANCH_NAME"
else
echo "Creating new branch $BRANCH_NAME"
git checkout -b "$BRANCH_NAME"
fi

# Commit changes
git commit -m "Add metrics for ${QUARTER}" -m "Generated by GitHub Actions workflow" -m "- Data: data/${QUARTER}/" -m "- Graphs: reports/images/"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Staged changes are lost when switching to existing remote branch.

The workflow stages changes with git add data/ reports/ (line 122), then checks for staged changes (line 125). However, when the branch already exists remotely, git checkout -B "$BRANCH_NAME" "origin/$BRANCH_NAME" (line 135) resets the working tree and index to the remote state, discarding the staged changes. The subsequent commit will then fail or commit nothing.

Consider re-staging after checkout, or use a different approach:

Proposed fix
           if git ls-remote --exit-code --heads origin "$BRANCH_NAME" >/dev/null 2>&1; then
             echo "Branch $BRANCH_NAME exists, updating it"
             git fetch origin "$BRANCH_NAME"
             git checkout -B "$BRANCH_NAME" "origin/$BRANCH_NAME"
+            # Re-stage changes after checkout
+            git add data/ reports/
           else
             echo "Creating new branch $BRANCH_NAME"
             git checkout -b "$BRANCH_NAME"
           fi
+
+          # Verify there are still changes to commit after branch switch
+          if git diff --staged --quiet; then
+            echo "No changes to commit after branch switch"
+            exit 0
+          fi

           # Commit changes
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/quarterly-metrics.yml around lines 130 - 142, The workflow
currently stages files with git add data/ reports/ before switching to an
existing remote branch using git checkout -B "$BRANCH_NAME"
"origin/$BRANCH_NAME", which resets the index and discards staged changes; to
fix, move or repeat the staging so changes are added after the checkout (i.e.,
ensure git add data/ reports/ runs after git checkout -B "$BRANCH_NAME"
"origin/$BRANCH_NAME"), or alternatively stash before checkout and pop after,
then run git commit with BRANCH_NAME as before so the commit includes the
intended files.

Comment thread reports/2026-Q2-analysis.md Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
reports/2026-Q2-analysis.md (2)

32-37: 💤 Low value

Highlights section has 4 bullets instead of the specified 2-3.

SKILL.md line 45 specifies "Highlights (2-3 bullets)" but this example has 4 bullets. While not a major issue, it represents another small deviation from the specification.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@reports/2026-Q2-analysis.md` around lines 32 - 37, The "## Highlights" list
has four bullets but SKILL.md requires 2-3; edit the Highlights section (heading
"## Highlights") to reduce to 2–3 bullets by removing or consolidating
items—e.g., keep "PR volume trending up strongly" and "Excellent merge
efficiency" as standalone bullets and merge "Issue resolution improving" with
"Strong external engagement" into a single combined bullet (or drop the least
important one), ensuring wording stays factual and metrics (233 PRs, 94.4% merge
rate, 28.6% resolution, 12.5% external PRs) are preserved or adjusted as needed.

67-92: ⚡ Quick win

Example report doesn't follow the SKILL.md brevity guidelines.

The SKILL.md specification (lines 8, 64, 89) calls for a "focused" report of ~400-500 words with "3-4 bullets" for Recommendations. This example report exceeds 650 words and structures Recommendations as 4 time-phased categories (Immediate, Short-term, Ongoing, Follow-up) with multiple nested bullets, rather than the specified "3-4 bullets" format.

While the time-phased structure is organizationally sound, the example doesn't demonstrate the brevity and scannability emphasized in the skill specification. This inconsistency may confuse users about the expected report format.

Suggested approach to align with spec

Consider either:

  1. Simplifying this example report to match the 400-500 word target with flat "3-4 bullets" for Recommendations, or
  2. Updating SKILL.md to acknowledge that reports may exceed 500 words when comprehensive recommendations are needed, and that time-phased recommendation structures are acceptable
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@reports/2026-Q2-analysis.md` around lines 67 - 92, The Recommendations
section violates SKILL.md brevity (see SKILL.md lines 8, 64, 89): condense the
Recommendations into a flat list of 3–4 focused bullets and trim the overall
report to ~400–500 words; specifically, replace the four time-phased categories
with 3–4 concise bullets in the "Recommendations" section, remove nested
sub-bullets, merge related actions into single short sentences, and shorten/pare
down other sections to meet the word target — or alternatively update SKILL.md
to accept time-phased, longer recommendation formats if you intend to keep the
current structure.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@reports/2026-Q2-analysis.md`:
- Around line 32-37: The "## Highlights" list has four bullets but SKILL.md
requires 2-3; edit the Highlights section (heading "## Highlights") to reduce to
2–3 bullets by removing or consolidating items—e.g., keep "PR volume trending up
strongly" and "Excellent merge efficiency" as standalone bullets and merge
"Issue resolution improving" with "Strong external engagement" into a single
combined bullet (or drop the least important one), ensuring wording stays
factual and metrics (233 PRs, 94.4% merge rate, 28.6% resolution, 12.5% external
PRs) are preserved or adjusted as needed.
- Around line 67-92: The Recommendations section violates SKILL.md brevity (see
SKILL.md lines 8, 64, 89): condense the Recommendations into a flat list of 3–4
focused bullets and trim the overall report to ~400–500 words; specifically,
replace the four time-phased categories with 3–4 concise bullets in the
"Recommendations" section, remove nested sub-bullets, merge related actions into
single short sentences, and shorten/pare down other sections to meet the word
target — or alternatively update SKILL.md to accept time-phased, longer
recommendation formats if you intend to keep the current structure.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 6427ae0e-8fba-48fa-b39b-71be486a5d60

📥 Commits

Reviewing files that changed from the base of the PR and between 28e9a22 and f0cc296.

📒 Files selected for processing (2)
  • .claude/skills/analyze-quarterly-metrics/SKILL.md
  • reports/2026-Q2-analysis.md

@spetrosi spetrosi merged commit 248fe12 into main May 25, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant