feat: per-OS download breakdown badges (v3) by cmeans · Pull Request #56 · cmeans/pypi-winnow-downloads

cmeans · 2026-04-29T23:58:06Z

Summary

Adds per-OS download breakdown badges (linux / macos / windows) parallel to the per-installer breakdown shipped in v0.2.0. Three new shields.io endpoint JSON files per package per window, plus README dogfood block extension and _health.json shape extension.

Why

The installer-mix v2 feature surfaces which packaging tool users run when they install a package. The OS distribution breakdown answers a different operator question: what platforms is this used on? For deciding what OS matrix to test against, what platform-specific bugs to prioritize, or whether to ship a wheel for a specific OS, the OS breakdown is more decision-useful than the installer breakdown.

How

One pypinfo invocation per package per window (unchanged), now with ci installer system group-by (extended from ci installer). Cartesian rows ~6 → ~18 after allowlist filtering. BigQuery cost unchanged (same source table, marginal column).
run_pypinfo() return type changes from dict[str, int] to a TypedDict carrying both by_installer and by_system.
The v0.2.0 hero-stability invariant is preserved: hero = sum(by_installer.values()) regardless of system_name. Per-system aggregation applies an independent allowlist filter (Linux/Darwin/Windows).
PackageOutcome and _health.json gain a counts_by_system field. Existing fields preserved verbatim.
README dogfood block grows a "By OS" paragraph; "What these badges actually count" gains a "By OS breakdown" paragraph; "Use this service for your own package" table grows 3 rows.
Filename slug + badge label use macos (user-friendly); internal allowlist key is Darwin to match pypinfo's raw emission.

What's in the diff

src/pypi_winnow_downloads/collector.py — new constants, multi-dim pypinfo argv, restructured return shape, per-system aggregation, OS badge emission loop, PackageOutcome field, _write_health() extension.
tests/test_collector.py — 9 new test cases (argv extension, return shape, per-system aggregation, system allowlist filter, missing/empty system_name edge cases, CI filter, OS badge file emission, 11-files-per-package invariant, _health.json shape, v0.2.0 backwards-compat invariant). 1 pre-existing test renamed (8 → 11 files); 8 pre-existing tests updated to consume new return shape; _fake_runner_for test fixture extended with system_name field.
README.md — "By OS" dogfood paragraph, "By OS breakdown" prose, 3 new table rows.
CHANGELOG.md — ## [Unreleased] → ### Added bullet.
docs/superpowers/specs/2026-04-29-os-distribution-badge-design.md — design spec.
docs/superpowers/plans/2026-04-29-os-distribution-badges.md — implementation plan.

Cost

Zero net BigQuery cost (same source table, marginal additional column scanned). One additional badge-file-write per package per OS per run (3 file writes per package per window).

Test plan

Full pytest at 100% coverage on src/ (88 tests pass).
ruff check, ruff format --check, mypy all clean.
CI green (lint, typecheck, test, deploy-smoke).
After merge: collector run on CT 112 emits 11 files per package per window (verify via `update-collector.sh status` or direct ls).
Live README renders correctly with the 3 new badges showing real values.

Backwards-compat invariants verified

downloads-<N>d-non-ci.json filename, schema, and value unchanged for any given pypinfo response (test: test_collect_one_v0_2_0_files_unchanged_alongside_os_files).
The seven v0.2.0 installer-* files unchanged.
_health.json count, counts, window_days keys preserved verbatim (test: test_health_json_preserves_v0_2_0_fields).
Hero count = sum(by_installer.values()) regardless of system_name (test: test_run_pypinfo_filters_out_non_allowlisted_systems).

Release framing

Target release: v0.3.0 — minor bump per SemVer. Additive feature; no breaking changes to v0.2.0 contracts.

Commits (7 total — squash-merge will collapse)

`2058aa6` — feat(collector): add OS allowlist + badge specs (no behavior change)
`30cad27` — feat(collector): pypinfo group-by ci installer system + dual-dim aggregation
`04c5fe0` — feat(collector): emit per-OS badges (linux/macos/windows)
`a8f004f` — test(collector): add system_name to _fake_runner_for non-CI row
`de8ecc3` — feat(collector): _health.json gains counts_by_system per package
`2e60cd5` — docs(README): add per-OS dogfood badges + breakdown paragraph + table rows
`a21088e` — docs(CHANGELOG): record v3 OS distribution feature in Unreleased

🤖 Generated with Claude Code

Adds _SYSTEM_NAMES, _SYSTEM_ALLOWLIST, and _OS_BADGE_SPECS constants parallel to the per-installer constants. No behavior change yet — the constants are forward-declared for the v3 OS distribution feature (filenames, labels, allowlist keys). Subsequent commits wire up the multi-dim pypinfo query, per-system aggregation, badge emission, and _health.json shape. Spec: docs/superpowers/specs/2026-04-29-os-distribution-badge-design.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…egation Changes run_pypinfo() to query BigQuery on a 3-dimensional GROUP BY (`ci installer system`) so a single call yields both per-installer and per-system breakdowns. Return type changes from dict[str, int] to a TypedDict carrying both aggregates. The v0.2.0 hero-stability invariant is preserved: hero count (sum(by_installer.values())) is unchanged because the per-installer aggregation does not consider system_name. The per-system aggregation applies an independent allowlist filter (Linux/Darwin/Windows); rows with missing or non-allowlisted system_name drop out of by_system but still count toward by_installer when the installer is allowlisted. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Three new shields.io endpoint JSON files per package per window: os-linux-Nd-non-ci.json, os-macos-Nd-non-ci.json, os-windows-Nd-non-ci.json. Color logic and label format mirror the per-installer badges (blue if count >= 10 else lightgrey; parameterized by window_days). PackageOutcome gains a counts_by_system field; v0.2.0's existing fields are preserved verbatim. Total badge files per package per window increases from 8 to 11. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Code-quality reviewer flagged that _fake_runner_for emits rows without a system_name field, making by_system always all-zero in integration tests using this helper. Adding system_name="Linux" to the non-CI row gives the per-OS aggregates non-zero values in those tests, exercising realistic badge-color logic instead of only the lightgrey path. Verified: 86/86 tests still pass, ruff/format/mypy clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Per-package successful entries in _health.json now include counts_by_system alongside the existing counts (per-installer) field. v0.2.0 fields (count, counts, window_days) preserved verbatim — no change to existing monitoring or scripting that reads them. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… rows Each dogfood package's block gains a 'By OS (30d, non-CI):' paragraph parallel to the existing 'By installer' paragraph (3 badges: linux/macos/windows). 'What these badges actually count' gains a 'By OS breakdown' paragraph documenting the per-OS-sum <= hero gap. 'Use this service for your own package' table grows 3 rows. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Adds the per-OS-badges entry under ## [Unreleased] / ### Added, matching the project's per-PR CHANGELOG rule and the v0.2.0 v2-feature entry's house style. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

codecov-commenter · 2026-04-29T23:58:50Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

cmeans

QA round 1 — PASS (no findings)

Substantive feature PR — v3 OS distribution badges. Architecture cleanly mirrors the v0.2.0 installer-mix feature (PR #49) one axis over. Code reuse is high, naming patterns parallel, and the v0.2.0 backwards-compat invariants are preserved with explicit test coverage.

Code review (collector.py)

RunPypinfoResult TypedDict with by_installer and by_system keys — typed, structured, makes the hero/per-system distinction at the type level (collector.py:26-28).
_SYSTEM_NAMES / _SYSTEM_ALLOWLIST / _OS_BADGE_SPECS parallel the existing installer counterparts (collector.py:75-87). Comment at line 70-77 documents the Darwin → macos slug-vs-allowlist-key mapping clearly.
Pypinfo argv extension: ["ci", "installer"] → ["ci", "installer", "system"] at collector.py:175-179. Single BigQuery call, multi-dimensional GROUP BY — same source table, marginal column scanned, negligible cost.
Hero stability invariant preserved at the code level: by_installer[installer] += count (collector.py:243-244) increments regardless of system_name. by_system[system] += count (line 250-251) is gated on a separate allowlist check, so non-allowlisted/missing/empty system_name rows drop out of by_system but still contribute to by_installer. Hero = sum(by_installer.values()) (line 311) is structurally identical to v0.2.0's hero computation.
Stable shape: both by_installer and by_system are initialized to {name: 0 for name in ...} (collector.py:220-221) so the returned shape is deterministic regardless of which (installer, system) pairs had rows in the window. Tests can assert on dict equality with literals.
_collect_one per-OS loop (line 348-356) follows the same shape as the per-installer loop. 1 + len(_INSTALLER_BADGE_SPECS) + len(_OS_BADGE_SPECS) = 11 (line 369) — log line uses dynamic count derived from the spec tuples, so it stays accurate if spec tuples change.
_health.json (_write_health line 437-447) adds counts_by_system conditionally only when non-None, preserving backwards-compat for any reader that ignores unknown keys.

Test review (tests/test_collector.py)

47 test functions, 88 total in suite. Cited tests verified by name:

test_run_pypinfo_argv_groups_by_ci_installer_system — argv extension ✓
test_run_pypinfo_returns_by_installer_and_by_system — return shape ✓
test_run_pypinfo_filters_out_non_allowlisted_systems — hero stability invariant (Linux row contributes 100 to both, while FreeBSD/empty/OpenBSD rows contribute 31 to by_installer["pip"] only — total pip=131, by_system={Linux:100,Darwin:0,Windows:0}) ✓
test_run_pypinfo_excludes_ci_true_from_both_dimensions — CI filter applies to both aggregates ✓
test_run_pypinfo_handles_missing_system_name_field — missing-key edge case (older pypinfo or UA-parse failure) — row drops from by_system only ✓
test_collect_one_writes_three_per_os_badge_files — OS file emission ✓
test_collect_one_v0_2_0_files_unchanged_alongside_os_files — v0.2.0 hero/installer files byte-stable ✓
test_collect_writes_eleven_files_per_successful_package — 11-file invariant ✓
test_health_json_preserves_v0_2_0_fields — _health.json count/counts/window_days preserved verbatim ✓

Verifications run

uv sync --frozen --extra dev → succeeded (lockfile reverted to pre-PR-#55 pathspec==1.1.0, expected — branch was cut before #55 merged)
uv run pytest --cov --cov-report=term-missing → 88 passed at 100% coverage including collector.py (168 stmts, 0 missed) ✓
uv run ruff check src/ tests/ → All checks passed
uv run ruff format --check src/ tests/ → 11 files already formatted
uv run mypy src/pypi_winnow_downloads/ → Success: no issues found in 5 source files
CI: 9 pass / 2 expected-skip (changelog, on-unlabel) / 1 pending QA Gate ✓

Doc + integration checks

CHANGELOG.md: well-formed ### Added bullet under ## [Unreleased], sits ABOVE the existing ### Changed block (KaC v1.1.0 ordering preserved). The sibling bullet for PR #54's uv-lock-refresh.yml workflow is preserved unchanged.
README.md: dogfood block has **By OS** (30d, non-CI): paragraph at line 31 with three badges (linux/macos/windows). New **By OS breakdown.** prose paragraph at line 76 explains the per-OS-sum-≤-hero gap and the Darwin → macos slug. "Use this service for your own package" table grows three rows (linux/macos/windows). Format consistent with the existing per-installer dogfood block.
Spec (docs/superpowers/specs/...os-distribution-badge-design.md): aligned with implementation — pypinfo group-by axis, allowlist keys, filename slugs, public labels, and "Hero impact: none" all match shipped code. No drift like #54 had.
No pyproject.toml change: confirmed via diff scope. Version bump to v0.3.0 stays in a separate release PR per project convention.

Backwards-compat invariants

All four invariants from PR body have explicit test coverage; spot-checked the most load-bearing one (test_collect_one_v0_2_0_files_unchanged_alongside_os_files writes 11 files including the v0.2.0 hero + 7 installer files, then asserts byte-identity for the hero file and seven installer files — that's the strongest possible byte-stability claim and it lands).

Test-plan checkboxes 4-5 (post-merge)

Collector run on CT 112 emits 11 files per package per window — needs maintainer verification once the cron picks up the change.
Live README renders correctly with the 3 new badges showing real values — needs the collector run on CT 112 to emit the new badge JSONs first.

Both are post-merge; correct gating.

Labels: Ready for QA → QA Active → Ready for QA Signoff. Awaiting maintainer QA Approved.

cmeans · 2026-04-30T00:05:01Z

Audit trail: applying Ready for QA Signoff — code review clean (architecture mirrors v0.2.0, hero stability + per-system independence both proven at code level), 88/88 pytest at 100% coverage with all 47 collector tests passing, ruff/format/mypy clean, CI fully green. CHANGELOG/README/spec all coherent. v0.2.0 backwards-compat byte-stability has explicit test coverage. Workflow: Ready for QA → QA Active → Ready for QA Signoff.

cmeans · 2026-04-30T00:10:25Z

Closing — PR was created under cmeans (Chris's personal identity) instead of the cmeans-claude-dev[bot] identity that opens PRs in this repo by convention. Reopening with the correct identity. Branch and commits are intact (commits are bot-authored already); only the PR-create action needs redoing.

cmeans-claude-dev Bot and others added 7 commits April 29, 2026 18:30

cmeans added the Ready for QA Dev work complete — QA can begin review label Apr 29, 2026

github-actions Bot added Awaiting CI Dev complete, waiting for CI/Codecov to pass before QA and removed Awaiting CI Dev complete, waiting for CI/Codecov to pass before QA labels Apr 29, 2026

cmeans added QA Active QA is actively reviewing; Dev should not push changes and removed Ready for QA Dev work complete — QA can begin review labels Apr 30, 2026

cmeans commented Apr 30, 2026

View reviewed changes

cmeans added Ready for QA Signoff QA passed — ready for maintainer final review and merge and removed QA Active QA is actively reviewing; Dev should not push changes labels Apr 30, 2026

cmeans closed this Apr 30, 2026

cmeans-claude-dev Bot mentioned this pull request Apr 30, 2026

feat: per-OS download breakdown badges (v3) #57

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: per-OS download breakdown badges (v3)#56

feat: per-OS download breakdown badges (v3)#56
cmeans wants to merge 7 commits into
mainfrom
feat/os-distribution-badges

cmeans commented Apr 29, 2026 •

edited

Loading

Uh oh!

codecov-commenter commented Apr 29, 2026

Uh oh!

cmeans left a comment

Uh oh!

cmeans commented Apr 30, 2026

Uh oh!

cmeans commented Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

cmeans commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why

How

What's in the diff

Cost

Test plan

Backwards-compat invariants verified

Release framing

Commits (7 total — squash-merge will collapse)

Uh oh!

codecov-commenter commented Apr 29, 2026

Codecov Report

Uh oh!

cmeans left a comment

Choose a reason for hiding this comment

Code review (collector.py)

Test review (tests/test_collector.py)

Verifications run

Doc + integration checks

Backwards-compat invariants

Test-plan checkboxes 4-5 (post-merge)

Uh oh!

cmeans commented Apr 30, 2026

Uh oh!

cmeans commented Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cmeans commented Apr 29, 2026 •

edited

Loading