Skip to content

backlog: file B-0182 — filter formal verification to Linux only (Aaron 2026-05-03)#1405

Merged
AceHack merged 3 commits intomainfrom
backlog/b0182-linux-only-formal-verification-tests-aaron-2026-05-03
May 3, 2026
Merged

backlog: file B-0182 — filter formal verification to Linux only (Aaron 2026-05-03)#1405
AceHack merged 3 commits intomainfrom
backlog/b0182-linux-only-formal-verification-tests-aaron-2026-05-03

Conversation

@AceHack
Copy link
Copy Markdown
Member

@AceHack AceHack commented May 3, 2026

Summary

Files B-0182 per Aaron 2026-05-03's CI-cost observation: formal verification (TLC, Alloy, Lean) is pure-math computation with no OS-specific behavior; running it on the full matrix (macos-26, ubuntu-24.04-arm, ubuntu-slim) is duplicate work.

Architecturally orthogonal-axes split

  • Formal verification (TLC, Alloy, Lean): Linux-only — pure-math, OS-independent
  • Everything else (F# unit, FsCheck, integration): full matrix — touches runtime/IO/threading, needs OS coverage

Three implementation options documented

  • Option A: runtime OS check in toolchainReady() with CI=true env-gate (recommended)
  • Option B: custom [<LinuxOnly>] xunit attribute
  • Option C: separate Tests.FSharp.Formal project (cleanest, most invasive)

P2 priority

Cost-optimization; doesn't block any current work.

Test plan

🤖 Generated with Claude Code

…Alloy/Lean OS-independent; pure-math)

Per Aaron 2026-05-03: 'we don't have to run formal verifical on all
the oses, we can run that just on linux it does not have differents
between OS like scirpt and code that touch the enviroment. just the
standard linux too don't need it to run on slim'.

Confirmed agreement: the split is orthogonal-axes correct — formal
verification (TLC, Alloy, Lean) is pure-math computation with no
OS-specific behavior; everything else (F# unit + FsCheck +
integration) touches runtime/IO/threading and needs OS coverage.

Backlog row covers:
- Problem (current duplication: gate.yml runs TLC on macos-26 +
  ubuntu-24.04-arm; low-memory.yml runs on ubuntu-slim — wasted
  CI time)
- Verbatim ask
- Scope (filter to standard ubuntu-24.04 only)
- Three implementation options:
  - Option A: runtime OS check in toolchainReady() with CI=true
    env-gate (recommended; minimal change; preserves dev-local
    validation)
  - Option B: custom [LinuxOnly] xunit attribute (more structured)
  - Option C: separate Tests.FSharp.Formal project (cleanest
    separation; most invasive)
- Effort estimate: M (1-2 days)

P2 priority: cost-optimization, not blocking.

Composes with #1383 math-proofs assessment + B-0017 dashboard
(future CI-cost metric).
Copilot AI review requested due to automatic review settings May 3, 2026 13:17
@AceHack AceHack enabled auto-merge (squash) May 3, 2026 13:17
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new P2 backlog row documenting a CI cost-optimization idea: run formal-verification workloads (TLC, Alloy, Lean) on standard Linux only, rather than across the full OS/arch matrix.

Changes:

  • Adds backlog row B-0182 describing the problem, scope, and trade-offs for Linux-only formal verification runs.
  • Documents three implementation options (runtime gating, xUnit attribute, separate test project) and a recommendation.

…im discriminator + MD032

Three #1405 findings:

1. **Fact count**: said '9 [Fact] entries' but file has 10 (will
   drift further). Removed hard count; describes purpose instead.

2. **Option A misses ubuntu-slim**: ubuntu-slim is still Linux, so
   OS-only check doesn't skip it. Updated Option A to be a dual-
   axis check: OS (skip non-Linux) AND runner-class (skip
   ubuntu-slim via GITHUB_WORKFLOW=low-memory env-var
   discriminator). Cleaner than RUNNER_NAME since it survives
   runner-name churn.

3. **MD032 violation at line 100**: 'M (1-2 days):' line was
   immediately followed by list items without a blank line.
   Added blank line per markdownlint MD032/blanks-around-lists.
@AceHack AceHack merged commit f715861 into main May 3, 2026
22 checks passed
@AceHack AceHack deleted the backlog/b0182-linux-only-formal-verification-tests-aaron-2026-05-03 branch May 3, 2026 13:25
AceHack added a commit that referenced this pull request May 3, 2026
…ster + cache-clobber discipline encoded (#1408)

Substantial multi-tick session shard. 18 PRs touched (#1383 + #1387
+ #1392-#1407 inclusive); 14 merged + 4 in-flight as of shard time.

**Math-proofs assessment progress** (#1383 outstanding-work matrix):
- A1+A2 → A-with-CI ✓ (#1394 Lean lake-build workflow)
- A4 registry rows ✓ (#1393)
- B1 → 2 of 4 deferred specs in CI ✓ (#1397 DbspSpec + #1401
  CircuitRegistration B-0180 closed)
- B2 Alloy → A ✓ (#1396 silent-no-op spec-path fix)
- B4 Semgrep → A ✓ (correction)
- Peer-review email template ✓ (#1387)
- Phase 0 substrate-discovery PoC ✓ (#1392)
- Stryker config-fix ✓ (#1395; CI wire deferred)
- 3 broken-spec backlog rows filed ✓ (#1398)

**Cache-clobber silent-bug class discovered + fully encoded:**
B-0180 fix passing locally + failing CI → verify-then-claim
identified gate.yml + low-memory.yml caching whole tools/tla and
tools/alloy directories. Fix cluster: #1403 (gate.yml) + #1404
(low-memory.yml + audit-ci-cache-paths.ts) + #1406 (CI lint gate)
+ #1407 (memory file + bug-locus disambiguation per Aaron's 'real
github bug?' question — answer: usage-bug, not tool-bug).

**Other substrate work:** #1399 BACKLOG.md regen, #1400 .ts/.sh
parity bug, #1402 assessment matrix doc update, #1405 B-0182
backlog row (Linux-only formal verification — orthogonal-axes
split per Aaron 2026-05-03).

**Discipline lessons captured:** chat-is-assertion-channel,
substrate-corrections-cluster, search-first-before-architectural-
expansion, verify-then-claim CI fidelity, documentation-is-
current-state-not-history.

Carved sentence: 'When a lucky catch surfaces a class of bug,
build the structural fix that eliminates the luck — audit + lint
gate + carved-sentence rule + memory file.'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants