Skip to content

docs: test-classification taxonomy — Amara 18th-ferry §C operationalized#339

Merged
AceHack merged 2 commits intomainfrom
docs/test-classification-amara-18th-ferry
Apr 24, 2026
Merged

docs: test-classification taxonomy — Amara 18th-ferry §C operationalized#339
AceHack merged 2 commits intomainfrom
docs/test-classification-amara-18th-ferry

Conversation

@AceHack
Copy link
Copy Markdown
Member

@AceHack AceHack commented Apr 24, 2026

Summary

Research-grade proposal formalizing the 5-category test taxonomy from Amara 18th-ferry Part 1 §C + Part 2 correction #10. Sixth of the ten 18th-ferry corrections — specifically the CI-test-classification correction.

Five categories

  1. Deterministic unit tests (PR gate; no randomness)
  2. Seeded property tests (PR gate; fixed-seed replay)
  3. Statistical smoke tests (nightly/extended; do NOT gate PRs)
  4. Formal / model tests (PR gate or separate track)
  5. Quarantined / known-flaky (not gated; migration path required)

Sharder flake as worked example

BACKLOG #327 sharder flake used as the running worked example. Remedy order: measure variance → seed-lock → widen if data justifies → nightly only if stochastic is essential. Do NOT blind-widen or blind-quarantine.

CI split proposed (advisory)

  • PR-gate workflow (deterministic-only)
  • Nightly-sweep workflow (100+-seed tests; emits seed-results.csv, failing-seeds.txt, distributions.json)
  • Quarantined workflow (weekly verbose logging)

Scope

  • Research-grade only. Promotion to factory discipline requires ADR.
  • No code changes. No workflow changes. No test migrations.
  • Composes with docs/research/test-organization.md (layout) + docs/definitions/KSK.md (Oracle trusts CI-backed stats).

Test plan

  • Markdownlint clean locally.
  • Single new file; no surface impact.
  • Markdownlint passes on CI.

🤖 Generated with Claude Code

Research-grade proposal formalizing the 5-category test
taxonomy from Amara 18th-ferry Part 1 §C ("CI Testing &
Governance Policy") + Part 2 correction #10 (sharder —
measure before widen).

Five categories:
1. Deterministic unit tests (PR gate; no randomness)
2. Seeded property tests (PR gate; fixed-seed replay)
3. Statistical smoke tests (nightly/extended; assert
   statistical properties; do NOT gate PRs)
4. Formal / model tests (PR gate or separate track)
5. Quarantined / known-flaky (not gated; migration path
   required)

Sharder flake (BACKLOG #327) used as the running worked
example — it is a category-3 statistical test masquerading
as category-1 deterministic. Remedy order: measure
observed variance → seed-lock if intent allows → widen
threshold if data justifies → move to nightly only if
stochastic is essential. Do NOT blind-widen or blind-
quarantine.

CI split proposed (advisory, not yet implemented):
- PR-gate workflow (deterministic-only, excludes
  [<Statistical>] and tests/Quarantine/)
- Nightly-sweep workflow (100+-seed statistical tests;
  emits seed-results.csv, failing-seeds.txt,
  distributions.json artifacts)
- Quarantined workflow (weekly, verbose logging, issues
  opened on tests that start passing)

Sixth queued correction from the 18th-ferry operationalization
list; remains research-grade until ADR promotes. Composes with
docs/research/test-organization.md (layout), BACKLOG #327
(sharder), docs/definitions/KSK.md (Oracle trusts statistical
evidence with CIs), Otto-105 graduation cadence.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings April 24, 2026 08:56
@AceHack AceHack enabled auto-merge (squash) April 24, 2026 08:56
@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@AceHack AceHack merged commit 05a381a into main Apr 24, 2026
10 checks passed
@AceHack AceHack deleted the docs/test-classification-amara-18th-ferry branch April 24, 2026 09:02
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a research-grade documentation proposal defining a 5-category test-classification taxonomy intended to clarify CI gating discipline (PR-gate deterministic vs. scheduled statistical/quarantined), using the sharder flake as the motivating example.

Changes:

  • Introduces a new docs/research/test-classification.md proposal defining five test categories and migration rules.
  • Documents an advisory CI workflow split (PR gate vs nightly sweep vs optional quarantine run) and proposed attribute/directory conventions.
  • Adds cross-references intended to connect this taxonomy to existing governance, backlog, and verification docs.

the migration path.
- **Discovery hint.** Custom `[<Quarantined>]`
attribute, or `[<Fact(Skip="reason")>]`, or the
`tests/Quarantine/` directory (new convention).
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: The quarantine-directory path is inconsistent: here it says tests/Quarantine/, but later the doc proposes tests/Tests.FSharp/Quarantine/*.Tests.fs. Please pick one convention and use it consistently throughout to avoid confusion during migration.

Suggested change
`tests/Quarantine/` directory (new convention).
`tests/Tests.FSharp/Quarantine/` directory (new convention).

Copilot uses AI. Check for mistakes.
Comment on lines +18 to +19
bounds (e.g. `CartelToy.Tests.fs` ≥90% detection over
100 seeds).
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: CartelToy.Tests.fs is used as an example of an existing statistical test, but there is currently no tests/Tests.FSharp/Simulation/CartelToy.Tests.fs (and no Simulation/ folder) in the repo. Either update the example to point at an in-tree test file, or label it as a planned/future test rather than an existing one.

Suggested change
bounds (e.g. `CartelToy.Tests.fs` ≥90% detection over
100 seeds).
bounds (e.g. a planned/future `CartelToy.Tests.fs`
asserting ≥90% detection over 100 seeds).

Copilot uses AI. Check for mistakes.
Comment on lines +109 to +110
(`tools/alloy/`, `tools/lean4/`, `tools/formal/`)
separate from F# test directories.
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: This discovery hint lists tools/formal/, but that directory doesn’t exist in the current repo (formal tooling lives under e.g. tools/tla/, tools/Z3Verify/, tools/alloy/, tools/lean4/). Consider updating the path list to match the actual tree so the hint is actionable.

Suggested change
(`tools/alloy/`, `tools/lean4/`, `tools/formal/`)
separate from F# test directories.
(`tools/tla/`, `tools/Z3Verify/`, `tools/alloy/`,
`tools/lean4/`) separate from F# test directories.

Copilot uses AI. Check for mistakes.
Comment on lines +20 to +23
- Property tests without seed-locking that assert
statistical properties (e.g.
`SharderInfoTheoreticTests.Uniform` — the flake
tracked in BACKLOG #327).
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: This characterizes SharderInfoTheoreticTests.Uniform as “without seed-locking”, but the current test implementation uses an explicit Random 42 / fixed seeds. If the test is still non-deterministic in practice, the doc should call out the actual nondeterminism source (e.g., process-randomized hashing like HashCode.Combine) rather than implying the RNG seed is missing.

Suggested change
- Property tests without seed-locking that assert
statistical properties (e.g.
`SharderInfoTheoreticTests.Uniform` — the flake
tracked in BACKLOG #327).
- Property tests with fixed RNG seeds but another
nondeterminism source that still assert statistical
properties (e.g. `SharderInfoTheoreticTests.Uniform`
— the flake tracked in BACKLOG #327; if the variance
is from process-level hashing or similar, document
that source explicitly rather than calling it
"without seed-locking").

Copilot uses AI. Check for mistakes.
ferry.md`.
- `docs/research/test-organization.md` — layout
discipline (28-files-flat → folder grouping).
- `docs/BACKLOG.md` — PR #327 sharder flake row.
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: This cross-reference claims docs/BACKLOG.md contains a “PR #327 sharder flake row”, but docs/BACKLOG.md currently has no matching entry. Please update this link target to something that exists (or add the referenced row) so readers can follow the trail.

Suggested change
- `docs/BACKLOG.md`PR #327 sharder flake row.
- `docs/BACKLOG.md`migration / misclassification
tracking rows referenced by this proposal.

Copilot uses AI. Check for mistakes.
Comment on lines +3 to +8
**Status:** research-grade proposal (pre-v1). Origin: Amara
18th courier ferry, Part 1 §C ("CI Testing & Governance
Policy") + Part 2 correction #1 (precision wording) +
correction #10 (sharder — measure before widen). Author:
architect review. Scope: formalizes a 5-category test
taxonomy and the "PR gate = deterministic-only" discipline.
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: This doc uses direct contributor names in the body (e.g., “Amara”, “Aaron”). Per the operational standing rule in docs/AGENT-BEST-PRACTICES.md (“No name attribution in code, docs, or skills”), docs should use role references (e.g., “human maintainer”, “architect”) except in the allowed locations. Please rewrite the doc to use role-refs and keep names confined to the permitted areas.

Copilot uses AI. Check for mistakes.
Comment on lines +310 to +313
- Amara 18th ferry — Part 1 §C + Part 2 #1 + #10.
`docs/aurora/2026-04-24-amara-calibration-ci-
hardening-deep-research-plus-5-5-corrections-18th-
ferry.md`.
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: The cross-reference to docs/aurora/2026-04-24-amara-calibration-ci-hardening-deep-research-plus-5-5-corrections-18th-ferry.md appears to be broken (no such file currently exists under docs/aurora/). Either update this to the correct in-tree document, or mark it as a future/pending artifact and link to the tracking issue/PR instead.

Suggested change
- Amara 18th ferry — Part 1 §C + Part 2 #1 + #10.
`docs/aurora/2026-04-24-amara-calibration-ci-
hardening-deep-research-plus-5-5-corrections-18th-
ferry.md`.
- Amara 18th ferry — Part 1 §C + Part 2 #1 + #10;
source memorandum / pending artifact, not currently
checked into `docs/aurora/`.

Copilot uses AI. Check for mistakes.
Comment on lines +281 to +286
- **`docs/definitions/KSK.md`.** KSK's advisory flow
(Detection → Oracle → KSK → Action) benefits from
category-3 statistical evidence for "Detection" —
the Oracle and KSK layers trust statistical smoke
output with confidence intervals, not single-seed
point estimates.
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: docs/definitions/KSK.md is referenced as if it exists, but there is no docs/definitions/ directory in the repo right now. Consider linking to the existing KSK material that’s actually in-tree, or explicitly labeling this as a planned doc and linking to the backlog item that tracks creating it.

Copilot uses AI. Check for mistakes.
AceHack added a commit that referenced this pull request Apr 24, 2026
…-ferry §B + §F + corrections #2 #7 #9 (#342)

Research-grade design doc for the Stage-2 rung of Amara's
corrected promotion ladder. Specifies: (a) placement under
src/Experimental/CartelLab/ (not src/Core/ — that's Stage 4);
(b) MetricVector type with PLV magnitude AND offset split
(correction #6); (c) INullModelGenerator interface +
Preserves/Avoids table columns; (d) IAttackInjector
forward-looking interface (Stage 3); (e) Wilson-interval
reporting contract with {successes, trials, lowerBound,
upperBound} schema (correction #2 — no more "~95% CI ±5%"
handwave); (f) RobustZScoreMode with Hybrid fallback
(correction #7 — percentile-rank when MAD < epsilon);
(g) explicit artifact-output layout under artifacts/
coordination-risk/ with five files + run-manifest.json
(correction #9).

6-stage promotion path (0 doc / 1 ADR / 2.a skeleton /
2.b full null-models + first attack / 3 attack suite /
4 Core/NetworkIntegrity / 5 Aurora-KSK) matches Amara's
corrected ladder and Otto-105 cadence.

Doc-only change; no code, no tests, no workflow, no
BACKLOG tail touch (avoids positional-conflict pattern
that cost #334#341 re-file this session).

This is the 7th of 10 18th-ferry operationalizations:
- #1/#10 test-classification (#339)
- #2 Wilson-interval design specified (this doc)
- #6 PLV phase-offset shipped (#340)
- #7 MAD=0 Hybrid mode specified (this doc)
- #9 artifact layout specified (this doc)
- #4 exclusivity already shipped (#331)
- #5 modularity relational already shipped (#324)

Remaining: Wilson-interval IMPLEMENTATION (waits on #323 +
Stage 2.a), MAD=0 Hybrid IMPLEMENTATION (waits on #333 +
Stage 2.a), conductance-sign doc (waits on #331), Stage-2.a
skeleton itself.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
AceHack added a commit that referenced this pull request Apr 24, 2026
…rections (#344)

Dedicated absorb of Amara's 19th courier ferry per CC-002
close-on-existing discipline. Scheduled Otto-164 → executed
Otto-165, following 7-ferry precedent (PRs #196 / #211 /
#219 / #221 / #235 / #245 / #259 / #330 / #337).

Two-part ferry: Part 1 deep-research DST audit (12
sections: rulebook, 12-row entropy scan, dependency audit,
7-row simulation-surface coverage, retry audit, CI
determinism, seed discipline, Cartel-Lab DST readiness,
KSK/Aurora DST readiness, state-of-the-art comparison,
10-row PR roadmap, what-not-to-claim caveats; Mermaid CI
diagram + Gantt timeline). Part 2 Amara's own 5.5-Thinking
correction pass (7 required corrections, per-area grade
table with B- overall, revised 6-PR roadmap with titles
locked, DST-held + FoundationDB-grade acceptance criteria,
copy-paste Kenji summary).

Key findings:
- DST grade: B- (strong architecture, partial impl)
- Blockers: DiskBackingStore bypasses simulation (D-grade
  filesystem simulation), no ISimulationDriver, Task.Run
  ambient ThreadPool risk, no seed artifacts / no swarm
  harness
- 4 of 12 Part-1 sections already align with shipped
  substrate:
  - §6 test classification → PR #339
  - §7 artifact layout → PR #342 design
  - §8 Cartel-Lab stage discipline → PRs #330/#337/#342
  - §9 KSK advisory-only → PR #336 + Otto-140..145 memory

6-PR revised roadmap queued as graduation candidates:
1. DST scanner + accepted-boundary registry (new tool +
   policy docs + workflow)
2. Seed protocol + CI artifacts
3. Sharder reproduction (NOT widen) — reinforces 18th #10
4. ISimulationDriver + VTS promotion to core
5. Simulated filesystem (DiskBackingStore rewrite)
6. Cartel-Lab DST calibration (aligns with #342 design)

Plus: push-with-retry.sh retry-audit finding; DST-held +
FDB-grade criteria lock.

GOVERNANCE §33 four-field header (Scope / Attribution /
Operational status / Non-fusion disclaimer). Amara verdict
preserved: "strong draft / not canonical yet."

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
AceHack added a commit that referenced this pull request Apr 24, 2026
…mara 19th-ferry correction #6) (#346)

Research-grade criteria doc locking two acceptance bars:

1. DST-held — minimum: 6 items (seeds committed, failing
   tests emit seed+params, bit-for-bit local-vs-CI
   reproducibility, broad sweeps nightly-not-gating,
   zero unreviewed entropy hits in main-path, boundaries
   either simulated or explicitly accepted).
2. FoundationDB-grade DST candidate — aspirational: 8
   surfaces (simulated FS, simulated network,
   deterministic task scheduler, fault injection/buggify,
   swarm runner, replay artifact storage, failure
   minimization/shrinking, end-to-end scenario from one
   seed).

Maps 19th-ferry revised-roadmap PRs to which criteria
items each addresses. Captures Amara's per-area grade
table (overall B-) as "Amara's assessment, not factory-
certified."

Explicit promotion path: doc stays research-grade until
PR 1 of the 19th-ferry revised roadmap lands an ADR
promoting the DST-held bar to factory discipline; at
that point criteria migrate to docs/DST-COMPLIANCE.md
top-level.

No graduation claims DST-held today; graduations reference
this doc as target without self-certification.

Composes with test-classification.md (PR #339; supports
items 1+2+4), calibration-harness-stage2-design.md (PR
#342; artifact schema supports item 2), Amara 19th ferry
(PR #344 absorb; source of criteria).

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
AceHack added a commit that referenced this pull request Apr 24, 2026
Addresses Amara 18th-ferry correction #6: PLV = 1 can mean
anti-phase locking, not same-time synchronization. Downstream
detectors that rely on "PLV = 1 => synchronized" misread
anti-phase coordinators as same-time coordinators.

Two new functions in `TemporalCoordinationDetection`:

- `meanPhaseOffset phasesA phasesB : double option`
  Returns the argument (angle) of the mean complex phase-
  difference vector whose magnitude is the PLV. Returns
  None when series are empty, mismatched-length, or when
  the mean vector has effectively zero magnitude (1e-12
  floor) — in which case direction is mathematically
  undefined.

- `phaseLockingWithOffset phasesA phasesB : struct (double * double) option`
  Returns both magnitude and offset in one sequence pass.
  Zero-magnitude case: magnitude near 0, offset = nan;
  near-zero magnitude is the caller's reliable "offset is
  undefined" signal.

Existing `phaseLockingValue` contract unchanged; new primitives
are additive. Downstream `Graph.coordinationRiskScore*` and any
other detector consuming PLV can now add a separate offset-
based term instead of collapsing both into one scalar (Amara's
explicit recommendation in correction #6).

8 new xUnit tests covering:
- Identical series (offset = 0)
- Constant pi/4 offset (observed = -pi/4, a-minus-b convention)
- Anti-phase series (magnitude 1, offset = pi) — the correction
  #6 regression test, contrasted against in-phase (offset 0)
  with identical magnitude
- Uniformly-distributed differences (zero-magnitude => None)
- Empty / mismatched-length / single-element edge cases
- phaseLockingWithOffset magnitude matches phaseLockingValue
  (consistency property preventing silent detector divergence)
- phaseLockingWithOffset zero-magnitude returns (near-zero, nan)
- phaseLockingWithOffset returns None on empty/mismatched

All 37 TemporalCoordinationDetection tests pass locally.
0 Warnings / 0 Errors build.

6th of the 10 18th-ferry corrections operationalized this
week (after test-classification doc in #339, parser-tech in
#338). Remaining: Wilson CIs in CartelToy tests (needs #323
landed), MAD=0 percentile-rank fallback (needs #333 landed),
conductance-sign doc (needs #331 landed), artifact-output
layout (Stage-2 with calibration harness).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
AceHack added a commit that referenced this pull request Apr 24, 2026
Addresses Amara 18th-ferry correction #6: PLV = 1 can mean
anti-phase locking, not same-time synchronization. Downstream
detectors that rely on "PLV = 1 => synchronized" misread
anti-phase coordinators as same-time coordinators.

Two new functions in `TemporalCoordinationDetection`:

- `meanPhaseOffset phasesA phasesB : double option`
  Returns the argument (angle) of the mean complex phase-
  difference vector whose magnitude is the PLV. Returns
  None when series are empty, mismatched-length, or when
  the mean vector has effectively zero magnitude (1e-12
  floor) — in which case direction is mathematically
  undefined.

- `phaseLockingWithOffset phasesA phasesB : struct (double * double) option`
  Returns both magnitude and offset in one sequence pass.
  Zero-magnitude case: magnitude near 0, offset = nan;
  near-zero magnitude is the caller's reliable "offset is
  undefined" signal.

Existing `phaseLockingValue` contract unchanged; new primitives
are additive. Downstream `Graph.coordinationRiskScore*` and any
other detector consuming PLV can now add a separate offset-
based term instead of collapsing both into one scalar (Amara's
explicit recommendation in correction #6).

8 new xUnit tests covering:
- Identical series (offset = 0)
- Constant pi/4 offset (observed = -pi/4, a-minus-b convention)
- Anti-phase series (magnitude 1, offset = pi) — the correction
  #6 regression test, contrasted against in-phase (offset 0)
  with identical magnitude
- Uniformly-distributed differences (zero-magnitude => None)
- Empty / mismatched-length / single-element edge cases
- phaseLockingWithOffset magnitude matches phaseLockingValue
  (consistency property preventing silent detector divergence)
- phaseLockingWithOffset zero-magnitude returns (near-zero, nan)
- phaseLockingWithOffset returns None on empty/mismatched

All 37 TemporalCoordinationDetection tests pass locally.
0 Warnings / 0 Errors build.

6th of the 10 18th-ferry corrections operationalized this
week (after test-classification doc in #339, parser-tech in
#338). Remaining: Wilson CIs in CartelToy tests (needs #323
landed), MAD=0 percentile-rank fallback (needs #333 landed),
conductance-sign doc (needs #331 landed), artifact-output
layout (Stage-2 with calibration harness).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
AceHack added a commit that referenced this pull request Apr 24, 2026
…340)

* core: PLV mean phase offset — 19th graduation (Amara 18th-ferry #6)

Addresses Amara 18th-ferry correction #6: PLV = 1 can mean
anti-phase locking, not same-time synchronization. Downstream
detectors that rely on "PLV = 1 => synchronized" misread
anti-phase coordinators as same-time coordinators.

Two new functions in `TemporalCoordinationDetection`:

- `meanPhaseOffset phasesA phasesB : double option`
  Returns the argument (angle) of the mean complex phase-
  difference vector whose magnitude is the PLV. Returns
  None when series are empty, mismatched-length, or when
  the mean vector has effectively zero magnitude (1e-12
  floor) — in which case direction is mathematically
  undefined.

- `phaseLockingWithOffset phasesA phasesB : struct (double * double) option`
  Returns both magnitude and offset in one sequence pass.
  Zero-magnitude case: magnitude near 0, offset = nan;
  near-zero magnitude is the caller's reliable "offset is
  undefined" signal.

Existing `phaseLockingValue` contract unchanged; new primitives
are additive. Downstream `Graph.coordinationRiskScore*` and any
other detector consuming PLV can now add a separate offset-
based term instead of collapsing both into one scalar (Amara's
explicit recommendation in correction #6).

8 new xUnit tests covering:
- Identical series (offset = 0)
- Constant pi/4 offset (observed = -pi/4, a-minus-b convention)
- Anti-phase series (magnitude 1, offset = pi) — the correction
  #6 regression test, contrasted against in-phase (offset 0)
  with identical magnitude
- Uniformly-distributed differences (zero-magnitude => None)
- Empty / mismatched-length / single-element edge cases
- phaseLockingWithOffset magnitude matches phaseLockingValue
  (consistency property preventing silent detector divergence)
- phaseLockingWithOffset zero-magnitude returns (near-zero, nan)
- phaseLockingWithOffset returns None on empty/mismatched

All 37 TemporalCoordinationDetection tests pass locally.
0 Warnings / 0 Errors build.

6th of the 10 18th-ferry corrections operationalized this
week (after test-classification doc in #339, parser-tech in
#338). Remaining: Wilson CIs in CartelToy tests (needs #323
landed), MAD=0 percentile-rank fallback (needs #333 landed),
conductance-sign doc (needs #331 landed), artifact-output
layout (Stage-2 with calibration harness).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(#340): refactor shared accumulation + 5 review-thread fixes (Otto-216)

Active PR-resolve-loop on #340 (PLV mean phase offset).

1. Sentinel-default in test (thread 59WGi9): replaced
   Option.defaultValue -1.0 pattern in the
   phaseLockingWithOffset-magnitude-matches-phaseLockingValue
   consistency test with explicit pattern-match + fail
   on None. Sentinel form would silently pass the
   equality assertion if BOTH primitives returned None,
   masking regressions.

2. Broken ferry cross-reference path (thread 59WGjn):
   doc comment referenced docs/aurora/2026-04-24-amara-
   calibration-ci-hardening-deep-research-plus-5-5-
   corrections-18th-ferry.md which doesn't exist on
   main (only 7th / 17th / 19th ferries landed as
   standalone docs). Rewrote provenance to describe the
   ferry topically + cross-reference the related 19th-
   ferry DST audit that IS in the repo.

3. Misleading "same PLV-magnitude floor" wording
   (thread 59WGj4): doc said meanPhaseOffset's
   zero-magnitude check uses "the same PLV-magnitude
   floor" — phaseLockingValue has NO floor (returns
   values arbitrarily close to 0). Fixed: clarified
   that the phasePairEpsilon floor applies ONLY to
   the offset-undefined decision; phaseLockingValue
   returns magnitude without threshold.

4. Name-attribution in doc comment (thread 59WGkP):
   "Aaron + Amara 11th ferry" replaced with "the 11th
   ferry" per factory role-reference convention. Audit-
   trail surfaces (commit messages, tick-history, memory)
   retain direct attribution; code/doc comments use
   role references.

5. Duplicate sin/cos accumulation across 3 functions
   (thread 59WGkn): extracted private helpers
   phasePairEpsilon + meanPhaseDiffVector. All three
   functions (phaseLockingValue, meanPhaseOffset,
   phaseLockingWithOffset) now route through the
   shared accumulator. Eliminates drift risk — one
   function can no longer silently diverge from the
   others on accumulation or threshold.

Build: 0 Warning(s) / 0 Error(s). All 37 TemporalCoordinationDetection
tests pass.

All 5 threads replied via GraphQL next step.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(#340): 2 review threads (stale ferry path + atan2 range)

Thread 59Yqkl (P1) — stale provenance reference:
  The doc cited `docs/aurora/2026-04-24-amara-temporal-
  coordination-detection-cartel-graph-influence-surface-
  11th-ferry.md`, but the 11th ferry has not yet landed
  under `docs/aurora/` (it's queued in the Otto-105
  operationalize cadence; PR #296 is its pending absorb).
  Replaced with the intent-preserving form: role references
  ("external AI collaborator's 11th courier ferry") plus a
  pointer at the MEMORY.md queue entry, so the provenance
  survives regardless of when the file-path question
  resolves. Also dropped the direct first-name so this
  factory-produced doc-comment tracks the name-attribution
  discipline.

Thread 59YqlC (P2) — atan2 range correction:
  Doc said `(-pi, pi]` but `System.Math.Atan2` is documented
  as `[-pi, pi]` (both endpoints reachable under IEEE-754
  signed-zero semantics: atan2(0, -1) = +pi,
  atan2(-0, -1) = -pi). Updated the doc to match the
  implementation. Behaviour unchanged.

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
AceHack added a commit that referenced this pull request Apr 24, 2026
…-161 docs ambiguity

Design-only proposal per Otto-165 offer. Aaron Otto-161
macOS-everywhere directive + Otto-164 pricing-docs ambiguity
(macos-14 is standard-runner-type per about-github-hosted-
runners; billing page lists it at $0.062/min in the same
table as Linux/Windows without marking public-only).

Instead of resolving the ambiguity (can't — docs genuinely
contradict each other), propose a THIRD PATH that works in
either interpretation:

- PR gate stays ubuntu-22.04 only (unambiguously free on
  public repos).
- New nightly-cross-platform.yml runs matrix [ubuntu-22.04,
  windows-2022, macos-14] on cron '0 9 * * *' (09:00 UTC,
  off-the-hour to avoid scheduler stampede).
- Cost model: worst case ~$28/month/repo if macOS is billed;
  $0 if free. Either way, cadence caps exposure.
- Fork-scoping: `if: github.repository == canonical OR
  workflow_dispatch OR pull_request-to-this-file` prevents
  scheduled trigger firing on contributor forks (would burn
  fork-owner's personal-account minutes).
- No-alerting first cut (observation-only); issue-opening
  on red is a later enhancement.

Phased rollout:
- Phase 0 (now): this design doc, no YAML.
- Phase 1: Aaron signs off on cost tradeoff.
- Phase 2: land workflow on Zeta.
- Phase 3: observe 7 nightly runs for signal.
- Phase 4 (30 days): parallel lucent-ksk landing per
  Otto-140 rewrite authority, OR drop macOS if no signal +
  worst-case billing, OR expand matrix if best-case
  confirmed.

Rollback: delete macos-14 from matrix (one-line diff) or
delete workflow file entirely. No impact on gate.yml.

Composes with FACTORY-HYGIENE row #51 (unblocks enforcement
mode), docs/BACKLOG.md row ~2471 (Otto-161 declined + this
as alternative), docs/research/test-classification.md (PR
#339; category-3 nightly pattern).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
AceHack added a commit that referenced this pull request Apr 24, 2026
…-161 docs ambiguity (#345)

* docs: nightly cross-platform workflow design — third path around Otto-161 docs ambiguity

Design-only proposal per Otto-165 offer. Aaron Otto-161
macOS-everywhere directive + Otto-164 pricing-docs ambiguity
(macos-14 is standard-runner-type per about-github-hosted-
runners; billing page lists it at $0.062/min in the same
table as Linux/Windows without marking public-only).

Instead of resolving the ambiguity (can't — docs genuinely
contradict each other), propose a THIRD PATH that works in
either interpretation:

- PR gate stays ubuntu-22.04 only (unambiguously free on
  public repos).
- New nightly-cross-platform.yml runs matrix [ubuntu-22.04,
  windows-2022, macos-14] on cron '0 9 * * *' (09:00 UTC,
  off-the-hour to avoid scheduler stampede).
- Cost model: worst case ~$28/month/repo if macOS is billed;
  $0 if free. Either way, cadence caps exposure.
- Fork-scoping: `if: github.repository == canonical OR
  workflow_dispatch OR pull_request-to-this-file` prevents
  scheduled trigger firing on contributor forks (would burn
  fork-owner's personal-account minutes).
- No-alerting first cut (observation-only); issue-opening
  on red is a later enhancement.

Phased rollout:
- Phase 0 (now): this design doc, no YAML.
- Phase 1: Aaron signs off on cost tradeoff.
- Phase 2: land workflow on Zeta.
- Phase 3: observe 7 nightly runs for signal.
- Phase 4 (30 days): parallel lucent-ksk landing per
  Otto-140 rewrite authority, OR drop macOS if no signal +
  worst-case billing, OR expand matrix if best-case
  confirmed.

Rollback: delete macos-14 from matrix (one-line diff) or
delete workflow file entirely. No impact on gate.yml.

Composes with FACTORY-HYGIENE row #51 (unblocks enforcement
mode), docs/BACKLOG.md row ~2471 (Otto-161 declined + this
as alternative), docs/research/test-classification.md (PR
#339; category-3 nightly pattern).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(#345): 6 review threads — name attribution + cron + YAML + fork-scheduling + BACKLOG ref

- thread Wkcz (line 327): removed broken `memory/feedback_ksk_naming_...`
  reference (factory-personal memories live in `~/.claude/projects/<slug>/memory/`,
  not in-repo); paraphrased the rewrite-authority rule in §10 without
  promising an in-repo path.

- thread WkdI (line 7): purged name-attribution tokens per Otto-220
  code-comments-not-history + doc-comment-history-audit lint
  (PR #363). All "Aaron" / "Otto-NN" / "Amara" / "Max" references
  rewritten to role references ("human maintainer", "prior-contributor",
  "autonomous loop", "initial-starting-point contributor").

- thread WkdX (line 163): cron changed `0 9 * * *` → `7 9 * * *`
  (09:07 UTC) so it matches the "off the hour" comment; note now
  calls out alignment with the sibling scheduled workflow
  `github-settings-drift.yml` (`17 14 * * 1`).

- thread Wkdk (line 146): YAML sketch rewritten to match the actual
  `.github/workflows/gate.yml` installer pattern — three-way-parity
  `./tools/setup/install.sh` invocation plus the same cache-key
  shape (dotnet / mise / nuget). Added explicit note that Windows
  matrix leg depends on `tools/setup/install.sh` growing Windows
  support first per the existing BACKLOG row.

- thread Wkdz (line 248): corrected the fork-scheduling claim. GitHub
  disables scheduled workflows on forks by default — the repo's
  own `github-settings-drift.yml` runs without fork-scoping and
  proves this. The `if: github.repository ==` guard is kept as
  optional hygiene for the rare opt-in-fork case, not as a cost-
  safety requirement.

- thread WkeB (line 316): replaced the wrong `docs/BACKLOG.md`
  line-number reference (~2471 is actually the mise-activate
  / HLL-flakiness neighborhood) with stable grep anchors
  ("Windows matrix in CI" + "Parity swap: CI's `actions/setup-dotnet`").

Markdownlint passes on the edited file.

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants