Skip to content

ci: E2E workflow, web typecheck job, pre-commit hook, test suite#486

Merged
magyargergo merged 13 commits into
abhigyanpatwari:mainfrom
jreakin:ci/web-test-infrastructure
Mar 24, 2026
Merged

ci: E2E workflow, web typecheck job, pre-commit hook, test suite#486
magyargergo merged 13 commits into
abhigyanpatwari:mainfrom
jreakin:ci/web-test-infrastructure

Conversation

@jreakin

@jreakin jreakin commented Mar 23, 2026

Copy link
Copy Markdown
Contributor

Summary

CI infrastructure and test suite for gitnexus-web. Split from #303. Replaces closed #478 (rebased on main after #477 merged).

CI

  • ci.yml — Consolidated to reference ci-tests.yml. Gate allows skipped E2E.
  • ci-quality.yml — New typecheck-web job (tsc -b --noEmit in gitnexus-web/).
  • ci-e2e.yml — E2E workflow with dorny/paths-filter (only on gitnexus-web/ changes). result == 'success' guard on filter.
  • ci-report.yml — Removed dead integration-reports references.
  • .gitignore — Playwright artifacts, eval test artifacts.
  • gitnexus/vitest.config.ts — Removed autoUpdate: true from coverage thresholds.

Test Infrastructure

  • Vitest — 58 unit tests: graph, server-connection, mermaid, settings, constants, utils, path-resolution.
  • Playwright E2E — 5 tests + manual recording harness: server connect, Nexus AI init, Processes View, highlight toggle, turn-off-all-highlights.
  • ConfigdefineConfig from vitest/config, engines.node >= 20, wait-on in devDeps, retain-on-failure for artifacts.

Test plan

  • cd gitnexus-web && npx vitest run — tests pass
  • cd gitnexus-web && npx tsc --noEmit — clean

🤖 Generated with Claude Code

Part 5 of 6 — split from #303. Stack: PR1 (#474 ✅) → PR2 (#475 ✅) → PR3 (#476 ✅) → PR4 (#477 ✅) → PR5 → PR6 (#479)

jreakin and others added 2 commits March 23, 2026 13:38
CI:
- ci.yml consolidated to reference ci-tests.yml
- ci-quality.yml: add typecheck-web job for gitnexus-web/
- ci-e2e.yml: E2E workflow with dorny/paths-filter (web changes only)
- ci-report.yml: remove dead integration-reports references
- CI gate allows skipped E2E status
- .gitignore: playwright artifacts, eval test artifacts

Pre-commit hook:
- .githooks/pre-commit: typecheck + unit tests for both packages
- Activated via git config core.hooksPath in prepare script

Test infrastructure:
- Vitest + React Testing Library: 58 unit tests
  (graph, server-connection, mermaid, settings, constants, utils, paths)
- Playwright E2E: 5 tests + manual recording harness
- vitest.config from vitest/config, engines.node >= 20
- Playwright artifacts retain-on-failure
- wait-on in devDependencies
- vitest/coverage-v8 aligned with vitest 4.x

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Reflects devDependency additions (vitest, playwright, wait-on,
@testing-library, etc.) from package.json changes in this PR.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings March 23, 2026 18:40
@vercel

vercel Bot commented Mar 23, 2026

Copy link
Copy Markdown

@jreakin is attempting to deploy a commit to the NexusCore Team on Vercel.

A member of the Team first needs to authorize it.

@jreakin

jreakin commented Mar 23, 2026

Copy link
Copy Markdown
Contributor Author

Replaces closed #478 (couldn't reopen after force-push). Same branch, rebased on latest main.

Full stack:

Original monolith: #303 (draft, superseded by this stack)

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends the repo’s CI and testing infrastructure to cover the gitnexus-web frontend, adding Playwright E2E coverage, Vitest configuration for unit tests, and CI reporting/gating updates.

Changes:

  • Adds a reusable E2E workflow (ci-e2e.yml) that runs Playwright only when gitnexus-web/** changes, and updates the CI gate to allow E2E to be skipped when not applicable.
  • Introduces gitnexus-web Vitest configuration plus a new unit test suite (graph, server connection, settings, constants, utils, path resolution, mermaid generation).
  • Updates CI reporting and project tooling (Playwright config, package scripts/deps, .gitignore, and Vitest coverage config cleanup).

Reviewed changes

Copilot reviewed 20 out of 22 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
gitnexus/vitest.config.ts Removes coverage threshold auto-update configuration.
gitnexus-web/vitest.config.ts Adds Vite React plugin integration, aliases, and coverage settings for frontend unit tests.
gitnexus-web/test/setup.ts Adds jest-dom matchers and scopes storage reset to the LLM settings key.
gitnexus-web/test/fixtures/graph.ts Adds shared graph factories for unit tests.
gitnexus-web/test/unit/*.test.ts Adds new unit tests for utils, settings-service, server-connection, graph, constants, path-resolution, and mermaid generation.
gitnexus-web/playwright.config.ts Adds Playwright configuration with optional insecure mode and artifact retention.
gitnexus-web/e2e/server-connect.spec.ts Adds E2E tests for server connect, Nexus AI init, processes, highlighting, and clearing highlights.
gitnexus-web/e2e/manual-record.spec.ts Adds a manual recording harness (ignored by default).
gitnexus-web/e2e/debug-issues.spec.ts Adds opt-in debug harness tests (ignored by default).
gitnexus-web/package.json Adds Node engine constraint, Playwright/Vitest scripts, and testing dependencies.
gitnexus-web/package-lock.json Locks new/updated dependencies for the web testing stack.
.gitignore Ignores Playwright outputs and Python test artifacts.
.github/workflows/ci.yml Orchestrates reusable workflows, adds E2E job, updates PR metadata + CI gate logic.
.github/workflows/ci-quality.yml Adds a typecheck-web job for gitnexus-web (tsc -b --noEmit).
.github/workflows/ci-e2e.yml New reusable workflow to run Playwright E2E on web changes only.
.github/workflows/ci-report.yml Refactors CI report generation and PR commenting to use sticky comments + artifacts.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread .github/workflows/ci.yml
Comment thread gitnexus-web/e2e/server-connect.spec.ts Outdated
jreakin and others added 2 commits March 23, 2026 13:50
- Add data-testid="process-list-loaded" to ProcessesPanel (E2E tests
  were waiting for an element that didn't exist)
- Increase server connect timeouts from 5s to 10s for slower CI

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add gitnexus-web npm ci + vitest run to ci-tests.yml so web unit
  tests are gated by the CI status check (were only running locally)
- Remove unused IS_PLAYWRIGHT_AUTOMATION variable from E2E spec

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jreakin

jreakin commented Mar 23, 2026

Copy link
Copy Markdown
Contributor Author

Both Copilot items addressed in 478159b:

Comment Fix
gitnexus-web unit tests not in CI Added npm ci + npx vitest run for gitnexus-web/ to ci-tests.yml — web unit tests are now gated by CI status.
Unused IS_PLAYWRIGHT_AUTOMATION Removed.

@jreakin

jreakin commented Mar 23, 2026

Copy link
Copy Markdown
Contributor Author

This PR replaces the closed #478. @magyargergo — as you commented on #478 asking if it was accidentally closed and noting the value in the work, this is the rebased continuation of that PR.

- Add data-testid="process-row" to ProcessItem component (E2E tests
  referenced it but it didn't exist in the source)
- Use waitUntil: 'networkidle' on page.goto to ensure Vite dev server
  is fully ready before interacting (fixes first-test timeout in CI)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jreakin jreakin marked this pull request as draft March 23, 2026 19:24
E2E tests referenced these data-testid attributes but they didn't
exist in ProcessItem. All 6 E2E testids now have matching source
elements: status-ready, process-list-loaded, process-row,
process-view-button, process-highlight-button, server-url-input.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jreakin jreakin marked this pull request as ready for review March 23, 2026 19:26
jreakin and others added 2 commits March 23, 2026 14:36
…solving

networkidle waits for zero network activity for 500ms, but Vite's HMR
WebSocket stays open permanently, causing page.goto to timeout at 60s
on all tests after the first. The explicit toBeVisible waits on UI
elements are sufficient and deterministic.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…s pass locally

Root cause: test 1 clicked the Server button before React hydrated,
so the tab content never rendered and the input wasn't found.

Fixes:
- Wait for Server button toBeVisible before clicking
- Increase input wait to 15s
- Remove networkidle (Vite HMR WebSocket prevents it from resolving)
- Add retries: 1 in CI for transient cold-start flakiness

Verified locally: all 5 E2E tests pass, 198 unit tests pass, typecheck clean.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 22 out of 24 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread gitnexus-web/package.json
Comment thread .github/workflows/ci-tests.yml
gitnexus analyze can crash with "double free or corruption" (known
issue abhigyanpatwari#273) during the LadybugDB native addon shutdown. The index is
usually written successfully before the crash. The workflow now:
1. Allows analyze to exit non-zero with a warning
2. Verifies .gitnexus index was actually created
3. Only fails if no index exists (real failure)

All tests verified locally: 198 unit, 5 E2E pass, typecheck clean.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jreakin jreakin force-pushed the ci/web-test-infrastructure branch from 0ef70df to a5822d6 Compare March 23, 2026 20:31
The previous echo string had special characters that broke bash
quoting in GitHub Actions. Simplified to: analyze || true, then
check if .gitnexus exists.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@magyargergo

Copy link
Copy Markdown
Collaborator

@claude deep review this pr focusing on the new playwright tests and additionally the new setup of the cursor and additional AI md files.

@github-actions

github-actions Bot commented Mar 23, 2026

Copy link
Copy Markdown
Contributor

Claude finished @magyargergo's task in 3m 18s —— View job


Deep Review — PR #486: CI/E2E Test Infrastructure

Tasks:

  • Gather context and understand request
  • Read Playwright config and E2E test files
  • Read pre-commit hook, CI workflows, and unit test setup
  • Post deep review feedback

Note on "cursor and additional AI md files": This PR does not include changes to .cursorrules or any AI instruction markdown files — those aren't in the changed-files list. The .cursorrules in the repo root was not modified. I've focused this review on the Playwright tests and CI infrastructure, which is the actual content of the PR.


Playwright Tests — e2e/server-connect.spec.ts

1. beforeAll skip pattern is partially correct, but has a coverage gap

server-connect.spec.ts:23-24

if (backendRes.status === 'fulfilled' && !backendRes.value.ok)
  test.skip(true, 'gitnexus serve not available on :4747');
if (frontendRes.status === 'rejected')
  test.skip(true, 'Vite dev server not available on :5173');

The frontend check only skips on network failure (rejected promise), but not if the Vite server responds with an error status. Inconsistency with the backend check. More importantly: once test.skip() is called, execution continues in the beforeAll block — the second test.skip() call still runs. You should return after each test.skip() call to prevent fall-through.

Fix this →

2. "Turn Off All Highlights" test can silently pass without testing anything

server-connect.spec.ts:147-168

const fileItem = page.getByText('package.json').first();
if (await fileItem.isVisible()) {
  // ... test the feature
} else {
  await page.screenshot(...)  // silently "passes"
}

The entire core assertion (select node → highlights dim → button clears) is inside an if guard with no test.skip() or expect().toBeTruthy(). If package.json isn't visible in the file tree, the test passes vacuously. This should either hard-fail (expect(fileItem).toBeVisible()) or skip gracefully.

Fix this →

3. { force: true } clicks bypass actionability checks

server-connect.spec.ts:106, 128

await viewBtn.click({ force: true });
await lightbulb.click({ force: true });

Force-clicking skips Playwright's built-in checks (enabled, visible, stable, receives events). Since both buttons use waitFor({ state: 'visible' }) before the click, the force: true is redundant and masks real UI issues. Remove it unless there's a documented reason (e.g., overlapping element).

4. Canvas visibility is not a meaningful highlight assertion

server-connect.spec.ts:130-133, 151-153

await expect(async () => {
  await expect(page.locator('canvas').first()).toBeVisible();
}).toPass({ timeout: 2000 });

The canvas is already visible before the click. This assertion verifies nothing about the highlight state having changed. Consider asserting on a more specific signal: a DOM attribute on the highlight button, a count of highlighted nodes, or a data-testid on the highlight state indicator.

5. Fragile CSS selector for modal

server-connect.spec.ts:108

await expect(page.locator('.fixed.inset-0.z-50')).toBeVisible({ timeout: 5_000 });

Tailwind utility class selectors will break silently if the modal is restyled. Use a data-testid="process-modal" instead.

Fix this →


e2e/debug-issues.spec.ts

6. Hardcoded URL instead of env var

debug-issues.spec.ts:17

await serverInput.fill('http://localhost:4747');

server-connect.spec.ts uses process.env.BACKEND_URL ?? 'http://localhost:4747' consistently. The debug file hardcodes it. Minor inconsistency that could cause confusion for contributors using a non-default port.

7. waitForLoadState('networkidle') present in debug file

debug-issues.spec.ts:106, 117
The commit history (a6211ad) shows networkidle was explicitly removed from server-connect.spec.ts because Vite's HMR WebSocket prevents it from resolving. The debug harness still uses it — while excluded from CI, future contributors may wonder why the main tests avoid it.


playwright.config.ts

8. No webServer configuration

The config relies on the CI workflow manually starting servers with &. Playwright's webServer block would handle start/stop automatically and surface server stderr in test output. This is not a blocker, but it would simplify local npx playwright test invocations (currently developers must start both servers themselves).

9. Only one retry in CI

retries: process.env.CI ? 1 : 0,

One retry is reasonable. Worth noting: retries mask flakiness rather than fixing it. The canvas-visibility assertions (point 4 above) are likely a source of future flakiness.


CI Workflow — ci-e2e.yml

10. analyze runs on the wrong working directory

ci-e2e.yml:53-60

- name: Analyze repository (index for backend)
  run: |
    node dist/cli/index.js analyze || true
    ...
  working-directory: gitnexus

This indexes the gitnexus/ CLI package itself, not the repository under test. The E2E tests then render graphs of GitNexus's own codebase. This is intentional (self-hosting the analyzer for the demo), but should have a comment explaining the intent.

11. Backend serve process not explicitly stopped between retries

- name: Start backend server
  run: node dist/cli/index.js serve &

On retry (if Playwright retries the job), the serve process from the previous attempt may still be listening on 4747. Adding pkill -f "cli/index.js serve" || true before the start step, or using a pid capture for cleanup, would prevent port conflicts.


Pre-commit Hook — .husky/pre-commit

12. Well-structured and mirrors CI accurately

The hook correctly detects which package changed and runs only the relevant checks. The dual cd approach using $ROOT as an anchor is safe. The --reporter=dot is appropriate for pre-commit output. Overall this is clean.

Minor nit: After cd "$ROOT/gitnexus-web" the npx vitest run still targets the cwd correctly, but if both WEB_CHANGED and CLI_CHANGED are set, the second block's cd must recover from the first. Since it uses cd "$ROOT/gitnexus" (absolute), this is fine — just worth a comment.


CI gate (ci.yml) and reporting (ci-report.yml)

13. E2E gate is correct — skipped is treated as passing

ci.yml:97

if [[ "$E2E" != "success" && "$E2E" != "skipped" ]]; then
  echo "::error::E2E job failed"
  exit 1
fi

This is correct: when no gitnexus-web/ files change, E2E is skipped and the gate passes.

14. ci-report.yml input validation is solid

The artifact-injection defense at lines 88-93 (regex-validating PR number) and the result whitelist at lines 101-107 correctly treat fork-contributed artifact contents as untrusted. This is good security practice.

15. gitnexus-web unit test results not in CI report

As Copilot noted: ci-tests.yml runs npx vitest run --reporter=default for gitnexus-web/ (lines 35-37) but doesn't emit a JSON report or upload a dedicated artifact. Web test failures won't appear in the CI report comment, only in raw job logs. Adding --reporter=json --outputFile=web-test-results.json and uploading it would close this gap.

Fix this →


Unit Tests — Quick Assessment

The Vitest suite (test/unit/*.test.ts) is well-structured:

  • settings-service.test.ts covers happy path, error cases (corrupted JSON), and the localStorage→sessionStorage migration
  • server-connection.test.ts tests URL normalization edge cases thoroughly
  • Fixtures (test/fixtures/graph.ts) are clean data factories with no test logic

The setup.ts scope is appropriately narrow — only clearing the one known storage key rather than blanket-wiping localStorage.


Summary of Issues by Severity

# Severity Issue
2 🔴 High Silent pass in "Turn Off All Highlights" test
4 🔴 High Canvas visibility assertion proves nothing about highlight state
1 🟡 Medium test.skip() fall-through in beforeAll
3 🟡 Medium force: true clicks bypass actionability
5 🟡 Medium CSS class selector for modal — fragile
15 🟡 Medium gitnexus-web test results not in CI report
11 🟡 Medium serve process not cleaned up on retry
6, 7 🟢 Low Debug harness inconsistencies (hardcoded URL, networkidle)
8 🟢 Low No webServer config for local dev convenience
10 🟢 Low analyze working directory needs a comment explaining intent

@magyargergo

Copy link
Copy Markdown
Collaborator

@jreakin could you please have a look at this? 🙏

The analyze step ran from gitnexus/ but wrote .gitnexus/ at repo root,
so the directory check always failed. Run from repo root instead.

E2E test improvements from review feedback:
- Replace fragile CSS class selectors with data-testid attributes
- Remove force:true clicks that bypassed Playwright actionability
- Add return after test.skip() to prevent beforeAll fall-through
- Replace vacuous canvas-visibility assertions with state checks
  (process row class for highlights, button title for toggle)
- Fix "Turn Off All Highlights" test that could pass without testing
- Use BACKEND_URL env var in debug harness for consistency
- Replace networkidle waits with timeouts (Vite HMR prevents resolve)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jreakin

jreakin commented Mar 24, 2026

Copy link
Copy Markdown
Contributor Author

Pushed 640ba9d — addresses review feedback from Claude and Copilot.

CI fix (root cause of E2E failure)

  • Analyze step working directory: ci-e2e.yml ran analyze from gitnexus/ but the index is written at the repo root. Removed working-directory: gitnexus and run node gitnexus/dist/cli/index.js analyze from repo root so the .gitnexus/ check finds the index where it's actually created.

E2E test hardening

Issue Fix
Fragile .fixed.inset-0.z-50 CSS selector Added data-testid="process-modal" to ProcessFlowModal.tsx, updated selectors
force: true clicks bypassing actionability Removed — waitFor({ state: 'visible' }) already ensures clickability
Canvas visibility assertions proving nothing Replaced with toHaveClass(/bg-amber-950/) (highlight row state) and toHaveAttribute('title', ...) (toggle button state)
"Turn Off All Highlights" passing vacuously Replaced if (isVisible) guard with expect(fileItem).toBeVisible() — test now fails loud
test.skip() fall-through in beforeAll Added return after each skip, added !value.ok check for frontend
Hardcoded URL in debug harness Uses BACKEND_URL env var like server-connect.spec.ts
networkidle in debug harness Replaced with waitForTimeout(500) — Vite HMR WebSocket prevents networkidle

Also added data-testid="ai-highlights-toggle" to GraphCanvas.tsx for the highlight toggle button.

All 198 unit tests pass, typecheck clean. E2E tests verified locally (5/5).

The web unit tests ran in CI but only output to stdout — results
weren't in the JSON artifact so they were missing from the PR report.
Now emits web-test-results.json, uploads it, and the report sums
both CLI and web test counts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jreakin

jreakin commented Mar 24, 2026

Copy link
Copy Markdown
Contributor Author

Pushed 6705f88 — addresses remaining items from the deep review.

Review item tracker

# Issue Status
1 test.skip() fall-through in beforeAll Fixed in 640ba9d
2 "Turn Off All Highlights" silent pass Fixed in 640ba9d
3 force: true bypassing actionability Fixed in 640ba9d
4 Canvas visibility assertions meaningless Fixed in 640ba9d
5 Fragile CSS class modal selector Fixed in 640ba9d
6,7 Debug harness: hardcoded URL, networkidle Fixed in 640ba9d
8 No webServer in Playwright config Won't fix (see below)
9 Only one retry in CI N/A — observation, not actionable
10 Analyze working directory bug Fixed in 640ba9d (root cause of CI failure)
11 serve not cleaned up on retry N/A — Playwright retries happen within a single job run; GitHub Actions job-level reruns get a fresh container
15 Web test results missing from CI report Fixed in 6705f88

Why #8 (Playwright webServer) is intentionally omitted

8. No webServer configuration

The config relies on the CI workflow manually starting servers with &. Playwright's webServer block would handle start/stop automatically and surface server stderr in test output. This is not a blocker, but it would simplify local npx playwright test invocations (currently developers must start both servers themselves).

The webServer config couples Playwright to the server lifecycle — Playwright starts and stops the servers for each test run. That doesn't fit here because:

  1. Two servers with a dependency: The backend must finish analyze and serve before the frontend can connect. webServer supports multiple entries but doesn't model "server B depends on server A's data being ready" — we'd still need the wait-on steps.
  2. CI already handles it: The workflow controls startup order explicitly (analyzeserve &wait-on :4747dev &wait-on :5173), which is more transparent and debuggable than Playwright's built-in server management.
  3. Local DX trade-off is marginal: Contributors already need both servers running for development anyway. Adding webServer would mean Playwright also starts its own server instances, potentially conflicting with the ones already running.

If local DX becomes a pain point, a scripts/e2e.sh that starts both servers and runs Playwright would be a better fit than baking it into Playwright config. Let me know if you'd like me to write up this script, @magyargergo. Glad to do it if you feel it's beneficial.

jreakin added a commit to jreakin/GitNexus that referenced this pull request Mar 24, 2026
…atwari#486

- AGENTS.md: update test counts (CLI ~2000 unit, ~1850 integration),
  add gitnexus-web testing section (198 unit, 5 E2E with commands)
- RUNBOOK.md: fix Node requirement to >=20, fix E2E local repro command
- TESTING.md: E2E uses data-testid selectors + real servers, not mocks
- .cursor/rules/100-monorepo.mdc: add web test/E2E commands

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@magyargergo magyargergo requested a review from Copilot March 24, 2026 05:19
@magyargergo

Copy link
Copy Markdown
Collaborator

@claude review the latest changes and evaluation it's readiness.

@github-actions

github-actions Bot commented Mar 24, 2026

Copy link
Copy Markdown
Contributor

Claude finished @magyargergo's task in 2m 9s —— View job


Tasks:

  • Gather context and understand request
  • Review latest changes (640ba9d, 6705f88)
  • Check CI status
  • Evaluate readiness for merge

Readiness Evaluation — PR #486

CI: ✅ Passing (run 23468410633)


Review of Latest Commits

640ba9d — E2E test hardening + analyze working directory fix

All 10 actionable items from the previous deep review are now addressed. Verification of each fix:

Issue Code evidence Status
test.skip() fall-through server-connect.spec.ts:24-29return after each skip, !value.ok check for frontend
"Turn Off All Highlights" silent pass server-connect.spec.ts:150expect(fileItem).toBeVisible() with no if guard
force: true clicks Removed from viewBtn.click() and lightbulb.click()
Canvas visibility assertions server-connect.spec.ts:134toHaveClass(/bg-amber-950/) (process row state); lines:155,162toHaveAttribute('title', ...) (toggle state)
Fragile CSS modal selector ProcessFlowModal.tsx:214data-testid="process-modal" added; test uses [data-testid="process-modal"]
Debug harness: hardcoded URL debug-issues.spec.ts — uses BACKEND_URL env var
Debug harness: networkidle Replaced with waitForTimeout(500)
Analyze working directory bug ci-e2e.yml:55 — runs node gitnexus/dist/cli/index.js analyze from repo root

The toHaveClass(/bg-amber-950/) assertion correctly matches bg-amber-950/40 set in ProcessesPanel.tsx:459 when a process is focused — the regex is appropriately loose.

6705f88 — Web test results in CI report

ci-tests.yml:37-41 now emits --reporter=json --outputFile=web-test-results.json and uploads gitnexus-web/web-test-results.json alongside the CLI report artifact. ci-report.yml:244-263 reads both files, sums totals, and displays combined results in the PR comment. This correctly closes the gap where web test failures were invisible in the report.


One Remaining Open Item

Copilot: engines.node floor (package.json:6) — the declared >=20.0.0 is lower than what jsdom@29 actually requires (^20.19.0). Not a blocker (CI uses Node 20 via setup-node which will pull a compatible version), but it will cause misleading npm install warnings on Node 20.0–20.18. A one-line bump to >=20.19.0 would close this cleanly.

Fix this →


Verdict

Ready to merge. All review feedback from the prior deep review has been actioned. CI is green. The engines.node floor is a minor nit that can be addressed in a follow-up or squashed in before merge — it does not affect CI or runtime behavior.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 24 out of 26 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +275 to +283
cov_delta() {
local pct=$1 base=$2
if [ "$pct" = "N/A" ] || [ "$base" = "N/A" ]; then echo "—"; return; fi
local diff
diff=$(awk "BEGIN { printf \"%.1f\", $pct - $base }")
if [ "$(awk "BEGIN { print ($pct > $base) ? 1 : 0 }")" = "1" ]; then
echo "📈 +${diff}"
elif [ "$(awk "BEGIN { print ($pct < $base) ? 1 : 0 }")" = "1" ]; then
echo "📉 ${diff}"

Copilot AI Mar 24, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The coverage percentages used in cov_delta/cov_bar are sourced from CI artifacts (potentially untrusted for fork PRs) and are interpolated directly into awk programs. If a malicious/invalid coverage-summary.json is uploaded, this can lead to arbitrary awk code execution in the reporting workflow (which runs with a write-capable token). Please validate that pct/base match a strict numeric regex before using them in awk (or extract as numbers via jq with tonumber? and treat non-numeric values as N/A).

Copilot uses AI. Check for mistakes.
Comment on lines +63 to +70
- name: Extract artifacts
shell: bash
run: |
cd "$RUNNER_TEMP/dl"
# Extract each artifact into its own directory to avoid filename collisions
for z in *.zip; do
[ -f "$z" ] || continue
name="${z%.zip}"

Copilot AI Mar 24, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extract artifacts assumes $RUNNER_TEMP/dl exists, but the preceding download step only creates that directory when at least one artifact is found. If pr-meta (or both artifacts) are missing, this cd will fail and the workflow will error instead of skipping gracefully. Consider mkdir -p "$RUNNER_TEMP/dl" before downloads (or guard this step when no artifacts were downloaded) so CI reporting is resilient to missing artifacts.

Copilot uses AI. Check for mistakes.
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- uses: actions/setup-node@49933ea5288caeca8642d1e84afbd3f7d6820020 # v4
with:
node-version: 20

Copilot AI Mar 24, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The workflow pins node-version: 20, but gitnexus-web's current dependency tree (per package-lock.json, e.g. jsdom@29) requires Node ^20.19.0+. To avoid unexpected EBADENGINE warnings/failures as runner images change, consider pinning to 20.19.0 (or 22.x) here to match the actual minimum runtime required by the lockfile.

Suggested change
node-version: 20
node-version: 20.19.0

Copilot uses AI. Check for mistakes.

- uses: actions/setup-node@49933ea5288caeca8642d1e84afbd3f7d6820020 # v4
with:
node-version: 20

Copilot AI Mar 24, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as the typecheck workflow: this uses node-version: 20, but the gitnexus-web lockfile includes packages that require Node ^20.19.0+. Pinning to a specific patch (e.g. 20.19.0) or moving to 22 would make E2E runs more reproducible and aligned with the dependency engines.

Suggested change
node-version: 20
node-version: 20.19.0

Copilot uses AI. Check for mistakes.
@magyargergo

Copy link
Copy Markdown
Collaborator

@claude can you please find out why there was no report from GHA workflows of the test report summary?

@github-actions

github-actions Bot commented Mar 24, 2026

Copy link
Copy Markdown
Contributor

Claude finished @magyargergo's task in 3m 35s —— View job


Tasks:

  • Gather context and understand request
  • Check CI status and workflow runs
  • Read ci-report.yml to understand report generation logic
  • Identify why test report summary was missing
  • Post findings

Root Cause: Artifact Filename Schema Mismatch

The CI report comment was never posted because the ci.yml on this PR branch and the ci-report.yml on main are out of sync on artifact file names.

How workflow_run works

GitHub's workflow_run trigger always executes the workflow file from the default branch (main). That means:

  • ci.yml — runs from the PR branch (uses this PR's changes)
  • ci-report.yml — runs from main (uses the old version)

They can't both pick up schema changes from the same PR simultaneously.

The exact mismatch

PR branch ci.yml (save-pr-meta job) writes files with underscores:

pr-meta/pr_number
pr-meta/quality_result
pr-meta/tests_result
pr-meta/e2e_result   ← new field, didn't exist on main

Main branch ci-report.yml (Extract PR metadata step) reads files with hyphens:

PR_NUMBER=$(cat pr-number | tr -d '[:space:]')   # ← hyphen, not underscore
cat quality-result
cat tests-result

When the CI Report workflow runs from main and tries cat pr-number, it gets No such file or directory. PR_NUMBER is empty → fails the ^[0-9]+$ regex check → exit 1. The report step is never reached.

Evidence

  • git show 2ede01d:.github/workflows/ci.yml confirms main writes pr-number, quality-result, tests-result (hyphens)
  • git show 2ede01d:.github/workflows/ci-report.yml confirms main reads the same hyphen names
  • This PR's ci.yml changed all three to underscores + added e2e_result
  • The Save PR Metadata job in run 23468410633 succeeded and uploaded the artifact — confirming the artifact was there, just with wrong filenames

Additionally, main's ci.yml had no E2E job in the save-pr-meta needs: list, so ci-report.yml on main also has no E2E variable in the report template at all.

Why this only affects this transitional state

Once this PR is merged to main, both workflows will be in sync (both use underscores, both include E2E). The mismatch only exists during the pre-merge state where ci.yml picks up new schema from the PR branch but ci-report.yml still runs the old main version.

Fix options

  1. Do nothing — once merged, it will work. The mismatch is a transitional artifact of the split between ci.yml (PR-branch) and ci-report.yml (main-branch).
  2. Backward-compat shim — in this PR's ci.yml, write both pr_number (new) AND pr-number (old) so the still-on-main ci-report.yml can find what it needs. Same for quality-result/quality_result and tests-result/tests_result. This would make the report work on this PR before merge.

Option 2 is a one-time patch to enable the report to run before the PR is merged, but since the PR is already reviewed and ready, option 1 is simpler.

@magyargergo

Copy link
Copy Markdown
Collaborator

@jreakin can you please look into this? 🙏

…ompat, node alignment

- Sanitize coverage values with is_numeric() validation and awk -v flag
  parameterization to prevent command injection from untrusted fork artifacts
- Add mkdir -p before unzip in base coverage extraction step
- Write both underscore and hyphenated artifact filenames so main-branch
  ci-report.yml (via workflow_run) can read them before this PR merges
- Align Node version: engines >=18→>=20, Dockerfile.test 22→20

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jreakin

jreakin commented Mar 24, 2026

Copy link
Copy Markdown
Contributor Author

Addressed all outstanding review items

Pushed af734cb with fixes for the remaining findings:

1. Coverage delta validation — AWK injection (Medium)

  • Added is_numeric() regex validator (^-?[0-9]+(\.[0-9]+)?$) that rejects non-numeric values before they reach awk
  • Switched all 5 awk calls from string interpolation ($pct) to -v flag parameterization (-v p="$pct"p), making injection structurally impossible even if validation is bypassed
  • Invalid values gracefully render as , matching existing N/A behavior

2. Directory creation resilience (Medium)

  • Added mkdir -p base before unzip -o base.zip -d base in the "Extract base coverage" step

3. Node version pin mismatch (Low)

  • gitnexus/package.json engines: >=18.0.0>=20.0.0 (aligns with gitnexus-web and all CI workflows)
  • gitnexus/Dockerfile.test: node:22-bookwormnode:20-bookworm (aligns integration tests with CI runtime)

4. Artifact naming schema mismatch (re: missing test report)

  • ci.yml now writes both underscore (pr_number) and hyphenated (pr-number) artifact filenames
  • This allows the main-branch ci-report.yml (executed via workflow_run) to find the files before this PR merges
  • Marked with TODO(post-merge) for cleanup once both workflows are on the same branch

🤖 Generated with Claude Code

@github-actions

Copy link
Copy Markdown
Contributor

CI Report

All checks passedaf734cb

Pipeline

Stage Status Ubuntu Windows macOS
Typecheck success
Tests success

Tests

Metric Value
Total 3878
Passed 3876
Skipped 2
Files 1102
Duration 9m 59s

✅ All 3876 tests passed across 1102 files

2 test(s) skipped
  • withTestLbugDB(lock-retry) > withLbugDb retry behaviour > throws after max retry attempts
  • buildTypeEnv > known limitations (documented skip tests) > Ruby block parameter: users.each { |user| } — closure param inference, different feature

Coverage

Metric Coverage Covered Base (main) Delta
Statements 70.61% 10062/14250 70.79% 📉 -0.2%
Branches 61.18% 6852/11198 61.33% 📉 -0.1%
Functions 73.82% 883/1196 74.24% 📉 -0.4%
Lines 72.98% 8988/12315 73.14% 📉 -0.2%

📋 Full run · Coverage from Ubuntu · Generated by CI

@magyargergo magyargergo merged commit f0540b3 into abhigyanpatwari:main Mar 24, 2026
9 of 10 checks passed
magyargergo added a commit that referenced this pull request Mar 25, 2026
)

* ci: E2E workflow, web typecheck job, pre-commit hook, test suite

CI:
- ci.yml consolidated to reference ci-tests.yml
- ci-quality.yml: add typecheck-web job for gitnexus-web/
- ci-e2e.yml: E2E workflow with dorny/paths-filter (web changes only)
- ci-report.yml: remove dead integration-reports references
- CI gate allows skipped E2E status
- .gitignore: playwright artifacts, eval test artifacts

Pre-commit hook:
- .githooks/pre-commit: typecheck + unit tests for both packages
- Activated via git config core.hooksPath in prepare script

Test infrastructure:
- Vitest + React Testing Library: 58 unit tests
  (graph, server-connection, mermaid, settings, constants, utils, paths)
- Playwright E2E: 5 tests + manual recording harness
- vitest.config from vitest/config, engines.node >= 20
- Playwright artifacts retain-on-failure
- wait-on in devDependencies
- vitest/coverage-v8 aligned with vitest 4.x

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: update gitnexus-web package-lock.json

Reflects devDependency additions (vitest, playwright, wait-on,
@testing-library, etc.) from package.json changes in this PR.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(e2e): add missing process-list-loaded testid, increase CI timeouts

- Add data-testid="process-list-loaded" to ProcessesPanel (E2E tests
  were waiting for an element that didn't exist)
- Increase server connect timeouts from 5s to 10s for slower CI

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(ci): run gitnexus-web unit tests in CI, remove unused variable

- Add gitnexus-web npm ci + vitest run to ci-tests.yml so web unit
  tests are gated by the CI status check (were only running locally)
- Remove unused IS_PLAYWRIGHT_AUTOMATION variable from E2E spec

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(e2e): add process-row testid, wait for networkidle on page load

- Add data-testid="process-row" to ProcessItem component (E2E tests
  referenced it but it didn't exist in the source)
- Use waitUntil: 'networkidle' on page.goto to ensure Vite dev server
  is fully ready before interacting (fixes first-test timeout in CI)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(e2e): add process-view-button and process-highlight-button testids

E2E tests referenced these data-testid attributes but they didn't
exist in ProcessItem. All 6 E2E testids now have matching source
elements: status-ready, process-list-loaded, process-row,
process-view-button, process-highlight-button, server-url-input.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(e2e): remove networkidle — Vite HMR WebSocket prevents it from resolving

networkidle waits for zero network activity for 500ms, but Vite's HMR
WebSocket stays open permanently, causing page.goto to timeout at 60s
on all tests after the first. The explicit toBeVisible waits on UI
elements are sufficient and deterministic.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(e2e): wait for Server button visibility, add CI retry, all 5 tests pass locally

Root cause: test 1 clicked the Server button before React hydrated,
so the tab content never rendered and the input wasn't found.

Fixes:
- Wait for Server button toBeVisible before clicking
- Increase input wait to 15s
- Remove networkidle (Vite HMR WebSocket prevents it from resolving)
- Add retries: 1 in CI for transient cold-start flakiness

Verified locally: all 5 E2E tests pass, 198 unit tests pass, typecheck clean.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(ci): tolerate LadybugDB native crash during analyze step

gitnexus analyze can crash with "double free or corruption" (known
issue #273) during the LadybugDB native addon shutdown. The index is
usually written successfully before the crash. The workflow now:
1. Allows analyze to exit non-zero with a warning
2. Verifies .gitnexus index was actually created
3. Only fails if no index exists (real failure)

All tests verified locally: 198 unit, 5 E2E pass, typecheck clean.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(ci): fix shell quoting in analyze step, simplify to || true

The previous echo string had special characters that broke bash
quoting in GitHub Actions. Simplified to: analyze || true, then
check if .gitnexus exists.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add agent development framework, GitHub templates, eval refactor

Agent framework (layered docs for AI-assisted contributions):
- AGENTS.md: canonical instructions, impact analysis, MCP tools
- CLAUDE.md: Claude Code-specific deltas and hooks
- GUARDRAILS.md: safety boundaries, non-negotiables, escalation
- ARCHITECTURE.md: monorepo layout, data flow map
- TESTING.md: test structure, commands, categories
- RUNBOOK.md: copy-paste operations for dev/CI/MCP
- llms.txt: minimal LLM context pointer

Editor integration:
- .cursor/index.mdc + rules/100-monorepo.mdc

GitHub templates:
- PR template with areas-touched checkboxes
- Bug report + feature request issue forms

Eval harness:
- Refactored mcp_bridge, tool_registry, constants
- Error sanitization utilities
- Property-based tests via Hypothesis

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(eval): use format_exception instead of format_exc in sanitize_exception

format_exc() returns the currently handled exception traceback, which
may be unrelated if called outside an active except block. Using
format_exception(type(exc), exc, exc.__traceback__) reliably captures
the passed exception's traceback.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: update CONTRIBUTING.md and TESTING.md for current CI/hook setup

- CONTRIBUTING.md: add gitnexus-web typecheck command, pre-commit hook
  checklist item
- TESTING.md: add gitnexus-web typecheck command, pre-commit hook
  section (husky), update CI integration to list actual workflow files
  (ci-quality, ci-tests, ci-e2e)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: update testing docs to reflect CI/E2E changes from PR #486

- AGENTS.md: update test counts (CLI ~2000 unit, ~1850 integration),
  add gitnexus-web testing section (198 unit, 5 E2E with commands)
- RUNBOOK.md: fix Node requirement to >=20, fix E2E local repro command
- TESTING.md: E2E uses data-testid selectors + real servers, not mocks
- .cursor/rules/100-monorepo.mdc: add web test/E2E commands

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: address context engineering review — deduplicate tokens, expand Cursor rules

- Remove ~100-line gitnexus:start block from CLAUDE.md (was duplicated from AGENTS.md)
- Fix gitnexus:start block inlined inside AGENTS.md Reference Docs bullet (doubled)
- Replace CLAUDE.md scope table with pointer to AGENTS.md (single source of truth)
- Expand .cursor/index.mdc with 5 non-negotiable safety rules for always-on context
- Add .cursor/rules/200-eval.mdc with Python/eval commands (glob-scoped to eval/**)
- Improve llms.txt with priority annotations and descriptions
- Bump version headers to 1.2.0, last-reviewed to 2026-03-24

Saves ~1,400 tokens/session with zero information loss.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Gergő Magyar <gergomagyar@icloud.com>
@jreakin jreakin deleted the ci/web-test-infrastructure branch April 1, 2026 04:10
motolese pushed a commit to motolese/datamoto-gitnexus that referenced this pull request Apr 23, 2026
motolese pushed a commit to motolese/datamoto-gitnexus that referenced this pull request Apr 23, 2026
…bhigyanpatwari#479)

* ci: E2E workflow, web typecheck job, pre-commit hook, test suite

CI:
- ci.yml consolidated to reference ci-tests.yml
- ci-quality.yml: add typecheck-web job for gitnexus-web/
- ci-e2e.yml: E2E workflow with dorny/paths-filter (web changes only)
- ci-report.yml: remove dead integration-reports references
- CI gate allows skipped E2E status
- .gitignore: playwright artifacts, eval test artifacts

Pre-commit hook:
- .githooks/pre-commit: typecheck + unit tests for both packages
- Activated via git config core.hooksPath in prepare script

Test infrastructure:
- Vitest + React Testing Library: 58 unit tests
  (graph, server-connection, mermaid, settings, constants, utils, paths)
- Playwright E2E: 5 tests + manual recording harness
- vitest.config from vitest/config, engines.node >= 20
- Playwright artifacts retain-on-failure
- wait-on in devDependencies
- vitest/coverage-v8 aligned with vitest 4.x

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: update gitnexus-web package-lock.json

Reflects devDependency additions (vitest, playwright, wait-on,
@testing-library, etc.) from package.json changes in this PR.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(e2e): add missing process-list-loaded testid, increase CI timeouts

- Add data-testid="process-list-loaded" to ProcessesPanel (E2E tests
  were waiting for an element that didn't exist)
- Increase server connect timeouts from 5s to 10s for slower CI

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(ci): run gitnexus-web unit tests in CI, remove unused variable

- Add gitnexus-web npm ci + vitest run to ci-tests.yml so web unit
  tests are gated by the CI status check (were only running locally)
- Remove unused IS_PLAYWRIGHT_AUTOMATION variable from E2E spec

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(e2e): add process-row testid, wait for networkidle on page load

- Add data-testid="process-row" to ProcessItem component (E2E tests
  referenced it but it didn't exist in the source)
- Use waitUntil: 'networkidle' on page.goto to ensure Vite dev server
  is fully ready before interacting (fixes first-test timeout in CI)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(e2e): add process-view-button and process-highlight-button testids

E2E tests referenced these data-testid attributes but they didn't
exist in ProcessItem. All 6 E2E testids now have matching source
elements: status-ready, process-list-loaded, process-row,
process-view-button, process-highlight-button, server-url-input.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(e2e): remove networkidle — Vite HMR WebSocket prevents it from resolving

networkidle waits for zero network activity for 500ms, but Vite's HMR
WebSocket stays open permanently, causing page.goto to timeout at 60s
on all tests after the first. The explicit toBeVisible waits on UI
elements are sufficient and deterministic.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(e2e): wait for Server button visibility, add CI retry, all 5 tests pass locally

Root cause: test 1 clicked the Server button before React hydrated,
so the tab content never rendered and the input wasn't found.

Fixes:
- Wait for Server button toBeVisible before clicking
- Increase input wait to 15s
- Remove networkidle (Vite HMR WebSocket prevents it from resolving)
- Add retries: 1 in CI for transient cold-start flakiness

Verified locally: all 5 E2E tests pass, 198 unit tests pass, typecheck clean.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(ci): tolerate LadybugDB native crash during analyze step

gitnexus analyze can crash with "double free or corruption" (known
issue abhigyanpatwari#273) during the LadybugDB native addon shutdown. The index is
usually written successfully before the crash. The workflow now:
1. Allows analyze to exit non-zero with a warning
2. Verifies .gitnexus index was actually created
3. Only fails if no index exists (real failure)

All tests verified locally: 198 unit, 5 E2E pass, typecheck clean.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(ci): fix shell quoting in analyze step, simplify to || true

The previous echo string had special characters that broke bash
quoting in GitHub Actions. Simplified to: analyze || true, then
check if .gitnexus exists.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add agent development framework, GitHub templates, eval refactor

Agent framework (layered docs for AI-assisted contributions):
- AGENTS.md: canonical instructions, impact analysis, MCP tools
- CLAUDE.md: Claude Code-specific deltas and hooks
- GUARDRAILS.md: safety boundaries, non-negotiables, escalation
- ARCHITECTURE.md: monorepo layout, data flow map
- TESTING.md: test structure, commands, categories
- RUNBOOK.md: copy-paste operations for dev/CI/MCP
- llms.txt: minimal LLM context pointer

Editor integration:
- .cursor/index.mdc + rules/100-monorepo.mdc

GitHub templates:
- PR template with areas-touched checkboxes
- Bug report + feature request issue forms

Eval harness:
- Refactored mcp_bridge, tool_registry, constants
- Error sanitization utilities
- Property-based tests via Hypothesis

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(eval): use format_exception instead of format_exc in sanitize_exception

format_exc() returns the currently handled exception traceback, which
may be unrelated if called outside an active except block. Using
format_exception(type(exc), exc, exc.__traceback__) reliably captures
the passed exception's traceback.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: update CONTRIBUTING.md and TESTING.md for current CI/hook setup

- CONTRIBUTING.md: add gitnexus-web typecheck command, pre-commit hook
  checklist item
- TESTING.md: add gitnexus-web typecheck command, pre-commit hook
  section (husky), update CI integration to list actual workflow files
  (ci-quality, ci-tests, ci-e2e)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: update testing docs to reflect CI/E2E changes from PR abhigyanpatwari#486

- AGENTS.md: update test counts (CLI ~2000 unit, ~1850 integration),
  add gitnexus-web testing section (198 unit, 5 E2E with commands)
- RUNBOOK.md: fix Node requirement to >=20, fix E2E local repro command
- TESTING.md: E2E uses data-testid selectors + real servers, not mocks
- .cursor/rules/100-monorepo.mdc: add web test/E2E commands

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: address context engineering review — deduplicate tokens, expand Cursor rules

- Remove ~100-line gitnexus:start block from CLAUDE.md (was duplicated from AGENTS.md)
- Fix gitnexus:start block inlined inside AGENTS.md Reference Docs bullet (doubled)
- Replace CLAUDE.md scope table with pointer to AGENTS.md (single source of truth)
- Expand .cursor/index.mdc with 5 non-negotiable safety rules for always-on context
- Add .cursor/rules/200-eval.mdc with Python/eval commands (glob-scoped to eval/**)
- Improve llms.txt with priority annotations and descriptions
- Bump version headers to 1.2.0, last-reviewed to 2026-03-24

Saves ~1,400 tokens/session with zero information loss.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Gergő Magyar <gergomagyar@icloud.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants