Skip to content

fix(docker): add dedicated health endpoint for container healthcheck (#1147)#1355

Merged
magyargergo merged 2 commits into
abhigyanpatwari:mainfrom
azizur100389:fix/docker-healthcheck-endpoint
May 5, 2026
Merged

fix(docker): add dedicated health endpoint for container healthcheck (#1147)#1355
magyargergo merged 2 commits into
abhigyanpatwari:mainfrom
azizur100389:fix/docker-healthcheck-endpoint

Conversation

@azizur100389

Copy link
Copy Markdown
Contributor

Summary

  • Adds a lightweight GET /api/health endpoint that returns {"status":"ok"} immediately.
  • Switches the docker-compose healthcheck from the SSE /api/heartbeat (which never calls res.end()) to /api/health, fully decoupling the probe from the long-lived stream.

Context

PR #1182 mitigated the issue by adding -I (HEAD) to the curl probe, but the SSE handler still never terminates the response -- the HEAD request relies on curl interpreting the lack of body correctly despite Connection: keep-alive and no Content-Length. A dedicated health endpoint eliminates the ambiguity.

Test plan

  • npx tsc --noEmit -- compiles cleanly
  • npx vitest run test/unit/rate-limit.test.ts -- api.ts source-grep assertions still pass (15/15)
  • Prettier check passes on both changed files
  • docker compose up -d marks gitnexus-server healthy (reviewer verification)

Closes #1147

…bhigyanpatwari#1147)

PR abhigyanpatwari#1182 switched the probe to HEAD (-I) so the SSE body no longer
blocks curl, but the underlying handler still never calls res.end(),
leaving the connection open until Docker's 5s timeout races with the
TCP half-close. Add a lightweight GET /api/health that returns
{"status":"ok"} immediately and point the compose healthcheck at it,
fully decoupling the probe from the SSE stream.
@vercel

vercel Bot commented May 5, 2026

Copy link
Copy Markdown

@azizur100389 is attempting to deploy a commit to the NexusCore Team on Vercel.

A member of the Team first needs to authorize it.

@github-actions

github-actions Bot commented May 5, 2026

Copy link
Copy Markdown
Contributor

CI Report

All checks passed

Pipeline Status

Stage Status Details
✅ Typecheck success tsc --noEmit
✅ Tests success unit tests, 3 platforms
✅ E2E success gitnexus-web changes only

Test Results

Tests Passed Failed Skipped Duration
8023 8022 0 1 376s

✅ All 8022 tests passed

1 test(s) skipped — expand for details
  • buildTypeEnv > known limitations (documented skip tests) > Ruby block parameter: users.each { |user| } — closure param inference, different feature

Code Coverage

Tests

Metric Coverage Covered Base Delta Status
Statements 77.6% 24482/31545 77.02% 📈 +0.6 🟢 ███████████████░░░░░
Branches 66.19% 15521/23447 65.97% 📈 +0.2 🟢 █████████████░░░░░░░
Functions 82.81% 2419/2921 81.86% 📈 +1.0 🟢 ████████████████░░░░
Lines 80.63% 22103/27410 79.89% 📈 +0.7 🟢 ████████████████░░░░

📋 View full run · Generated by CI

Repository owner deleted a comment from github-actions Bot May 5, 2026
@github-actions

github-actions Bot commented May 5, 2026

Copy link
Copy Markdown
Contributor

Claude finished @magyargergo's task in 3m 51s —— View job


PR #1355 Review — fix(docker): add dedicated health endpoint for container healthcheck

Reviewed by: Senior Principal Software Architect / Docker Healthcheck / Express API Contract / SSE / Container Runtime / Elegance reviewers


Review Bar for this PR

  • GET /api/health returns a finite { "status": "ok" } with 200 and terminates the response
  • Docker healthcheck is wired to /api/health, not /api/heartbeat
  • curl -f exits cleanly on 200 and non-zero on failure; curl is present in the image
  • /api/health is registered before the SPA fallback and cannot be intercepted by it
  • /api/health carries no rate limiter that would produce 429 under normal Docker probe cadence
  • /api/heartbeat SSE behaviour is completely unchanged
  • No hidden Unicode or invisible controls in changed lines
  • No dependency, package-lock, or unrelated formatting churn
  • CI (typecheck, 8022 tests, 3 platforms) is green

Current PR State

Property Value
Head SHA a43a060
Status Open, not draft
Commits 1
Changed files 2 (docker-compose.yaml +1/-1, gitnexus/src/server/api.ts +7/-0)
Typecheck tsc --noEmit clean
Tests ✅ 8022 passed / 0 failed / 1 skipped, 3 platforms
E2E ✅ (web-only, not directly relevant)
Docker manual validation PR body marks as reviewer-pending; CI does not automate docker-compose up
Issue linkage Closes #1147
package-lock / deps No dependency changes ✅
Hidden Unicode (changed lines) Pre-existing em-dashes in comments only; no new non-ASCII in executable code or config values
Vercel status Irrelevant to this Docker/server healthcheck change
Code matches PR body

Understanding the Change

/api/heartbeat is a Server-Sent Events (SSE) endpoint: it sets Content-Type: text/event-stream, calls res.flushHeaders(), writes :ok\n\n, then keeps the connection open — emitting :ping\n\n every 15s indefinitely and never calling res.end() during normal operation. A Docker healthcheck curl against it would never receive a response termination under normal operation; the probe depends on curl interpreting the half-open connection under Connection: keep-alive with no Content-Length — fragile at best.

PR #1182 changed the probe to curl -fsSI (HEAD) as a mitigation: curl with -I sends HEAD, reads response headers only, and exits. This works but relies on Express's automatic response to HEAD on a GET handler — semantically awkward for a long-lived stream, and potentially sensitive to future middleware changes.

This PR adds a dedicated GET /api/health endpoint at api.ts:783-785:

app.get('/api/health', (_req, res) => {
  res.json({ status: 'ok' });
});

res.json() calls res.end() internally. The response is guaranteed to terminate. The probe becomes: send GET, get {"status":"ok"}, done.

What it proves: the Express process is alive and HTTP routing is functional.
What it intentionally does not prove: repo availability, embedding DB readiness, Git availability, indexing readiness — correctly out of scope for a liveness probe.


Findings

[minor] No dedicated route test for GET /api/health

  • Category: Tests (DoD §2.7)
  • Files: gitnexus/src/server/api.ts, gitnexus/test/unit/
  • Issue: There is no unit or integration test asserting GET /api/health → 200 { status: 'ok' }. The rate-limit.test.ts source-grep suite checks that specific routes carry createRouteLimiter, but does not assert health endpoint presence or shape. No existing test would fail if the health route were accidentally removed or returned 500.
  • Why it matters: DoD §6 states "The runtime path is not actually exercised by the tests" is a "Not Done" signal. For this specific endpoint the risk is low (3-line synchronous handler, no deps), but future refactors of createServer's route-registration block could silently drop it without a test catching the regression before Docker healthchecks in prod.
  • Recommended fix: Add a small test — Fix this →
  • Blocks merge: No — the endpoint is trivially correct, CI is green, and the structural risk is low for a 3-line synchronous handler. But this is the strongest argument against "fully Done" per DoD.

[cosmetic] Landing page HTML does not list /api/health

  • Category: Observability / Correctness (DoD §2.8)
  • Files: gitnexus/src/server/api.ts:186 (landingPageHtml())
  • Issue: The landing page (shown when no web UI is built) lists /api/heartbeat as an endpoint but does not list /api/health. The new endpoint is invisible to developers navigating to the server root.
  • Why it matters: Minor discoverability gap only. The endpoint is not user-facing in normal operation.
  • Recommended fix: Add <p class="endpoint"><code>/api/health</code> <span style="color:#5a5a70">— Docker/orchestrator healthcheck</span></p> after the heartbeat line.
  • Blocks merge: No.

Server Endpoint Assessment ✅

app.get('/api/health', ...) at api.ts:783 is registered at the start of the API route block, approximately 989 lines before registerWebUI(app, staticDir) is called at line 1781. Route order is correct.

SPA_FALLBACK_REGEX at line 131 is:

/^(?!\/api(?:\/|$))(?!.*\.\w{1,10}$).*/

This regex explicitly excludes all /api/* paths via the negative lookahead (?!\/api(?:\/|$)). Even if /api/health were registered after the SPA fallback (it isn't), the regex would still not match it. Double protection.

app.get('/api/health', (_req, res) => {
  res.json({ status: 'ok' });
});
  • ✅ Returns res.json(...) — Express sets Content-Type: application/json and calls res.end() automatically
  • ✅ Synchronous handler — no async work, no await, no I/O
  • ✅ No filesystem, database, Git, or network access
  • ✅ No rate limiter — Docker probes at 30s intervals (2 rpm) against a default 60 rpm budget; even if a limiter were present it wouldn't fire. But absence is correct and intentional.
  • ✅ Does not shadow /api/heartbeat (different path literal, registered two lines later)
  • ✅ Does not rely on any repo state or initialization
  • ✅ HEAD requests will work via Express's automatic HEAD-from-GET handling, though Docker now uses GET

Docker Healthcheck Assessment ✅

healthcheck:
  test: ['CMD', 'curl', '-f', 'http://localhost:4747/api/health']
  interval: 30s
  timeout: 5s
  retries: 3
  start_period: 15s
  • ✅ Target is http://localhost:4747/api/health — correct in-container address
  • curl -f exits 0 for 2xx, non-zero for 4xx/5xx or connection failure — correct semantics for a finite response
  • ✅ No -I needed — response terminates naturally
  • curl is explicitly installed in Dockerfile.cli line 35: RUN apt-get update && apt-get install -y --no-install-recommends curl git
  • interval: 30s, timeout: 5s, retries: 3, start_period: 15s — all appropriate
  • gitnexus-web depends_on: condition: service_healthy is preserved — web service still waits for server healthcheck before starting
  • ✅ Cannot hang on SSE connection — /api/health terminates immediately
  • ✅ No internet access or workspace mount required
  • ℹ️ curl -f without -s will emit curl progress/error output to stdout on failure. The existing gitnexus-web healthcheck at line 38 uses the same curl -f pattern without -s, so this is consistent with repo style. Not a blocker.

Heartbeat / SSE Regression Assessment ✅

/api/heartbeat at api.ts:787-805 is completely unchanged:

app.get('/api/heartbeat', (_req, res) => {
  res.set({ 'Content-Type': 'text/event-stream', 'Cache-Control': 'no-cache', Connection: 'keep-alive' });
  res.flushHeaders();
  res.write(':ok\n\n');
  const interval = setInterval(() => res.write(':ping\n\n'), 15_000);
  _req.on('close', () => clearInterval(interval));
});
  • ✅ SSE headers, initial ping, keep-alive interval, and close cleanup are all unchanged
  • ✅ EventSource clients will continue to work identically
  • ✅ Docker healthcheck no longer touches this endpoint
  • ✅ No accidental refactor or movement of the heartbeat route

Test Assessment

  • ✅ CI reports 8022 tests passed, 0 failed, 3 platforms
  • rate-limit.test.ts source-grep tests check routes that carry createRouteLimiter/api/health is intentionally un-limited and is not asserted; these tests remain valid and unaffected
  • ⚠️ No dedicated route test for GET /api/health → 200 { "status": "ok" } (see Findings above)
  • ✅ No test relies on /api/heartbeat being the healthcheck target
  • ✅ The PR body documents manual Docker validation is pending reviewer verification — acceptable for a 3-line synchronous endpoint if CI is trusted

Hidden Unicode / Hygiene Assessment ✅

Running character analysis on the changed files confirms:

  • docker-compose.yaml line 22 (the changed healthcheck test command): pure ASCII['CMD', 'curl', '-f', 'http://localhost:4747/api/health']
  • docker-compose.yaml line 15: pre-existing em-dash in a comment. Not introduced by this PR.
  • gitnexus/src/server/api.ts lines 780–785 (the new health endpoint + its comment): pure ASCII
  • Em-dashes and arrows in other api.ts lines are pre-existing in comments and HTML string literals throughout the file

No bidi controls, no hidden characters, no invisible whitespace in executable code, route strings, or YAML structure. Clean.

  • ✅ No unrelated formatting churn
  • ✅ No dependency or package-lock changes
  • ✅ Diff is minimal and scoped exactly to the stated change

Elegance Assessment ✅

This is an exemplary small fix:

  • 7 lines added, 1 line changed — nothing more
  • Handler is 2 lines (comment block + one-liner body) — immediately obvious to any contributor
  • Comment explains the why (#1147, "container managers do not confuse a long-lived SSE stream") without over-explaining
  • No readiness framework, no DB ping, no filesystem check — correctly scoped as a liveness-only probe
  • Named /api/health — the universal convention for HTTP liveness endpoints
  • No duplication of existing health logic (none existed)
  • Does not break any existing SSE, API, or routing behaviour

Final Verdict

production-ready with minor follow-ups

The core change is correct, minimal, and well-scoped. /api/health returns a finite, synchronous {"status":"ok"} with no rate limiter, no SSE dependency, registered well before the SPA fallback (which also explicitly excludes /api/* paths), backed by a Dockerfile that already installs curl. The SSE heartbeat is untouched. CI is green across all three platforms with 8022 tests. The only gap against the full DoD bar is the absence of a dedicated route test — the DoD §6 flags this as a "Not Done" signal, but for a 3-line synchronous handler with no dependencies, the structural risk is low enough that it warrants a follow-up rather than a block. The landing page HTML cosmetic miss (not listing /api/health) is strictly minor. Both can be addressed in a fast follow-up commit without re-opening the healthcheck risk.


- Source-grep test asserts GET /api/health is registered in api.ts
- Landing page HTML now lists /api/health for discoverability
@magyargergo magyargergo merged commit 8fc32e6 into abhigyanpatwari:main May 5, 2026
24 of 25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

docker-compose: gitnexus-server healthcheck uses SSE endpoint and becomes unhealthy

2 participants