feat(audit): add immutable audit logging for device operations#1
feat(audit): add immutable audit logging for device operations#1pahuldeepp merged 3 commits intomasterfrom
Conversation
- audit_events table: append-only, REVOKE UPDATE/DELETE - Gateway logs device.created and device.creation_failed - libs/audit Go package for server-side audit events - Indexes on tenant_id, actor_id, event_type for fast queries - Fire-and-forget pattern — never blocks main request flow
|
Caution Review failedPull request was closed or merged during review 📝 WalkthroughWalkthroughAdds an append-only, database-backed audit logging system: TypeScript and Go libraries, a PostgreSQL migration creating an immutable Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Client as Client
participant Gateway as Gateway / POST /devices
participant Service as createDevice
participant Audit as Audit Module
participant DB as WriteDB (audit_events)
Client->>Gateway: POST /devices (body, headers)
Gateway->>Service: createDevice(...)
Service-->>Gateway: result / error
alt success
Gateway->>Audit: logAuditEvent({ eventType: "device.created", actorId, tenantId, resourceType:"device", resourceId, payload, ip, userAgent })
else failure
Gateway->>Audit: logAuditEvent({ eventType: "device.creation_failed", actorId, tenantId, resourceType:"device", resourceId, payloadWithError, ip, userAgent })
end
Audit->>DB: INSERT INTO audit_events (...)
DB-->>Audit: INSERT result / error
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@apps/gateway/src/lib/audit.ts`:
- Line 1: The audit INSERTs currently use the read-only shared pool (pool)
causing failures in read-replica setups; create a dedicated write pool (e.g.,
writePool) that connects using WRITE_DATABASE_URL or the primary DB config and
replace usages in logAuditEvent so the INSERT INTO audit_events executes against
writePool; ensure the new writePool is exported from the DB module and imported
into the audit module, and close/handle the writePool connection similarly to
the existing pool.
In `@apps/gateway/src/server.ts`:
- Around line 160-169: The audit logging calls in the /devices request handlers
are currently awaited (logAuditEvent), which blocks the request; change both
usages (the device.created and device.updated calls to logAuditEvent inside the
device create/update handlers) to fire-and-forget by invoking logAuditEvent
without awaiting and ensure any errors are observed (eg. attach a background
rejection handler that logs failures) so the request response is not delayed by
audit DB latency. Reference the logAuditEvent calls in the device create/update
handlers and add a non-blocking invocation with an error-capture path (e.g.,
background .catch to processLogger.error) to avoid unhandled rejections.
- Around line 175-176: The tenantId fallback "unknown" is not a valid UUID and
will cause audit insert failures; replace the literal "unknown" with a valid nil
UUID constant (e.g. '00000000-0000-0000-0000-000000000000') and use that in the
tenantId assignment (where tenantId: req.user?.tenantId || ... is set in
server.ts); define or import a NIL_UUID constant and use tenantId:
req.user?.tenantId || NIL_UUID so audit_events.tenant_id always receives a valid
UUID.
In `@apps/telemetry-service/migrations/000002_create_audit_events.up.sql`:
- Line 21: The REVOKE line only removes PRIVILEGES from PUBLIC but does not
prevent the table owner or superusers from modifying audit_events; add a
table-level safeguard by creating a trigger function (e.g.,
prevent_audit_events_modification) that raises an exception and then attach a
BEFORE UPDATE OR DELETE trigger on audit_events to abort any UPDATE/DELETE
attempts, ensuring immutability even for owners/elevated roles; keep the REVOKE
line but implement the trigger and function to enforce true immutability.
In `@libs/audit/audit.go`:
- Around line 46-53: The async Logger.Log currently spawns a goroutine that
passes the caller's cancelable ctx into write, causing fire-and-forget audit
events to be dropped; change Logger.Log so the goroutine uses a non-cancelable
base context (e.g. context.Background()) and create any timeout with
context.WithTimeout(context.Background(), ...) before calling l.write, ensuring
the write call does not inherit the caller's cancellation; update both the
goroutine entry in Log and the WithTimeout usage in write (or where timeout is
created) to use context.Background() as the parent.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 7957f919-6d95-4713-ad25-9af0fead4a9b
📒 Files selected for processing (5)
apps/gateway/src/lib/audit.tsapps/gateway/src/server.tsapps/telemetry-service/migrations/000002_create_audit_events.down.sqlapps/telemetry-service/migrations/000002_create_audit_events.up.sqllibs/audit/audit.go
| await logAuditEvent({ | ||
| eventType: "device.created", | ||
| actorId: userId, | ||
| tenantId, | ||
| resourceType: "device", | ||
| resourceId: result?.deviceId || serialNumber, | ||
| payload: { serialNumber, requestId }, | ||
| ipAddress: req.ip, | ||
| userAgent: req.headers["user-agent"], | ||
| }); |
There was a problem hiding this comment.
Audit logging is still request-blocking because both calls are awaited.
On Line 160 and Line 173, await logAuditEvent(...) keeps /devices coupled to DB latency/failures, which contradicts the fire-and-forget requirement and can degrade availability.
Suggested change
- await logAuditEvent({
+ void logAuditEvent({
eventType: "device.created",
actorId: userId,
tenantId,
resourceType: "device",
resourceId: result?.deviceId || serialNumber,
payload: { serialNumber, requestId },
ipAddress: req.ip,
userAgent: req.headers["user-agent"],
});
...
- await logAuditEvent({
+ void logAuditEvent({
eventType: "device.creation_failed",
actorId: req.user?.sub || "unknown",
tenantId: req.user?.tenantId || "unknown",
resourceType: "device",
payload: { serialNumber: req.body?.serialNumber, error: String(err) },
ipAddress: req.ip,
userAgent: req.headers["user-agent"],
});Also applies to: 173-181
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@apps/gateway/src/server.ts` around lines 160 - 169, The audit logging calls
in the /devices request handlers are currently awaited (logAuditEvent), which
blocks the request; change both usages (the device.created and device.updated
calls to logAuditEvent inside the device create/update handlers) to
fire-and-forget by invoking logAuditEvent without awaiting and ensure any errors
are observed (eg. attach a background rejection handler that logs failures) so
the request response is not delayed by audit DB latency. Reference the
logAuditEvent calls in the device create/update handlers and add a non-blocking
invocation with an error-capture path (e.g., background .catch to
processLogger.error) to avoid unhandled rejections.
- Use dedicated write pool for audit inserts (not read replica) - Remove await from logAuditEvent calls (true fire-and-forget) - Fix UUID fallback to nil UUID instead of 'unknown' string - Add immutability triggers to block UPDATE/DELETE at DB level - Use context.Background() in Go audit goroutine to prevent cancellation
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
libs/audit/audit.go (1)
46-55: 🧹 Nitpick | 🔵 TrivialUnused
ctxparameter inLogis misleading.The
ctxparameter is ignored since the function usescontext.Background()for fire-and-forget semantics. This can confuse callers who expect their context to control cancellation. Consider removing the parameter or documenting why it's ignored.Additionally, the goroutine captures
eventby closure rather than by value. While safe here becauseeventis a struct (copied on closure creation), explicitly passing it as a goroutine parameter makes intent clearer and matches the pattern suggested in the past review.♻️ Suggested fix
-func (l *Logger) Log(ctx context.Context, event Event) { +func (l *Logger) Log(event Event) { // Fire and forget — use Background() to decouple from request lifecycle // The caller context may be cancelled after the request completes - go func() { - if err := l.write(context.Background(), event); err != nil { + go func(ev Event) { + if err := l.write(context.Background(), ev); err != nil { log.Printf("[audit] failed to write event=%s actor=%s err=%v", - event.EventType, event.ActorID, err) + ev.EventType, ev.ActorID, err) } - }() + }(event) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@libs/audit/audit.go` around lines 46 - 55, The Log method currently ignores its ctx parameter and captures event via closure; change Logger.Log to remove the unused ctx parameter (i.e., func (l *Logger) Log(event Event)) and update all callers accordingly, and start the goroutine with the event passed as an argument (go func(ev Event) { if err := l.write(context.Background(), ev); ... }(event)) so the closure doesn't implicitly capture variables; retain the fire-and-forget behavior using context.Background() when calling l.write.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@apps/gateway/src/lib/audit.ts`:
- Around line 4-11: The write pool (writePool) is not exported so server
shutdown logic can't close it; export the existing writePool constant (created
with new Pool) from the audit module and update the server shutdown handler to
import writePool and call its end() (await writePool.end()) alongside pool.end()
and redis.quit() on SIGTERM to ensure graceful cleanup of DB connections.
---
Duplicate comments:
In `@libs/audit/audit.go`:
- Around line 46-55: The Log method currently ignores its ctx parameter and
captures event via closure; change Logger.Log to remove the unused ctx parameter
(i.e., func (l *Logger) Log(event Event)) and update all callers accordingly,
and start the goroutine with the event passed as an argument (go func(ev Event)
{ if err := l.write(context.Background(), ev); ... }(event)) so the closure
doesn't implicitly capture variables; retain the fire-and-forget behavior using
context.Background() when calling l.write.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: b20e03bb-3689-446f-82a0-eb3cb0749b15
📒 Files selected for processing (6)
apps/gateway/src/lib/audit.tsapps/gateway/src/server.tsapps/telemetry-service/migrations/000002_create_audit_events.up.sqlcmdlibs/audit/audit.goopenapitools.json
| const writePool = new Pool({ | ||
| host: process.env.WRITE_DB_HOST || "postgres", | ||
| port: parseInt(process.env.WRITE_DB_PORT || "5432"), | ||
| database: process.env.WRITE_DB_NAME || "grainguard", | ||
| user: process.env.WRITE_DB_USER || "postgres", | ||
| password: process.env.WRITE_DB_PASSWORD || "postgres", | ||
| max: 5, | ||
| }); |
There was a problem hiding this comment.
Export writePool for graceful shutdown cleanup.
The dedicated write pool is correctly configured for audit inserts. However, server.ts handles SIGTERM and closes pool and redis, but this writePool isn't exported and won't be cleaned up, potentially leaving connections open during shutdown.
🛠️ Proposed fix
-const writePool = new Pool({
+export const writePool = new Pool({
host: process.env.WRITE_DB_HOST || "postgres",
port: parseInt(process.env.WRITE_DB_PORT || "5432"),
database: process.env.WRITE_DB_NAME || "grainguard",
user: process.env.WRITE_DB_USER || "postgres",
password: process.env.WRITE_DB_PASSWORD || "postgres",
max: 5,
});Then in server.ts, add cleanup:
import { writePool } from "./lib/audit";
process.on("SIGTERM", async () => {
await redis.quit();
await pool.end();
await writePool.end();
process.exit(0);
});📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const writePool = new Pool({ | |
| host: process.env.WRITE_DB_HOST || "postgres", | |
| port: parseInt(process.env.WRITE_DB_PORT || "5432"), | |
| database: process.env.WRITE_DB_NAME || "grainguard", | |
| user: process.env.WRITE_DB_USER || "postgres", | |
| password: process.env.WRITE_DB_PASSWORD || "postgres", | |
| max: 5, | |
| }); | |
| export const writePool = new Pool({ | |
| host: process.env.WRITE_DB_HOST || "postgres", | |
| port: parseInt(process.env.WRITE_DB_PORT || "5432"), | |
| database: process.env.WRITE_DB_NAME || "grainguard", | |
| user: process.env.WRITE_DB_USER || "postgres", | |
| password: process.env.WRITE_DB_PASSWORD || "postgres", | |
| max: 5, | |
| }); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@apps/gateway/src/lib/audit.ts` around lines 4 - 11, The write pool
(writePool) is not exported so server shutdown logic can't close it; export the
existing writePool constant (created with new Pool) from the audit module and
update the server shutdown handler to import writePool and call its end() (await
writePool.end()) alongside pool.end() and redis.quit() on SIGTERM to ensure
graceful cleanup of DB connections.
|
fixed the issue |
Summary by CodeRabbit