Skip to content

feat(audit): add immutable audit logging for device operations#1

Merged
pahuldeepp merged 3 commits intomasterfrom
feat/audit-logging
Mar 21, 2026
Merged

feat(audit): add immutable audit logging for device operations#1
pahuldeepp merged 3 commits intomasterfrom
feat/audit-logging

Conversation

@pahuldeepp
Copy link
Copy Markdown
Owner

@pahuldeepp pahuldeepp commented Mar 21, 2026

  • audit_events table: append-only, REVOKE UPDATE/DELETE
  • Gateway logs device.created and device.creation_failed
  • libs/audit Go package for server-side audit events
  • Indexes on tenant_id, actor_id, event_type for fast queries
  • Fire-and-forget pattern — never blocks main request flow

Summary by CodeRabbit

  • New Features
    • System-wide audit logging added to record events (device lifecycle, telemetry queries, auth failures, admin actions) with actor/tenant IDs, resource details, client context, payloads, and timestamps.
    • Device creation endpoints now emit audit events for successes and failures.
  • Chores
    • Added migration and rollback to create an immutable, append-only audit_events table.
  • Config
    • Added OpenAPI generator configuration file.

- audit_events table: append-only, REVOKE UPDATE/DELETE
- Gateway logs device.created and device.creation_failed
- libs/audit Go package for server-side audit events
- Indexes on tenant_id, actor_id, event_type for fast queries
- Fire-and-forget pattern — never blocks main request flow
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 21, 2026

Caution

Review failed

Pull request was closed or merged during review

📝 Walkthrough

Walkthrough

Adds an append-only, database-backed audit logging system: TypeScript and Go libraries, a PostgreSQL migration creating an immutable audit_events table, gateway integration to emit device creation success/failure events, and an OpenAPI Generator CLI config file.

Changes

Cohort / File(s) Summary
Gateway audit module & integration
apps/gateway/src/lib/audit.ts, apps/gateway/src/server.ts
New TypeScript audit module exposing writePool, AuditEventType, AuditEvent, and logAuditEvent() which INSERTs into audit_events. POST /devices now logs device.created on success and device.creation_failed on error; shutdown calls writePool.end().
Telemetry DB migrations
apps/telemetry-service/migrations/000002_create_audit_events.up.sql, apps/telemetry-service/migrations/000002_create_audit_events.down.sql
New migration creating audit_events (immutable append-only table: UUID id, event_type, actor_id, tenant_id, resource_type/id, payload JSONB, ip_address, user_agent, created_at), adds indexes, revokes UPDATE/DELETE, adds trigger function to prevent updates/deletes; down migration drops table.
Go audit library
libs/audit/audit.go
New Go package audit with EventType constants, Event struct, Logger (wraps pgxpool.Pool), async Log() (fire-and-forget) and synchronous LogSync() (with 5s timeout) methods that INSERT JSON payloads into audit_events.
Tooling config
openapitools.json
Adds OpenAPI Generator CLI config (schema path, 2-space JSON formatting, pins generator-cli to v7.20.0).

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Client as Client
  participant Gateway as Gateway / POST /devices
  participant Service as createDevice
  participant Audit as Audit Module
  participant DB as WriteDB (audit_events)

  Client->>Gateway: POST /devices (body, headers)
  Gateway->>Service: createDevice(...)
  Service-->>Gateway: result / error
  alt success
    Gateway->>Audit: logAuditEvent({ eventType: "device.created", actorId, tenantId, resourceType:"device", resourceId, payload, ip, userAgent })
  else failure
    Gateway->>Audit: logAuditEvent({ eventType: "device.creation_failed", actorId, tenantId, resourceType:"device", resourceId, payloadWithError, ip, userAgent })
  end
  Audit->>DB: INSERT INTO audit_events (...)
  DB-->>Audit: INSERT result / error
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: adding immutable audit logging for device operations, which is the primary focus across all modified files.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/audit-logging

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/gateway/src/lib/audit.ts`:
- Line 1: The audit INSERTs currently use the read-only shared pool (pool)
causing failures in read-replica setups; create a dedicated write pool (e.g.,
writePool) that connects using WRITE_DATABASE_URL or the primary DB config and
replace usages in logAuditEvent so the INSERT INTO audit_events executes against
writePool; ensure the new writePool is exported from the DB module and imported
into the audit module, and close/handle the writePool connection similarly to
the existing pool.

In `@apps/gateway/src/server.ts`:
- Around line 160-169: The audit logging calls in the /devices request handlers
are currently awaited (logAuditEvent), which blocks the request; change both
usages (the device.created and device.updated calls to logAuditEvent inside the
device create/update handlers) to fire-and-forget by invoking logAuditEvent
without awaiting and ensure any errors are observed (eg. attach a background
rejection handler that logs failures) so the request response is not delayed by
audit DB latency. Reference the logAuditEvent calls in the device create/update
handlers and add a non-blocking invocation with an error-capture path (e.g.,
background .catch to processLogger.error) to avoid unhandled rejections.
- Around line 175-176: The tenantId fallback "unknown" is not a valid UUID and
will cause audit insert failures; replace the literal "unknown" with a valid nil
UUID constant (e.g. '00000000-0000-0000-0000-000000000000') and use that in the
tenantId assignment (where tenantId: req.user?.tenantId || ... is set in
server.ts); define or import a NIL_UUID constant and use tenantId:
req.user?.tenantId || NIL_UUID so audit_events.tenant_id always receives a valid
UUID.

In `@apps/telemetry-service/migrations/000002_create_audit_events.up.sql`:
- Line 21: The REVOKE line only removes PRIVILEGES from PUBLIC but does not
prevent the table owner or superusers from modifying audit_events; add a
table-level safeguard by creating a trigger function (e.g.,
prevent_audit_events_modification) that raises an exception and then attach a
BEFORE UPDATE OR DELETE trigger on audit_events to abort any UPDATE/DELETE
attempts, ensuring immutability even for owners/elevated roles; keep the REVOKE
line but implement the trigger and function to enforce true immutability.

In `@libs/audit/audit.go`:
- Around line 46-53: The async Logger.Log currently spawns a goroutine that
passes the caller's cancelable ctx into write, causing fire-and-forget audit
events to be dropped; change Logger.Log so the goroutine uses a non-cancelable
base context (e.g. context.Background()) and create any timeout with
context.WithTimeout(context.Background(), ...) before calling l.write, ensuring
the write call does not inherit the caller's cancellation; update both the
goroutine entry in Log and the WithTimeout usage in write (or where timeout is
created) to use context.Background() as the parent.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 7957f919-6d95-4713-ad25-9af0fead4a9b

📥 Commits

Reviewing files that changed from the base of the PR and between 2f9228f and c25087e.

📒 Files selected for processing (5)
  • apps/gateway/src/lib/audit.ts
  • apps/gateway/src/server.ts
  • apps/telemetry-service/migrations/000002_create_audit_events.down.sql
  • apps/telemetry-service/migrations/000002_create_audit_events.up.sql
  • libs/audit/audit.go

Comment thread apps/gateway/src/lib/audit.ts Outdated
Comment thread apps/gateway/src/server.ts Outdated
Comment on lines +160 to +169
await logAuditEvent({
eventType: "device.created",
actorId: userId,
tenantId,
resourceType: "device",
resourceId: result?.deviceId || serialNumber,
payload: { serialNumber, requestId },
ipAddress: req.ip,
userAgent: req.headers["user-agent"],
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Audit logging is still request-blocking because both calls are awaited.

On Line 160 and Line 173, await logAuditEvent(...) keeps /devices coupled to DB latency/failures, which contradicts the fire-and-forget requirement and can degrade availability.

Suggested change
-      await logAuditEvent({
+      void logAuditEvent({
         eventType: "device.created",
         actorId: userId,
         tenantId,
         resourceType: "device",
         resourceId: result?.deviceId || serialNumber,
         payload: { serialNumber, requestId },
         ipAddress: req.ip,
         userAgent: req.headers["user-agent"],
       });
...
-      await logAuditEvent({
+      void logAuditEvent({
         eventType: "device.creation_failed",
         actorId: req.user?.sub || "unknown",
         tenantId: req.user?.tenantId || "unknown",
         resourceType: "device",
         payload: { serialNumber: req.body?.serialNumber, error: String(err) },
         ipAddress: req.ip,
         userAgent: req.headers["user-agent"],
       });

Also applies to: 173-181

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/gateway/src/server.ts` around lines 160 - 169, The audit logging calls
in the /devices request handlers are currently awaited (logAuditEvent), which
blocks the request; change both usages (the device.created and device.updated
calls to logAuditEvent inside the device create/update handlers) to
fire-and-forget by invoking logAuditEvent without awaiting and ensure any errors
are observed (eg. attach a background rejection handler that logs failures) so
the request response is not delayed by audit DB latency. Reference the
logAuditEvent calls in the device create/update handlers and add a non-blocking
invocation with an error-capture path (e.g., background .catch to
processLogger.error) to avoid unhandled rejections.

Comment thread apps/gateway/src/server.ts Outdated
Comment thread apps/telemetry-service/migrations/000002_create_audit_events.up.sql
Comment thread libs/audit/audit.go
- Use dedicated write pool for audit inserts (not read replica)
- Remove await from logAuditEvent calls (true fire-and-forget)
- Fix UUID fallback to nil UUID instead of 'unknown' string
- Add immutability triggers to block UPDATE/DELETE at DB level
- Use context.Background() in Go audit goroutine to prevent cancellation
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
libs/audit/audit.go (1)

46-55: 🧹 Nitpick | 🔵 Trivial

Unused ctx parameter in Log is misleading.

The ctx parameter is ignored since the function uses context.Background() for fire-and-forget semantics. This can confuse callers who expect their context to control cancellation. Consider removing the parameter or documenting why it's ignored.

Additionally, the goroutine captures event by closure rather than by value. While safe here because event is a struct (copied on closure creation), explicitly passing it as a goroutine parameter makes intent clearer and matches the pattern suggested in the past review.

♻️ Suggested fix
-func (l *Logger) Log(ctx context.Context, event Event) {
+func (l *Logger) Log(event Event) {
 	// Fire and forget — use Background() to decouple from request lifecycle
 	// The caller context may be cancelled after the request completes
-	go func() {
-		if err := l.write(context.Background(), event); err != nil {
+	go func(ev Event) {
+		if err := l.write(context.Background(), ev); err != nil {
 			log.Printf("[audit] failed to write event=%s actor=%s err=%v",
-				event.EventType, event.ActorID, err)
+				ev.EventType, ev.ActorID, err)
 		}
-	}()
+	}(event)
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@libs/audit/audit.go` around lines 46 - 55, The Log method currently ignores
its ctx parameter and captures event via closure; change Logger.Log to remove
the unused ctx parameter (i.e., func (l *Logger) Log(event Event)) and update
all callers accordingly, and start the goroutine with the event passed as an
argument (go func(ev Event) { if err := l.write(context.Background(), ev); ...
}(event)) so the closure doesn't implicitly capture variables; retain the
fire-and-forget behavior using context.Background() when calling l.write.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/gateway/src/lib/audit.ts`:
- Around line 4-11: The write pool (writePool) is not exported so server
shutdown logic can't close it; export the existing writePool constant (created
with new Pool) from the audit module and update the server shutdown handler to
import writePool and call its end() (await writePool.end()) alongside pool.end()
and redis.quit() on SIGTERM to ensure graceful cleanup of DB connections.

---

Duplicate comments:
In `@libs/audit/audit.go`:
- Around line 46-55: The Log method currently ignores its ctx parameter and
captures event via closure; change Logger.Log to remove the unused ctx parameter
(i.e., func (l *Logger) Log(event Event)) and update all callers accordingly,
and start the goroutine with the event passed as an argument (go func(ev Event)
{ if err := l.write(context.Background(), ev); ... }(event)) so the closure
doesn't implicitly capture variables; retain the fire-and-forget behavior using
context.Background() when calling l.write.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: b20e03bb-3689-446f-82a0-eb3cb0749b15

📥 Commits

Reviewing files that changed from the base of the PR and between c25087e and 8419060.

📒 Files selected for processing (6)
  • apps/gateway/src/lib/audit.ts
  • apps/gateway/src/server.ts
  • apps/telemetry-service/migrations/000002_create_audit_events.up.sql
  • cmd
  • libs/audit/audit.go
  • openapitools.json

Comment thread apps/gateway/src/lib/audit.ts Outdated
Comment on lines +4 to +11
const writePool = new Pool({
host: process.env.WRITE_DB_HOST || "postgres",
port: parseInt(process.env.WRITE_DB_PORT || "5432"),
database: process.env.WRITE_DB_NAME || "grainguard",
user: process.env.WRITE_DB_USER || "postgres",
password: process.env.WRITE_DB_PASSWORD || "postgres",
max: 5,
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Export writePool for graceful shutdown cleanup.

The dedicated write pool is correctly configured for audit inserts. However, server.ts handles SIGTERM and closes pool and redis, but this writePool isn't exported and won't be cleaned up, potentially leaving connections open during shutdown.

🛠️ Proposed fix
-const writePool = new Pool({
+export const writePool = new Pool({
   host:     process.env.WRITE_DB_HOST     || "postgres",
   port:     parseInt(process.env.WRITE_DB_PORT || "5432"),
   database: process.env.WRITE_DB_NAME     || "grainguard",
   user:     process.env.WRITE_DB_USER     || "postgres",
   password: process.env.WRITE_DB_PASSWORD || "postgres",
   max: 5,
 });

Then in server.ts, add cleanup:

import { writePool } from "./lib/audit";

process.on("SIGTERM", async () => {
  await redis.quit();
  await pool.end();
  await writePool.end();
  process.exit(0);
});
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const writePool = new Pool({
host: process.env.WRITE_DB_HOST || "postgres",
port: parseInt(process.env.WRITE_DB_PORT || "5432"),
database: process.env.WRITE_DB_NAME || "grainguard",
user: process.env.WRITE_DB_USER || "postgres",
password: process.env.WRITE_DB_PASSWORD || "postgres",
max: 5,
});
export const writePool = new Pool({
host: process.env.WRITE_DB_HOST || "postgres",
port: parseInt(process.env.WRITE_DB_PORT || "5432"),
database: process.env.WRITE_DB_NAME || "grainguard",
user: process.env.WRITE_DB_USER || "postgres",
password: process.env.WRITE_DB_PASSWORD || "postgres",
max: 5,
});
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/gateway/src/lib/audit.ts` around lines 4 - 11, The write pool
(writePool) is not exported so server shutdown logic can't close it; export the
existing writePool constant (created with new Pool) from the audit module and
update the server shutdown handler to import writePool and call its end() (await
writePool.end()) alongside pool.end() and redis.quit() on SIGTERM to ensure
graceful cleanup of DB connections.

@pahuldeepp
Copy link
Copy Markdown
Owner Author

fixed the issue

@pahuldeepp pahuldeepp merged commit 530585b into master Mar 21, 2026
2 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant