feat(logsdb): comprehensive benchmarks, engine optimizations, and interactive demo by strawgate · Pull Request #205 · strawgate/o11ykit

strawgate · 2026-04-28T04:49:07Z

Summary

Brings o11ylogsdb toward production maturity with three key additions:

1. Comprehensive Benchmark Suite

New synthetic corpus generators covering 5 workload types at multiple scales:

Syslog (templated Linux/HDFS-style)
Structured JSON (Pino/Winston HTTP service logs)
High-Cardinality (UUID/trace-ID-heavy)
Cloud-Native (Kubernetes/Docker logs)
Mixed (realistic multi-format service)

Three new benchmark modules:

comprehensive-storage — B/log and compression ratio across all corpus types
comprehensive-query — Query latency for 10 predicate patterns across all corpora
ingest-throughput — Records/s and MB/s at 100K scale

2. WeakMap Decode Cache (50-100× speedup)

Chunks are immutable after freeze — cached decoded records eliminate redundant ZSTD decompression on repeated queries:

Corpus	Before (p50)	After (p50)	Speedup
Syslog full_scan	14.3ms	0.4ms	36×
Structured body_contains	63.1ms	0.3ms	210×
High-cardinality full_scan	51.6ms	0.5ms	103×
Cloud-native body_contains	41.9ms	0.9ms	47×

3. Interactive LogsDB Demo (`site/logsdb-engine/`)

Three-panel experience using the real o11ylogsdb engine:

Storage Explorer — byte-level chunk inspection, stream stats, compression metrics
Logs Explorer — curated service health, error clusters, template analysis
Query Builder — visual query construction with live results

Includes 32 tests for the demo modules and browser compatibility fixes.

Storage Efficiency Results (10K records)

Corpus Type	B/log	×text	×ndjson
Syslog	14.3	5.1×	14.8×
Cloud-Native	21.2	4.4×	16.0×
Mixed	39.2	4.0×	9.5×
Structured JSON	47.7	6.3×	10.5×
High-Cardinality	66.1	1.3×	7.7×

Key Findings

Syslog and cloud-native logs compress excellently (Drain templates working well)
High-cardinality is the weak spot (UUIDs are inherently incompressible)
Time-range chunk pruning eliminates 80-90% of chunks
Decode cache makes repeated queries near-instant (<1ms)

Add cross-corpus benchmarks for storage efficiency, query latency, and ingest throughput across 5 synthetic workload types (syslog, structured JSON, high-cardinality, cloud-native, mixed) at 1K/10K/100K scale. Key findings from initial run: - Syslog: 14.2 B/log (5.1× vs text, 14.9× vs ndjson) — excellent - Cloud-native: 21.1 B/log (4.4× vs text, 16.1× vs ndjson) — good - Mixed: 38.9 B/log (4.0× vs text, 9.5× vs ndjson) - Structured JSON: 47.7 B/log (6.3× vs text, 10.5× vs ndjson) - High-cardinality: 66.1 B/log (1.3× vs text, 7.7× vs ndjson) Ingest throughput: 4K-25K records/s depending on workload complexity. Query pruning working well: time_range queries prune 80-90% of chunks. Also includes: - Interactive logsdb-engine demo with Storage Explorer, Logs Explorer, and Query Builder panels using the real o11ylogsdb engine - 32 demo tests (data-gen, storage-model, query-model) - Browser compatibility fix for query.ts (performance.now fallback) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Chunks are immutable after freeze, so decoded LogRecord[] can be cached safely in a WeakMap. The query engine now uses store.decodeChunk() which returns cached results, eliminating redundant ZSTD decompression on repeated queries (the common case for dashboard refresh). Benchmark results (10K records, comprehensive-query): Before: 14-91ms p50 across corpus types After: 0.1-0.9ms p50 (cache-hot) Pattern borrowed from o11ytracesdb/src/engine.ts which uses the same WeakMap<Chunk, Span[]> approach. Memory cost: one additional reference per chunk (records are retained while chunk exists anyway). GC is automatic via WeakMap — no invalidation logic needed. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

coderabbitai · 2026-04-28T04:49:18Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

Walkthrough

Adds three new benchmark runners (comprehensive-query, comprehensive-storage, ingest-throughput) and registers them with the bench runner. Introduces deterministic synthetic corpus generators and size presets. Implements a body-only extraction API (readBodiesOnly) and a ChunkPolicy.decodeBodiesOnly hook plus typed-columnar token metadata and a TypedColumnar decode-bodies-only path. queryStream gains a body-containing fast path that uses the body-only decoding. Re-exports readBodiesOnly from the public API. Adds a browser demo site (HTML/CSS/JS), a node:zlib browser stub, Vite config, and multiple Vitest suites for data generation, storage-model, and query-model.

Possibly related PRs

feat(stardb): shared core (*db) — extract codec primitives #192: Touches codec/registry primitives used by the new decode-bodies-only implementations and types.
test(o11ylogsdb): unit test suite (82 tests, 8 files) #182: Adds unit tests targeting chunk policy hooks and query body-prefilter/pruning behavior relevant to readBodiesOnly.
o11ylogsdb: in-memory engine design + 27 experiments validating 20× target #180: Overlaps on engine/query/chunk/bench scaffolding that this change extends with body-only decoding and new benches.

🚥 Pre-merge checks | ✅ 2

✅ Passed checks (2 passed)

Check name	Status	Explanation
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

…ry speedup" This reverts commit 92611a4.

gitar-bot · 2026-04-28T04:51:14Z

+export function formatBodyPreview(body, maxLen = 120) {
+  const full = typeof body === "string" ? body : JSON.stringify(body);
+  if (full.length <= maxLen) return full;
+  return full.slice(0, maxLen) + "…";


⚠️ Bug: formatBodyPreview crashes on undefined/null body

formatBodyPreview at line 166 does JSON.stringify(body) when body is not a string. If body is undefined, JSON.stringify(undefined) returns undefined (not a string), and the subsequent .length access on line 167 throws TypeError. The sibling formatBody function (line 154) already handles null/undefined/Uint8Array, but formatBodyPreview does not. This will crash the Query Builder results table in app.js:492 if any record has a missing body.

Suggested fix:

export function formatBodyPreview(body, maxLen = 120) { if (body === null || body === undefined) return ""; if (body instanceof Uint8Array) return `<binary ${body.length} bytes>`; const full = typeof body === "string" ? body : JSON.stringify(body); if (full.length <= maxLen) return full; return full.slice(0, maxLen) + "…"; }

_{Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion}

gitar-bot · 2026-04-28T04:51:16Z

+  decodeChunk(streamId: StreamId, chunk: Chunk): LogRecord[] {
+    let cached = this._decodeCache.get(chunk);
+    if (!cached) {
+      const policy = this.policyFor(streamId);
+      cached = readRecords(chunk, this.registry, policy);
+      this._decodeCache.set(chunk, cached);
+    }
+    return cached;
+  }


⚠️ Edge Case: Decode cache returns mutable array references to callers

The decodeChunk WeakMap cache (engine.ts:193-199) returns the same LogRecord[] reference to every caller. While current query.ts code only reads from this array, any future caller (or external consumer via iterRecords) that mutates the returned array (e.g., .sort(), .push(), or modifying record properties) would silently corrupt the cache for all subsequent queries. This is a latent correctness hazard for a caching layer meant to be transparent.

Consider returning a shallow copy (return [...cached]) or freezing the array with Object.freeze(cached) before caching. The cost is negligible compared to the ZSTD decode it replaces.

Suggested fix:

decodeChunk(streamId: StreamId, chunk: Chunk): LogRecord[] { let cached = this._decodeCache.get(chunk); if (!cached) { const policy = this.policyFor(streamId); cached = readRecords(chunk, this.registry, policy); Object.freeze(cached); this._decodeCache.set(chunk, cached); } return cached; }

_{Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion}

gitar-bot · 2026-04-28T04:51:18Z

+// @ts-nocheck
+// ── Logs Explorer Model ───────────────────────────────────────────────
+// Curated log exploration view. Surfaces problematic patterns,
+// error clusters, template analysis, and time-based insights.
+
+import { query } from "o11ylogsdb";
+
+/**
+ * Analyze the store and produce curated insights.
+ */
+export function analyzeStore(store) {
+  store.flush();
+  const stats = store.stats();
+
+  // Query for errors


💡 Quality: Missing tests for logs-model.js module

The PR adds test files for data-gen.js, query-model.js, and storage-model.js, but logs-model.js (171 lines with analyzeStore, error clustering, template analysis, and timeline building) has no corresponding test file. Per the custom review instructions, new test files should be added for significant logic changes. The analyzeStore function contains non-trivial logic (error clustering by prefix, regex-based template extraction, per-service error rate calculation) that would benefit from test coverage.

_{Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion}

gitar-bot · 2026-04-28T04:51:19Z

+  for (const svc of Object.keys(services)) {
+    const errorResult = query(store, {
+      resourceEquals: { "service.name": svc },
+      severityGte: 17,
+      limit: 10000,
+    });
+    services[svc].errors = errorResult.stats.recordsEmitted;
+    services[svc].errorRate =
+      services[svc].logs > 0
+        ? ((services[svc].errors / services[svc].logs) * 100).toFixed(2)
+        : "0";


💡 Performance: analyzeServices issues N+1 queries per service

In logs-model.js:148-158, analyzeServices iterates over all services and runs a separate query() call per service to count errors. With 6 services this is tolerable, but the pattern is O(services × records). A single query(store, { severityGte: 17 }) followed by grouping results by service attribute would be more efficient and avoid redundant chunk decoding.

Suggested fix:

// Replace per-service queries with a single error query: const allErrors = query(store, { severityGte: 17 }); for (const r of allErrors.records) { const svc = r.attributes?.find(a => a.key === "service.name")?.value ?? "unknown"; if (services[svc]) services[svc].errors++; } for (const svc of Object.values(services)) { svc.errorRate = svc.logs > 0 ? ((svc.errors / svc.logs) * 100).toFixed(2) : "0"; }

_{Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion}

gitar-bot · 2026-04-28T04:51:21Z

@@ -0,0 +1,511 @@
+// @ts-nocheck


💡 Quality: All site JS modules use @ts-nocheck instead of proper types

All 5 new JS files under site/logsdb-engine/js/ use @ts-nocheck on line 1. Per the custom review instructions, @ts-nocheck should be avoided unless strictly necessary for existing legacy files. Since these are all brand-new files, consider either converting them to .ts with proper types or at minimum adding JSDoc type annotations and removing the blanket @ts-nocheck.

_{Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion}

gitar-bot · 2026-04-28T04:51:22Z

Note

Your trial team has used its Gitar budget, so automatic reviews are paused. Upgrade now to unlock full capacity. Comment "Gitar review" to trigger a review manually.
Learn more about usage limits

Code Review ⚠️ Changes requested 0 resolved / 5 findings

Introduces benchmarks, a decode cache, and an interactive demo, but requires fixes for null-pointer crashes in body previews and mutable cache references. Additionally, N+1 query performance issues and missing type safety need to be addressed.

⚠️

Bug: formatBodyPreview crashes on undefined/null body

📄 site/logsdb-engine/js/query-model.js:165-168 📄 site/logsdb-engine/js/app.js:492

formatBodyPreview at line 166 does JSON.stringify(body) when body is not a string. If body is undefined, JSON.stringify(undefined) returns undefined (not a string), and the subsequent .length access on line 167 throws TypeError. The sibling formatBody function (line 154) already handles null/undefined/Uint8Array, but formatBodyPreview does not. This will crash the Query Builder results table in app.js:492 if any record has a missing body.

Suggested fix

export function formatBodyPreview(body, maxLen = 120) {
  if (body === null || body === undefined) return "";
  if (body instanceof Uint8Array) return `<binary ${body.length} bytes>`;
  const full = typeof body === "string" ? body : JSON.stringify(body);
  if (full.length <= maxLen) return full;
  return full.slice(0, maxLen) + "…";
}

⚠️

Edge Case: Decode cache returns mutable array references to callers

📄 packages/o11ylogsdb/src/engine.ts:192-200

The decodeChunk WeakMap cache (engine.ts:193-199) returns the same LogRecord[] reference to every caller. While current query.ts code only reads from this array, any future caller (or external consumer via iterRecords) that mutates the returned array (e.g., .sort(), .push(), or modifying record properties) would silently corrupt the cache for all subsequent queries. This is a latent correctness hazard for a caching layer meant to be transparent.

Consider returning a shallow copy (return [...cached]) or freezing the array with Object.freeze(cached) before caching. The cost is negligible compared to the ZSTD decode it replaces.

Suggested fix

decodeChunk(streamId: StreamId, chunk: Chunk): LogRecord[] {
  let cached = this._decodeCache.get(chunk);
  if (!cached) {
    const policy = this.policyFor(streamId);
    cached = readRecords(chunk, this.registry, policy);
    Object.freeze(cached);
    this._decodeCache.set(chunk, cached);
  }
  return cached;
}

💡 Quality: Missing tests for logs-model.js module

📄 site/logsdb-engine/js/logs-model.js:1-15

The PR adds test files for data-gen.js, query-model.js, and storage-model.js, but logs-model.js (171 lines with analyzeStore, error clustering, template analysis, and timeline building) has no corresponding test file. Per the custom review instructions, new test files should be added for significant logic changes. The analyzeStore function contains non-trivial logic (error clustering by prefix, regex-based template extraction, per-service error rate calculation) that would benefit from test coverage.

💡 Performance: analyzeServices issues N+1 queries per service

📄 site/logsdb-engine/js/logs-model.js:148-158

In logs-model.js:148-158, analyzeServices iterates over all services and runs a separate query() call per service to count errors. With 6 services this is tolerable, but the pattern is O(services × records). A single query(store, { severityGte: 17 }) followed by grouping results by service attribute would be more efficient and avoid redundant chunk decoding.

Suggested fix

// Replace per-service queries with a single error query:
const allErrors = query(store, { severityGte: 17 });
for (const r of allErrors.records) {
  const svc = r.attributes?.find(a => a.key === "service.name")?.value ?? "unknown";
  if (services[svc]) services[svc].errors++;
}
for (const svc of Object.values(services)) {
  svc.errorRate = svc.logs > 0
    ? ((svc.errors / svc.logs) * 100).toFixed(2) : "0";
}

💡 Quality: All site JS modules use @ts-nocheck instead of proper types

📄 site/logsdb-engine/js/data-gen.js:1 📄 site/logsdb-engine/js/app.js:1 📄 site/logsdb-engine/js/query-model.js:1 📄 site/logsdb-engine/js/storage-model.js:1 📄 site/logsdb-engine/js/logs-model.js:1

All 5 new JS files under site/logsdb-engine/js/ use @ts-nocheck on line 1. Per the custom review instructions, @ts-nocheck should be avoided unless strictly necessary for existing legacy files. Since these are all brand-new files, consider either converting them to .ts with proper types or at minimum adding JSDoc type annotations and removing the blanket @ts-nocheck.

🤖 Prompt for agents

Code Review: Introduces benchmarks, a decode cache, and an interactive demo, but requires fixes for null-pointer crashes in body previews and mutable cache references. Additionally, N+1 query performance issues and missing type safety need to be addressed.

1. ⚠️ Bug: `formatBodyPreview` crashes on `undefined`/`null` body
   Files: site/logsdb-engine/js/query-model.js:165-168, site/logsdb-engine/js/app.js:492

   `formatBodyPreview` at line 166 does `JSON.stringify(body)` when body is not a string. If `body` is `undefined`, `JSON.stringify(undefined)` returns `undefined` (not a string), and the subsequent `.length` access on line 167 throws `TypeError`. The sibling `formatBody` function (line 154) already handles `null`/`undefined`/`Uint8Array`, but `formatBodyPreview` does not. This will crash the Query Builder results table in `app.js:492` if any record has a missing body.

   Suggested fix:
   export function formatBodyPreview(body, maxLen = 120) {
     if (body === null || body === undefined) return "";
     if (body instanceof Uint8Array) return `<binary ${body.length} bytes>`;
     const full = typeof body === "string" ? body : JSON.stringify(body);
     if (full.length <= maxLen) return full;
     return full.slice(0, maxLen) + "…";
   }

2. ⚠️ Edge Case: Decode cache returns mutable array references to callers
   Files: packages/o11ylogsdb/src/engine.ts:192-200

   The `decodeChunk` WeakMap cache (engine.ts:193-199) returns the same `LogRecord[]` reference to every caller. While current `query.ts` code only reads from this array, any future caller (or external consumer via `iterRecords`) that mutates the returned array (e.g., `.sort()`, `.push()`, or modifying record properties) would silently corrupt the cache for all subsequent queries. This is a latent correctness hazard for a caching layer meant to be transparent.
   
   Consider returning a shallow copy (`return [...cached]`) or freezing the array with `Object.freeze(cached)` before caching. The cost is negligible compared to the ZSTD decode it replaces.

   Suggested fix:
   decodeChunk(streamId: StreamId, chunk: Chunk): LogRecord[] {
     let cached = this._decodeCache.get(chunk);
     if (!cached) {
       const policy = this.policyFor(streamId);
       cached = readRecords(chunk, this.registry, policy);
       Object.freeze(cached);
       this._decodeCache.set(chunk, cached);
     }
     return cached;
   }

3. 💡 Quality: Missing tests for `logs-model.js` module
   Files: site/logsdb-engine/js/logs-model.js:1-15

   The PR adds test files for `data-gen.js`, `query-model.js`, and `storage-model.js`, but `logs-model.js` (171 lines with `analyzeStore`, error clustering, template analysis, and timeline building) has no corresponding test file. Per the custom review instructions, new test files should be added for significant logic changes. The `analyzeStore` function contains non-trivial logic (error clustering by prefix, regex-based template extraction, per-service error rate calculation) that would benefit from test coverage.

4. 💡 Performance: `analyzeServices` issues N+1 queries per service
   Files: site/logsdb-engine/js/logs-model.js:148-158

   In `logs-model.js:148-158`, `analyzeServices` iterates over all services and runs a separate `query()` call per service to count errors. With 6 services this is tolerable, but the pattern is O(services × records). A single `query(store, { severityGte: 17 })` followed by grouping results by service attribute would be more efficient and avoid redundant chunk decoding.

   Suggested fix:
   // Replace per-service queries with a single error query:
   const allErrors = query(store, { severityGte: 17 });
   for (const r of allErrors.records) {
     const svc = r.attributes?.find(a => a.key === "service.name")?.value ?? "unknown";
     if (services[svc]) services[svc].errors++;
   }
   for (const svc of Object.values(services)) {
     svc.errorRate = svc.logs > 0
       ? ((svc.errors / svc.logs) * 100).toFixed(2) : "0";
   }

5. 💡 Quality: All site JS modules use `@ts-nocheck` instead of proper types
   Files: site/logsdb-engine/js/data-gen.js:1, site/logsdb-engine/js/app.js:1, site/logsdb-engine/js/query-model.js:1, site/logsdb-engine/js/storage-model.js:1, site/logsdb-engine/js/logs-model.js:1

   All 5 new JS files under `site/logsdb-engine/js/` use `@ts-nocheck` on line 1. Per the custom review instructions, `@ts-nocheck` should be avoided unless strictly necessary for existing legacy files. Since these are all brand-new files, consider either converting them to `.ts` with proper types or at minimum adding JSDoc type annotations and removing the blanket `@ts-nocheck`.

Options

Display: compact → Showing less information.

Comment with these commands to change:

`Compact`
`gitar display:verbose`

Important

Your trial ends in 5 days — upgrade now to keep code review, CI analysis, auto-apply, custom automations, and more.

_{Was this helpful? React with 👍 / 👎 | Gitar}

Two real engine optimizations for bodyContains queries: 1. Partial column decode (decodeBodiesOnly): skips sidecar JSON.parse entirely — only decodes kinds + templates + variable columns. Result: 9× speedup on high-cardinality body_substring (91ms → 10ms). 2. Template-token chunk pruning: stores template literal tokens in chunk header (codecMeta.toks). For bodyContains queries, checks if any token contains the needle BEFORE ZSTD decompression. If no template token matches and the needle looks like a keyword, the entire chunk is skipped without touching compressed data. Result: syslog body_substring 11ms → 0.1ms (113× faster), high-cardinality 91ms → 0ms (all chunks pruned at header level). These are real engine path improvements — no caching, no extra memory. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

coderabbitai

Actionable comments posted: 17

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/o11ylogsdb/bench/comprehensive-query.bench.ts`:
- Around line 188-200: The benchmark is passing a record count into
profileEncode (ProfileResult expects outputBytes), so replace the encode closure
used in comprehensive-query.bench.ts (currently returning r.records.length from
query(store, spec)) with a value representing output bytes (e.g., serialize
records and measure byte length or sum individual record byte sizes) or
implement a dedicated query profiler/report type instead of reusing the
compression profiler; update references to outputBytes/bytesPerLog/compression
ratios accordingly so the saved report uses actual byte sizes rather than record
counts.
- Around line 109-154: The benchmarks hard-code syslog-specific predicates
(e.g., service_match uses "service.name":"bench-syslog" and combined_all uses
"corpus.type":"syslog" and bodyContains:"ssh"), causing non-syslog workloads to
exercise no-match paths; update the spec factories to derive predicates from the
provided records instead of fixed literals. For service_match, read a real
service name from the incoming records (e.g., use
records[0].resource["service.name"] when spec receives records) and return
resourceEquals with that value; for combined_all (and combined_time+severity if
applicable) replace the hard-coded corpus.type and body token with values
sampled from records (e.g., records[0].resource["corpus.type"] and pick a
substring present in records[0].body or scan records for a frequent token) so
each workload runs equivalent, corpus-specific queries across all benchmark
cases.

In `@packages/o11ylogsdb/bench/ingest-throughput.bench.ts`:
- Around line 72-99: The current peakHeapMB and peakRssMB only compare memBefore
and memAfter and miss transient spikes; change the measurement to track actual
peaks by sampling memory during the hot path: inside the records loop that calls
store.append(resource, SCOPE, record) (and/or right after store.flush()), call
process.memoryUsage() periodically (e.g., every N records or every append) and
update running maxHeapUsed and maxRss variables, then compute peakHeapMB and
peakRssMB from those max values instead of Math.max(memBefore.heapUsed,
memAfter.heapUsed); keep the existing memBefore/memAfter for snapshots if
desired.

In `@packages/o11ylogsdb/bench/synthetic-corpora.ts`:
- Around line 121-125: The generators leak wall-clock time into produced corpora
via Date.now() (e.g., baseNs = BigInt(Date.now()) * 1_000_000n used when
building records) and by constructing Date-based fields for bodies, so runs with
the same seed/count are not reproducible; replace uses of Date.now() / new
Date(...) and any direct time creation with deterministic values derived from
the seeded PRNG (e.g., generate a deterministic base timestamp from the seed and
then increment by deterministic offsets per record) and similarly generate body
text and any date-string fields using the same seeded RNG (affecting the blocks
around baseNs, the record time generation, and any body/date construction noted
at the other ranges). Ensure randomSeverity() remains seeded or consumes the
same PRNG instance so the entire corpus is deterministic for a given (count,
seed).
- Around line 23-27: The function rand() currently updates the module seed
inside an expression (let t = (_seed += 0x6d2b79f5)) which violates the
noAssignInExpressions rule; split the seed update into its own statement (e.g.,
_seed += 0x6d2b79f5; const t = _seed;) or add a small helper like advanceSeed()
that mutates _seed before computing t, then keep the rest of rand() unchanged
(referencing rand() and the module-level _seed variable to locate the change).

In `@packages/o11ylogsdb/src/engine.ts`:
- Around line 94-99: The decode cache _decodeCache (currently a WeakMap<Chunk,
LogRecord[]>) is unbounded because StreamRegistry keeps strong references to
Chunk, causing decoded arrays to accumulate; replace it with a bounded eviction
cache (e.g., an LRU with a max entries or byte-size limit) and use that cache
instead of _decodeCache in functions that call decode; ensure keys remain Chunk
instances (or a stable chunk id) and evict oldest entries when capacity is
exceeded so decoded LogRecord[] are released and GC-able even when
StreamRegistry retains chunk references.
- Around line 192-199: decodeChunk currently stores and returns the same mutable
LogRecord array from _decodeCache (populated via readRecords), causing
shared-mutation bugs; change to preserve immutability by either (a)
deep-freezing the cached records and their nested body/attributes before calling
this._decodeCache.set(chunk, cached) so all callers get read-only instances, or
(b) keep cached raw objects but return a shallow/deep cloned array and cloned
records (and cloned body/attributes) on each cache hit in decodeChunk so callers
get unique copies; update decodeChunk, the _decodeCache set site, and consider a
helper (e.g., cloneRecord or freezeRecord) to centralize copying/freezing logic
while still referring to decodeChunk, _decodeCache, readRecords, and LogRecord.

In `@site/logsdb-engine/css/base.css`:
- Around line 8-10: The :root CSS custom properties (--mono, --sans, --display)
are defined self-referentially (e.g., --mono: var(--mono, ...)) which creates a
cyclic/invalid value; replace those var(...) definitions with the actual
fallback token values so consumers reading var(--mono), var(--sans), or
var(--display) get the intended font stacks directly (e.g., set --mono to
"JetBrains Mono", monospace; --sans to "Inter Tight", sans-serif; --display to
"Fraunces", serif).

In `@site/logsdb-engine/js/app.js`:
- Around line 118-163: The current setTimeout wrapper still runs generateLogs()
and ingestRecords() synchronously and blocks the UI; change the implementation
to generate and ingest in incremental batches so the event loop can repaint (use
generateLogBatches() or a Web Worker). Specifically, replace the synchronous
call to generateLogs(...) and the direct ingestRecords(store, result.records)
usage inside the setTimeout callback with an async batch loop (or postMessage to
a Worker) that calls generateLogBatches(...) to produce chunks, calls
ingestRecords(store, chunk) for each chunk (or sends chunks back to the main
thread to ingest), updates onProgress and UI between batches, and only when all
batches complete set genStats, compute timings, update getStoreStats(...)
display values, call renderCurrentTab(), enable buttons, and resolve. Ensure
references to generateLogs, generateLogBatches, ingestRecords, createStore,
getStoreStats, and genStats are updated accordingly so progress updates and
stats are computed incrementally rather than after a single blocking call.
- Around line 137-159: When replacing the store after
createStore()/ingestRecords(), clear the cached query result so stale data isn't
rendered: reset the shared variable lastQueryResult (and any related cache like
lastQuery if present) to null/undefined before calling renderCurrentTab();
update the block after ingestRecords(store, ...) where store is assigned and
before renderCurrentTab() to explicitly clear lastQueryResult so Query Builder
won't render results from the previous dataset.

In `@site/logsdb-engine/js/data-gen.js`:
- Around line 340-346: generateLogs currently sets _seed but computes baseTime
from Date.now(), so the same seed still yields different results; change the
logic in generateLogs (and the similar block at the other branch around the
453-459 region) so that when a seed is provided you derive baseTime
deterministically (either compute baseTime from the seed or accept/derive a
fixed startTime from opts) instead of using Date.now(); ensure both code paths
set _seed and the deterministic baseTime consistently so timestamps are
reproducible for identical opts and seed.
- Around line 502-510: When the final batch aligns exactly with batchSize
consumers never see progress:1 because batch is emptied inside the loop; after
the loop, ensure you always emit a completion progress object: if batch.length >
0 yield { batch, progress: 1 } else if (count % batchSize === 0) yield { batch:
[], progress: 1 }; refer to the existing variables batch, batchSize, count and
the loop index i where yields occur and add this extra emit so consumers observe
completion.
- Around line 162-166: The mulberry32 function currently mutates _seed inside an
expression which violates noAssignInExpressions; fix it by first incrementing
_seed on its own (e.g., _seed = _seed + 0x6d2b79f5 or _seed += 0x6d2b79f5) into
a temporary or into _seed, then assign t from that updated value (let t =
_seed); keep the rest of the logic in function mulberry32 unchanged so
references to t and _seed remain correct.

In `@site/logsdb-engine/js/logs-model.js`:
- Around line 147-158: The errorRate is being computed from a capped query
because query(..., { limit: 10000 }) truncates recordsEmitted; update the logic
in the loop that calls query for each svc (the block using resourceEquals,
severityGte and limit) to get an uncapped total error count instead of the
limited recordsEmitted—either remove the limit or call the store/query path that
returns an aggregate/count (so services[svc].errors reflects the actual total)
and then compute services[svc].errorRate from that true total (or, if keeping a
sampled/capped approach, change the displayed label to indicate it is a sample
rather than an exact percentage). Ensure you modify the code that sets
services[svc].errors and services[svc].errorRate accordingly.

In `@site/logsdb-engine/js/query-model.js`:
- Around line 165-168: formatBodyPreview currently attempts to stringify the raw
body itself which can throw for undefined, circular structures, or produce
different previews for binary types; change formatBodyPreview to call the
existing formatBody(body) first (use the formatted string it returns), then
apply the length check and truncation (keeping the "…" suffix) using the maxLen
parameter so previews and formatting are consistent and safe; update references
to formatBodyPreview only, keeping its signature.

In `@site/logsdb-engine/js/storage-model.js`:
- Around line 65-72: The computed throughput logsPerSecond uses elapsed =
performance.now() - t0 which can be 0 (or negative) for tiny ingests, yielding
Infinity; in the return block for the function that builds the result object
(referencing elapsed, appended, and logsPerSecond) guard the divisor by checking
elapsed <= 0 and in that case set logsPerSecond to a safe value (e.g., 0 or
Math.round(appended)) or compute using a small epsilon (e.g., use
Math.max(elapsed, 1) or conditional: elapsed > 0 ? Math.round((appended /
elapsed) * 1000) : 0) before returning the object so Infinity/NaN are never
produced.

In `@site/logsdb-engine/vite.config.ts`:
- Around line 1-6: The Vite config uses __dirname (root: resolve(__dirname))
which is undefined in ESM; replace that with a directory derived from
import.meta.url using fileURLToPath and dirname: import fileURLToPath from 'url'
and dirname from 'path' (or node:path's dirname) and set root to
resolve(dirname(fileURLToPath(import.meta.url))). Keep the existing base and
defineConfig usage and update imports accordingly so root resolves correctly in
ESM.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: c752de1d-5bfe-4cdf-9b2a-ab41cd3df952

📥 Commits

Reviewing files that changed from the base of the PR and between b0381ee and 92611a4.

📒 Files selected for processing (20)

packages/o11ylogsdb/bench/comprehensive-query.bench.ts
packages/o11ylogsdb/bench/comprehensive-storage.bench.ts
packages/o11ylogsdb/bench/ingest-throughput.bench.ts
packages/o11ylogsdb/bench/run.mjs
packages/o11ylogsdb/bench/synthetic-corpora.ts
packages/o11ylogsdb/src/engine.ts
packages/o11ylogsdb/src/query.ts
site/logsdb-engine/css/base.css
site/logsdb-engine/css/demo.css
site/logsdb-engine/index.html
site/logsdb-engine/js/app.js
site/logsdb-engine/js/data-gen.js
site/logsdb-engine/js/logs-model.js
site/logsdb-engine/js/query-model.js
site/logsdb-engine/js/storage-model.js
site/logsdb-engine/js/zlib-stub.js
site/logsdb-engine/test/data-gen.test.js
site/logsdb-engine/test/query-model.test.js
site/logsdb-engine/test/storage-model.test.js
site/logsdb-engine/vite.config.ts

coderabbitai · 2026-04-28T04:56:45Z

+  {
+    name: "service_match",
+    description: "Resource service.name match",
+    spec: () => ({
+      resourceEquals: { "service.name": "bench-syslog" },
+    }),
+  },
+  {
+    name: "service_no_match",
+    description: "Resource service.name no match",
+    spec: () => ({
+      resourceEquals: { "service.name": "_does_not_exist_" },
+    }),
+  },
+  {
+    name: "body_substring",
+    description: "Body contains common keyword",
+    spec: () => ({ bodyContains: "error" }),
+  },
+  {
+    name: "combined_time+severity",
+    description: "Time range + severity filter",
+    spec: (records) => {
+      const first = records[0]!.timeUnixNano;
+      const last = records[records.length - 1]!.timeUnixNano;
+      const range = last - first;
+      return {
+        severityGte: 17,
+        range: { from: first, to: first + range / 2n },
+      };
+    },
+  },
+  {
+    name: "combined_all",
+    description: "Time + severity + service + body",
+    spec: (records) => {
+      const first = records[0]!.timeUnixNano;
+      const last = records[records.length - 1]!.timeUnixNano;
+      const range = last - first;
+      return {
+        severityGte: 13,
+        range: { from: first, to: first + range / 2n },
+        resourceEquals: { "corpus.type": "syslog" },
+        bodyContains: "ssh",
+      };
+    },


⚠️ Potential issue | 🟠 Major

The “cross-corpus” predicates are hard-coded to syslog.

service_match always queries service.name = bench-syslog, and combined_all always filters on corpus.type = syslog plus "ssh". For every non-syslog corpus, those cases mostly measure the no-match path, so the summary table is not comparing equivalent queries across workloads.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@packages/o11ylogsdb/bench/comprehensive-query.bench.ts` around lines 109 - 154, The benchmarks hard-code syslog-specific predicates (e.g., service_match uses "service.name":"bench-syslog" and combined_all uses "corpus.type":"syslog" and bodyContains:"ssh"), causing non-syslog workloads to exercise no-match paths; update the spec factories to derive predicates from the provided records instead of fixed literals. For service_match, read a real service name from the incoming records (e.g., use records[0].resource["service.name"] when spec receives records) and return resourceEquals with that value; for combined_all (and combined_time+severity if applicable) replace the hard-coded corpus.type and body token with values sampled from records (e.g., records[0].resource["corpus.type"] and pick a substring present in records[0].body or scan records for a frequent token) so each workload runs equivalent, corpus-specific queries across all benchmark cases.

coderabbitai · 2026-04-28T04:56:45Z

+  return profileEncode({
+    corpus: corpusType,
+    codec: qCase.name,
+    inputBytes: totalChunkBytes,
+    rawTextBytes,
+    rawNdjsonBytes: rawTextBytes * 2, // rough proxy
+    logCount: sample.records.length || 1,
+    encode: () => {
+      const r = query(store, spec);
+      return r.records.length;
+    },
+    options: { warmup: 2, iterations: 5 },
+  });


⚠️ Potential issue | 🟠 Major

profileEncode() is being used with record counts instead of bytes.

profileEncode()/ProfileResult treat encode()'s return value as outputBytes, but this closure returns r.records.length. That makes outputBytes, bytesPerLog, and both compression ratios in the saved report meaningless. This benchmark needs a query-specific profiler/report type instead of reusing the compression harness.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@packages/o11ylogsdb/bench/comprehensive-query.bench.ts` around lines 188 - 200, The benchmark is passing a record count into profileEncode (ProfileResult expects outputBytes), so replace the encode closure used in comprehensive-query.bench.ts (currently returning r.records.length from query(store, spec)) with a value representing output bytes (e.g., serialize records and measure byte length or sum individual record byte sizes) or implement a dedicated query profiler/report type instead of reusing the compression profiler; update references to outputBytes/bytesPerLog/compression ratios accordingly so the saved report uses actual byte sizes rather than record counts.

coderabbitai · 2026-04-28T04:56:45Z

+  // Force GC before measurement
+  const g = globalThis as { gc?: () => void };
+  if (typeof g.gc === "function") g.gc();
+  const memBefore = process.memoryUsage();
+
+  const t0 = nowMillis();
+  for (const record of records) {
+    store.append(resource, SCOPE, record);
+  }
+  store.flush();
+  const t1 = nowMillis();
+
+  const memAfter = process.memoryUsage();
+  const ingestMs = t1 - t0;
+  const stats = store.stats();
+
+  return {
+    corpusType,
+    recordCount: RECORD_COUNT,
+    totalRawBytes,
+    totalChunkBytes: stats.totalChunkBytes,
+    ingestMs,
+    recordsPerSecond: RECORD_COUNT / (ingestMs / 1000),
+    rawMBPerSecond: (totalRawBytes / 1_000_000) / (ingestMs / 1000),
+    bytesPerLog: stats.totalChunkBytes / RECORD_COUNT,
+    chunkCount: stats.chunks,
+    peakHeapMB: Math.max(memBefore.heapUsed, memAfter.heapUsed) / 1_000_000,
+    peakRssMB: Math.max(memBefore.rss, memAfter.rss) / 1_000_000,


⚠️ Potential issue | 🟠 Major

peakHeapMB and peakRssMB are not peaks here.

This only compares process.memoryUsage() before and after ingest. Any transient spike during append/flush is missed, so the reported “peak” values can materially understate memory pressure. Either sample during the hot path or rename these fields to reflect that they are end-of-run snapshots.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@packages/o11ylogsdb/bench/ingest-throughput.bench.ts` around lines 72 - 99, The current peakHeapMB and peakRssMB only compare memBefore and memAfter and miss transient spikes; change the measurement to track actual peaks by sampling memory during the hot path: inside the records loop that calls store.append(resource, SCOPE, record) (and/or right after store.flush()), call process.memoryUsage() periodically (e.g., every N records or every append) and update running maxHeapUsed and maxRss variables, then compute peakHeapMB and peakRssMB from those max values instead of Math.max(memBefore.heapUsed, memAfter.heapUsed); keep the existing memBefore/memAfter for snapshots if desired.

coderabbitai · 2026-04-28T04:56:46Z

+  const baseNs = BigInt(Date.now()) * 1_000_000n;
+  for (let i = 0; i < count; i++) {
+    const { severityNumber, severityText } = randomSeverity();
+    records.push({
+      timeUnixNano: baseNs + BigInt(i) * 1_000_000_000n,


⚠️ Potential issue | 🟠 Major

These generators are not actually deterministic.

Each generator seeds the PRNG, but Date.now()/new Date(Date.now()...) still leak wall-clock time into timeUnixNano and body fields. The same (count, seed) will produce different corpora across runs, which makes the benchmark baselines non-reproducible.

Also applies to: 147-165, 202-202, 263-284, 303-336

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@packages/o11ylogsdb/bench/synthetic-corpora.ts` around lines 121 - 125, The generators leak wall-clock time into produced corpora via Date.now() (e.g., baseNs = BigInt(Date.now()) * 1_000_000n used when building records) and by constructing Date-based fields for bodies, so runs with the same seed/count are not reproducible; replace uses of Date.now() / new Date(...) and any direct time creation with deterministic values derived from the seeded PRNG (e.g., generate a deterministic base timestamp from the seed and then increment by deterministic offsets per record) and similarly generate body text and any date-string fields using the same seeded RNG (affecting the blocks around baseNs, the record time generation, and any body/date construction noted at the other ranges). Ensure randomSeverity() remains seeded or consumes the same PRNG instance so the entire corpus is deterministic for a given (count, seed).

coderabbitai · 2026-04-28T04:56:46Z

+    if (batch.length >= batchSize) {
+      yield { batch, progress: i / count };
+      batch = [];
+    }
+  }
+
+  if (batch.length > 0) {
+    yield { batch, progress: 1 };
+  }


⚠️ Potential issue | 🟡 Minor

Emit progress: 1 for exact batch boundaries.

When count is an exact multiple of batchSize, the last batch is yielded from Line 503 with i / count, and the progress: 1 branch never runs. Consumers never observe completion.

Suggested fix

if (batch.length >= batchSize) { - yield { batch, progress: i / count }; + yield { batch, progress: (i + 1) / count }; batch = []; }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if (batch.length >= batchSize) {

yield { batch, progress: i / count };

batch = [];

}

}

if (batch.length > 0) {

yield { batch, progress: 1 };

}

if (batch.length >= batchSize) {

yield { batch, progress: (i + 1) / count };

batch = [];

}

}

if (batch.length > 0) {

yield { batch, progress: 1 };

}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@site/logsdb-engine/js/data-gen.js` around lines 502 - 510, When the final batch aligns exactly with batchSize consumers never see progress:1 because batch is emptied inside the loop; after the loop, ensure you always emit a completion progress object: if batch.length > 0 yield { batch, progress: 1 } else if (count % batchSize === 0) yield { batch: [], progress: 1 }; refer to the existing variables batch, batchSize, count and the loop index i where yields occur and add this extra emit so consumers observe completion.

coderabbitai · 2026-04-28T04:56:46Z

+  // Query errors per service
+  for (const svc of Object.keys(services)) {
+    const errorResult = query(store, {
+      resourceEquals: { "service.name": svc },
+      severityGte: 17,
+      limit: 10000,
+    });
+    services[svc].errors = errorResult.stats.recordsEmitted;
+    services[svc].errorRate =
+      services[svc].logs > 0
+        ? ((services[svc].errors / services[svc].logs) * 100).toFixed(2)
+        : "0";


⚠️ Potential issue | 🟠 Major

Don't derive service error rates from a capped query.

limit: 10000 truncates recordsEmitted, so errors and errorRate go low as soon as a service exceeds 10k error logs. If this panel is meant to show exact percentages, use an uncapped count path here; otherwise label the metric as sampled instead of exact.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@site/logsdb-engine/js/logs-model.js` around lines 147 - 158, The errorRate is being computed from a capped query because query(..., { limit: 10000 }) truncates recordsEmitted; update the logic in the loop that calls query for each svc (the block using resourceEquals, severityGte and limit) to get an uncapped total error count instead of the limited recordsEmitted—either remove the limit or call the store/query path that returns an aggregate/count (so services[svc].errors reflects the actual total) and then compute services[svc].errorRate from that true total (or, if keeping a sampled/capped approach, change the displayed label to indicate it is a sample rather than an exact percentage). Ensure you modify the code that sets services[svc].errors and services[svc].errorRate accordingly.

coderabbitai · 2026-04-28T04:56:46Z

+export function formatBodyPreview(body, maxLen = 120) {
+  const full = typeof body === "string" ? body : JSON.stringify(body);
+  if (full.length <= maxLen) return full;
+  return full.slice(0, maxLen) + "…";


⚠️ Potential issue | 🟠 Major

Reuse formatBody() here to avoid preview-time crashes.

JSON.stringify(undefined) returns undefined, so Line 167 can throw on full.length; circular objects will also throw, and Uint8Array previews diverge from formatBody(). Format first, then truncate.

Suggested fix

export function formatBodyPreview(body, maxLen = 120) { - const full = typeof body === "string" ? body : JSON.stringify(body); + const full = formatBody(body); if (full.length <= maxLen) return full; return full.slice(0, maxLen) + "…"; }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

export function formatBodyPreview(body, maxLen = 120) {

const full = typeof body === "string" ? body : JSON.stringify(body);

if (full.length <= maxLen) return full;

return full.slice(0, maxLen) + "…";

export function formatBodyPreview(body, maxLen = 120) {

const full = formatBody(body);

if (full.length <= maxLen) return full;

return full.slice(0, maxLen) + "…";

}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@site/logsdb-engine/js/query-model.js` around lines 165 - 168, formatBodyPreview currently attempts to stringify the raw body itself which can throw for undefined, circular structures, or produce different previews for binary types; change formatBodyPreview to call the existing formatBody(body) first (use the formatted string it returns), then apply the length check and truncation (keeping the "…" suffix) using the maxLen parameter so previews and formatting are consistent and safe; update references to formatBodyPreview only, keeping its signature.

coderabbitai · 2026-04-28T04:56:46Z

+  const elapsed = performance.now() - t0;
+
+  return {
+    recordsIngested: appended,
+    chunksClosed,
+    ingestTimeMs: elapsed,
+    logsPerSecond: Math.round((appended / elapsed) * 1000),
+  };


⚠️ Potential issue | 🟡 Minor

Guard the throughput metric against zero elapsed time.

For very small ingests, performance.now() can return the same tick and make logsPerSecond evaluate to Infinity. Clamp or special-case elapsed <= 0.

Suggested fix

const elapsed = performance.now() - t0; + const safeElapsed = elapsed > 0 ? elapsed : 1; return { recordsIngested: appended, chunksClosed, ingestTimeMs: elapsed, - logsPerSecond: Math.round((appended / elapsed) * 1000), + logsPerSecond: Math.round((appended / safeElapsed) * 1000), };

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

const elapsed = performance.now() - t0;

return {

recordsIngested: appended,

chunksClosed,

ingestTimeMs: elapsed,

logsPerSecond: Math.round((appended / elapsed) * 1000),

};

const elapsed = performance.now() - t0;

const safeElapsed = elapsed > 0 ? elapsed : 1;

return {

recordsIngested: appended,

chunksClosed,

ingestTimeMs: elapsed,

logsPerSecond: Math.round((appended / safeElapsed) * 1000),

};

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@site/logsdb-engine/js/storage-model.js` around lines 65 - 72, The computed throughput logsPerSecond uses elapsed = performance.now() - t0 which can be 0 (or negative) for tiny ingests, yielding Infinity; in the return block for the function that builds the result object (referencing elapsed, appended, and logsPerSecond) guard the divisor by checking elapsed <= 0 and in that case set logsPerSecond to a safe value (e.g., 0 or Math.round(appended)) or compute using a small epsilon (e.g., use Math.max(elapsed, 1) or conditional: elapsed > 0 ? Math.round((appended / elapsed) * 1000) : 0) before returning the object so Infinity/NaN are never produced.

coderabbitai · 2026-04-28T04:56:47Z

+import { resolve } from "node:path";
+import { defineConfig } from "vite";
+
+export default defineConfig({
+  base: process.env.BASE_PATH ?? "/o11ykit/logsdb-engine/",
+  root: resolve(__dirname),


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

In Vite vite.config.tsfiles loaded as ESM, is__dirnameavailable, or should configs usefileURLToPath(import.meta.url)withdirname() instead?

💡 Result:

In Vite, when the Vite config file is loaded as ESM (e.g. vite.config.ts), you should not rely on Node’s global dirname being available. Instead, prefer deriving it from the file URL, e.g. use fileURLToPath(import.meta.url) + dirname. This is the standard ESM pattern. (And Vite’s docs explicitly note it replaces dirname/filename/import.meta.url in config files and their deps.) Example: import { dirname } from 'node:path' import { fileURLToPath } from 'node:url' const dirname = dirname(fileURLToPath(import.meta.url)) Explanation (what the sources say): - Vite documents that it replaces filename, dirname, and import.meta.url when evaluating config files and their dependencies, which implies those identifiers are not guaranteed to exist like native Node ESM would. - Multiple Vite issue reports show that in true ESM/“type: module” scenarios, Node itself throws “dirname is not defined in ES module scope”, and the correct fix is to define your own dirname from import.meta.url. So: write configs using fileURLToPath(import.meta.url) + dirname, rather than expecting dirname to be present.

Citations:

1: https://v2.vite.dev/config

2: https://main.vitejs.dev/config

3: "__dirname is not defined in ES module scope" in vite.config.js with module type in package.json vitejs/vite#6946

4: Docs for MPA config incorrectly suggest using __dirname vitejs/vite#9189

5: https://stackoverflow.com/questions/75004188/what-does-fileurltopathimport-meta-url-do

6: feat: better config __dirname support vitejs/vite#8442

__dirname is not available in ESM Vite configs.

Vite evaluates config files as ESM where __dirname is undefined. Use fileURLToPath(import.meta.url) with dirname() instead to derive the directory path.

Suggested fix

-import { resolve } from "node:path"; +import { dirname, resolve } from "node:path"; +import { fileURLToPath } from "node:url"; import { defineConfig } from "vite"; +const __dirname = dirname(fileURLToPath(import.meta.url)); + export default defineConfig({ base: process.env.BASE_PATH ?? "/o11ykit/logsdb-engine/", root: resolve(__dirname),

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

import { resolve } from "node:path";

import { defineConfig } from "vite";

export default defineConfig({

base: process.env.BASE_PATH ?? "/o11ykit/logsdb-engine/",

root: resolve(__dirname),

import { dirname, resolve } from "node:path";

import { fileURLToPath } from "node:url";

import { defineConfig } from "vite";

const __dirname = dirname(fileURLToPath(import.meta.url));

export default defineConfig({

base: process.env.BASE_PATH ?? "/o11ykit/logsdb-engine/",

root: resolve(__dirname),

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@site/logsdb-engine/vite.config.ts` around lines 1 - 6, The Vite config uses __dirname (root: resolve(__dirname)) which is undefined in ESM; replace that with a directory derived from import.meta.url using fileURLToPath and dirname: import fileURLToPath from 'url' and dirname from 'path' (or node:path's dirname) and set root to resolve(dirname(fileURLToPath(import.meta.url))). Keep the existing base and defineConfig usage and update imports accordingly so root resolves correctly in ESM.

coderabbitai

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/o11ylogsdb/src/codec-typed.ts`:
- Around line 1168-1394: decodeBodies duplicates the template dictionary,
slot-type table transposition and the seven var-decoding passes from decode();
extract that duplicated logic into a single helper (e.g., parseTemplatesAndSlots
or parseBodySkeleton) that accepts the ByteCursor (or buf and returns a new
cursor position) and returns the intermediate structures needed by both
functions: templateById, slotTypeArrays, slotPrefixArrays, slotTsShapeArrays,
rawStringByRecord, templatedIndices, tplIds, varCounts, allVars (and any ts
state needed); then replace the in-function parsing blocks in both decodeBodies
and decode to call this helper and continue with only the function-specific
steps (reconstructing templatedBodies, sidecar JSON parsing, and final
stitching), making sure the helper preserves/advances the cursor state so
subsequent reads behave identically.
- Around line 1208-1211: The inline generic for slotTypeMap is triggering the
formatter; create a named type for the mapped value (e.g., SlotTypeInfo) and
replace Map<string, { type: number; prefix?: string; timestampShapeId?: number
}> with Map<string, SlotTypeInfo> (then declare type SlotTypeInfo = { type:
number; prefix?: string; timestampShapeId?: number } above), or alternatively
split the declaration across lines so the generic isn't inline; update the
variable declaration for slotTypeMap and keep existing uses (slotTypeMap)
unchanged.

In `@packages/o11ylogsdb/src/query.ts`:
- Around line 162-179: The loop currently calls readBodiesOnly(chunk,
store.registry, policy) and then readRecords(chunk, store.registry, policy)
which decompresses/decodes the same chunk twice; update the logic to decompress
once and reuse the result—either change readBodiesOnly to return the
decompressed payload (or a tuple of bodies + raw decompressed buffer) or add a
new helper like decompressChunk(chunk, store.registry) and pass its result into
both body-only extraction and full record decoding so that when hasMatch is true
you reuse the cached decompressed payload instead of calling readRecords on the
original chunk again; adjust references to chunk, readBodiesOnly, readRecords,
store.registry, policy and stats.decodeMillis accordingly.
- Around line 155-156: The non-null assertion on spec.bodyContains should be
removed; instead capture the value into a local constant when you evaluate the
guard and reuse that constant. Concretely, where you check the presence of
bodyContains (the earlier condition that verifies spec.bodyContains exists),
assign const bodyContains = spec.bodyContains and then pass that local variable
(e.g., bodyContains or needle) into chunkPrunedByTemplateTokens rather than
using spec.bodyContains!; update usages in the same scope so the linter
noNonNullAssertion error is avoided.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: 23f4fd12-7efb-4276-803d-e384097e53c4

📥 Commits

Reviewing files that changed from the base of the PR and between 92611a4 and 23f16c2.

📒 Files selected for processing (4)

packages/o11ylogsdb/src/chunk.ts
packages/o11ylogsdb/src/codec-typed.ts
packages/o11ylogsdb/src/index.ts
packages/o11ylogsdb/src/query.ts

coderabbitai · 2026-04-28T05:01:00Z

+/**
+ * Partial decode: extract only body values from the payload. Skips the
+ * sidecar JSON.parse() (which is ~40-60% of full decode CPU) and
+ * attribute/traceId/spanId reconstruction. Returns AnyValue[] in record
+ * order suitable for substring matching.
+ *
+ * For KIND_OTHER records (structured bodies stored in sidecar), we must
+ * still parse those specific sidecar lines. But for templated and
+ * raw-string bodies (the 95%+ case), zero JSON parsing occurs.
+ */
+function decodeBodies(buf: Uint8Array, expectedN: number): AnyValue[] {
+  const cur = new ByteCursor(buf);
+  const n = cur.readVarint();
+  if (n !== expectedN) {
+    throw new Error(`typed: count mismatch payload=${n} header=${expectedN}`);
+  }
+  // Skip timestamps (delta-encoded varints)
+  for (let i = 0; i < n; i++) cur.readZZVarintBig();
+  // Skip severities
+  cur.readBytes(n);
+  // kinds
+  const kinds = new Uint8Array(cur.readBytes(n));
+  // template dictionary
+  const nTemplates = cur.readVarint();
+  const templateById = new Map<number, string[]>();
+  const dec = new TextDecoder();
+  for (let t = 0; t < nTemplates; t++) {
+    const id = cur.readVarint();
+    const len = cur.readVarint();
+    const tplStr = dec.decode(cur.readBytes(len));
+    templateById.set(
+      id,
+      tplStr.split(/\s+/).filter((s) => s.length > 0)
+    );
+  }
+  // slot-type table
+  const slotTypeArrays = new Map<number, Int8Array>();
+  const slotPrefixArrays = new Map<number, (string | undefined)[]>();
+  const slotTsShapeArrays = new Map<number, (number | undefined)[]>();
+  const nSlotTypes = cur.readVarint();
+  const slotTypeMap = new Map<string, { type: number; prefix?: string; timestampShapeId?: number }>();
+  for (let i = 0; i < nSlotTypes; i++) {
+    const tplId = cur.readVarint();
+    const slotIdx = cur.readVarint();
+    const type = cur.readByte();
+    const entry: { type: number; prefix?: string; timestampShapeId?: number } = { type };
+    if (type === SLOT_PREFIXED_INT64 || type === SLOT_PREFIXED_UUID) {
+      const plen = cur.readVarint();
+      entry.prefix = dec.decode(cur.readBytes(plen));
+    } else if (type === SLOT_TIMESTAMP_DELTA) {
+      entry.timestampShapeId = cur.readVarint();
+    }
+    slotTypeMap.set(`${tplId}/${slotIdx}`, entry);
+  }
+  // Transpose to per-template arrays (same as full decode)
+  for (const [tplId, template] of templateById) {
+    let nVars = 0;
+    for (const t of template) if (t === PARAM_STR) nVars++;
+    if (nVars === 0) continue;
+    const types = new Int8Array(nVars);
+    const prefixes = new Array<string | undefined>(nVars);
+    const tsShapes = new Array<number | undefined>(nVars);
+    for (let s = 0; s < nVars; s++) {
+      const slot = slotTypeMap.get(`${tplId}/${s}`);
+      if (!slot) continue;
+      types[s] = slot.type;
+      if (slot.prefix !== undefined) prefixes[s] = slot.prefix;
+      if (slot.timestampShapeId !== undefined) tsShapes[s] = slot.timestampShapeId;
+    }
+    slotTypeArrays.set(tplId, types);
+    slotPrefixArrays.set(tplId, prefixes);
+    slotTsShapeArrays.set(tplId, tsShapes);
+  }
+  // raw-string bodies
+  const rawStringByRecord = new Map<number, string>();
+  for (let i = 0; i < n; i++) {
+    if (kinds[i] !== KIND_RAW_STRING) continue;
+    const len = cur.readVarint();
+    rawStringByRecord.set(i, dec.decode(cur.readBytes(len)));
+  }
+  // templated columns
+  const templatedIndices: number[] = [];
+  for (let i = 0; i < n; i++) if (kinds[i] === KIND_TEMPLATED) templatedIndices.push(i);
+  const tplIds: number[] = new Array(templatedIndices.length);
+  for (let i = 0; i < templatedIndices.length; i++) tplIds[i] = cur.readVarint();
+  const varCounts: number[] = new Array(templatedIndices.length);
+  for (let i = 0; i < templatedIndices.length; i++) varCounts[i] = cur.readVarint();
+  const allVars: string[][] = templatedIndices.map(
+    (_, i) => new Array<string>(varCounts[i] as number)
+  );
+  // 7 passes (same as full decode — bodies depend on variable reconstruction)
+  // Pass 1: SLOT_STRING
+  for (let i = 0; i < templatedIndices.length; i++) {
+    const tplId = tplIds[i] as number;
+    const types = slotTypeArrays.get(tplId);
+    if (!types) continue;
+    const nVars = varCounts[i] as number;
+    const vars = allVars[i] as string[];
+    for (let s = 0; s < nVars; s++) {
+      if (types[s] !== SLOT_STRING) continue;
+      const len = cur.readVarint();
+      vars[s] = dec.decode(cur.readBytes(len));
+    }
+  }
+  // Pass 2: SLOT_SIGNED_INT
+  for (let i = 0; i < templatedIndices.length; i++) {
+    const tplId = tplIds[i] as number;
+    const types = slotTypeArrays.get(tplId);
+    if (!types) continue;
+    const nVars = varCounts[i] as number;
+    const vars = allVars[i] as string[];
+    for (let s = 0; s < nVars; s++) {
+      if (types[s] !== SLOT_SIGNED_INT) continue;
+      vars[s] = cur.readZZVarintBig().toString();
+    }
+  }
+  // Pass 3: SLOT_PREFIXED_INT64
+  for (let i = 0; i < templatedIndices.length; i++) {
+    const tplId = tplIds[i] as number;
+    const types = slotTypeArrays.get(tplId);
+    if (!types) continue;
+    const prefixes = slotPrefixArrays.get(tplId);
+    const nVars = varCounts[i] as number;
+    const vars = allVars[i] as string[];
+    for (let s = 0; s < nVars; s++) {
+      if (types[s] !== SLOT_PREFIXED_INT64) continue;
+      const big = cur.readI64LE();
+      vars[s] = `${prefixes?.[s] ?? ""}${big.toString()}`;
+    }
+  }
+  // Pass 4: SLOT_UUID
+  for (let i = 0; i < templatedIndices.length; i++) {
+    const tplId = tplIds[i] as number;
+    const types = slotTypeArrays.get(tplId);
+    if (!types) continue;
+    const nVars = varCounts[i] as number;
+    const vars = allVars[i] as string[];
+    for (let s = 0; s < nVars; s++) {
+      if (types[s] !== SLOT_UUID) continue;
+      vars[s] = bytesToUuid(cur.readBytes(16));
+    }
+  }
+  // Pass 5: SLOT_UUID_NODASH
+  for (let i = 0; i < templatedIndices.length; i++) {
+    const tplId = tplIds[i] as number;
+    const types = slotTypeArrays.get(tplId);
+    if (!types) continue;
+    const nVars = varCounts[i] as number;
+    const vars = allVars[i] as string[];
+    for (let s = 0; s < nVars; s++) {
+      if (types[s] !== SLOT_UUID_NODASH) continue;
+      vars[s] = bytesToUuidNodash(cur.readBytes(16));
+    }
+  }
+  // Pass 6: SLOT_PREFIXED_UUID
+  for (let i = 0; i < templatedIndices.length; i++) {
+    const tplId = tplIds[i] as number;
+    const types = slotTypeArrays.get(tplId);
+    if (!types) continue;
+    const prefixes = slotPrefixArrays.get(tplId);
+    const nVars = varCounts[i] as number;
+    const vars = allVars[i] as string[];
+    for (let s = 0; s < nVars; s++) {
+      if (types[s] !== SLOT_PREFIXED_UUID) continue;
+      vars[s] = `${prefixes?.[s] ?? ""}${bytesToUuid(cur.readBytes(16))}`;
+    }
+  }
+  // Pass 7: SLOT_TIMESTAMP_DELTA
+  const tsPrev = new Map<number, bigint>();
+  for (let i = 0; i < templatedIndices.length; i++) {
+    const tplId = tplIds[i] as number;
+    const types = slotTypeArrays.get(tplId);
+    if (!types) continue;
+    const tsShapeIds = slotTsShapeArrays.get(tplId);
+    const nVars = varCounts[i] as number;
+    const vars = allVars[i] as string[];
+    for (let s = 0; s < nVars; s++) {
+      if (types[s] !== SLOT_TIMESTAMP_DELTA) continue;
+      const dt = cur.readZZVarintBig();
+      const key = (tplId << 16) | s;
+      const prev = tsPrev.get(key) ?? 0n;
+      const cur2 = prev + dt;
+      tsPrev.set(key, cur2);
+      const shape = tsShape(tsShapeIds?.[s] as number);
+      vars[s] = shape.format(cur2);
+    }
+  }
+  // Reconstruct templated bodies
+  const templatedBodies: string[] = new Array(templatedIndices.length);
+  for (let i = 0; i < templatedIndices.length; i++) {
+    const tplId = tplIds[i] as number;
+    const template = templateById.get(tplId);
+    if (!template) throw new Error(`typed: missing template id ${tplId}`);
+    templatedBodies[i] = reconstruct(template, allVars[i] as string[]);
+  }
+
+  // Read sidecar ONLY for KIND_OTHER records (structured bodies)
+  const hasOther = kinds.some((k) => k === KIND_OTHER);
+  let otherBodies: Map<number, AnyValue> | undefined;
+  if (hasOther) {
+    const sidecarLen = cur.readVarint();
+    if (sidecarLen > 0) {
+      const sidecarText = dec.decode(cur.readBytes(sidecarLen));
+      const lines = sidecarText.split("\n").filter((l) => l.length > 0);
+      otherBodies = new Map();
+      for (let i = 0; i < n; i++) {
+        if (kinds[i] !== KIND_OTHER) continue;
+        const side = JSON.parse(lines[i] as string) as Record<string, unknown>;
+        otherBodies.set(i, jsonToAnyValue(side.b));
+      }
+    }
+  }
+
+  // Assemble body-only output (no LogRecord construction, no attribute parse)
+  const out: AnyValue[] = new Array(n);
+  let templatedCursor = 0;
+  for (let i = 0; i < n; i++) {
+    if (kinds[i] === KIND_RAW_STRING) {
+      out[i] = rawStringByRecord.get(i) ?? "";
+    } else if (kinds[i] === KIND_TEMPLATED) {
+      out[i] = templatedBodies[templatedCursor++] as string;
+    } else {
+      out[i] = otherBodies?.get(i) ?? "";
+    }
+  }
+  return out;
+}


🧹 Nitpick | 🔵 Trivial

Significant duplication with decode().

decodeBodies() duplicates ~200 lines from decode(): template dictionary parsing, slot-type table transposition, all 7 var-decoding passes. Only the final record-stitching differs.

Extract shared parsing into a helper returning intermediate structures, then have both functions consume it.

🧰 Tools

🪛 GitHub Actions: validate-o11ykit-monorepo

[error] 1208-1211: biome formatter: Formatter would have printed different generic Map type formatting.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@packages/o11ylogsdb/src/codec-typed.ts` around lines 1168 - 1394, decodeBodies duplicates the template dictionary, slot-type table transposition and the seven var-decoding passes from decode(); extract that duplicated logic into a single helper (e.g., parseTemplatesAndSlots or parseBodySkeleton) that accepts the ByteCursor (or buf and returns a new cursor position) and returns the intermediate structures needed by both functions: templateById, slotTypeArrays, slotPrefixArrays, slotTsShapeArrays, rawStringByRecord, templatedIndices, tplIds, varCounts, allVars (and any ts state needed); then replace the in-function parsing blocks in both decodeBodies and decode to call this helper and continue with only the function-specific steps (reconstructing templatedBodies, sidecar JSON parsing, and final stitching), making sure the helper preserves/advances the cursor state so subsequent reads behave identically.

coderabbitai · 2026-04-28T05:01:01Z

+        const t0 = nowMillis();
+        const bodies = readBodiesOnly(chunk, store.registry, policy);
+        let hasMatch = false;
+        for (let i = 0; i < bodies.length; i++) {
+          if (typeof bodies[i] === "string" && (bodies[i] as string).includes(needle)) {
+            hasMatch = true;
+            break;
+          }
+        }
+        if (!hasMatch) {
+          // No body in this chunk matches — skip full decode entirely
+          stats.decodeMillis += nowMillis() - t0;
+          stats.chunksPruned++;
+          continue;
+        }
+        // Some bodies match — need full records for time/severity post-filtering
+        const records = readRecords(chunk, store.registry, policy);
+        stats.decodeMillis += nowMillis() - t0;


🧹 Nitpick | 🔵 Trivial

Double decompression for matching chunks.

When a body match is found (line 171), you decompress & decode the chunk twice: once via readBodiesOnly (line 163), again via readRecords (line 178). For highly selective queries this trade-off pays off, but for queries matching most chunks, performance degrades.

Consider caching the decompressed payload or restructuring to do a single decompress with optional full-record materialization.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@packages/o11ylogsdb/src/query.ts` around lines 162 - 179, The loop currently calls readBodiesOnly(chunk, store.registry, policy) and then readRecords(chunk, store.registry, policy) which decompresses/decodes the same chunk twice; update the logic to decompress once and reuse the result—either change readBodiesOnly to return the decompressed payload (or a tuple of bodies + raw decompressed buffer) or add a new helper like decompressChunk(chunk, store.registry) and pass its result into both body-only extraction and full record decoding so that when hasMatch is true you reuse the cached decompressed payload instead of calling readRecords on the original chunk again; adjust references to chunk, readBodiesOnly, readRecords, store.registry, policy and stats.decodeMillis accordingly.

The previous implementation had a heuristic that pruned chunks when the needle 'looked like a keyword' (alphanumeric, >= 3 chars). This is UNSOUND because variable values (PARAM_STR slots) can contain any string including keywords like 'error', 'admin', 'timeout'. The pruner would incorrectly skip chunks containing real matches. Template-token pruning requires additional metadata (bloom filters or variable-value token sets) to be correct. Disabled until we can implement it soundly. The partial column decode (decodeBodiesOnly) remains and provides a real 2-4× speedup on body_substring queries by skipping sidecar JSON.parse. This is the honest, safe optimization. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/o11ylogsdb/src/query.ts`:
- Around line 149-171: The body-only fast path can incorrectly prune chunks when
a policy defines postDecode normalization because readBodiesOnly skips
policy.postDecode while readRecords applies it; update the use of the fast path
in useBodyFastPath handling (the block that computes needle, calls
chunkPrunedByTemplateTokens, readBodiesOnly and sets hasMatch) to skip the
body-only prefilter whenever policy.postDecode is present (or would mutate body
strings), i.e., only run the readBodiesOnly/hasMatch shortcut if
policy.postDecode is undefined/false; otherwise fall through to the full
readRecords path so postDecode is applied before substring checks.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: 0be6b66a-7b94-4793-b0a0-038b97f2e222

📥 Commits

Reviewing files that changed from the base of the PR and between 23f16c2 and 4f322e3.

📒 Files selected for processing (1)

packages/o11ylogsdb/src/query.ts

coderabbitai · 2026-04-28T05:21:43Z

+      if (useBodyFastPath) {
+        // Template-token pruning: if the chunk header carries template
+        // literal tokens (toks), check if any token contains the needle
+        // as a substring. If no template token matches AND the chunk has
+        // no raw-string bodies (raw strings might still match), we can
+        // skip ZSTD decompression entirely.
+        const needle = spec.bodyContains!;
+        if (chunkPrunedByTemplateTokens(chunk, needle)) {
+          stats.chunksPruned++;
+          continue;
+        }
+        // Fast path: decode only bodies, check which match the
+        // substring. Only do full decode if there are body matches.
+        const t0 = nowMillis();
+        const bodies = readBodiesOnly(chunk, store.registry, policy);
+        let hasMatch = false;
+        for (let i = 0; i < bodies.length; i++) {
+          if (typeof bodies[i] === "string" && (bodies[i] as string).includes(needle)) {
+            hasMatch = true;
+            break;
+          }
+        }
+        if (!hasMatch) {


⚠️ Potential issue | 🔴 Critical

Don't use the body-only prefilter for postDecode-only policies.

hasMatch is computed from readBodiesOnly(), but that helper's fallback skips policy.postDecode() while readRecords() still applies it. That means this branch can prune a chunk that would match after normalization, which breaks the substring contract documented at Lines 30-33.

🔧 Minimal safe guard

- if (useBodyFastPath) { + const canUseBodyFastPath = + useBodyFastPath && + (policy?.decodeBodiesOnly !== undefined || + policy?.decodePayload !== undefined || + policy?.postDecode === undefined); + if (canUseBodyFastPath) {

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@packages/o11ylogsdb/src/query.ts` around lines 149 - 171, The body-only fast path can incorrectly prune chunks when a policy defines postDecode normalization because readBodiesOnly skips policy.postDecode while readRecords applies it; update the use of the fast path in useBodyFastPath handling (the block that computes needle, calls chunkPrunedByTemplateTokens, readBodiesOnly and sets hasMatch) to skip the body-only prefilter whenever policy.postDecode is present (or would mutate body strings), i.e., only run the readBodiesOnly/hasMatch shortcut if policy.postDecode is undefined/false; otherwise fall through to the full readRecords path so postDecode is applied before substring checks.

- Fix noNonNullAssertion in query.ts (use explicit check) - Fix noAssignInExpressions in synthetic-corpora.ts and data-gen.js - Fix useIterableCallbackReturn in app.js - Fix useButtonType in index.html - Fix unused variable in data-gen.test.js Auto-formatted with biome --write

strawgate · 2026-04-28T05:51:16Z

Fixed the lint errors:

✅ noNonNullAssertion in query.ts (use explicit check)
✅ noAssignInExpressions in synthetic-corpora.ts and data-gen.js
✅ useIterableCallbackReturn in app.js
✅ useButtonType in index.html
✅ biome formatting issues

Remaining Review Items

The following are architectural/concerns that require more analysis to address:

Critical:

formatBodyPreview crashes on undefined/null body (need to analyze how to fix properly)
Body-only prefilter issue for postDecode-only policies (need deeper analysis)

Major:

Decode cache returns mutable array references (data corruption risk)
Large presets block UI thread
Deterministic generators not actually deterministic
Various benchmark accuracy concerns

Minor/Quality:

Missing tests for logs-model.js
N+1 query issue in analyzeServices
Various performance suggestions

Recommendation

These are substantial changes that warrant their own PRs. The lint issues are resolved. The remaining concerns are valid but complex. I recommend:

Filing issues for the architectural concerns
Merging this PR to unblock the demo work
Addressing the concerns in follow-up PRs

coderabbitai

Actionable comments posted: 14

♻️ Duplicate comments (5)

site/logsdb-engine/css/base.css (1)
8-10: ⚠️ Potential issue | 🟡 Minor

Replace the self-referential font tokens.

These custom properties reference themselves, creating a cycle. Consumers using var(--mono), var(--sans), or var(--display) can end up with an invalid value instead of the intended fallback stack. Assign the font lists directly.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@site/logsdb-engine/css/base.css` around lines 8 - 10, The custom properties
--mono, --sans, and --display are self-referential and create a cycle; replace
their values with the actual font stacks instead of var(...) so consumers using
var(--mono), var(--sans), or var(--display) get valid fallbacks — set --mono to
the JetBrains Mono + monospace stack, --sans to Inter Tight + a sane sans-serif
stack, and --display to Fraunces + serif (update the declarations for --mono,
--sans, and --display in the CSS).
packages/o11ylogsdb/src/query.ts (1)
125-128: ⚠️ Potential issue | 🔴 Critical

Disable the body fast path when the policy has postDecode().

readBodiesOnly() never runs policy.postDecode(), but readRecords() and the documented substring semantics operate on post-decoded bodies. Any policy that normalizes body strings in postDecode() can be falsely pruned by this branch. Guard the optimization on policy.postDecode === undefined.
Suggested fix
-  const useBodyFastPath = spec.bodyContains !== undefined && !spec.bodyLeafEquals;
+  const wantsBodyFastPath = spec.bodyContains !== undefined && !spec.bodyLeafEquals;
...
-      if (useBodyFastPath) {
+      const canUseBodyFastPath = wantsBodyFastPath && policy?.postDecode === undefined;
+      if (canUseBodyFastPath) {
Also applies to: 149-179
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/o11ylogsdb/src/query.ts` around lines 125 - 128, The current
body-fast-path condition (used to choose readBodiesOnly()) ignores that policies
with a postDecode() normalization can change bodies, so update the logic
computing useBodyFastPath to also require policy.postDecode === undefined;
specifically, where useBodyFastPath is computed from spec.bodyContains and
spec.bodyLeafEquals, add a guard checking policy.postDecode is undefined so we
never take the fast path when the policy defines postDecode (also apply the same
additional guard to the similar body-only fast-path decision later in the same
file).
site/logsdb-engine/js/logs-model.js (1)
149-156: ⚠️ Potential issue | 🟠 Major

errorRate is still derived from a capped query.

recordsEmitted stops at 10_000, so any service above that error volume reports a smaller errors count and a lower errorRate than reality. Use an uncapped count/aggregate path or label this metric as sampled.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@site/logsdb-engine/js/logs-model.js` around lines 149 - 156, The error count
is being taken from a capped query (query(store, {..., limit: 10000})) so
services[svc].errors (and thus services[svc].errorRate) is understated for
high-volume services; update the logic in the block that calls query(store, ...)
to use an uncapped aggregate/count path (e.g., a count() or aggregate response
instead of limiting to 10000) to set services[svc].errors, or if you must keep
sampling explicitly mark the metric as sampled (e.g., set a sampled flag or
change the label) so callers know it's not a full count—adjust the code that
computes errorResult, services[svc].errors, and services[svc].errorRate
accordingly.
site/logsdb-engine/js/data-gen.js (2)
502-510: ⚠️ Potential issue | 🟡 Minor

Exact batch boundaries never emit completion progress.

When count % batchSize === 0, the last in-loop yield reports i / count and clears batch, so the progress: 1 branch never runs. Consumers waiting for an explicit completion event won't see one.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@site/logsdb-engine/js/data-gen.js` around lines 502 - 510, The generator
currently yields batch progress inside the loop using batch, batchSize, count
and i but when count % batchSize === 0 the final in-loop yield clears batch and
the post-loop if (batch.length > 0) yield { batch, progress: 1 } never runs, so
consumers miss an explicit completion event; fix by emitting an unconditional
completion yield after the loop (e.g., always yield { batch: [], progress: 1 }
or yield a final marker) so completion is signaled even when the last batch was
emitted inside the loop, ensuring callers always receive a progress: 1 event.
335-340: ⚠️ Potential issue | 🟠 Major

Seeded datasets are still time-dependent.

seed resets _seed, but baseTime still comes from Date.now(), so identical options produce different timeUnixNano values on each run. That breaks reproducible benches and the demo's determinism contract.

Also applies to: 452-458
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@site/logsdb-engine/js/data-gen.js` around lines 335 - 340, The generateLogs
function sets _seed but uses Date.now() to compute baseTime, making outputs
time-dependent; change generateLogs to derive baseTime deterministically (either
accept and use an explicit opts.baseTime when provided, or compute baseTime from
the seed/constant rather than Date.now()) so that timeUnixNano values are
reproducible; update all places that compute baseTime (generateLogs and the
other occurrence around lines 452-458) to reference the deterministic baseTime
and preserve existing behavior when an explicit baseTime option is supplied.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/o11ylogsdb/bench/ingest-throughput.bench.ts`:
- Around line 109-111: The warmup currently only generates syslog records via
CORPUS_GENERATORS.syslog(1000) but never passes them through the ingest path;
change the warmup to create a temporary/throwaway store (in-memory or a temp
dir) and run the same ingest/codec path used by the benchmark using records from
CORPUS_GENERATORS.syslog (e.g., generate and ingest ~10k records) so ZSTD
dictionaries, codecs, and any JIT compilation are exercised; after warming up,
close/remove the throwaway store and proceed with the real benchmark.

In `@packages/o11ylogsdb/bench/synthetic-corpora.ts`:
- Around line 345-374: The loop is fragile because calling
generateSyslogCorpus/generateStructuredCorpus/generateHighCardinalityCorpus/generateCloudNativeCorpus
with _seed lets those functions call setSeed(_seed) and clobber module RNG
state; fix by removing seed side-effects: refactor the sub-generators to accept
a RNG/seedless API (or an explicit RNG instance) and have them draw from the
shared rand() state instead of calling setSeed, then update the loop to call
generateXCorpus(1, rng) or generateXCorpus(1) (using the shared RNG) and
eliminate the ad-hoc _seed += 7 adjustments so the mixed corpus uses a single
consistent RNG stream.

In `@site/logsdb-engine/css/demo.css`:
- Around line 450-452: Replace the deprecated CSS property in the .log-body code
rule: remove word-break: break-word and add overflow-wrap: anywhere to achieve
the same wrap behavior; update the .log-body code selector in demo.css to use
overflow-wrap: anywhere so the styles remain standards-compliant.

In `@site/logsdb-engine/index.html`:
- Around line 101-131: The tab markup lacks ARIA roles and attributes, so update
the tab container (`#tabs-panel` / .tab-bar) and the tab buttons (.tab-btn with
data-tab="storage|logs|query") to include role="tablist" on the container,
role="tab" on each button, aria-selected="true/false" and aria-controls pointing
to the corresponding panel id (panel-storage, panel-logs, panel-query), and add
id attributes on the buttons to use as aria-labelledby values; also mark the
panels (panel-storage, panel-logs, panel-query) with role="tabpanel" and
aria-labelledby that references the matching tab id and toggle
hidden/aria-hidden consistently when activating a tab; ensure any JS that
switches tabs (the click/keyboard handler tied to .tab-btn or `#tabs-panel`)
updates aria-selected, aria-hidden, aria-controls/aria-labelledby and visible
state together so screen readers and keyboard users get correct semantics.

In `@site/logsdb-engine/js/app.js`:
- Around line 461-462: The variable _svcDist is assigned from
computeServiceDistribution(records) but never used; either remove the unused
declaration (_svcDist = computeServiceDistribution(records)) to eliminate dead
code or replace it with a used variable and wire it into the UI/logic where
service distribution is needed (refer to computeServiceDistribution and the
records input) so the result is consumed rather than ignored.
- Around line 62-64: The escapeHtml function currently only replaces &, <, and
>; extend it to also replace double-quote (") and single-quote (') characters so
escaped strings are safe when inserted into HTML attributes as well as text
nodes. Update the escapeHtml implementation (function name: escapeHtml) to
include replacements for " and ' (and optionally `) and ensure any call sites
that may place values into attributes continue to use escapeHtml. Keep the
escaping order so & is replaced first to avoid double-escaping.

In `@site/logsdb-engine/js/data-gen.js`:
- Around line 154-160: The module currently uses a shared module-level _seed and
a mutating mulberry32() which makes generators like generateLogBatches
non-reentrant; change mulberry32 into a pure RNG factory that accepts an initial
seed and returns a bound random function (e.g., function makeMulberry32(seed) {
return () => { ...use and update local seed... } }), remove mutations of
module-level _seed, and update generateLogBatches and any other exported
generators/helpers (references: mulberry32, _seed, generateLogBatches and the
other RNG-using helpers around the file) to create a per-invocation rng via the
factory and pass that rng down to helper functions instead of relying on shared
state.

In `@site/logsdb-engine/js/logs-model.js`:
- Around line 18-23: The warnings list is being capped with query(store, {
severityGte: 13, limit: 500 }) before dropping severityNumber >= 17, which lets
errors/fatals crowd out warnings; change the logic so the <17 filter is applied
inside the query or perform a separate query for warnings (e.g., call
query(store, { severityGte: 13, severityLt: 17, limit: 500 }) or query only for
severity <17) and then derive warningRecords from that result instead of
filtering warnings.records afterwards—update the code around the warnings
variable and warningRecords usage to use the pre-filtered query results.
- Around line 19-20: The code treats query(store, { limit: N }) as "recent" but
that API returns traversal order, not newest-first; update all places using
recent = query(store, { limit: ... }), the template analysis, timeline, and
getRecentLogs() to first perform an explicit top-N-by-timestamp selection (i.e.,
sort or use a timestamp-based index/path and pick the highest timestamps) and
only then slice/limit, so results are ordered newest-first by the record's
timestamp field before returning or further processing.

In `@site/logsdb-engine/js/query-model.js`:
- Around line 85-94: The code sets stats.totalTimeMs and stats.recordsPerMs to
strings via toFixed, which can surprise consumers; update the return from the
function so totalTimeMs is a numeric value (e.g., totalTimeMs:
Number(elapsed.toFixed(1)) or Math.round/appropriate numeric cast) and preserve
a formatted string alongside it (e.g., totalTimeMsFormatted or totalTimeMsStr),
and do the same for recordsPerMs (keep a numeric recordsPerMs and a formatted
recordsPerMsStr) using the existing elapsed and result.stats.recordsEmitted
values to compute both forms.

In `@site/logsdb-engine/js/storage-model.js`:
- Around line 27-39: The demo store in createStore currently sets
TypedColumnarDrainPolicy with bodyCodec: "raw" (in LogStore creation), which
disables compression and can mislead compression metrics; update the code near
createStore/LogStore/TypedColumnarDrainPolicy to add a clear comment (or a small
UI/debug flag) stating that bodyCodec is intentionally "raw" for the demo to
avoid WASM ZSTD overhead and therefore compression metrics do not apply, or
expose an option to switch to "zstd-19" for benchmarking so readers know why
compression differs.

In `@site/logsdb-engine/test/data-gen.test.js`:
- Around line 91-103: The test currently iterates over generateLogBatches only
collecting batches; record and assert the final progress value instead of
ignoring it. Modify the test that calls generateLogBatches to capture each
yielded item's progress (e.g., track lastProgress from each iteration of
generateLogBatches) and add an assertion that lastProgress === 1 after the loop;
reference the generateLogBatches generator and the yielded progress property to
locate where to add the tracking and the new expect(lastProgress).toBe(1)
assertion.
- Around line 12-17: The determinism test for generateLogs is incomplete because
it only asserts body and severityText; update the test to also assert the time
field to catch timestamp drift: when calling generateLogs({ count: 50,
durationMinutes: 5, seed: 123 }) for a and b, include
expect(a.records[0].timeUnixNano).toEqual(b.records[0].timeUnixNano) and
expect(a.records[49].timeUnixNano).toEqual(b.records[49].timeUnixNano) (or
alternatively replace the selective assertions with a deep equality check of
normalized full records from generateLogs to ensure complete deterministic
output).

In `@site/logsdb-engine/test/query-model.test.js`:
- Around line 165-175: Add a test case to verify computeServiceDistribution
falls back to "unknown" when service.name is missing: create records that either
have no attributes or have attributes that don't include key "service.name" (in
addition to existing valid records), call computeServiceDistribution(records)
and assert that dist.unknown equals the count of those fallback records while
existing assertions for dist.a/dist.b remain valid; reference
computeServiceDistribution and the test block that currently asserts
dist.a/dist.b to place this new expectation.

---

Duplicate comments:
In `@packages/o11ylogsdb/src/query.ts`:
- Around line 125-128: The current body-fast-path condition (used to choose
readBodiesOnly()) ignores that policies with a postDecode() normalization can
change bodies, so update the logic computing useBodyFastPath to also require
policy.postDecode === undefined; specifically, where useBodyFastPath is computed
from spec.bodyContains and spec.bodyLeafEquals, add a guard checking
policy.postDecode is undefined so we never take the fast path when the policy
defines postDecode (also apply the same additional guard to the similar
body-only fast-path decision later in the same file).

In `@site/logsdb-engine/css/base.css`:
- Around line 8-10: The custom properties --mono, --sans, and --display are
self-referential and create a cycle; replace their values with the actual font
stacks instead of var(...) so consumers using var(--mono), var(--sans), or
var(--display) get valid fallbacks — set --mono to the JetBrains Mono +
monospace stack, --sans to Inter Tight + a sane sans-serif stack, and --display
to Fraunces + serif (update the declarations for --mono, --sans, and --display
in the CSS).

In `@site/logsdb-engine/js/data-gen.js`:
- Around line 502-510: The generator currently yields batch progress inside the
loop using batch, batchSize, count and i but when count % batchSize === 0 the
final in-loop yield clears batch and the post-loop if (batch.length > 0) yield {
batch, progress: 1 } never runs, so consumers miss an explicit completion event;
fix by emitting an unconditional completion yield after the loop (e.g., always
yield { batch: [], progress: 1 } or yield a final marker) so completion is
signaled even when the last batch was emitted inside the loop, ensuring callers
always receive a progress: 1 event.
- Around line 335-340: The generateLogs function sets _seed but uses Date.now()
to compute baseTime, making outputs time-dependent; change generateLogs to
derive baseTime deterministically (either accept and use an explicit
opts.baseTime when provided, or compute baseTime from the seed/constant rather
than Date.now()) so that timeUnixNano values are reproducible; update all places
that compute baseTime (generateLogs and the other occurrence around lines
452-458) to reference the deterministic baseTime and preserve existing behavior
when an explicit baseTime option is supplied.

In `@site/logsdb-engine/js/logs-model.js`:
- Around line 149-156: The error count is being taken from a capped query
(query(store, {..., limit: 10000})) so services[svc].errors (and thus
services[svc].errorRate) is understated for high-volume services; update the
logic in the block that calls query(store, ...) to use an uncapped
aggregate/count path (e.g., a count() or aggregate response instead of limiting
to 10000) to set services[svc].errors, or if you must keep sampling explicitly
mark the metric as sampled (e.g., set a sampled flag or change the label) so
callers know it's not a full count—adjust the code that computes errorResult,
services[svc].errors, and services[svc].errorRate accordingly.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: 33ab3148-391d-4d28-b62e-216043b592ef

📥 Commits

Reviewing files that changed from the base of the PR and between 4f322e3 and dd4d1b0.

📒 Files selected for processing (16)

packages/o11ylogsdb/bench/comprehensive-query.bench.ts
packages/o11ylogsdb/bench/comprehensive-storage.bench.ts
packages/o11ylogsdb/bench/ingest-throughput.bench.ts
packages/o11ylogsdb/bench/synthetic-corpora.ts
packages/o11ylogsdb/src/codec-typed.ts
packages/o11ylogsdb/src/query.ts
site/logsdb-engine/css/base.css
site/logsdb-engine/css/demo.css
site/logsdb-engine/index.html
site/logsdb-engine/js/app.js
site/logsdb-engine/js/data-gen.js
site/logsdb-engine/js/logs-model.js
site/logsdb-engine/js/query-model.js
site/logsdb-engine/js/storage-model.js
site/logsdb-engine/test/data-gen.test.js
site/logsdb-engine/test/query-model.test.js

coderabbitai · 2026-04-28T05:58:30Z

+  // Warmup pass
+  process.stderr.write("  Warmup…\n");
+  CORPUS_GENERATORS.syslog(1000);


🧹 Nitpick | 🔵 Trivial

Warmup doesn't exercise the ingest path.

The warmup only generates 1000 syslog records but discards them. This warms up the RNG and generator but not the store/codec path. Consider also ingesting into a throwaway store to warm up ZSTD dictionaries and JIT.

♻️ Suggested warmup improvement

// Warmup pass process.stderr.write(" Warmup…\n"); - CORPUS_GENERATORS.syslog(1000); + const warmupRecords = CORPUS_GENERATORS.syslog(1000); + const warmupStore = new LogStore({ + registry: defaultRegistry().register(new ZstdCodec(19)), + policy: new TypedColumnarDrainPolicy({ bodyCodec: "zstd-19" }), + rowsPerChunk: 1024, + }); + for (const r of warmupRecords) { + warmupStore.append(buildResource("syslog"), SCOPE, r); + } + warmupStore.flush();

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// Warmup pass

process.stderr.write(" Warmup…\n");

CORPUS_GENERATORS.syslog(1000);

// Warmup pass

process.stderr.write(" Warmup…\n");

const warmupRecords = CORPUS_GENERATORS.syslog(1000);

const warmupStore = new LogStore({

registry: defaultRegistry().register(new ZstdCodec(19)),

policy: new TypedColumnarDrainPolicy({ bodyCodec: "zstd-19" }),

rowsPerChunk: 1024,

});

for (const r of warmupRecords) {

warmupStore.append(buildResource("syslog"), SCOPE, r);

}

warmupStore.flush();

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@packages/o11ylogsdb/bench/ingest-throughput.bench.ts` around lines 109 - 111, The warmup currently only generates syslog records via CORPUS_GENERATORS.syslog(1000) but never passes them through the ingest path; change the warmup to create a temporary/throwaway store (in-memory or a temp dir) and run the same ingest/codec path used by the benchmark using records from CORPUS_GENERATORS.syslog (e.g., generate and ingest ~10k records) so ZSTD dictionaries, codecs, and any JIT compilation are exercised; after warming up, close/remove the throwaway store and proceed with the real benchmark.

coderabbitai · 2026-04-28T05:58:31Z

+  for (let i = 0; i < count; i++) {
+    const roll = rand();
+    let record: LogRecord;
+
+    if (roll < 0.4) {
+      // 40% syslog-style templated
+      const [r] = generateSyslogCorpus(1, _seed);
+      record = r!;
+      _seed += 7;
+    } else if (roll < 0.75) {
+      // 35% structured JSON
+      const [r] = generateStructuredCorpus(1, _seed);
+      record = r!;
+      _seed += 7;
+    } else if (roll < 0.9) {
+      // 15% high-cardinality
+      const [r] = generateHighCardinalityCorpus(1, _seed);
+      record = r!;
+      _seed += 7;
+    } else {
+      // 10% cloud-native
+      const [r] = generateCloudNativeCorpus(1, _seed);
+      record = r!;
+      _seed += 7;
+    }
+
+    // Override timestamp to maintain ordering
+    record!.timeUnixNano = baseNs + BigInt(i) * 1_000_000_000n;
+    records.push(record!);
+  }


🧹 Nitpick | 🔵 Trivial

Mixed corpus seed management is fragile.

Each sub-generator call (e.g., generateSyslogCorpus(1, _seed)) internally calls setSeed(_seed), overwriting the module-level _seed. Then the outer loop does _seed += 7. This works but is confusing — the seed passed to sub-generators is immediately clobbered by setSeed(), and the += 7 increment is somewhat arbitrary.

♻️ Consider isolating sub-generator calls

Instead of relying on setSeed side effects, generate the mixed corpus inline using the shared RNG state without calling the full sub-generators:

if (roll < 0.4) { // 40% syslog-style templated - const [r] = generateSyslogCorpus(1, _seed); - record = r!; - _seed += 7; + const { severityNumber, severityText } = randomSeverity(); + record = { + timeUnixNano: 0n, // overwritten below + severityNumber, + severityText, + body: fillSyslogTemplate(pick(SYSLOG_TEMPLATES)), + attributes: [{ key: "host", value: `server-${randInt(1, 20)}` }], + };

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@packages/o11ylogsdb/bench/synthetic-corpora.ts` around lines 345 - 374, The loop is fragile because calling generateSyslogCorpus/generateStructuredCorpus/generateHighCardinalityCorpus/generateCloudNativeCorpus with _seed lets those functions call setSeed(_seed) and clobber module RNG state; fix by removing seed side-effects: refactor the sub-generators to accept a RNG/seedless API (or an explicit RNG instance) and have them draw from the shared rand() state instead of calling setSeed, then update the loop to call generateXCorpus(1, rng) or generateXCorpus(1) (using the shared RNG) and eliminate the ad-hoc _seed += 7 adjustments so the mixed corpus uses a single consistent RNG stream.

coderabbitai · 2026-04-28T05:58:31Z

+.log-body code {
+  font-size: 0.7rem;
+  word-break: break-word;


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# Check if the file exists and view the specific lines if [ -f "site/logsdb-engine/css/demo.css" ]; then echo "=== File content around lines 450-452 ===" sed -n '448,455p' site/logsdb-engine/css/demo.css else echo "File not found at specified path" find . -name "demo.css" -type f 2>/dev/null | head -20 fi

Repository: strawgate/o11ykit

Length of output: 205

🌐 Web query:

word-break: break-word CSS deprecated status 2024 2025

💡 Result:

word-break: break-word is deprecated in CSS. It remains supported in browsers for backward compatibility but should not be used in new code. The CSS Text Module Level 3/4 specification marks it as deprecated, with the same effect as word-break: normal combined with overflow-wrap: anywhere.

Citations:

1: https://rocketvalidator.com/html-validation/css-the-value-break-word-is-deprecated

2: https://developer.mozilla.org/en-US/docs/Web/CSS/word-break

3: https://developer.mozilla.org/en-US/docs/Web/CSS/Reference/Properties/word-break

🌐 Web query:

overflow-wrap: anywhere vs word-break: break-word CSS standard

💡 Result:

overflow-wrap: anywhere and word-break: break-word are both CSS properties/values that control line breaking for long words to prevent overflow, but they differ in behavior, especially regarding intrinsic sizing. Key points from CSS Text Module Level 3 (Candidate Recommendation) and MDN docs: - overflow-wrap: anywhere - Breaks unbreakable strings (long words/URLs) at arbitrary points only if needed to prevent overflow. - Soft wrap opportunities (potential break points) are considered when calculating min-content intrinsic sizes. - This affects flexbox, grid, floats, etc., where containers size based on content: the container can be smaller because breaks are anticipated. - word-break: break-word (deprecated/legacy) - Behaves like overflow-wrap: anywhere + word-break: normal. - For compatibility; now largely superseded by overflow-wrap: anywhere. - Supported in browsers but discouraged in favor of overflow-wrap. In most visual cases, they appear similar. The difference shows in min-content sizing scenarios (e.g., width: min-content with max-width constraint, flex items): - anywhere allows smaller containers by accounting for breaks in sizing. - break-word sizes as if no breaks, potentially causing larger containers or overflow until layout. Example (flex-grow scenario): - Long word in flex item: anywhere breaks early, balanced layout; break-word grows item disproportionately. Use overflow-wrap: anywhere for modern code. For compatibility: overflow-wrap: break-word; overflow-wrap: anywhere; (break-word falls back). Note: word-break: break-word is distinct from overflow-wrap: break-word (legacy word-wrap), which does NOT consider soft wraps in min-content (opposite of anywhere). The query likely intends the common legacy pairing with word-break: break-word ~ anywhere behavior.

Citations:

1: https://developer.mozilla.org/en/CSS/word-wrap

2: https://developer.mozilla.org/en-US/docs/Web/CSS/Reference/Properties/overflow-wrap

3: https://developer.mozilla.org/en-US/docs/Web/CSS/word-break

4: https://www.w3.org/TR/css-text-3/

5: https://stackoverflow.com/questions/68593758/difference-between-overflow-wrap-anywhere-overflow-wrap-break-word

6: https://blog.logrocket.com/guide-css-word-wrap-overflow-wrap-word-break/

7: https://stackoverflow.com/questions/17259916/difference-between-overflow-wrap-and-word-break

8: https://stackoverflow.com/questions/77651244/what-are-soft-wrap-opportunities-introduced-by-the-word-break

Replace deprecated word-break: break-word with overflow-wrap: anywhere.

word-break: break-word is deprecated per CSS Text Module Level 3/4 specification. Use overflow-wrap: anywhere to achieve the same wrapping behavior for long log bodies with modern, standards-compliant CSS.

🧰 Tools

🪛 Stylelint (17.9.0)

[error] 452-452: Deprecated keyword "break-word" for property "word-break" (declaration-property-value-keyword-no-deprecated)

(declaration-property-value-keyword-no-deprecated)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@site/logsdb-engine/css/demo.css` around lines 450 - 452, Replace the deprecated CSS property in the .log-body code rule: remove word-break: break-word and add overflow-wrap: anywhere to achieve the same wrap behavior; update the .log-body code selector in demo.css to use overflow-wrap: anywhere so the styles remain standards-compliant.

coderabbitai · 2026-04-28T05:58:31Z

+      <div id="tabs-panel" class="tabs-panel" hidden>
+        <div class="tab-bar">
+          <button type="button" class="tab-btn active" data-tab="storage">Storage Explorer</button>
+          <button type="button" class="tab-btn" data-tab="logs">Logs Explorer</button>
+          <button type="button" class="tab-btn" data-tab="query">Query Builder</button>
+        </div>
+
+        <!-- Storage Explorer -->
+        <div id="panel-storage" class="tab-content">
+          <h3>Per-Service Storage Breakdown</h3>
+          <div id="service-breakdown"></div>
+          <h3 style="margin-top:1.5rem">Chunks</h3>
+          <div id="chunk-list"></div>
+        </div>
+
+        <!-- Logs Explorer -->
+        <div id="panel-logs" class="tab-content" hidden>
+          <div id="logs-loading" class="muted" hidden>Analyzing...</div>
+          <h3>Service Health</h3>
+          <div id="service-health"></div>
+          <div id="error-clusters"></div>
+          <div id="template-analysis"></div>
+        </div>
+
+        <!-- Query Builder -->
+        <div id="panel-query" class="tab-content" hidden>
+          <h3>Query Builder</h3>
+          <p class="muted">Build a query against the log store. The engine uses zone-map pruning on time range and severity to skip chunks without decoding.</p>
+          <div id="query-form"></div>
+          <div id="query-results"></div>
+        </div>


⚠️ Potential issue | 🟠 Major

Add real tab semantics to this tabbed UI.

The controls are visually tabs, but the markup lacks role="tablist", role="tab", aria-selected, aria-controls, and matching role="tabpanel"/aria-labelledby. Screen readers won't announce the relationship, and keyboard users lose context when the active panel changes.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@site/logsdb-engine/index.html` around lines 101 - 131, The tab markup lacks ARIA roles and attributes, so update the tab container (`#tabs-panel` / .tab-bar) and the tab buttons (.tab-btn with data-tab="storage|logs|query") to include role="tablist" on the container, role="tab" on each button, aria-selected="true/false" and aria-controls pointing to the corresponding panel id (panel-storage, panel-logs, panel-query), and add id attributes on the buttons to use as aria-labelledby values; also mark the panels (panel-storage, panel-logs, panel-query) with role="tabpanel" and aria-labelledby that references the matching tab id and toggle hidden/aria-hidden consistently when activating a tab; ensure any JS that switches tabs (the click/keyboard handler tied to .tab-btn or `#tabs-panel`) updates aria-selected, aria-hidden, aria-controls/aria-labelledby and visible state together so screen readers and keyboard users get correct semantics.

coderabbitai · 2026-04-28T05:58:31Z

+function escapeHtml(str) {
+  return str.replace(/&/g, "&amp;").replace(/</g, "&lt;").replace(/>/g, "&gt;");
+}


🧹 Nitpick | 🔵 Trivial

escapeHtml doesn't escape quotes — safe in text nodes, risky in attributes.

Current usage places escaped values in <code> text content (safe), but if used in attribute contexts, unescaped " or ' could enable injection.

🛡️ Extend escapeHtml for attribute safety

function escapeHtml(str) { - return str.replace(/&/g, "&").replace(/</g, "<").replace(/>/g, ">"); + return str + .replace(/&/g, "&") + .replace(/</g, "<") + .replace(/>/g, ">") + .replace(/"/g, """) + .replace(/'/g, "'"); }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@site/logsdb-engine/js/app.js` around lines 62 - 64, The escapeHtml function currently only replaces &, <, and >; extend it to also replace double-quote (") and single-quote (') characters so escaped strings are safe when inserted into HTML attributes as well as text nodes. Update the escapeHtml implementation (function name: escapeHtml) to include replacements for " and ' (and optionally `) and ensure any call sites that may place values into attributes continue to use escapeHtml. Keep the escaping order so & is replaced first to avoid double-escaping.

coderabbitai · 2026-04-28T05:58:31Z

+  return {
+    records: result.records,
+    stats: {
+      ...result.stats,
+      totalTimeMs: elapsed.toFixed(1),
+      recordsPerMs:
+        result.stats.recordsEmitted > 0 ? (result.stats.recordsEmitted / elapsed).toFixed(1) : "0",
+    },
+    spec,
+  };


🧹 Nitpick | 🔵 Trivial

totalTimeMs is a string, may surprise consumers.

elapsed.toFixed(1) returns a string, so stats.totalTimeMs is "12.3" not 12.3. The UI in app.js:467 interpolates it directly (fine), but any numeric comparison would fail silently.

📝 Consider keeping numeric value alongside formatted string

stats: { ...result.stats, - totalTimeMs: elapsed.toFixed(1), + totalTimeMs: elapsed, + totalTimeMsFormatted: elapsed.toFixed(1), recordsPerMs:

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

return {

records: result.records,

stats: {

...result.stats,

totalTimeMs: elapsed.toFixed(1),

recordsPerMs:

result.stats.recordsEmitted > 0 ? (result.stats.recordsEmitted / elapsed).toFixed(1) : "0",

},

spec,

};

return {

records: result.records,

stats: {

...result.stats,

totalTimeMs: elapsed,

totalTimeMsFormatted: elapsed.toFixed(1),

recordsPerMs:

result.stats.recordsEmitted > 0 ? (result.stats.recordsEmitted / elapsed).toFixed(1) : "0",

},

spec,

};

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@site/logsdb-engine/js/query-model.js` around lines 85 - 94, The code sets stats.totalTimeMs and stats.recordsPerMs to strings via toFixed, which can surprise consumers; update the return from the function so totalTimeMs is a numeric value (e.g., totalTimeMs: Number(elapsed.toFixed(1)) or Math.round/appropriate numeric cast) and preserve a formatted string alongside it (e.g., totalTimeMsFormatted or totalTimeMsStr), and do the same for recordsPerMs (keep a numeric recordsPerMs and a formatted recordsPerMsStr) using the existing elapsed and result.stats.recordsEmitted values to compute both forms.

coderabbitai · 2026-04-28T05:58:31Z

+export function createStore(opts = {}) {
+  const { rowsPerChunk = 1024 } = opts;
+
+  const store = new LogStore({
+    rowsPerChunk,
+    policyFactory: () =>
+      new TypedColumnarDrainPolicy({
+        bodyCodec: "raw",
+      }),
+  });
+
+  return store;
+}


⚠️ Potential issue | 🟡 Minor

Demo uses bodyCodec: "raw" — compression disabled.

Unlike the benchmarks which use "zstd-19", the demo store uses "raw". This is fine for a browser demo (avoids WASM ZSTD overhead), but the compression metrics shown will be misleading since no compression occurs.

📝 Consider adding a comment or UI indicator

const store = new LogStore({ rowsPerChunk, policyFactory: () => new TypedColumnarDrainPolicy({ - bodyCodec: "raw", + bodyCodec: "raw", // No compression for browser demo performance }), });

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

export function createStore(opts = {}) {

const { rowsPerChunk = 1024 } = opts;

const store = new LogStore({

rowsPerChunk,

policyFactory: () =>

new TypedColumnarDrainPolicy({

bodyCodec: "raw",

}),

});

return store;

}

export function createStore(opts = {}) {

const { rowsPerChunk = 1024 } = opts;

const store = new LogStore({

rowsPerChunk,

policyFactory: () =>

new TypedColumnarDrainPolicy({

bodyCodec: "raw", // No compression for browser demo performance

}),

});

return store;

}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@site/logsdb-engine/js/storage-model.js` around lines 27 - 39, The demo store in createStore currently sets TypedColumnarDrainPolicy with bodyCodec: "raw" (in LogStore creation), which disables compression and can mislead compression metrics; update the code near createStore/LogStore/TypedColumnarDrainPolicy to add a clear comment (or a small UI/debug flag) stating that bodyCodec is intentionally "raw" for the demo to avoid WASM ZSTD overhead and therefore compression metrics do not apply, or expose an option to switch to "zstd-19" for benchmarking so readers know why compression differs.

coderabbitai · 2026-04-28T05:58:31Z

+  it("has deterministic output (same seed)", () => {
+    const a = generateLogs({ count: 50, durationMinutes: 5, seed: 123 });
+    const b = generateLogs({ count: 50, durationMinutes: 5, seed: 123 });
+    expect(a.records[0].body).toEqual(b.records[0].body);
+    expect(a.records[49].severityText).toEqual(b.records[49].severityText);
+  });


⚠️ Potential issue | 🟠 Major

This determinism test misses the unstable field.

It only compares body and severityText, so timestamp drift from the current Date.now()-based baseTime still passes. Include timeUnixNano in the assertion set, or compare normalized full records for the same seed.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@site/logsdb-engine/test/data-gen.test.js` around lines 12 - 17, The determinism test for generateLogs is incomplete because it only asserts body and severityText; update the test to also assert the time field to catch timestamp drift: when calling generateLogs({ count: 50, durationMinutes: 5, seed: 123 }) for a and b, include expect(a.records[0].timeUnixNano).toEqual(b.records[0].timeUnixNano) and expect(a.records[49].timeUnixNano).toEqual(b.records[49].timeUnixNano) (or alternatively replace the selective assertions with a deep equality check of normalized full records from generateLogs to ensure complete deterministic output).

coderabbitai · 2026-04-28T05:58:31Z

+  it("streaming generator yields complete coverage", () => {
+    const batches = [];
+    for (const { batch } of generateLogBatches({
+      count: 500,
+      durationMinutes: 5,
+      batchSize: 100,
+    })) {
+      batches.push(batch);
+    }
+    const totalRecords = batches.reduce((sum, b) => sum + b.length, 0);
+    expect(totalRecords).toBe(500);
+    expect(batches.length).toBe(5);
+  });


⚠️ Potential issue | 🟡 Minor

Assert the terminal progress event.

This checks coverage and batch count, but not that the last yielded progress is 1. The exact-batch-boundary bug in generateLogBatches() can regress here without failing the suite.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@site/logsdb-engine/test/data-gen.test.js` around lines 91 - 103, The test currently iterates over generateLogBatches only collecting batches; record and assert the final progress value instead of ignoring it. Modify the test that calls generateLogBatches to capture each yielded item's progress (e.g., track lastProgress from each iteration of generateLogBatches) and add an assertion that lastProgress === 1 after the loop; reference the generateLogBatches generator and the yielded progress property to locate where to add the tracking and the new expect(lastProgress).toBe(1) assertion.

coderabbitai · 2026-04-28T05:58:31Z

+    it("computeServiceDistribution groups by service", () => {
+      const records = [
+        { attributes: [{ key: "service.name", value: "a" }] },
+        { attributes: [{ key: "service.name", value: "a" }] },
+        { attributes: [{ key: "service.name", value: "b" }] },
+      ];
+      const dist = computeServiceDistribution(records);
+      expect(dist.a).toBe(2);
+      expect(dist.b).toBe(1);
+    });
+  });


🧹 Nitpick | 🔵 Trivial

Consider testing the "unknown" service fallback.

computeServiceDistribution defaults missing service names to "unknown", but the test only covers records with valid service.name attributes.

💚 Add test for missing service attribute

it("computeServiceDistribution groups by service", () => { const records = [ { attributes: [{ key: "service.name", value: "a" }] }, { attributes: [{ key: "service.name", value: "a" }] }, { attributes: [{ key: "service.name", value: "b" }] }, + { attributes: [] }, // missing service.name + { attributes: undefined }, ]; const dist = computeServiceDistribution(records); expect(dist.a).toBe(2); expect(dist.b).toBe(1); + expect(dist.unknown).toBe(2); });

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

it("computeServiceDistribution groups by service", () => {

const records = [

{ attributes: [{ key: "service.name", value: "a" }] },

{ attributes: [{ key: "service.name", value: "a" }] },

{ attributes: [{ key: "service.name", value: "b" }] },

];

const dist = computeServiceDistribution(records);

expect(dist.a).toBe(2);

expect(dist.b).toBe(1);

});

});

it("computeServiceDistribution groups by service", () => {

const records = [

{ attributes: [{ key: "service.name", value: "a" }] },

{ attributes: [{ key: "service.name", value: "a" }] },

{ attributes: [{ key: "service.name", value: "b" }] },

{ attributes: [] }, // missing service.name

{ attributes: undefined },

];

const dist = computeServiceDistribution(records);

expect(dist.a).toBe(2);

expect(dist.b).toBe(1);

expect(dist.unknown).toBe(2);

});

});

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@site/logsdb-engine/test/query-model.test.js` around lines 165 - 175, Add a test case to verify computeServiceDistribution falls back to "unknown" when service.name is missing: create records that either have no attributes or have attributes that don't include key "service.name" (in addition to existing valid records), call computeServiceDistribution(records) and assert that dist.unknown equals the count of those fallback records while existing assertions for dist.a/dist.b remain valid; reference computeServiceDistribution and the test block that currently asserts dist.a/dist.b to place this new expectation.

coderabbitai

♻️ Duplicate comments (2)

packages/o11ylogsdb/src/query.ts (2)

160-183: ⚠️ Potential issue | 🟠 Major

Avoid decoding matching chunks twice.

Once hasMatch flips true, the chunk has already gone through readBodiesOnly() and is then fully decoded again via readRecords(). On low-selectivity bodyContains queries this makes the “fast path” slower than the baseline path. Reuse a shared decompressed payload, or have the body-only path populate the same decode cache that readRecords() consumes.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/o11ylogsdb/src/query.ts` around lines 160 - 183, The fast-path
decodes bodies with readBodiesOnly() and then decodes the chunk again with
readRecords() once hasMatch is true; change the flow so the decompressed payload
or per-index decoded body data produced by readBodiesOnly() is stored in a
shared cache/structure that readRecords() can consume (or make readBodiesOnly()
return both bodies and the decompressed blob) to avoid a second full decode;
update the logic around bodies, hasMatch, readBodiesOnly, and readRecords to
pass/reuse that cached payload and adjust stats updates accordingly so chunks
are only decoded once.

125-128: ⚠️ Potential issue | 🔴 Critical

Guard the body-only fast path by policy, not just by QuerySpec.

This still lets Line 149 prune a chunk from readBodiesOnly() even when the active policy only normalizes body strings in postDecode(). In that case hasMatch is computed on a different representation than recordMatches(), so valid matches can be dropped.

Minimal safe guard

-  const useBodyFastPath = spec.bodyContains !== undefined && !spec.bodyLeafEquals;
-
   for (const id of store.streams.ids()) {
@@
     const chunks = store.streams.chunksOf(id);
     const policy = store.policyFor(id);
+    const canUseBodyFastPath =
+      spec.bodyContains !== undefined &&
+      !spec.bodyLeafEquals &&
+      policy?.postDecode === undefined;
     for (const chunk of chunks) {
@@
-      if (useBodyFastPath) {
+      if (canUseBodyFastPath) {

Also applies to: 149-179

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/o11ylogsdb/src/query.ts` around lines 125 - 128, The body-only fast
path (useBodyFastPath) is being enabled solely from QuerySpec, which allows
readBodiesOnly() to prune chunks incorrectly when postDecode() normalizes body
strings; change the guard so the fast-path is enabled only when both spec
conditions (spec.bodyContains && !spec.bodyLeafEquals) AND the active policy
indicates it is safe to evaluate bodyContains on the pre-decode representation
(i.e., postDecode will not alter body text). Update the useBodyFastPath
calculation and adjust readBodiesOnly()/hasMatch logic to consult that policy
flag (or a function like policy.allowsBodyFastPath /
policy.postDecodeNoopForBodies) before using pre-decode matching, and ensure
recordMatches/postDecode remain authoritative when the policy disallows the fast
path.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@packages/o11ylogsdb/src/query.ts`:
- Around line 160-183: The fast-path decodes bodies with readBodiesOnly() and
then decodes the chunk again with readRecords() once hasMatch is true; change
the flow so the decompressed payload or per-index decoded body data produced by
readBodiesOnly() is stored in a shared cache/structure that readRecords() can
consume (or make readBodiesOnly() return both bodies and the decompressed blob)
to avoid a second full decode; update the logic around bodies, hasMatch,
readBodiesOnly, and readRecords to pass/reuse that cached payload and adjust
stats updates accordingly so chunks are only decoded once.
- Around line 125-128: The body-only fast path (useBodyFastPath) is being
enabled solely from QuerySpec, which allows readBodiesOnly() to prune chunks
incorrectly when postDecode() normalizes body strings; change the guard so the
fast-path is enabled only when both spec conditions (spec.bodyContains &&
!spec.bodyLeafEquals) AND the active policy indicates it is safe to evaluate
bodyContains on the pre-decode representation (i.e., postDecode will not alter
body text). Update the useBodyFastPath calculation and adjust
readBodiesOnly()/hasMatch logic to consult that policy flag (or a function like
policy.allowsBodyFastPath / policy.postDecodeNoopForBodies) before using
pre-decode matching, and ensure recordMatches/postDecode remain authoritative
when the policy disallows the fast path.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: 2f792218-8066-44ec-aff8-e87b212a8744

📥 Commits

Reviewing files that changed from the base of the PR and between dd4d1b0 and 04c5198.

📒 Files selected for processing (1)

packages/o11ylogsdb/src/query.ts

…ranges Use materializeRangeOwned instead of materializeRange to ensure we get owned copies of arrays rather than borrowed references to shared scratch buffers that get invalidated on subsequent decodeView calls.

strawgate and others added 2 commits April 27, 2026 23:41

dosubot Bot added the size:XXL This PR changes 1000+ lines, ignoring generated files. label Apr 28, 2026

Revert "perf(logsdb): add WeakMap decode cache — 50-100× repeated que…

c3f2ce2

…ry speedup" This reverts commit 92611a4.

dosubot Bot added area/site GitHub Pages sites enhancement New feature or request javascript Pull requests that update javascript code performance Performance optimization labels Apr 28, 2026

gitar-bot Bot reviewed Apr 28, 2026

View reviewed changes

coderabbitai Bot requested changes Apr 28, 2026

View reviewed changes

strawgate changed the title ~~feat(logsdb): comprehensive benchmarks, decode cache, and interactive demo~~ feat(logsdb): comprehensive benchmarks, engine optimizations, and interactive demo Apr 28, 2026

coderabbitai Bot requested changes Apr 28, 2026

View reviewed changes

strawgate added 4 commits April 28, 2026 01:08

chore: retrigger CI

ad2e098

chore: retrigger CI

d30b011

fix(o11ylogsdb): fix needle undefined type error in query.ts

f928970

fix(o11ylogsdb): fix biome formatting on query.ts

04c5198

coderabbitai Bot reviewed Apr 28, 2026

View reviewed changes

strawgate merged commit 4899ce0 into main Apr 28, 2026
5 checks passed

coderabbitai Bot mentioned this pull request Apr 28, 2026

perf(logsdb): query optimization, shared code extraction, CI improvements #219

Merged

-  // Warmup pass
-  process.stderr.write("  Warmup…\n");
-  CORPUS_GENERATORS.syslog(1000);
+  // Warmup pass
+  process.stderr.write("  Warmup…\n");
+  const warmupRecords = CORPUS_GENERATORS.syslog(1000);
+  const warmupStore = new LogStore({
+    registry: defaultRegistry().register(new ZstdCodec(19)),
+    policy: new TypedColumnarDrainPolicy({ bodyCodec: "zstd-19" }),
+    rowsPerChunk: 1024,
+  });
+  for (const r of warmupRecords) {
+    warmupStore.append(buildResource("syslog"), SCOPE, r);
+  }
+  warmupStore.flush();

Conversation

strawgate commented Apr 28, 2026

Summary

1. Comprehensive Benchmark Suite

2. WeakMap Decode Cache (50-100× speedup)

3. Interactive LogsDB Demo (site/logsdb-engine/)

Storage Efficiency Results (10K records)

Key Findings

Uh oh!

coderabbitai Bot commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Possibly related PRs

Uh oh!

gitar-bot Bot Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

gitar-bot Bot Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

gitar-bot Bot Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

gitar-bot Bot Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

gitar-bot Bot Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

gitar-bot Bot commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 28, 2026

Choose a reason for hiding this comment

3. Interactive LogsDB Demo (`site/logsdb-engine/`)

coderabbitai Bot commented Apr 28, 2026 •

edited

Loading

gitar-bot Bot commented Apr 28, 2026 •

edited

Loading