Skip to content

fix(etl): use raw SQL for completion write to bypass neon-http enum issue#2395

Merged
andrew-bierman merged 10 commits into
mainfrom
fix/etl-completion-and-retry
May 12, 2026
Merged

fix(etl): use raw SQL for completion write to bypass neon-http enum issue#2395
andrew-bierman merged 10 commits into
mainfrom
fix/etl-completion-and-retry

Conversation

@andrew-bierman
Copy link
Copy Markdown
Collaborator

@andrew-bierman andrew-bierman commented May 8, 2026

Summary

  • Replace Drizzle ORM .set({ status: 'completed' }) with raw SQL UPDATE etl_jobs SET status = 'completed'::etl_job_status to bypass a neon-http driver enum serialization issue that was causing the try block to throw, triggering the catch block and incorrectly marking jobs as failed
  • Wrap the completion write in an isolated try-catch so a transient DB hiccup doesn't cascade to status='failed' — logged but non-fatal; stuck-job sweep will reset if needed

Root Cause

Jobs that processed all rows successfully were ending up status='failed' because:

  1. db.update(etlJobs).set({ status: 'completed', completedAt: new Date() }) threw via neon-http driver (enum serialization bug)
  2. The outer catch block caught it and set status='failed'

The identical .set({ status: 'failed' }) in the catch block works fine — the failure is specific to the 'completed' enum value through the Drizzle ORM neon-http code path.

Fix

Switched to raw SQL (same pattern already used in updateEtlJobProgress.ts):

UPDATE etl_jobs SET status = 'completed'::etl_job_status, completed_at = NOW() WHERE id = $jobId

Post-Deploy Steps

  1. Run Reset Stuck in admin UI to mark any lingering running jobs as failed
  2. Use Retry button on null-count failed jobs to replay from R2

Post-Deploy Monitoring & Validation

  • Logs: Search Cloudflare Workers logs for [ETL] Failed to mark job — should stop appearing
  • Dashboard: Admin /analytics/catalog/etl — new jobs should show status: completed not failed
  • Validation: After next scraper run or manual retry, confirm completedAt is set and status = 'completed'
  • Failure signal: If jobs still end up failed with non-null totalProcessed, the raw SQL path is also failing — check neon connectivity
  • Window: 24h post-deploy
  • Owner: Andrew

🤖 Generated with Claude Code

Co-Authored-By: Claude noreply@anthropic.com

Summary by CodeRabbit

  • New Features

    • Catalog items can now be created/stored without weight information.
  • Improvements

    • Data import processes no longer cascade failures when marking jobs complete.
    • Pack item loading now defaults missing weight to 0 g so items without weight behave predictably.
    • Imports tolerate missing weight values without failing.
  • Tests

    • Added comprehensive tests for data import and batch processing workflows.

Review Change Stack

…ssue

The Drizzle ORM .set({ status: 'completed' }) with the neon-http driver
appears to fail silently (triggering the catch block which then sets
status='failed'), even though the identical pattern with 'failed' works.

Switch to a raw sql`UPDATE ... SET status = 'completed'::etl_job_status`
to match the pattern already used in updateEtlJobProgress, bypassing any
Drizzle/neon-http enum serialization difference.

Also isolate the completion write in its own try-catch so a transient
failure here logs the error and leaves the job 'running' (for Reset Stuck)
rather than cascading to 'failed'.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 8, 2026 05:21
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 8, 2026

Warning

Rate limit exceeded

@andrew-bierman has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 23 minutes and 24 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 11bbc971-681e-49cb-b60c-4c6c1f8c90be

📥 Commits

Reviewing files that changed from the base of the PR and between f694d96 and 5676ae5.

⛔ Files ignored due to path filters (1)
  • bun.lock is excluded by !**/*.lock, !bun.lock
📒 Files selected for processing (20)
  • apps/admin/lib/api.ts
  • apps/admin/lib/queryKeys.ts
  • apps/expo/features/auth/hooks/useAuthActions.ts
  • apps/expo/features/auth/hooks/useAuthInit.ts
  • apps/expo/features/catalog/components/ItemReviews.tsx
  • apps/trails/components/AuthGate.tsx
  • apps/trails/lib/auth-client.ts
  • apps/trails/lib/auth.ts
  • apps/trails/lib/useAuth.tsx
  • apps/web/app/auth/page.tsx
  • apps/web/lib/data.ts
  • apps/web/lib/types.ts
  • package.json
  • packages/api/drizzle/0047_cute_bloodscream.sql
  • packages/api/drizzle/meta/0047_cute_bloodscream.json
  • packages/api/drizzle/meta/_journal.json
  • packages/api/src/db/schema.ts
  • packages/api/src/routes/admin/index.ts
  • packages/api/src/schemas/users.ts
  • packages/api/src/services/packService.ts

Walkthrough

This PR makes catalog item weight fields nullable in DB and Drizzle schema, adds pack-service defaults, isolates ETL job completion persistence, updates the Drizzle snapshot/journal, and adds tests verifying ETL handles items without weight and embedding failures.

Changes

Support Nullable Catalog Weights

Layer / File(s) Summary
Database Migrations
packages/api/drizzle/0037_*.sql
catalog_items.weight and catalog_items.weight_unit drop NOT NULL, permitting NULL values.
TypeScript Schema Definition
packages/api/src/db/schema.ts
catalogItems removes .notNull() on weight and weightUnit, making them nullable in types.
PackService Mapping
packages/api/src/services/packService.ts
PackService.getItems defaults missing weight0 and weightUnit'g'.
Job Completion Persistence
packages/api/src/services/etl/processCatalogEtl.ts
ETL marks jobs completed via raw SQL UPDATE ... completed_at = NOW() inside an isolated try/catch; success logging unchanged.
Schema Snapshot & Journal
packages/api/drizzle/meta/0037_snapshot.json, packages/api/drizzle/meta/_journal.json
Drizzle snapshot records full schema state; journal appends migration entry 0037_rich_electro (version 7).
ETL Test Coverage
packages/api/test/etl.test.ts
New Vitest tests cover happy/failure ETL flows, invalid-row logging, missing R2 objects, batch boundaries, embedding errors, and items without weight.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Poem

🐰 Weights may vanish, columns flex—
nullable now, no more vexed,
ETL holds fast through the night,
catching errors, keeping things right!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the primary change: switching from Drizzle ORM to raw SQL for ETL job completion to work around a neon-http enum serialization issue.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/etl-completion-and-retry

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions Bot added the api label May 8, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 8, 2026

Coverage Report for API Unit Tests Coverage (./packages/api)

Status Category Percentage Covered / Total
🔵 Lines 76.17% 502 / 659
🔵 Statements 76.17% (🎯 65%) 502 / 659
🔵 Functions 95% 38 / 40
🔵 Branches 88.67% 227 / 256
File CoverageNo changed files found.
Generated in workflow #1150 for commit 5676ae5 by the Vitest Coverage Report Action

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 8, 2026

Coverage Report for Expo Unit Tests Coverage (./apps/expo)

Status Category Percentage Covered / Total
🔵 Lines 82.61% 480 / 581
🔵 Statements 82.61% (🎯 75%) 480 / 581
🔵 Functions 92.59% 50 / 54
🔵 Branches 90.9% 170 / 187
File CoverageNo changed files found.
Generated in workflow #1150 for commit 5676ae5 by the Vitest Coverage Report Action

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the catalog ETL worker so that marking a job as completed no longer fails due to a neon-http + Drizzle enum serialization issue, preventing successfully-processed jobs from being incorrectly marked failed.

Changes:

  • Replace Drizzle .set({ status: 'completed' }) completion update with a raw SQL UPDATE ... 'completed'::etl_job_status.
  • Isolate the completion-status write in its own try/catch so errors there don’t trigger the outer failure handler.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

);
} catch (completionErr) {
console.error(
`[ETL] Failed to mark job ${jobId} completed — will be reset by stuck-job sweep:`,
The weight NOT NULL constraint on catalog_items was causing ETL job failures
for any item missing weight data (common for clothing/footwear brands). The
CatalogItemValidator explicitly marks weight as optional, but the DB would
reject the INSERT, causing processValidItemsBatch's fallback to also fail,
which propagated to the outer catch and set status='failed'.

Migration 0037 drops NOT NULL from weight and weight_unit on catalog_items.
Adds full ETL integration test suite confirming: happy path completes,
no-weight items don't fail, invalid-only runs still complete, exact/multi-batch
row counts work, and the embedding fallback doesn't throw to the outer caller.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Replaces the handwritten 0037_nullable_catalog_weight.sql with the
drizzle-kit generated equivalent — same ALTER TABLE statements but
now tracked in the drizzle journal with the proper snapshot.

Social feed table CREATE statements were stripped from the generated
output because 0033_social_feed_tables.sql was applied manually and
not tracked in the journal, causing drizzle-kit to emit duplicate DDL.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/api/drizzle/0037_nullable_catalog_weight.sql`:
- Around line 1-5: Delete this duplicate migration file that applies ALTER TABLE
"catalog_items" ALTER COLUMN "weight" DROP NOT NULL and ALTER COLUMN
"weight_unit" DROP NOT NULL and keep the canonical 0037_rich_electro migration
tracked by the journal; remove the duplicate 0037 nullable catalog migration
from the repository, ensure the migration journal still references only
0037_rich_electro, and run the migration tests (or a dry-run) to confirm no
drift remains.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b6a9960c-66ed-4434-a61b-934b60aad1fc

📥 Commits

Reviewing files that changed from the base of the PR and between 103e5d8 and 60d5be8.

📒 Files selected for processing (7)
  • packages/api/drizzle/0037_nullable_catalog_weight.sql
  • packages/api/drizzle/0037_rich_electro.sql
  • packages/api/drizzle/meta/0037_snapshot.json
  • packages/api/drizzle/meta/_journal.json
  • packages/api/src/db/schema.ts
  • packages/api/src/services/etl/processCatalogEtl.ts
  • packages/api/test/etl.test.ts

Comment on lines +1 to +5
-- catalog_items.weight and weight_unit: drop NOT NULL to allow items without weight data.
-- The validator intentionally skips weight (clothing/footwear often omit it), but the
-- NOT NULL constraint was causing upserts to throw, which cascaded to ETL job failures.
ALTER TABLE "catalog_items" ALTER COLUMN "weight" DROP NOT NULL;--> statement-breakpoint
ALTER TABLE "catalog_items" ALTER COLUMN "weight_unit" DROP NOT NULL;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Remove the duplicate 0037 migration file to avoid drift/confusion.

This file duplicates the 0037_rich_electro DDL while the journal tracks 0037_rich_electro. Keeping both increases the risk of future divergence during migration maintenance.

🧹 Suggested cleanup
--- a/packages/api/drizzle/0037_nullable_catalog_weight.sql
+++ /dev/null
@@
--- catalog_items.weight and weight_unit: drop NOT NULL to allow items without weight data.
--- The validator intentionally skips weight (clothing/footwear often omit it), but the
--- NOT NULL constraint was causing upserts to throw, which cascaded to ETL job failures.
-ALTER TABLE "catalog_items" ALTER COLUMN "weight" DROP NOT NULL;--> statement-breakpoint
-ALTER TABLE "catalog_items" ALTER COLUMN "weight_unit" DROP NOT NULL;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
-- catalog_items.weight and weight_unit: drop NOT NULL to allow items without weight data.
-- The validator intentionally skips weight (clothing/footwear often omit it), but the
-- NOT NULL constraint was causing upserts to throw, which cascaded to ETL job failures.
ALTER TABLE "catalog_items" ALTER COLUMN "weight" DROP NOT NULL;--> statement-breakpoint
ALTER TABLE "catalog_items" ALTER COLUMN "weight_unit" DROP NOT NULL;
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/api/drizzle/0037_nullable_catalog_weight.sql` around lines 1 - 5,
Delete this duplicate migration file that applies ALTER TABLE "catalog_items"
ALTER COLUMN "weight" DROP NOT NULL and ALTER COLUMN "weight_unit" DROP NOT NULL
and keep the canonical 0037_rich_electro migration tracked by the journal;
remove the duplicate 0037 nullable catalog migration from the repository, ensure
the migration journal still references only 0037_rich_electro, and run the
migration tests (or a dry-run) to confirm no drift remains.

…from catalog

Making catalog_items.weight/weight_unit nullable caused a TypeScript error
in packService.ts — pack items require non-null weight. Fall back to 0/'g'
so the AI-generated pack flow still compiles; user can edit after generation.
… 0037 conflict

Main added 0037_trips_trail_osm_id after this branch was cut. Kept main's
journal/snapshot; will regenerate weight-nullable migration at correct number.
…ight + regenerate weight migration

CORS: admin scoped cors was silently dropping Access-Control-Allow-Origin on
preflight (two stacked cors plugins conflicted — root sets credentials:false/*,
admin sets credentials:true/specific-origin, header got dropped). Switch to
origin function so Elysia reflects the exact origin back; bypass auth guard for
OPTIONS preflights. Fixes admin app CORS errors from https://admin.packratai.com.

Migration: regenerate weight/weight_unit nullable migration as 0047 — main
merged 0037–0046 after this branch was cut.
@cloudflare-workers-and-pages
Copy link
Copy Markdown
Contributor

cloudflare-workers-and-pages Bot commented May 12, 2026

Deploying packrat-guides with  Cloudflare Pages  Cloudflare Pages

Latest commit: 6f84a52
Status: ✅  Deploy successful!
Preview URL: https://94c64abd.packrat-guides-6gq.pages.dev
Branch Preview URL: https://fix-etl-completion-and-retry.packrat-guides-6gq.pages.dev

View logs

@cloudflare-workers-and-pages
Copy link
Copy Markdown
Contributor

cloudflare-workers-and-pages Bot commented May 12, 2026

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Preview URL Updated (UTC)
✅ Deployment successful!
View logs
packrat-admin fdbc495 Commit Preview URL

Branch Preview URL
May 12 2026, 03:02 PM

@cloudflare-workers-and-pages
Copy link
Copy Markdown
Contributor

Deploying packrat-landing with  Cloudflare Pages  Cloudflare Pages

Latest commit: 6f84a52
Status: ✅  Deploy successful!
Preview URL: https://d1895024.packrat-landing.pages.dev
Branch Preview URL: https://fix-etl-completion-and-retry.packrat-landing.pages.dev

View logs

andrew-bierman and others added 3 commits May 12, 2026 08:57
- admin api.ts: double-cast Eden Treaty responses to PaginatedResponse<T>
  (treaty infers wide union types that don't overlap with the interface)
- packages/app user queries: remove deleted auth hooks (useLoginMutation,
  useRegisterMutation) that called Better Auth routes not in Elysia; redirect
  useCurrentUser to client.user.profile.get()
- apps/web auth page: implement login/register locally via Better Auth REST
  endpoints instead of importing from @packrat/app
- apps/trails useAuth: replace (apiClient as any).auth.* calls with typed
  trailsAuthClient (createAuthClient from better-auth/react); add auth-client.ts
- apps/trails UserInfoSchema: id is UUID string not number (Better Auth)
- apps/web types: Post.userId and PostAuthor.id are UUID strings not numbers
- apps/web data.ts: update mock post IDs to match string type

Co-Authored-By: Claude <noreply@anthropic.com>
- admin api.ts: add safe-cast annotations on Eden Treaty PaginatedResponse
  double casts; res.json() returns Promise<any> so no explicit cast needed
- web auth page: drop superfluous `as Promise<unknown>` on res.json()

Co-Authored-By: Claude <noreply@anthropic.com>
- UserSchema.id: z.number() → z.string() (Better Auth uses UUID strings; aligns with Drizzle schema's text('id') PK)
- mapToUser/applySessionUser: add missing emailVerified/createdAt/updatedAt fields; replace `as` casts with asString/asBoolean guards from @packrat/guards
- useAuthActions signInWithGoogle: replace dead apiClient/setToken/UserSchema references with authClient.signIn.social
- ItemReviews: guard nullable review.title/text/date per CatalogItemSchema
Fixes 3 TypeScript errors in nativewindui source:
- Icon/types: SymbolViewPropsWithStringName satisfies IconMapper constraint
- AdaptiveSearchHeader, LargeTitleHeader: map 'systemDefault' autoCapitalize to 'none'

Also widens expo-router peer dep to >=6.0.23 (was ~6.0.23) so bun does not
install expo-router@6.0.23 into apps/expo/node_modules, which was causing
TypeScript to resolve to the old API and produce 1184+ false positives.

2.0.3-2 is an exact pin; the root override prevents range resolution.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@github-actions github-actions Bot added the dependencies Pull requests that update a dependency file label May 12, 2026
@andrew-bierman andrew-bierman merged commit eee65b8 into main May 12, 2026
10 of 12 checks passed
@andrew-bierman andrew-bierman deleted the fix/etl-completion-and-retry branch May 12, 2026 16:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api database dependencies Pull requests that update a dependency file mobile

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants