test: expand backend test coverage for stats and history edge cases#14
Conversation
Adds the first 5 tests covering the backend: - test_predict_happy_path: POST /api/predict with a valid PDF returns a labeled prediction and persists it to history. - test_predict_rejects_invalid_uploads: parametrized over a .txt upload and a blank PDF; both must return 4xx. This surfaced a gap where the .txt case crashed fitz.open() with a 500 Internal Server Error. - test_predict_from_text_invariants: classifier gatecheck across three sample texts — label must be in pipeline.classes_, probabilities must cover all classes, sum to 1.0 (abs=1e-6), and fall in [0, 1]. - test_history_pagination_and_filters: seeds 5 rows across 2 labels, exercises page/limit, label filter, search filter, 404 on missing id, and 422 on invalid page/limit. - test_save_and_retrieve_round_trip: verifies the database layer preserves all fields and JSON-encodes probabilities correctly. Also hardens POST /api/predict to catch PyMuPDF errors and return 422 "File is not a readable PDF" instead of letting a 500 bubble. This is the fix the new test drove. Adds httpx to dev deps for fastapi.testclient.TestClient.
Adds the 4 frontend tests, bringing the suite total to 9: - DropZone.test.tsx: accepts valid PDFs, rejects non-PDFs (MIME + extension), rejects files above the 20MB limit. Client-side mirror of the server-side validation test. - PredictionCard.test.tsx: renders filename, label, and confidence percentage for high/medium/low cases; truncates long filenames to 40 chars. - useUpload.test.tsx: status transitions idle -> uploading -> done on success, status=error on rejection, clear() empties the queue. - useHistory.test.tsx: initial fetch uses page=1 + DEFAULT_PAGE_SIZE, setLabel resets page to 1, setPage refetches, API rejection surfaces error and clears loading. Adds Vitest + jsdom + React Testing Library to devDependencies and a `test` npm script. Config lives at frontend/vitest.config.ts with a setup file that wires up jest-dom matchers and RTL cleanup.
Adds tests/test_stats_endpoint.py:
- test_stats_empty: GET /api/stats with no rows returns total=0,
empty label_counts, recent_count_7d=0.
- test_stats_counts_total_and_labels: seeds 3 rows across 2 labels,
verifies total, per-label counts, and that rows inserted with
CURRENT_TIMESTAMP all fall within recent_count_7d.
Extends tests/test_history_endpoint.py:
- test_history_get_by_id_returns_entry: GET /api/history/{id} on a
seeded row returns 200 with probabilities parsed to a dict.
- test_history_handles_malformed_probabilities_json: writes invalid
JSON directly via raw SQL to bypass save_classification's json.dumps,
then asserts both /api/history and /api/history/{id} return 200
with probabilities={} instead of 500.
- test_history_search_escapes_sql_wildcards: confirms % and _ in the
search term match literally rather than as SQL LIKE wildcards,
exercising the escape logic in get_history.
aiosqlite and httpx were declared in pyproject.toml but missing from the lockfile. Running uv sync reconciled the lock (adding both) and pruned unused s390x wheel references for greenlet.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (5)
✅ Files skipped from review due to trivial changes (3)
🚧 Files skipped from review as they are similar to previous changes (2)
📝 WalkthroughWalkthroughAdded PDF extraction error handling in the backend predict endpoint (returns HTTP 422 on unreadable PDFs). Introduced extensive test infrastructure and many test suites for backend and frontend. Frontend testing tooling and config (Vitest, jsdom, Testing Library) and multiple component/hook tests were added. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~35 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (3)
tests/test_stats_endpoint.py (1)
8-17: Consider extracting_seedinto a shared fixture/helper.This helper is duplicated across endpoint test modules; moving it to
tests/conftest.pywould reduce drift and simplify future changes.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/test_stats_endpoint.py` around lines 8 - 17, Extract the duplicated helper function _seed into a shared pytest fixture in your test suite so other endpoint tests can reuse it; create a fixture (e.g., seed_rows or seed) that encapsulates the current logic (importing save_classification and running the async loop via asyncio.run) and return a callable that accepts rows, then update tests that currently define _seed to use the new fixture instead; ensure the fixture calls src.api.database.save_classification internally and preserves the behaviour of awaiting save_classification for each row.tests/test_predict_endpoint.py (1)
49-51: Assert explicit 422 for invalid uploads.Using a broad 4xx check may hide regressions (e.g., unexpected 400/401/403). If the API contract is unreadable input → 422, assert it directly.
♻️ Proposed fix
- assert 400 <= response.status_code < 500, ( - f"case={case_id} expected 4xx, got {response.status_code}: {response.text}" - ) + assert response.status_code == 422, ( + f"case={case_id} expected 422, got {response.status_code}: {response.text}" + )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/test_predict_endpoint.py` around lines 49 - 51, The test currently asserts a broad 4xx range using response.status_code which can mask regressions; change the assertion to require an explicit 422 Unprocessable Entity (e.g., assert response.status_code == 422) and update the failure message to include case_id and response.text like the original message so failures remain informative (you can use HTTPStatus.UNPROCESSABLE_ENTITY if available for readability). Ensure the assertion replaces the existing range check in tests/test_predict_endpoint.py where response.status_code and case_id are used.frontend/src/components/results/PredictionCard.test.tsx (1)
36-49: Test naming and fixture intent are inconsistent.Line 36 describes a low-confidence case, but the assertion on Line 48 expects
90%. Consider renaming the test (or adjusting the fixture) so the scenario intent is unambiguous.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@frontend/src/components/results/PredictionCard.test.tsx` around lines 36 - 49, The test name "renders low-confidence winner using the max probability" is inconsistent with the fixture which shows a 90% probability; either rename the test to reflect a high-confidence case (update the it(...) description) or adjust the fixture probabilities to a low-confidence example (e.g., ~55% vs 45%) and change the expected percent assertion accordingly; locate the test block using the PredictionCard component in PredictionCard.test.tsx and update the it(...) string or the result.probabilities and the expect(screen.getByText(...)) assertion so the name matches the fixture.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@app.py`:
- Around line 34-39: Replace the broad Exception handler in the
extract_text_from_bytes call with a specific handler for PyMuPDF parsing errors:
catch fitz.FileDataError (or the PyMuPDF common parsing exception if present)
instead of Exception, then raise the HTTPException(status_code=422, detail="File
is not a readable PDF") from the caught fitz.FileDataError to preserve the
original traceback; adjust the import/use of fitz accordingly and leave other
exceptions to propagate.
In `@frontend/src/components/upload/DropZone.test.tsx`:
- Around line 18-20: Replace the unsafe document.querySelector and
fireEvent.change usage with userEvent.upload and use accessible queries from the
render result (e.g., getByLabelText or getByRole) to locate the file input
safely; specifically, in DropZone.test.tsx replace the casted
querySelector('input[type=file]') and fireEvent.change calls with a lookup via
the render return (destructure getByLabelText/getByRole) and call
userEvent.upload(inputElement, file) so the test matches real user file
selection semantics and avoids null-safety casts (apply the same change pattern
to the other occurrences around the makeFile/firing lines).
In `@frontend/vitest.config.ts`:
- Around line 3-10: In the ESM vitest config replace the use of __dirname in the
alias resolution (the path.resolve call that sets alias '@') with a
fileURLToPath(new URL('./src', import.meta.url)) pattern: import fileURLToPath
(and URL) from 'url' and compute the srcDir from import.meta.url, then use that
value in the alias instead of path.resolve(__dirname, './src'); apply the same
change in frontend/vite.config.ts for its alias resolution so both configs use
fileURLToPath/import.meta.url rather than __dirname.
In `@tests/conftest.py`:
- Around line 43-60: The client fixture currently returns a TestClient without
closing it; convert the client function into a yield-style pytest fixture that
constructs TestClient(server.app) after running asyncio.run(init_db()) and
monkeypatching server._pipeline, then yield the TestClient to tests and ensure
the TestClient.close() is called in a finally/teardown block; reference the
client fixture name, server._pipeline monkeypatch, the asyncio.run(init_db())
call, and call TestClient.close() in the cleanup.
---
Nitpick comments:
In `@frontend/src/components/results/PredictionCard.test.tsx`:
- Around line 36-49: The test name "renders low-confidence winner using the max
probability" is inconsistent with the fixture which shows a 90% probability;
either rename the test to reflect a high-confidence case (update the it(...)
description) or adjust the fixture probabilities to a low-confidence example
(e.g., ~55% vs 45%) and change the expected percent assertion accordingly;
locate the test block using the PredictionCard component in
PredictionCard.test.tsx and update the it(...) string or the
result.probabilities and the expect(screen.getByText(...)) assertion so the name
matches the fixture.
In `@tests/test_predict_endpoint.py`:
- Around line 49-51: The test currently asserts a broad 4xx range using
response.status_code which can mask regressions; change the assertion to require
an explicit 422 Unprocessable Entity (e.g., assert response.status_code == 422)
and update the failure message to include case_id and response.text like the
original message so failures remain informative (you can use
HTTPStatus.UNPROCESSABLE_ENTITY if available for readability). Ensure the
assertion replaces the existing range check in tests/test_predict_endpoint.py
where response.status_code and case_id are used.
In `@tests/test_stats_endpoint.py`:
- Around line 8-17: Extract the duplicated helper function _seed into a shared
pytest fixture in your test suite so other endpoint tests can reuse it; create a
fixture (e.g., seed_rows or seed) that encapsulates the current logic (importing
save_classification and running the async loop via asyncio.run) and return a
callable that accepts rows, then update tests that currently define _seed to use
the new fixture instead; ensure the fixture calls
src.api.database.save_classification internally and preserves the behaviour of
awaiting save_classification for each row.
🪄 Autofix (Beta)
✅ Autofix completed
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: e7a492a8-e40a-44f2-b3d0-6b57a9ea4048
⛔ Files ignored due to path filters (2)
frontend/package-lock.jsonis excluded by!**/package-lock.jsonuv.lockis excluded by!**/*.lock
📒 Files selected for processing (15)
app.pyfrontend/package.jsonfrontend/src/components/results/PredictionCard.test.tsxfrontend/src/components/upload/DropZone.test.tsxfrontend/src/hooks/useHistory.test.tsxfrontend/src/hooks/useUpload.test.tsxfrontend/src/test/setup.tsfrontend/vitest.config.tspyproject.tomltests/conftest.pytests/test_classifier_contract.pytests/test_database.pytests/test_history_endpoint.pytests/test_predict_endpoint.pytests/test_stats_endpoint.py
| import path from 'path' | ||
|
|
||
| export default defineConfig({ | ||
| plugins: [react()], | ||
| resolve: { | ||
| alias: { | ||
| '@': path.resolve(__dirname, './src'), | ||
| }, |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "Check frontend module type:"
jq -r '.type' frontend/package.json
echo
echo "Check alias implementation in vitest config:"
rg -n "(__dirname|import.meta.url|fileURLToPath|alias)" frontend/vitest.config.ts -C2Repository: tomtranjr/docqflow
Length of output: 261
🏁 Script executed:
cat -n frontend/vitest.config.tsRepository: tomtranjr/docqflow
Length of output: 553
🏁 Script executed:
head -20 frontend/package.json | jq '.keywords, .type, .main, .exports' 2>/dev/null || cat frontend/package.json | head -20Repository: tomtranjr/docqflow
Length of output: 540
🏁 Script executed:
fd -type f "(vite|vitest)" frontend/ -name "*.ts" -o -name "*.js" | head -20Repository: tomtranjr/docqflow
Length of output: 232
🏁 Script executed:
rg "__dirname" frontend/ --type ts --type js -lRepository: tomtranjr/docqflow
Length of output: 111
🏁 Script executed:
rg "defineProperty|globalThis\.__dirname" frontend/ --type ts --type jsRepository: tomtranjr/docqflow
Length of output: 185
🏁 Script executed:
cat -n frontend/vite.config.tsRepository: tomtranjr/docqflow
Length of output: 582
🏁 Script executed:
rg "import\.meta\.url|fileURLToPath" frontend/ -A 2 -B 2Repository: tomtranjr/docqflow
Length of output: 44
🏁 Script executed:
# Check if Node.js version used supports __dirname in ESM config files
grep -E '"node"|engines' frontend/package.jsonRepository: tomtranjr/docqflow
Length of output: 44
🏁 Script executed:
# Check if there are any error logs or test failure reports
find frontend -name "*.log" -o -name ".test-results" -o -name "test-results*" 2>/dev/null | head -10Repository: tomtranjr/docqflow
Length of output: 44
🏁 Script executed:
# Check Node version requirements
cat frontend/package.json | jq '.engines, .packageManager' 2>/dev/null || echo "No engine specs found"Repository: tomtranjr/docqflow
Length of output: 71
🏁 Script executed:
# Check if there are any GitHub issues or comments about alias/vitest failures
git log --oneline --all | grep -i "alias\|vitest\|__dirname" | head -10Repository: tomtranjr/docqflow
Length of output: 121
🏁 Script executed:
# Verify: does Node.js inject __dirname in ESM config files?
# Create a simple test
node --version
node -e "console.log(typeof __dirname)"Repository: tomtranjr/docqflow
Length of output: 76
🏁 Script executed:
# Check if Vite/Node.js provides __dirname in config files
# Test if the config file actually loads without errors
cd frontend && npm --version && npm test -- --help 2>&1 | head -20Repository: tomtranjr/docqflow
Length of output: 133
🏁 Script executed:
# Search for any Vite documentation or comments about __dirname in config
rg -i "vite.*esm|__dirname.*vite|vite.*config" frontend/ --type mdRepository: tomtranjr/docqflow
Length of output: 44
🏁 Script executed:
# Check if there's a .node-version or .nvmrc specifying Node version
cat .node-version .nvmrc 2>/dev/null || echo "No version file"Repository: tomtranjr/docqflow
Length of output: 77
🏁 Script executed:
# Final check: Look for any CI/CD configurations that test the alias
find . -name "*.yml" -o -name "*.yaml" | xargs grep -l "vitest\|vite.*build" 2>/dev/null | head -5Repository: tomtranjr/docqflow
Length of output: 44
🏁 Script executed:
# Check if there are any actual error reports or issues in git history related to the alias
git log --all --grep="alias\|__dirname" --oneline | head -10Repository: tomtranjr/docqflow
Length of output: 121
🏁 Script executed:
# Verify the exact behavior: Does Vite/Node inject __dirname in config files?
# Check Vite and Node.js versions in package.json
jq '.devDependencies | {vite, vitest, node}' frontend/package.json 2>/dev/null || jq '.devDependencies | to_entries[] | select(.key | test("vite|node"))' frontend/package.jsonRepository: tomtranjr/docqflow
Length of output: 122
🌐 Web query:
Vite config __dirname ESM module Node.js v18+
💡 Result:
In Vite config files using ESM syntax (e.g., vite.config.js with "type": "module" in package.json or .mjs extension) on Node.js v18+, dirname is not natively defined because ESM modules do not support CommonJS globals like dirname. Vite provides built-in support for dirname, filename, and import.meta.url in config files by injecting or replacing them during preprocessing, as noted in older docs (Vite v2) and improved in later versions. However, issues arise when "type": "module" is set, causing Node to treat the file strictly as ESM before Vite's transforms fully apply. The standard, reliable polyfill for Node.js ESM (works on v18+) is: import { dirname } from 'node:path' import { fileURLToPath } from 'node:url' const dirname = dirname(fileURLToPath(import.meta.url)) Example in vite.config.js: import { defineConfig } from 'vite' import { dirname } from 'node:path' import { fileURLToPath } from 'node:url' const dirname = dirname(fileURLToPath(import.meta.url)) export default defineConfig({ build: { rollupOptions: { input: { main: fileURLToPath(new URL('./index.html', import.meta.url)) } } } }) For Node.js v20.11.0+ / v21.2.0+, you can use the native import.meta.dirname directly. Recent Vite versions (e.g., v5.2+) added explicit support for import.meta.dirname/import.meta.filename in config files via polyfill injection. To avoid issues: - Use vite.config.mjs extension (forces ESM). - Or rename to vite.config.cjs for CommonJS (native dirname). - Prefer new URL('./file.html', import.meta.url) over path.resolve(dirname, ...). This polyfill is widely recommended in Vite issues and Node.js docs.
Citations:
- 1: "__dirname is not defined in ES module scope" in vite.config.js with module type in package.json vitejs/vite#6946
- 2: feat: better config
__dirnamesupport vitejs/vite#8442 - 3: https://stackoverflow.com/questions/46745014/alternative-for-dirname-in-node-js-when-using-es6-modules
- 4: Docs for MPA config incorrectly suggest using __dirname vitejs/vite#9189
- 5: https://v2.vite.dev/config
- 6:
import.meta.dirnameCrashes in Node 18 rmp135/sql-ts#143 - 7: esm: add import.meta.dirname and import.meta.filename nodejs/node#48740
- 8: feat(config):
import.meta.filename/dirnamesupport vitejs/vite#15888 - 9: https://github.com/vitejs/vite/blob/8e05b61d/packages/vite/CHANGELOG.md
- 10: https://github.com/vitejs/vite/blob/main/packages/vite/CHANGELOG.md
🌐 Web query:
"import.meta.url" Vite config alias path.resolve ESM
💡 Result:
In Vite configuration files (vite.config.js/ts/mjs), which run in an ESM context, use import.meta.url with Node.js url module utilities to resolve absolute paths for resolve.alias entries. This replaces the CommonJS dirname equivalent. Standard pattern for aliasing @ to ./src: import { defineConfig } from 'vite' import { fileURLToPath, URL } from 'node:url' export default defineConfig({ resolve: { alias: { '@': fileURLToPath(new URL('./src', import.meta.url)) } } }) This resolves ./src relative to the config file's directory to an absolute filesystem path, ensuring cross-platform compatibility. Use array format for regex or custom resolvers: resolve: { alias: [ { find: '@', replacement: fileURLToPath(new URL('./src', import.meta.url)) } ] } For TypeScript path mapping sync, mirror in tsconfig.json: { "compilerOptions": { "paths": { "@/": ["./src/"] } } } Vite officially supports import.meta.url, import.meta.dirname, import.meta.filename, and (recently, as of late 2025) import.meta.resolve in ESM configs for path resolution. Avoid path.resolve(dirname, ...) as dirname is undefined in ESM without polyfill. Plugins like vite-tsconfig-paths can automate syncing tsconfig paths to Vite aliases.
Citations:
- 1: https://openillumi.com/en/en-vite-vue3-alias-config-fix/
- 2: https://vueschool.io/articles/vuejs-tutorials/import-aliases-in-vite/
- 3: https://vite.dev/config/shared-options
- 4: https://futurestud.io/tutorials/vite-create-resolve-aliases-for-imports-like-the-symbol
- 5: https://vueschool.io/articles/vuejs-tutorials/import-aliases-in-vite
__dirname is undefined in ESM config; use fileURLToPath with import.meta.url instead.
Line 9 uses path.resolve(__dirname, './src') but __dirname is not available in ESM modules (per Node.js and Vite semantics). While Vite may provide some polyfill support, the standard and reliable approach is fileURLToPath(new URL('./src', import.meta.url)).
Note: The same issue exists in frontend/vite.config.ts line 10 and should be fixed there as well.
🔧 Proposed fix
import { defineConfig } from 'vitest/config'
import react from '@vitejs/plugin-react'
-import path from 'path'
+import { fileURLToPath, URL } from 'node:url'
export default defineConfig({
plugins: [react()],
resolve: {
alias: {
- '@': path.resolve(__dirname, './src'),
+ '@': fileURLToPath(new URL('./src', import.meta.url)),
},
},🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@frontend/vitest.config.ts` around lines 3 - 10, In the ESM vitest config
replace the use of __dirname in the alias resolution (the path.resolve call that
sets alias '@') with a fileURLToPath(new URL('./src', import.meta.url)) pattern:
import fileURLToPath (and URL) from 'url' and compute the srcDir from
import.meta.url, then use that value in the alias instead of
path.resolve(__dirname, './src'); apply the same change in
frontend/vite.config.ts for its alias resolution so both configs use
fileURLToPath/import.meta.url rather than __dirname.
|
Note Autofix is a beta feature. Expect some limitations and changes as we gather feedback and continue to improve it. Fixes Applied SuccessfullyFixed 5 file(s) based on 4 unresolved review comments. Files modified:
Commit: The changes have been pushed to the Time taken: |
Fixed 5 file(s) based on 4 unresolved review comments. Co-authored-by: CodeRabbit <noreply@coderabbit.ai>
Summary
JSON fallback, SQL wildcard escaping
Test plan
uv run pytest tests/— 13 passed locallySummary by CodeRabbit
Bug Fixes
Tests
Chores