test(ci): make full test suite reliable across all packages#437
Merged
Conversation
Comment on lines
+63
to
+97
| runs-on: ubuntu-latest | ||
| timeout-minutes: 15 | ||
| steps: | ||
| - uses: actions/checkout@v4 | ||
|
|
||
| - id: submodule | ||
| uses: ./.github/actions/setup-submodule | ||
| with: | ||
| deploy-key: ${{ secrets.OWLETTO_WEB_DEPLOY_KEY }} | ||
|
|
||
| - uses: oven-sh/setup-bun@v2 | ||
| with: | ||
| bun-version: 1.3.5 | ||
|
|
||
| - name: Install dependencies | ||
| if: steps.submodule.outputs.stubbed != 'true' | ||
| run: bun install | ||
|
|
||
| - name: Build owletto-sdk (owletto-web imports its compiled dist) | ||
| if: steps.submodule.outputs.stubbed != 'true' | ||
| run: cd packages/owletto-sdk && bun run build | ||
|
|
||
| - name: owletto-web tests (vitest) | ||
| # owletto-web doesn't define a `test` script in its package.json yet | ||
| # (that change ships as a separate submodule PR). Invoke vitest | ||
| # directly via the workspace-installed binary so this works on | ||
| # whatever submodule SHA the parent repo points at. | ||
| if: steps.submodule.outputs.stubbed != 'true' | ||
| run: cd packages/owletto-web && ../../node_modules/.bin/vitest run | ||
|
|
||
| # Backend integration tests need a real Postgres + pgvector. We run them | ||
| # under Node (not bun) for two reasons: (1) the vitest suite uses Node-only | ||
| # APIs in places, and (2) the sandbox runtime requires isolated-vm which | ||
| # is a V8 native addon that bun cannot load. | ||
| integration: |
b97d7d5 to
33c9b66
Compare
Contributor
|
Triage decision: Reasons:
Next: Human review required for workflow changes. PR has been assigned to buremba for review. |
buremba
added a commit
that referenced
this pull request
Apr 28, 2026
… add packages Codex review of #437 surfaced concrete issues; addressed each: - **classifier handler**: fixed two bugs in manage_classifiers.handleCreate — missing organization_id on event_classifiers INSERT, and stale string literal 'api' for created_by (FK to user.id). Pass ctx through routeAction and use ctx.userId. Un-skipped the classifier integration test. - **TestApiClient ToolContext**: now sets tokenType, scopedToOrg, and allowCrossOrg so handlers gating on those fields (e.g. cross-org reads via client.org()) see realistic values. - **TestMcpClient comment**: dropped the misleading "both layers share client.entities, etc." line — only TestApiClient exposes the SDK namespaces. The two layers are deliberately not interchangeable. - **tools/list skips**: the 5 anonymous/auth'd tools/list cases are now rewritten to use mcpListTools() (which does the initialize handshake first per spec) instead of being skipped. Added cookie + orgSlug forwarding to mcpListTools/mcpRequest so the cookie-auth case works. - **403-vs-400 skip**: un-skipped the public-org REST mutation test with the correct 403 expectation. The 403 is right (forbidden, not malformed). - **JSON-RPC org-context test**: rewrote to assert the actual auth-layer behavior — anonymous tools/call returns HTTP 401 with a WWW-Authenticate challenge, not a JSON-RPC -32001 error. - **CI scope**: added packages/owletto-cli + packages/owletto-worker to the unit job. owletto-openclaw e2e tests need a live backend (docker compose) and stay in smoke-example.yml. - **biome ignore**: ignore .connector-child-*.mjs subprocess temp files the owletto-worker integration tests leave behind. Net: 471 vitest tests pass (was 462), 1114 bun tests pass (was 1106). The PAT FK skip stays — pre-existing schema bug, out of scope.
buremba
added a commit
that referenced
this pull request
Apr 28, 2026
… add packages Codex review of #437 surfaced concrete issues; addressed each: - **classifier handler**: fixed two bugs in manage_classifiers.handleCreate — missing organization_id on event_classifiers INSERT, and stale string literal 'api' for created_by (FK to user.id). Pass ctx through routeAction and use ctx.userId. Un-skipped the classifier integration test. - **TestApiClient ToolContext**: now sets tokenType, scopedToOrg, and allowCrossOrg so handlers gating on those fields (e.g. cross-org reads via client.org()) see realistic values. - **TestMcpClient comment**: dropped the misleading "both layers share client.entities, etc." line — only TestApiClient exposes the SDK namespaces. The two layers are deliberately not interchangeable. - **tools/list skips**: the 5 anonymous/auth'd tools/list cases are now rewritten to use mcpListTools() (which does the initialize handshake first per spec) instead of being skipped. Added cookie + orgSlug forwarding to mcpListTools/mcpRequest so the cookie-auth case works. - **403-vs-400 skip**: un-skipped the public-org REST mutation test with the correct 403 expectation. The 403 is right (forbidden, not malformed). - **JSON-RPC org-context test**: rewrote to assert the actual auth-layer behavior — anonymous tools/call returns HTTP 401 with a WWW-Authenticate challenge, not a JSON-RPC -32001 error. - **CI scope**: added packages/owletto-cli + packages/owletto-worker to the unit job. owletto-openclaw e2e tests need a live backend (docker compose) and stay in smoke-example.yml. - **biome ignore**: ignore .connector-child-*.mjs subprocess temp files the owletto-worker integration tests leave behind. Net: 471 vitest tests pass (was 462), 1114 bun tests pass (was 1106). The PAT FK skip stays — pre-existing schema bug, out of scope.
0f101f8 to
193a2ab
Compare
Contributor
|
Triage decision: Reasons:
Next: Human review required for infrastructure changes. The PR has been assigned to @buremba for manual review and merge. |
- Split CI into unit / frontend / integration jobs so unit failures surface
in <2 min instead of waiting behind Postgres setup.
- Drop the worker cherry-pick (8 of 18 files); full 187-test suite passes.
- Wire owletto-web's 14 vitest files into CI via the workspace binary.
- Stand up a real pgvector/pg16 service for the owletto-backend integration
job so the vitest suite (previously 0 of 38 in CI) runs end-to-end under
Node 22. Add a fail-fast pgvector extension check.
Compose-able test client primitives (packages/owletto-backend/src/__tests__/setup/test-mcp-client.ts):
- TestMcpClient — full HTTP/JSON-RPC round-trip, exercises auth, session
init, and the isolated-vm sandbox. Adds orgSlug pinning so tools that
need org context land on /mcp/{slug} instead of unscoped /mcp.
- TestApiClient — direct namespace-handler calls. Same fluent surface
(entities, watchers, classifiers, knowledge, …), no HTTP hop, no sandbox.
Replace 33 stale post-#348 manage_* integration tests with 8 focused new
files using the primitives:
entity-schema/entity-types, entities/entity-crud, watchers/watchers-crud,
classifiers/classifiers-crud, cross-org/isolation, mcp/auth-wire,
sandbox/execute-wire. Same conceptual coverage in 1/3 the line count.
Sync 30 columns of QUERYABLE_SCHEMA drift detected by the existing
table-schema test — the test was a real bug catcher; the columns now
flow through query_sql validation.
Other fixes folded in: worktree-relative paths in migrations-format,
owletto-guidance-sync, connector-catalog; stale URL-builder assertions;
search-cross-org's missing workspace-provider init and stale .entities
field reference; mcp-install-targets label drift.
Test status (post-PR, on Linux Node 22):
37 owletto-backend vitest files / 389 tests pass
187 worker tests pass (full suite)
76 owletto-web tests pass
720 core/gateway/cli tests pass
5 skipped: 2 pre-existing bugs (PAT FK, classifier org_id), 3 documented
The whatsapp connector imports `npm:baileys@7.0.0-rc.9` (Deno-style), which is rewritten by the connector compiler at install time. CI runs unit tests via `bun test` without the compiler, so the import fails to resolve. The test was never actually running in CI before — it landed under my expanded `bun test packages/owletto-backend/src/__tests__/unit` pattern, which exposed it. Gate the `describe`s on `RUN_CONNECTOR_TESTS=1` and short-circuit the beforeAll so the import doesn't run on CI. The integration test (whatsapp-entity-links.test.ts) covers the broader entityLinks surface.
… add packages Codex review of #437 surfaced concrete issues; addressed each: - **classifier handler**: fixed two bugs in manage_classifiers.handleCreate — missing organization_id on event_classifiers INSERT, and stale string literal 'api' for created_by (FK to user.id). Pass ctx through routeAction and use ctx.userId. Un-skipped the classifier integration test. - **TestApiClient ToolContext**: now sets tokenType, scopedToOrg, and allowCrossOrg so handlers gating on those fields (e.g. cross-org reads via client.org()) see realistic values. - **TestMcpClient comment**: dropped the misleading "both layers share client.entities, etc." line — only TestApiClient exposes the SDK namespaces. The two layers are deliberately not interchangeable. - **tools/list skips**: the 5 anonymous/auth'd tools/list cases are now rewritten to use mcpListTools() (which does the initialize handshake first per spec) instead of being skipped. Added cookie + orgSlug forwarding to mcpListTools/mcpRequest so the cookie-auth case works. - **403-vs-400 skip**: un-skipped the public-org REST mutation test with the correct 403 expectation. The 403 is right (forbidden, not malformed). - **JSON-RPC org-context test**: rewrote to assert the actual auth-layer behavior — anonymous tools/call returns HTTP 401 with a WWW-Authenticate challenge, not a JSON-RPC -32001 error. - **CI scope**: added packages/owletto-cli + packages/owletto-worker to the unit job. owletto-openclaw e2e tests need a live backend (docker compose) and stay in smoke-example.yml. - **biome ignore**: ignore .connector-child-*.mjs subprocess temp files the owletto-worker integration tests leave behind. Net: 471 vitest tests pass (was 462), 1114 bun tests pass (was 1106). The PAT FK skip stays — pre-existing schema bug, out of scope.
3361831 to
8221e6c
Compare
Contributor
|
Triage decision: Reasons:
Next: Human review required for infrastructure changes. The PR has been assigned to @buremba for manual review and merge. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Going from "CI runs ~75 tests of 195" to "CI runs everything reliably". Codex-reviewed.
What CI runs now (vs. before):
CI is split into
unit/frontend/integrationjobs so unit failures surface in <2 min instead of waiting behind Postgres setup.Composable test primitives
packages/owletto-backend/src/__tests__/setup/test-mcp-client.ts— two layers, same fluent surface (entities,watchers,classifiers, …):TestMcpClient— full HTTP/JSON-RPC round-trip, exercises auth, session init, sandbox.orgSlugpinning so org-scoped tools land on/mcp/{slug}.TestApiClient— direct namespace handlers, no HTTP/sandbox. Fast white-box CRUD and access-control tests.This is the pattern future integration tests should follow — see
entity-schema/entity-types.test.tsas the canonical example.Replaced 33 stale
manage_*tests with 8 focused new filesThe 33 deleted files all called
manage_entity_schema,manage_entity,manage_watchers, etc. — tools removed in #348. New files cover the same conceptual surface in 1/3 the line count:entity-schema/entity-types.test.tsentities/entity-crud.test.tswatchers/watchers-crud.test.tsclassifiers/classifiers-crud.test.tscross-org/isolation.test.tsmcp/auth-wire.test.tssandbox/execute-wire.test.tsReal bug found and fixed
The existing
table-schema.test.tswas a real schema-drift detector — and was finding 30 columns that had been added to the DB but not registered inQUERYABLE_SCHEMA, which means valid SQL JOINs were being rejected. Synced now.Documented limitations
it.skip+ comments: 2 pre-existing bugs (PAT FK to oauth_clients, classifier-create missing organization_id), 3 sandbox tests that need Node 22's isolated-vm prebuild (CI pins Node 22 — they pass there).testscript + jsdom setup left for the submodule PR; CI invokes vitest directly via the workspace binary so no submodule bump is required for this PR.Test plan
skip-size-checklabel (PR is mostly deletions: -10398 / +1092)unitjob (core/gateway/cli/worker/owletto-backend bun units)frontendjob (owletto-web vitest)integrationjob (owletto-backend vitest under Node 22 + pgvector/pg16)format-lint,typecheck,migrations*.test.tsunderintegration/<area>/, instantiateTestApiClient.for(...), assert)