feat: feat(api): Add RFC9457-compliant error handling and state machines#139
Conversation
…s monitoring, hash list management, presigned upload workflow, RFC9457 middleware extension, resource cleanup job, resource file CRUD, results and batch operations, state machine classes, and template import/export functionality. Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
- Fix type errors in RFC9457 middleware and exception handler by using explicit dict type annotations and dict comprehensions - Add ClassVar annotations to state machine class attributes - Separate domain exceptions from API exceptions: services now raise InvalidStateTransitionError, Web API converts to HTTPException, Control API uses RFC9457 middleware - Update unit tests to expect InvalidStateTransitionError from services - Fix frontend eslint no-var false positive in TypeScript global augmentation Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Bump python-multipart 0.0.21 → 0.0.22 (fixes CVE arbitrary file write) - Bump pypdf 6.6.0 → 6.6.2 (fixes infinite loop vulnerability) Resolves Dependabot alerts #27 and #28. Note: Alert #26 (ecdsa timing attack) remains open but is mitigated by using python-jose[cryptography] which performs crypto ops via cryptography library, not ecdsa. Alert #15 (js-yaml) is already at fixed version 4.1.1. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds Control-plane RFC9457 Problem Details handling, declarative Campaign/Attack state machines with transition validation, pending-resource cleanup (helpers, periodic task, cancel endpoint), state-machine integration into services/handlers, new tests, and extensive Control API docs and specs. Changes
Sequence Diagram(s)sequenceDiagram
rect rgba(220,235,255,0.5)
participant Client as Client / Control API
participant API as Control Resources API
participant Service as Resource Service
participant DB as Database
participant Storage as MinIO
end
Client->>API: DELETE /api/v1/control/resources/{id}/cancel
API->>Service: cancel_pending_resource(resource_id, db, user)
Service->>DB: SELECT ... FOR UPDATE SKIP LOCKED (resource by id)
DB-->>Service: AttackResourceFile
Service->>Service: Validate access & ensure is_uploaded == False
Service->>Storage: Delete object if exists
Storage-->>Service: OK / NotFound
Service->>DB: DELETE resource row
DB-->>Service: OK
Service-->>API: success
API-->>Client: 204 No Content
sequenceDiagram
rect rgba(235,220,255,0.5)
participant Scheduler as App Lifespan
participant Task as Periodic Cleanup Task
participant DB as Database
participant Service as Resource Service
participant Storage as MinIO
end
Scheduler->>Task: start run_periodic_cleanup()
loop every RESOURCE_CLEANUP_INTERVAL_HOURS
Task->>DB: open fresh session
Task->>Service: cleanup_stale_pending_resources(db)
Service->>DB: SELECT is_uploaded=false AND created_at < cutoff FOR UPDATE SKIP LOCKED
DB-->>Service: list of stale resources
loop each resource
Service->>Storage: delete object if present
Service->>DB: delete resource row
end
Service-->>Task: {deleted:N, errors:M}
Task->>DB: commit & close
end
sequenceDiagram
rect rgba(255,235,220,0.5)
participant Client as HTTP Client
participant Router as Control Router
participant Service as Domain Service / State Machine
participant Middleware as Control RFC9457 Middleware
participant Response as Client Response
end
Client->>Router: POST /api/v1/control/campaigns/{id}/start
Router->>Service: validate_action(current_state, "start")
alt valid
Service-->>Router: target_state
Router->>Service: perform start
Router-->>Response: 200 OK
else invalid
Service-->>Router: raises InvalidStateTransitionError
Router->>Middleware: exception propagated
Middleware->>Middleware: build problem+json (type,title,status,detail,extensions)
Middleware-->>Response: application/problem+json (e.g., 409)
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~55 minutes Possibly related PRs
Poem
Caution Pre-merge checks failedPlease resolve all errors before merging. Addressing warnings is optional.
❌ Failed checks (1 error)
✅ Passed checks (4 passed)
Comment |
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
There was a problem hiding this comment.
Pull request overview
This PR advances the Control API foundation by standardizing error responses to RFC9457 Problem Details, adding a manual cancel endpoint for pending resource uploads (plus a periodic cleanup task), and enforcing campaign/attack lifecycle transitions via dedicated state machines.
Changes:
- Add RFC9457 error normalization for Control API (middleware + HTTPException handler) and introduce a state-transition Problem type with extension fields.
- Implement resource upload cancellation and periodic cleanup of stale pending resources, configurable via settings.
- Add Campaign/Attack state machine classes and integrate them into service-layer lifecycle operations, with expanded unit/integration tests and supporting Control API epic docs.
Reviewed changes
Copilot reviewed 38 out of 39 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
| uv.lock | Bumps locked dependency versions (python-multipart, pypdf). |
| pyproject.toml | Updates python-multipart minimum version. |
| frontend/vitest-setup-client.ts | Adds eslint suppression for TS global augmentation. |
| app/models/attack.py | Adds PAUSED to AttackState. |
| app/main.py | Starts/stops periodic resource cleanup task in lifespan. |
| app/core/tasks/resource_tasks.py | Adds stale-pending-resource cleanup and periodic runner. |
| app/core/state_machines.py | Introduces Campaign/Attack state machines + transition validation errors. |
| app/core/services/resource_service.py | Adds per-resource cleanup helper + manual cancel logic. |
| app/core/services/campaign_service.py | Uses CampaignStateMachine to validate lifecycle actions. |
| app/core/services/attack_service.py | Uses AttackStateMachine to validate abort transition on delete. |
| app/core/control_rfc9457_middleware.py | Extends Control RFC9457 middleware to handle HTTPException + state-transition Problem extensions. |
| app/core/control_exceptions.py | Adds InvalidResourceStateError + InvalidStateTransitionProblem. |
| app/core/config.py | Adds cleanup job interval/age settings. |
| app/api/v1/endpoints/web/campaigns.py | Converts InvalidStateTransitionError into HTTP 400 for web endpoints. |
| app/api/v1/endpoints/control/router.py | Registers new Control resources router. |
| app/api/v1/endpoints/control/resources.py | Adds Control endpoint to cancel pending resource uploads. |
| app/api/v1/endpoints/agent/v1_http_exception_handler.py | Adds RFC9457 conversion for Control API HTTPException paths. |
| tests/unit/test_state_machines.py | Adds comprehensive unit tests for state machine behavior and RFC9457 conversion. |
| tests/unit/test_campaign_service.py | Adds service-level tests asserting invalid campaign transitions are rejected. |
| tests/unit/test_attack_service.py | Adds service-level tests for abort/delete behavior and invalid transitions. |
| tests/integration/control/test_control_error_handling.py | Adds integration tests for RFC9457 formatting + HTTPException normalization + path scoping. |
| docs/epics/control-api/tickets/Template_Import_Export.md | Adds planning ticket doc for template endpoints/behavior. |
| docs/epics/control-api/tickets/State_Machine_Classes.md | Adds planning ticket doc for state machine approach. |
| docs/epics/control-api/tickets/Results_&_Batch_Operations.md | Adds planning ticket doc for results/batch operations. |
| docs/epics/control-api/tickets/Resource_File_CRUD.md | Adds planning ticket doc for resource CRUD endpoints. |
| docs/epics/control-api/tickets/Resource_Cleanup_Job.md | Adds planning ticket doc for periodic cleanup + cancel endpoint. |
| docs/epics/control-api/tickets/RFC9457_Middleware_Extension.md | Adds planning ticket doc for HTTPException normalization strategy. |
| docs/epics/control-api/tickets/Presigned_Upload_Workflow.md | Adds planning ticket doc for initiate/confirm upload flow. |
| docs/epics/control-api/tickets/Hash_List_CRUD.md | Adds planning ticket doc for hash list CRUD endpoints. |
| docs/epics/control-api/tickets/Campaign_Status_&_Metrics.md | Adds planning ticket doc for monitoring/caching patterns. |
| docs/epics/control-api/tickets/Campaign_Lifecycle_Actions.md | Adds planning ticket doc for lifecycle endpoints and validation. |
| docs/epics/control-api/tickets/Campaign_CRUD_&_Validation.md | Adds planning ticket doc for campaign CRUD + preflight validation. |
| docs/epics/control-api/tickets/Attack_Lifecycle_&_Reordering.md | Adds planning ticket doc for attack lifecycle + reordering. |
| docs/epics/control-api/tickets/Attack_CRUD_&_Validation.md | Adds planning ticket doc for attack CRUD + validation/estimation. |
| docs/epics/control-api/tickets/Agent_&_Task_Monitoring.md | Adds planning ticket doc for agent/task monitoring endpoints. |
| docs/epics/control-api/specs/Tech_Plan__Control_API_Completion.md | Adds broader architectural tech plan for Control API completion. |
| docs/epics/control-api/specs/Epic_Brief__Control_API_Completion.md | Adds epic brief describing motivation/success criteria. |
| docs/epics/control-api/specs/Core_Flows__Control_API_User_Journeys.md | Adds detailed user journeys/flows for Control API usage. |
| except HTTPException as exc: | ||
| # Convert HTTPException to RFC9457 format for Control API routes | ||
| title = _STATUS_TITLES.get(exc.status_code, "HTTP Error") | ||
|
|
||
| # Build base problem response | ||
| problem: dict = { | ||
| "type": "about:blank", | ||
| "title": title, | ||
| "status": exc.status_code, | ||
| "instance": str(request.url.path), | ||
| } | ||
|
|
||
| # Support problem extensions when detail is a dictionary | ||
| if isinstance(exc.detail, dict): | ||
| detail_dict: dict[str, Any] = exc.detail | ||
| # Merge extension fields, but preserve required RFC9457 fields | ||
| extensions = { | ||
| key: value | ||
| for key, value in detail_dict.items() | ||
| if key not in ("type", "title", "status", "instance") | ||
| } | ||
| problem.update(extensions) | ||
| # Always set detail: use 'detail' from dict if present, otherwise fallback to title | ||
| problem["detail"] = detail_dict.get("detail", title) | ||
| else: | ||
| problem["detail"] = str(exc.detail) if exc.detail else title | ||
|
|
||
| return JSONResponse( | ||
| status_code=exc.status_code, | ||
| content=problem, | ||
| headers={"Content-Type": "application/problem+json"}, | ||
| ) |
There was a problem hiding this comment.
ControlRFC9457Middleware now converts HTTPException to RFC9457, but the app also registers a global HTTPException handler (v1_http_exception_handler) that already converts /api/v1/control/* exceptions to RFC9457. Keeping both conversion paths risks divergence and may make this middleware branch effectively unreachable depending on middleware order. Consider consolidating HTTPException→RFC9457 conversion into a single mechanism (middleware or exception handler).
| except InvalidStateTransitionError as e: | ||
| raise HTTPException( | ||
| status_code=status.HTTP_400_BAD_REQUEST, | ||
| detail=f"Cannot start an {e.from_state.value} campaign.", | ||
| ) from e |
There was a problem hiding this comment.
These error messages hardcode the article "an" (e.g., "Cannot start an {state} campaign"), which is grammatically incorrect for some states (e.g., "completed"). Consider removing the article or generating it based on the state to avoid awkward messages.
| responses={ | ||
| 204: {"description": "Resource cancelled successfully"}, | ||
| 400: { | ||
| "description": "Invalid resource state - resource is already uploaded", | ||
| "content": { | ||
| "application/problem+json": { | ||
| "example": { | ||
| "type": "about:blank", | ||
| "title": "Invalid Resource State", | ||
| "status": 400, | ||
| "detail": "Cannot cancel resource that is already uploaded. Use DELETE to remove uploaded resources.", | ||
| } | ||
| } | ||
| }, | ||
| }, | ||
| 403: { | ||
| "description": "User does not have access to the resource's project", | ||
| "content": { | ||
| "application/problem+json": { | ||
| "example": { | ||
| "type": "about:blank", | ||
| "title": "Project Access Denied", | ||
| "status": 403, | ||
| "detail": "User does not have access to project 123", | ||
| } | ||
| } | ||
| }, | ||
| }, | ||
| 404: { | ||
| "description": "Resource not found", | ||
| "content": { | ||
| "application/problem+json": { | ||
| "example": { | ||
| "type": "about:blank", | ||
| "title": "Resource Not Found", | ||
| "status": 404, | ||
| "detail": "Resource 123e4567-e89b-12d3-a456-426614174000 not found", | ||
| } |
There was a problem hiding this comment.
The OpenAPI responses examples use "type": "about:blank" for 400/403/404 cases, but the Control API's custom Problem classes (e.g., ResourceNotFoundError, InvalidResourceStateError) emit a kebab-case type string (per the Control error-handling tests). Updating these examples to match the actual response shape will prevent clients from being misled by the docs.
| class AttackState(str, Enum): | ||
| PENDING = "pending" | ||
| RUNNING = "running" | ||
| PAUSED = "paused" | ||
| COMPLETED = "completed" |
There was a problem hiding this comment.
Adding AttackState.PAUSED changes the SQLAlchemy Enum backing type. On Postgres this typically requires an enum ALTER / migration; otherwise persisting or querying rows with the new value can fail at runtime ("invalid input value for enum"). Please add the corresponding DB migration (or ensure the column is stored as a non-native enum) before relying on this state in services/state machines.
| # Check project access | ||
| if resource.project_id: | ||
| accessible_projects = [ | ||
| assoc.project_id for assoc in (current_user.project_associations or []) | ||
| ] | ||
| if ( | ||
| resource.project_id not in accessible_projects | ||
| and not current_user.is_superuser | ||
| ): | ||
| raise ProjectAccessDeniedError( |
There was a problem hiding this comment.
cancel_pending_resource() only enforces project access checks when resource.project_id is set. Since AttackResourceFile.project_id is nullable, a pending unrestricted resource (project_id=None) could be cancelled by any authenticated user via the Control API. Please add an explicit authorization rule for unrestricted resources (e.g., superuser/admin only) before allowing deletion.
…rom review Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
There was a problem hiding this comment.
Actionable comments posted: 19
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
app/main.py (1)
289-291:⚠️ Potential issue | 🟡 MinorDuplicate exception handler registration.
Line 291 registers the same
HTTPExceptionhandler that was already registered at line 206. This second registration is redundant and should be removed to avoid confusion.Proposed fix
# Register v1 error handler for all /api/v1/client/* and /api/v1/agent/* endpoints (contract compliance) # The handler passes any non-Agent API endpoints to the default handler -app.add_exception_handler(HTTPException, v1_http_exception_handler)app/core/services/campaign_service.py (1)
457-460:⚠️ Potential issue | 🟠 MajorEnforce project_id scoping in campaign state transitions.
These services load campaigns by ID only, which breaks multi-tenancy if IDs are guessed. Pass project_id (or user context) into the service and filter the queries accordingly.As per coding guidelines, "app/**/*.py: Always filter data queries by project_id for project isolation and multi-tenancy" and "app/**/*service.py: Enforce project-level isolation for all data access in multi-tenant system".Proposed fix (apply similarly to stop and archive)
-async def start_campaign_service(campaign_id: int, db: AsyncSession) -> CampaignRead: +async def start_campaign_service( + campaign_id: int, + project_id: int, + db: AsyncSession, +) -> CampaignRead: @@ - result = await db.execute(select(Campaign).where(Campaign.id == campaign_id)) + result = await db.execute( + select(Campaign).where( + Campaign.id == campaign_id, + Campaign.project_id == project_id, + ) + )Also applies to: 493-496, 588-591
🤖 Fix all issues with AI agents
In `@app/api/v1/endpoints/control/resources.py`:
- Around line 81-95: Update the docstring for the "Cancel a pending resource
upload" endpoint to include structured sections: Args (describe parameters such
as request path param resource_id and auth/user context), Returns (204 No
Content on success), and Raises (HTTPException cases for 403/404/409 or any
storage/db deletion errors). Make sure to mention preconditions checked (project
access verification and is_uploaded=False), side-effects (MinIO object deletion
and DB record deletion), and any error conditions thrown by the deletion logic
so generated docs and readers can see inputs, outputs, and exceptions clearly.
In `@app/api/v1/endpoints/web/campaigns.py`:
- Line 52: The InvalidStateTransitionError import is fine, but the error detail
strings that build messages like "an {state}" should be made neutral and
grammatically correct for any state; locate where InvalidStateTransitionError is
raised in this module (e.g., the places around the current import and the two
blocks around lines referenced) and change the message construction to use a
neutral phrase such as "state '{state}'" or "the {state} state" (for example:
raise InvalidStateTransitionError(detail=f"Cannot transition to state
'{new_state}'") ) so the wording reads correctly for all state names.
In `@app/core/config.py`:
- Around line 196-204: Add validation bounds to the new Pydantic Fields to
prevent zero/negative values: update RESOURCE_CLEANUP_INTERVAL_HOURS and
RESOURCE_CLEANUP_AGE_HOURS (the Field definitions) to include appropriate ge
constraints (e.g., ge=1) and a sensible description update; this ensures bad env
vars like 0 or negatives are rejected at model validation time and prevents
immediate/fast cleanup loops.
In `@app/core/control_rfc9457_middleware.py`:
- Around line 149-151: The middleware currently re-raises all Exception types
which then bypasses v1_http_exception_handler; update the except block inside
the control_rfc9457_middleware middleware to only catch HTTPException (from
fastapi) and re-raise it, and for any other Exception convert it into a
sanitized fastapi.HTTPException(status_code=500, detail="Internal server error")
before raising so no internal stack traces are exposed; alternatively add an
app-level catch-all that maps non-HTTPException errors to a sanitized
HTTPException and ensure v1_http_exception_handler still handles HTTPException
types.
In `@app/core/services/attack_service.py`:
- Around line 1104-1106: The delete endpoint currently only catches
AttackNotFoundError and lets InvalidStateTransitionError bubble up causing a
500; update the endpoint that calls delete_attack_service to also catch
InvalidStateTransitionError (import from app.core.state_machines) and re-raise
an HTTPException with status_code=status.HTTP_400_BAD_REQUEST and a descriptive
detail like "Cannot abort an {e.from_state.value} attack."; keep the existing
AttackNotFoundError handling (map to 404) and ensure you import
InvalidStateTransitionError and use the same pattern as the campaign endpoints.
In `@app/core/services/resource_service.py`:
- Around line 1796-1801: Logging currently includes current_user.email which may
be PII; change the structured log to use a non-PII identifier (e.g.,
current_user.id or current_user.user_id) instead. Update the logger.bind call in
the manual cancel flow (the block that binds resource_id, user_email,
project_id, action="manual_cancel") to replace user_email=current_user.email
with user_id=str(current_user.id) (or the canonical user ID attribute), and
adjust the key name from "user_email" to "user_id" so logs no longer contain
emails but still tie actions to a user identifier.
- Around line 1718-1727: The delete currently calls await db.delete(resource)
but never commits, which can leave the deletion unpersisted; after await
db.delete(resource) add an explicit await db.commit() (or otherwise ensure the
transaction is committed) before logging and returning True so the removal won't
be rolled back; update the code around db.delete(resource) / logger.bind(...) to
commit the transaction (or add a clear docstring comment if you decide the
caller should be responsible).
In `@app/core/state_machines.py`:
- Around line 134-140: Replace the non-ASCII Unicode arrow characters in the
state transition docstrings with ASCII "->" sequences; locate the "Valid
Transitions:" sections in app/core/state_machines.py (the docstring blocks that
list transitions like "DRAFT → ACTIVE (start)") and replace each "→" with "->"
(also update the other occurrence around lines 262-268). Ensure both docstring
blocks and any similar transition lists in the module use only ASCII characters
and keep the original wording/parentheses intact.
- Around line 377-386: The method is_terminal_state currently treats unknown
AttackState values as terminal due to using cls.TRANSITIONS.get(state, []);
change it to first check whether state is a known key in cls.TRANSITIONS and
raise a clear exception (e.g., KeyError or ValueError) if it's unmapped,
otherwise return whether the associated transition list is empty; update the
logic in is_terminal_state and reference cls.TRANSITIONS and the AttackState
argument to locate and fix the check.
In `@app/models/attack.py`:
- Line 24: The Attack model is missing a project_id which breaks multi-tenant
isolation; add a new field project_id: Mapped[int] to the Attack SQLAlchemy
model (class Attack) with a ForeignKey to the projects table (same pattern used
by Campaign, HashList, AttackResourceFile), make it non-nullable, then update
every place that queries by campaign_id (e.g., methods/functions that reference
Attack and filter by campaign_id) to also filter by project_id; finally, create
an Alembic migration that adds the non-nullable project_id column (populate it
for existing rows if needed or add a default/temporary nullable -> backfill ->
set non-nullable) and include the FK constraint so migrations remain consistent.
In `@docs/epics/control-api/specs/Tech_Plan__Control_API_Completion.md`:
- Line 31: Replace all non-ASCII punctuation in
Tech_Plan__Control_API_Completion.md (e.g., curly quotes “ and ”) with ASCII
equivalents (straight quotes " and ') throughout the document; scan for other
non-ASCII punctuation characters (em dashes, bullets, emojis) and normalize them
to ASCII counterparts, ensuring phrases like “Running” vs “Queued” use straight
quotes and no emojis remain, and then run a quick lint/encoding check to confirm
the file uses plain ASCII punctuation only.
In `@docs/epics/control-api/tickets/Campaign_CRUD_`&_Validation.md:
- Around line 66-80: The code block showing the validation response sample uses
the wrong language identifier (`python`) even though the payload is JSON (e.g.,
the object starting with "valid": false and the errors/warnings arrays); update
that fenced code block to use the `json` identifier so syntax highlighting and
boolean literals render correctly (replace the opening ```python with ```json
for the JSON response example in Campaign_CRUD_&_Validation.md).
In `@docs/epics/control-api/tickets/Campaign_Lifecycle_Actions.md`:
- Around line 36-48: The acceptance criteria claim "any state → archived" but
the Mermaid state diagram only shows transitions to archived from draft, active,
and paused (states: draft, active, paused, completed, archived); decide whether
completed → archived is allowed and make the docs consistent by either adding
the missing transition "completed --> archived: archive" to the stateDiagram-v2
block or changing the acceptance criteria to exclude completed, and ensure both
the Mermaid diagram (states draft, active, paused, completed, archived) and the
written acceptance criteria reflect the chosen behavior.
In `@docs/epics/control-api/tickets/Resource_Cleanup_Job.md`:
- Around line 35-63: The diagram shows committing the DB transaction
per-resource but the code in resource_tasks.py (resource cleanup job) commits
once after processing all resources; update the sequence diagram so the
"Job->>DB: Commit transaction" arrow is moved outside the inner "For each stale
resource" loop (i.e., after the loop completes) to reflect the batch-commit
behavior used by the Cleanup Job implementation.
In `@docs/epics/control-api/tickets/Resource_File_CRUD.md`:
- Line 83: Update the acceptance criterion currently written as "All operations
respect project scoping (query parameter)" to make project isolation mandatory:
replace that line so it requires that every operation MUST filter and enforce
the authenticated user's project context (not optional via a query parameter),
e.g., "All operations MUST filter by and enforce the authenticated user's
project context (project scoping is mandatory)." Ensure any mention of a query
parameter clarifies it cannot be used to override or bypass the authenticated
project and that requests missing a matching project must be rejected.
In `@docs/epics/control-api/tickets/Template_Import_Export.md`:
- Around line 48-85: The fenced example blocks labeled `python` contain
JSON-like payloads with lowercase booleans and a Python comment; change both
fences to ```json (the top example and the "Import Response (Partial Import)"
example) and remove or convert the Python-style inline comment ("# Name, not
ID") because JSON does not allow comments so the examples remain valid JSON with
true/false booleans.
In `@tests/integration/control/test_control_error_handling.py`:
- Around line 100-102: The test uses a non-ASCII apostrophe in the string "I'm a
teapot" inside the route handler test_http_unknown_status (and the other test
block around the later occurrence), so replace the curly/non-ASCII apostrophe
with a straight ASCII apostrophe in the string literals (e.g., change to "I'm a
teapot") in the test_http_unknown_status function and the other occurrence to
satisfy the ASCII-only guideline; search for the exact phrase and update both
string occurrences to use the ASCII apostrophe.
In `@tests/unit/test_campaign_service.py`:
- Around line 696-842: Add a new unit test mirroring the ARCHIVED rejection
tests to assert start_campaign_service also rejects the COMPLETED state: create
a project and hash_list via ProjectFactory and HashListFactory (use
get_or_create_hash_type), create a CampaignFactory entry with
state=CampaignState.COMPLETED, patch
app.core.services.campaign_service._broadcast_campaign_update, then call
start_campaign_service(campaign.id, db_session) inside
pytest.raises(InvalidStateTransitionError) and assert exc_info.value.from_state
== CampaignState.COMPLETED and exc_info.value.action == "start".
In `@tests/unit/test_state_machines.py`:
- Around line 699-726: Replace manual enum assignments in
test_campaign_full_lifecycle with factory/fixture-generated states: use the test
factory or helper that creates Campaign instances or campaign states (e.g., the
campaign factory in tests/factories or helper in tests/utils/test_helpers.py)
and call CampaignStateMachine.can_transition against the factory-produced
object's state rather than hardcoding CampaignState.DRAFT / ACTIVE / PAUSED /
COMPLETED / ARCHIVED; update references inside test_campaign_full_lifecycle and
any assertions to pull the current state from the factory object (or helper)
before invoking CampaignStateMachine.can_transition so the test follows the
project's factory-based test data pattern.
| except Exception: | ||
| # Let other exceptions bubble up to be handled by existing handlers | ||
| raise |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat -n app/core/control_rfc9457_middleware.py | head -160 | tail -30Repository: unclesp1d3r/Ouroboros
Length of output: 1544
🏁 Script executed:
cat -n app/main.py | grep -A10 -B2 "exception\|Exception\|@app"Repository: unclesp1d3r/Ouroboros
Length of output: 4590
🏁 Script executed:
rg -n "exception_handler|@.*exception|ExceptionMiddleware" app/ --type py -A3Repository: unclesp1d3r/Ouroboros
Length of output: 2123
🏁 Script executed:
cat -n app/api/v1/endpoints/agent/v1_http_exception_handler.pyRepository: unclesp1d3r/Ouroboros
Length of output: 3732
🏁 Script executed:
rg -n "ServerErrorMiddleware|RequestValidationError|starlette.exceptions" app/ --type py -A5Repository: unclesp1d3r/Ouroboros
Length of output: 47
🏁 Script executed:
rg -n "def.*exception.*handler|@app.exception_handler" app/ --type py -B1 -A5Repository: unclesp1d3r/Ouroboros
Length of output: 1169
Ensure generic exceptions re-raised from middleware are handled securely downstream.
The middleware re-raises unhandled exceptions (lines 149–151) to be processed by global handlers. However, v1_http_exception_handler (lines 58–59) explicitly re-raises non-HTTPException types, meaning unhandled generic exceptions bypass the dedicated handler and fall to FastAPI's default ServerErrorMiddleware. In debug mode or misconfigured environments, this can expose stack traces or internal error details to clients.
Recommend either:
- Restrict the middleware to only catch and handle
HTTPException - Convert generic exceptions to
HTTPExceptionwith sanitized error messages before re-raising - Add a catch-all exception handler at the app level that explicitly prevents stack trace leakage for Control API clients
🤖 Prompt for AI Agents
In `@app/core/control_rfc9457_middleware.py` around lines 149 - 151, The
middleware currently re-raises all Exception types which then bypasses
v1_http_exception_handler; update the except block inside the
control_rfc9457_middleware middleware to only catch HTTPException (from fastapi)
and re-raise it, and for any other Exception convert it into a sanitized
fastapi.HTTPException(status_code=500, detail="Internal server error") before
raising so no internal stack traces are exposed; alternatively add an app-level
catch-all that maps non-HTTPException errors to a sanitized HTTPException and
ensure v1_http_exception_handler still handles HTTPException types.
| - [ ] Users can update resource metadata (name, description, tags) | ||
| - [ ] Users can delete unused resources | ||
| - [ ] Delete validation prevents deletion if resource is used by attacks | ||
| - [ ] All operations respect project scoping (query parameter) |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Strengthen multi-tenancy isolation requirement.
The acceptance criterion states "operations respect project scoping (query parameter)" but for a multi-tenant system, project isolation should be mandatory, not optional via query parameter. Consider rewording to emphasize that all queries MUST filter by the authenticated user's project context.
Suggested clarification
-- [ ] All operations respect project scoping (query parameter)
+- [ ] All operations enforce project isolation (queries automatically filtered by authenticated user's project_id)
+- [ ] Optional project_id query parameter allows further filtering within accessible projects📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - [ ] All operations respect project scoping (query parameter) | |
| - [ ] All operations enforce project isolation (queries automatically filtered by authenticated user's project_id) | |
| - [ ] Optional project_id query parameter allows further filtering within accessible projects |
🤖 Prompt for AI Agents
In `@docs/epics/control-api/tickets/Resource_File_CRUD.md` at line 83, Update the
acceptance criterion currently written as "All operations respect project
scoping (query parameter)" to make project isolation mandatory: replace that
line so it requires that every operation MUST filter and enforce the
authenticated user's project context (not optional via a query parameter), e.g.,
"All operations MUST filter by and enforce the authenticated user's project
context (project scoping is mandatory)." Ensure any mention of a query parameter
clarifies it cannot be used to override or bypass the authenticated project and
that requests missing a matching project must be rejected.
| ```python | ||
| { | ||
| "schema_version": "1.0", | ||
| "campaign": { | ||
| "name": "Corporate Password Audit", | ||
| "description": "Standard audit template", | ||
| "attacks": [ | ||
| { | ||
| "name": "Dictionary Attack", | ||
| "attack_mode": 0, | ||
| "wordlist_name": "rockyou.txt", # Name, not ID | ||
| "rule_list_name": "best64.rule", | ||
| } | ||
| ], | ||
| }, | ||
| } | ||
| ``` | ||
|
|
||
| **Import Response (Partial Import):** | ||
|
|
||
| ```python | ||
| { | ||
| "campaign_id": 789, | ||
| "imported": { | ||
| "campaign": true, | ||
| "attacks": [{"name": "Dictionary Attack", "imported": true, "attack_id": 101}], | ||
| }, | ||
| "skipped": { | ||
| "attacks": [ | ||
| { | ||
| "name": "Mask Attack", | ||
| "reason": "missing_resource", | ||
| "missing_resources": ["custom_masks.hcmask"], | ||
| } | ||
| ] | ||
| }, | ||
| } | ||
| ``` |
There was a problem hiding this comment.
Fix example code block language/booleans to avoid invalid Python.
The examples are JSON-like but fenced as python and use true/false, which are invalid in Python. This can mislead implementers. Prefer json fences (or change booleans to True/False if you truly want Python).
Proposed documentation fix
-```python
+```json
{
"schema_version": "1.0",
"campaign": {
"name": "Corporate Password Audit",
"description": "Standard audit template",
"attacks": [
{
"name": "Dictionary Attack",
"attack_mode": 0,
"wordlist_name": "rockyou.txt", # Name, not ID
"rule_list_name": "best64.rule",
}
],
},
}@@
-python +json
{
"campaign_id": 789,
"imported": {
-
"campaign": true, -
"attacks": [{"name": "Dictionary Attack", "imported": true, "attack_id": 101}],
-
"campaign": true, -
},
"attacks": [{"name": "Dictionary Attack", "imported": true, "attack_id": 101}],
"skipped": {
"attacks": [
{
"name": "Mask Attack",
"reason": "missing_resource",
"missing_resources": ["custom_masks.hcmask"],
}
]
},
}
🤖 Prompt for AI Agents
In `@docs/epics/control-api/tickets/Template_Import_Export.md` around lines 48 -
85, The fenced example blocks labeled `python` contain JSON-like payloads with
lowercase booleans and a Python comment; change both fences to ```json (the top
example and the "Import Response (Partial Import)" example) and remove or
convert the Python-style inline comment ("# Name, not ID") because JSON does not
allow comments so the examples remain valid JSON with true/false booleans.
| @app.get("/api/v1/control/test/http-unknown-status", response_model=None) | ||
| async def test_http_unknown_status() -> Never: | ||
| raise HTTPException(status_code=418, detail="I'm a teapot") |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Minor: Non-ASCII character in test string.
Line 102 and 382 use "I'm a teapot" which contains a curly apostrophe or similar character. As per coding guidelines: "Avoid using emojis and other non-ASCII characters in code, comments, or documentation."
While this is test data representing the HTTP 418 response, consider using a straight ASCII apostrophe for consistency.
Suggested fix
- raise HTTPException(status_code=418, detail="I'm a teapot")
+ raise HTTPException(status_code=418, detail="I'm a teapot")- assert data["detail"] == "I'm a teapot"
+ assert data["detail"] == "I'm a teapot"Also applies to: 371-382
🤖 Prompt for AI Agents
In `@tests/integration/control/test_control_error_handling.py` around lines 100 -
102, The test uses a non-ASCII apostrophe in the string "I'm a teapot" inside
the route handler test_http_unknown_status (and the other test block around the
later occurrence), so replace the curly/non-ASCII apostrophe with a straight
ASCII apostrophe in the string literals (e.g., change to "I'm a teapot") in the
test_http_unknown_status function and the other occurrence to satisfy the
ASCII-only guideline; search for the exact phrase and update both string
occurrences to use the ASCII apostrophe.
|
|
||
|
|
||
| # --- State machine integration tests --- | ||
|
|
||
|
|
||
| @pytest.mark.asyncio | ||
| @patch("app.core.services.campaign_service._broadcast_campaign_update") | ||
| async def test_start_campaign_service_rejects_archived_state( | ||
| mock_broadcast: AsyncMock, | ||
| db_session: AsyncSession, | ||
| ) -> None: | ||
| """Test that start_campaign_service rejects transition from ARCHIVED state.""" | ||
| from app.core.state_machines import InvalidStateTransitionError | ||
| from tests.factories.hash_list_factory import HashListFactory | ||
| from tests.utils.hash_type_utils import get_or_create_hash_type | ||
|
|
||
| # Set factory sessions | ||
| CampaignFactory.__async_session__ = db_session | ||
| ProjectFactory.__async_session__ = db_session | ||
| HashListFactory.__async_session__ = db_session | ||
|
|
||
| project = await ProjectFactory.create_async() | ||
| hash_type = await get_or_create_hash_type(db_session, 0, "MD5") | ||
| hash_list = await HashListFactory.create_async( | ||
| project_id=project.id, hash_type_id=hash_type.id | ||
| ) | ||
|
|
||
| campaign = await CampaignFactory.create_async( | ||
| project_id=project.id, | ||
| hash_list_id=hash_list.id, | ||
| state=CampaignState.ARCHIVED, | ||
| ) | ||
|
|
||
| # Attempt to start an archived campaign - should raise InvalidStateTransitionError | ||
| with pytest.raises(InvalidStateTransitionError) as exc_info: | ||
| await start_campaign_service(campaign.id, db_session) | ||
|
|
||
| assert exc_info.value.from_state == CampaignState.ARCHIVED | ||
| assert exc_info.value.action == "start" | ||
|
|
||
|
|
||
| @pytest.mark.asyncio | ||
| @patch("app.core.services.campaign_service._broadcast_campaign_update") | ||
| async def test_stop_campaign_service_rejects_archived_state( | ||
| mock_broadcast: AsyncMock, | ||
| db_session: AsyncSession, | ||
| ) -> None: | ||
| """Test that stop_campaign_service rejects transition from ARCHIVED state.""" | ||
| from app.core.state_machines import InvalidStateTransitionError | ||
| from tests.factories.hash_list_factory import HashListFactory | ||
| from tests.utils.hash_type_utils import get_or_create_hash_type | ||
|
|
||
| # Set factory sessions | ||
| CampaignFactory.__async_session__ = db_session | ||
| ProjectFactory.__async_session__ = db_session | ||
| HashListFactory.__async_session__ = db_session | ||
|
|
||
| project = await ProjectFactory.create_async() | ||
| hash_type = await get_or_create_hash_type(db_session, 0, "MD5") | ||
| hash_list = await HashListFactory.create_async( | ||
| project_id=project.id, hash_type_id=hash_type.id | ||
| ) | ||
|
|
||
| campaign = await CampaignFactory.create_async( | ||
| project_id=project.id, | ||
| hash_list_id=hash_list.id, | ||
| state=CampaignState.ARCHIVED, | ||
| ) | ||
|
|
||
| # Attempt to stop an archived campaign - should raise InvalidStateTransitionError | ||
| with pytest.raises(InvalidStateTransitionError) as exc_info: | ||
| await stop_campaign_service(campaign.id, db_session) | ||
|
|
||
| assert exc_info.value.from_state == CampaignState.ARCHIVED | ||
| assert exc_info.value.action == "stop" | ||
|
|
||
|
|
||
| @pytest.mark.asyncio | ||
| @patch("app.core.services.campaign_service._broadcast_campaign_update") | ||
| async def test_stop_campaign_service_rejects_completed_state( | ||
| mock_broadcast: AsyncMock, | ||
| db_session: AsyncSession, | ||
| ) -> None: | ||
| """Test that stop_campaign_service rejects transition from COMPLETED state.""" | ||
| from app.core.state_machines import InvalidStateTransitionError | ||
| from tests.factories.hash_list_factory import HashListFactory | ||
| from tests.utils.hash_type_utils import get_or_create_hash_type | ||
|
|
||
| # Set factory sessions | ||
| CampaignFactory.__async_session__ = db_session | ||
| ProjectFactory.__async_session__ = db_session | ||
| HashListFactory.__async_session__ = db_session | ||
|
|
||
| project = await ProjectFactory.create_async() | ||
| hash_type = await get_or_create_hash_type(db_session, 0, "MD5") | ||
| hash_list = await HashListFactory.create_async( | ||
| project_id=project.id, hash_type_id=hash_type.id | ||
| ) | ||
|
|
||
| campaign = await CampaignFactory.create_async( | ||
| project_id=project.id, | ||
| hash_list_id=hash_list.id, | ||
| state=CampaignState.COMPLETED, | ||
| ) | ||
|
|
||
| # Attempt to stop a completed campaign - should raise InvalidStateTransitionError | ||
| with pytest.raises(InvalidStateTransitionError) as exc_info: | ||
| await stop_campaign_service(campaign.id, db_session) | ||
|
|
||
| assert exc_info.value.from_state == CampaignState.COMPLETED | ||
| assert exc_info.value.action == "stop" | ||
|
|
||
|
|
||
| @pytest.mark.asyncio | ||
| @patch("app.core.services.campaign_service._broadcast_campaign_update") | ||
| async def test_archive_campaign_service_rejects_error_state( | ||
| mock_broadcast: AsyncMock, | ||
| db_session: AsyncSession, | ||
| ) -> None: | ||
| """Test that archive_campaign_service rejects transition from ERROR state.""" | ||
| from app.core.state_machines import InvalidStateTransitionError | ||
| from tests.factories.hash_list_factory import HashListFactory | ||
| from tests.utils.hash_type_utils import get_or_create_hash_type | ||
|
|
||
| # Set factory sessions | ||
| CampaignFactory.__async_session__ = db_session | ||
| ProjectFactory.__async_session__ = db_session | ||
| HashListFactory.__async_session__ = db_session | ||
|
|
||
| project = await ProjectFactory.create_async() | ||
| hash_type = await get_or_create_hash_type(db_session, 0, "MD5") | ||
| hash_list = await HashListFactory.create_async( | ||
| project_id=project.id, hash_type_id=hash_type.id | ||
| ) | ||
|
|
||
| campaign = await CampaignFactory.create_async( | ||
| project_id=project.id, | ||
| hash_list_id=hash_list.id, | ||
| state=CampaignState.ERROR, | ||
| ) | ||
|
|
||
| # Attempt to archive an ERROR campaign - should raise InvalidStateTransitionError | ||
| with pytest.raises(InvalidStateTransitionError) as exc_info: | ||
| await archive_campaign_service(campaign.id, db_session) | ||
|
|
||
| assert exc_info.value.from_state == CampaignState.ERROR | ||
| assert exc_info.value.action == "archive" |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Consider adding test for start from COMPLETED state.
The current state machine tests cover ARCHIVED rejections well. For completeness, consider adding a test for start_campaign_service rejecting COMPLETED state, as this is also a terminal state that shouldn't allow restart without explicit relaunch flow.
Would you like me to generate a test case for test_start_campaign_service_rejects_completed_state?
🧰 Tools
🪛 Ruff (0.14.14)
[warning] 704-704: Unused function argument: mock_broadcast
(ARG001)
[warning] 740-740: Unused function argument: mock_broadcast
(ARG001)
[warning] 776-776: Unused function argument: mock_broadcast
(ARG001)
[warning] 812-812: Unused function argument: mock_broadcast
(ARG001)
🤖 Prompt for AI Agents
In `@tests/unit/test_campaign_service.py` around lines 696 - 842, Add a new unit
test mirroring the ARCHIVED rejection tests to assert start_campaign_service
also rejects the COMPLETED state: create a project and hash_list via
ProjectFactory and HashListFactory (use get_or_create_hash_type), create a
CampaignFactory entry with state=CampaignState.COMPLETED, patch
app.core.services.campaign_service._broadcast_campaign_update, then call
start_campaign_service(campaign.id, db_session) inside
pytest.raises(InvalidStateTransitionError) and assert exc_info.value.from_state
== CampaignState.COMPLETED and exc_info.value.action == "start".
| def test_campaign_full_lifecycle(self) -> None: | ||
| """Test a campaign going through its full lifecycle.""" | ||
| state = CampaignState.DRAFT | ||
|
|
||
| # Start campaign | ||
| assert CampaignStateMachine.can_transition(state, CampaignState.ACTIVE) | ||
| state = CampaignState.ACTIVE | ||
|
|
||
| # Pause campaign | ||
| assert CampaignStateMachine.can_transition(state, CampaignState.PAUSED) | ||
| state = CampaignState.PAUSED | ||
|
|
||
| # Resume campaign | ||
| assert CampaignStateMachine.can_transition(state, CampaignState.ACTIVE) | ||
| state = CampaignState.ACTIVE | ||
|
|
||
| # Complete campaign (system transition) | ||
| assert CampaignStateMachine.can_transition(state, CampaignState.COMPLETED) | ||
| state = CampaignState.COMPLETED | ||
|
|
||
| # Archive campaign | ||
| assert CampaignStateMachine.can_transition(state, CampaignState.ARCHIVED) | ||
| state = CampaignState.ARCHIVED | ||
|
|
||
| # Unarchive campaign | ||
| assert CampaignStateMachine.can_transition(state, CampaignState.DRAFT) | ||
| state = CampaignState.DRAFT | ||
|
|
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Use factories for test data instead of inline values.
This suite directly constructs enum/state values and action strings; per test guidelines, please use factories/fixtures from tests/factories/ or helpers from tests/utils/test_helpers.py to generate test data consistently.
As per coding guidelines, "tests/**/*.py: Use factories for all test data instead of manual creation".
🤖 Prompt for AI Agents
In `@tests/unit/test_state_machines.py` around lines 699 - 726, Replace manual
enum assignments in test_campaign_full_lifecycle with factory/fixture-generated
states: use the test factory or helper that creates Campaign instances or
campaign states (e.g., the campaign factory in tests/factories or helper in
tests/utils/test_helpers.py) and call CampaignStateMachine.can_transition
against the factory-produced object's state rather than hardcoding
CampaignState.DRAFT / ACTIVE / PAUSED / COMPLETED / ARCHIVED; update references
inside test_campaign_full_lifecycle and any assertions to pull the current state
from the factory object (or helper) before invoking
CampaignStateMachine.can_transition so the test follows the project's
factory-based test data pattern.
- Document exception architecture pattern (services → domain exceptions, Web API → HTTPException, Control API → RFC9457 middleware) - Document TypeScript global augmentation eslint workaround Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add skip_locked=True to cleanup query to prevent race conditions
- Fix grammar in state transition errors ("from state 'X'" not "an X")
- Add InvalidStateTransitionError handling to attack delete endpoint
- Wrap blocking stat_object in asyncio.to_thread for async safety
- Add authorization check for unrestricted resources (require superuser)
- Change user_email to user_id in logs for PII/GDPR compliance
- Add ge=1 validation constraints to cleanup job settings
- Update OpenAPI examples to match RFC9457 response format
- Add structured docstrings with Args/Returns/Raises sections
- Replace Unicode arrows with ASCII in state machine docstrings
- Guard against unknown states in is_terminal_state()
- Fix duplicate contextlib import in main.py
- Fix unused variable warnings in test files
- Update test assertions for new error message format
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Fix all issues with AI agents
In `@app/api/v1/endpoints/control/resources.py`:
- Around line 79-111: Add per-user and per-IP rate limiting to the Control API
by integrating a rate limiter (e.g., slowapi) and applying it to this
router/endpoints: import and configure the Limiter and LimiterMiddleware in the
app setup, register the limiter with the Control API router, then annotate the
cancel_pending_resource_upload endpoint (and other control endpoints) with the
limiter decorator (e.g., `@limiter.limit`(...)) so requests authenticated via
get_current_control_user and anonymous IPs are rate limited; ensure limits are
keyed per-user when current_user is present and fall back to per-IP, and confirm
cancel_pending_resource_upload and its underlying cancel_pending_resource
respect the limit.
In `@app/core/services/resource_service.py`:
- Around line 1796-1801: The current cancel_pending_resource path calls
cleanup_stale_resource(resource, db) and raises InternalServerError on a False
return, losing the original exception context; update the flow so
cleanup_stale_resource either re-raises the caught exception (so
cancel_pending_resource can let it bubble or wrap it) or change
cleanup_stale_resource to return (success, error) or an exception object instead
of just False; then adjust cancel_pending_resource to include that exception
detail when raising InternalServerError (reference cleanup_stale_resource and
cancel_pending_resource) so logs/error messages contain the original exception
information for debugging.
In `@app/core/state_machines.py`:
- Around line 377-390: Create a new exception class UnknownAttackStateError
(subclassing ValueError) whose __init__ sets the message "Unknown attack state:
{state}" and use it in is_terminal_state instead of raising ValueError directly;
update the is_terminal_state method to raise UnknownAttackStateError(state) when
state not in cls.TRANSITIONS, and ensure the exception class is defined in the
same module (near other state-related types) so callers of is_terminal_state can
import/handle UnknownAttackStateError.
In `@app/core/tasks/resource_tasks.py`:
- Around line 90-97: The stale-resource query in resource_tasks.py selects
AttackResourceFile rows across all projects; scope it by project_id to preserve
tenant isolation. Modify the db.execute select(AttackResourceFile)... query used
in the cleanup routine (the call that uses cutoff_time and
with_for_update(skip_locked=True)) to include a
.where(AttackResourceFile.project_id == project_id) filter (or, if this task
must run across projects, refactor to iterate projects under an explicit system
context and apply the same query per project). Ensure the function that calls
db.execute accepts or resolves a project_id and use that identifier when
building the select to enforce project-scoped deletes.
In `@tests/unit/test_state_machines.py`:
- Around line 782-803: Add an assertion checking the RFC9457 problem carries the
HTTP conflict status: after constructing InvalidStateTransitionProblem in
test_error_conversion_to_rfc9457, assert the problem's HTTP status attribute
from the ConflictProblem base (e.g., problem.status_code == 409 or
problem.status == 409 depending on the project convention) to ensure transitions
produce a 409 Conflict; locate this check immediately after the
InvalidStateTransitionProblem creation in the test and use the existing symbols
InvalidStateTransitionError, InvalidStateTransitionProblem,
CampaignStateMachine.validate_transition, and ConflictProblem to guide
placement.
| async def cancel_pending_resource_upload( | ||
| resource_id: UUID, | ||
| db: Annotated[AsyncSession, Depends(get_db)], | ||
| current_user: Annotated[User, Depends(get_current_control_user)], | ||
| ) -> Response: | ||
| """ | ||
| Cancel a pending resource upload. | ||
|
|
||
| This endpoint allows users to manually cancel a resource upload that is still | ||
| in the pending state (not yet marked as uploaded). This is useful for cleaning | ||
| up abandoned uploads without waiting for the automatic cleanup job. | ||
|
|
||
| The operation will: | ||
| 1. Validate the user has access to the resource's project | ||
| 2. Verify the resource is in pending state (is_uploaded=False) | ||
| 3. Delete any associated object from MinIO storage | ||
| 4. Delete the resource record from the database | ||
|
|
||
| Args: | ||
| resource_id: UUID of the resource to cancel. | ||
| db: Database session (injected). | ||
| current_user: Authenticated user from API key (injected). | ||
|
|
||
| Returns: | ||
| Response: 204 No Content on success. | ||
|
|
||
| Raises: | ||
| ResourceNotFoundError: If resource doesn't exist. | ||
| ProjectAccessDeniedError: If user lacks project access. | ||
| InvalidResourceStateError: If resource is already uploaded. | ||
| """ | ||
| await cancel_pending_resource(resource_id, db, current_user) | ||
| return Response(status_code=status.HTTP_204_NO_CONTENT) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check if rate limiting is configured for control API routes
rg -n "rate_limit|RateLimit|throttle" app/api/v1/endpoints/control/ app/core/deps.py --type py -C 2Repository: unclesp1d3r/Ouroboros
Length of output: 47
🏁 Script executed:
#!/bin/bash
# Search more broadly for rate limiting configuration in the app
rg -n "rate" app/ --type py -i | head -20
# Also check if there's any middleware configuration
fd -e py -path "*/middleware*" app/
# Check for any decorator patterns that might be rate limiting
rg -n "@.*limit|@.*throttle|@.*rate" app/ --type py -C 1 | head -30Repository: unclesp1d3r/Ouroboros
Length of output: 2239
🏁 Script executed:
#!/bin/bash
# Check for middleware configuration in main app setup
fd -name "main.py" app/ -o -name "app.py" app/ | xargs cat
# Check pyproject.toml for rate limiting dependencies
grep -i "rate\|limit\|throttle" pyproject.toml
# Check control router configuration
rg -n "router\|APIRouter" app/api/v1/endpoints/control/ --type py -A 3 | head -30Repository: unclesp1d3r/Ouroboros
Length of output: 295
🏁 Script executed:
#!/bin/bash
# Find main app setup files
find app -maxdepth 2 -name "main.py" -o -name "app.py" | xargs cat 2>/dev/null | head -100
# Check pyproject.toml for rate limiting dependencies
echo "=== Checking pyproject.toml for rate limiting packages ==="
grep -i "rate\|limit\|throttle\|slowapi" pyproject.toml
# Check control router configuration
echo "=== Control router configuration ==="
rg "router\|APIRouter" app/api/v1/endpoints/control/ --type py -A 3 | head -50Repository: unclesp1d3r/Ouroboros
Length of output: 3799
🏁 Script executed:
#!/bin/bash
# Find and check the control router setup
find app/api/v1 -name "*.py" | xargs grep -l "APIRouter\|router.*=" | head -10
# Check the main router file that includes all endpoints
cat app/api/v1/router.py 2>/dev/null | head -50
# Search for rate limiting more broadly (slowing, limiting requests, etc)
rg "SlowAPI|limiter|HTTPException.*429" app/ --type py -C 2Repository: unclesp1d3r/Ouroboros
Length of output: 1532
🏁 Script executed:
#!/bin/bash
# Check the control router file
cat app/api/v1/endpoints/control/router.py 2>/dev/null
# Also check if there are any rate limiting dependencies at all
cat pyproject.toml | grep -A 200 "dependencies\|tool.poetry" | grep -i "rate\|limit\|slowapi\|throttle"Repository: unclesp1d3r/Ouroboros
Length of output: 1063
Rate limiting not implemented for Control API endpoints; required per coding guidelines.
Control API endpoints currently lack rate limiting despite the coding guideline requirement to "apply per-user and per-IP rate limiting" for all /api/** endpoints. This has been verified to be missing at:
- Router level (no decorators or middleware in endpoint routers)
- Middleware level (no rate limiting middleware configured in main app.py)
- Dependency level (no rate limiting packages in pyproject.toml)
Implement per-user and per-IP rate limiting using a suitable library (e.g., slowapi) and apply it to the Control API router and/or individual endpoints to comply with project guidelines.
🤖 Prompt for AI Agents
In `@app/api/v1/endpoints/control/resources.py` around lines 79 - 111, Add
per-user and per-IP rate limiting to the Control API by integrating a rate
limiter (e.g., slowapi) and applying it to this router/endpoints: import and
configure the Limiter and LimiterMiddleware in the app setup, register the
limiter with the Control API router, then annotate the
cancel_pending_resource_upload endpoint (and other control endpoints) with the
limiter decorator (e.g., `@limiter.limit`(...)) so requests authenticated via
get_current_control_user and anonymous IPs are rate limited; ensure limits are
keyed per-user when current_user is present and fall back to per-IP, and confirm
cancel_pending_resource_upload and its underlying cancel_pending_resource
respect the limit.
| # Perform cleanup | ||
| success = await cleanup_stale_resource(resource, db) | ||
| if not success: | ||
| from app.core.control_exceptions import InternalServerError | ||
|
|
||
| raise InternalServerError(detail="Failed to clean up resource") |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Cleanup failure silently caught and re-raised as InternalServerError.
When cleanup_stale_resource returns False, an InternalServerError is raised. However, the original exception context is lost since cleanup_stale_resource catches all exceptions and returns False. This makes debugging production issues harder.
Consider either:
- Having
cleanup_stale_resourcere-raise after logging, or - Returning more context (e.g., the exception) so
cancel_pending_resourcecan include it in the error detail.
Alternative: Propagate exception for better observability
- # Perform cleanup
- success = await cleanup_stale_resource(resource, db)
- if not success:
- from app.core.control_exceptions import InternalServerError
-
- raise InternalServerError(detail="Failed to clean up resource")
+ # Perform cleanup - let exceptions propagate for better error context
+ try:
+ success = await cleanup_stale_resource(resource, db)
+ if not success:
+ from app.core.control_exceptions import InternalServerError
+ raise InternalServerError(detail="Failed to clean up resource")
+ except Exception as e:
+ from app.core.control_exceptions import InternalServerError
+ raise InternalServerError(detail=f"Failed to clean up resource: {e}") from e🤖 Prompt for AI Agents
In `@app/core/services/resource_service.py` around lines 1796 - 1801, The current
cancel_pending_resource path calls cleanup_stale_resource(resource, db) and
raises InternalServerError on a False return, losing the original exception
context; update the flow so cleanup_stale_resource either re-raises the caught
exception (so cancel_pending_resource can let it bubble or wrap it) or change
cleanup_stale_resource to return (success, error) or an exception object instead
of just False; then adjust cancel_pending_resource to include that exception
detail when raising InternalServerError (reference cleanup_stale_resource and
cancel_pending_resource) so logs/error messages contain the original exception
information for debugging.
| def is_terminal_state(cls, state: AttackState) -> bool: | ||
| """Check if a state is terminal (no outgoing transitions). | ||
|
|
||
| Args: | ||
| state: The state to check. | ||
|
|
||
| Returns: | ||
| True if the state is terminal, False otherwise. | ||
|
|
||
| Raises: | ||
| ValueError: If the state is not a known AttackState. | ||
| """ | ||
| if state not in cls.TRANSITIONS: | ||
| raise ValueError(f"Unknown attack state: {state}") |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, check if the file exists and examine the relevant section
if [ -f "app/core/state_machines.py" ]; then
echo "=== File found. Checking lines 377-390 ==="
sed -n '377,390p' app/core/state_machines.py
echo ""
echo "=== Context: lines 370-395 ==="
sed -n '370,395p' app/core/state_machines.py
else
echo "File not found at app/core/state_machines.py"
find . -name "state_machines.py" -type f
fiRepository: unclesp1d3r/Ouroboros
Length of output: 1241
🏁 Script executed:
# Run the requested ruff check for TRY003
if [ -f "app/core/state_machines.py" ]; then
echo "=== Running: ruff check app/core/state_machines.py --select TRY003 ==="
ruff check app/core/state_machines.py --select TRY003 2>&1 || true
echo ""
echo "=== For context, running full ruff check to see all violations ==="
ruff check app/core/state_machines.py 2>&1 | head -30 || true
fiRepository: unclesp1d3r/Ouroboros
Length of output: 1263
🏁 Script executed:
# Check the class structure to understand the context
if [ -f "app/core/state_machines.py" ]; then
echo "=== Searching for class definition and is_terminal_state method ==="
rg -n "class.*State|def is_terminal_state" app/core/state_machines.py -A 2
fiRepository: unclesp1d3r/Ouroboros
Length of output: 592
Move the exception message into a dedicated exception class to fix TRY003 linting violation.
Ruff flags the inline ValueError message in is_terminal_state() at line 390. Create a dedicated UnknownAttackStateError exception class with the message in its __init__ method, then raise that instead. This centralizes exception messages and satisfies ruff's TRY003 rule.
Proposed fix
+class UnknownAttackStateError(ValueError):
+ def __init__(self, state: AttackState) -> None:
+ super().__init__(f"Unknown attack state: {state}")
+
@@
- if state not in cls.TRANSITIONS:
- raise ValueError(f"Unknown attack state: {state}")
+ if state not in cls.TRANSITIONS:
+ raise UnknownAttackStateError(state)🧰 Tools
🪛 Ruff (0.14.14)
[warning] 390-390: Avoid specifying long messages outside the exception class
(TRY003)
🤖 Prompt for AI Agents
In `@app/core/state_machines.py` around lines 377 - 390, Create a new exception
class UnknownAttackStateError (subclassing ValueError) whose __init__ sets the
message "Unknown attack state: {state}" and use it in is_terminal_state instead
of raising ValueError directly; update the is_terminal_state method to raise
UnknownAttackStateError(state) when state not in cls.TRANSITIONS, and ensure the
exception class is defined in the same module (near other state-related types)
so callers of is_terminal_state can import/handle UnknownAttackStateError.
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
AGENTS.md (1)
1-288:⚠️ Potential issue | 🟡 MinorAdd language specifiers to 7 fenced code blocks and replace non-ASCII box-drawing characters.
AGENTS.md has two documentation quality issues:
Missing language specifiers (MD040) – 7 code blocks lack language identifiers at lines 61, 89, 124, 144, 211, 232, 240. Add language tags (e.g.,
\``python,```bash,```text`) for syntax highlighting.Non-ASCII characters – The project structure diagram (lines 47–55) uses box-drawing characters (├, │, └, ─) which violate the coding guideline requiring ASCII-only content in
.mdfiles. Replace with ASCII equivalents (e.g.,|--for tree branching).
🤖 Fix all issues with AI agents
In `@AGENTS.md`:
- Around line 136-143: The fenced JSON example in the AGENTS.md snippet is
missing a language specifier which triggers markdownlint MD040; update the
fenced code block surrounding the JSON example (the block starting with ``` and
the JSON object containing "type","title","status","detail","instance") to
include a language specifier (use "json") so it becomes ```json and enables
proper syntax highlighting and linting compliance.
In `@CLAUDE.md`:
- Line 1: Add a new section to AGENTS.md (or .cursor/rules) documenting the
`@filename` convention, explicitly describing the `@AGENTS.md` token and which AI
systems/tools consume it (e.g., Claude via CLAUDE.md), stating that CLAUDE.md is
the Claude counterpart to WARP.md and GEMINI.md (for Gemini), and explaining how
these files tie into API v1 contract expectations and agent behavior governance;
update CLAUDE.md to reference AGENTS.md as the authoritative rules and include a
short note on how tooling should parse `@filename` tokens and resolve
agent-specific guidance.
| ```python | ||
| { | ||
| "type": "https://example.com/problems/invalid-request", | ||
| "title": "Invalid Request", | ||
| "status": 400, | ||
| "detail": "The request parameters are invalid", | ||
| "instance": "/api/v1/control/campaigns/123", | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Consider adding language specifier to JSON code block.
The JSON example would benefit from a language specifier for better syntax highlighting and markdown compliance.
📝 Proposed improvement for markdown quality
-```python
+```json
{
"type": "https://example.com/problems/invalid-request",
"title": "Invalid Request",As per static analysis hints, markdownlint-cli2 reports fenced code blocks should have a language specified (MD040).
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ```python | |
| { | |
| "type": "https://example.com/problems/invalid-request", | |
| "title": "Invalid Request", | |
| "status": 400, | |
| "detail": "The request parameters are invalid", | |
| "instance": "/api/v1/control/campaigns/123", | |
| } |
🧰 Tools
🪛 markdownlint-cli2 (0.20.0)
[warning] 137-137: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🤖 Prompt for AI Agents
In `@AGENTS.md` around lines 136 - 143, The fenced JSON example in the AGENTS.md
snippet is missing a language specifier which triggers markdownlint MD040;
update the fenced code block surrounding the JSON example (the block starting
with ``` and the JSON object containing
"type","title","status","detail","instance") to include a language specifier
(use "json") so it becomes ```json and enables proper syntax highlighting and
linting compliance.
| @@ -0,0 +1 @@ | |||
| @AGENTS.md | |||
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Search for similar `@-reference` patterns in markdown files
echo "=== Searching for `@-reference` patterns ==="
rg '@[A-Z]+\.md' -g '*.md' || echo "No matches found"
echo ""
echo "=== Searching for documentation about CLAUDE.md ==="
rg 'CLAUDE\.md' -A 3 -B 3 -g '*.md' || echo "No matches found"
echo ""
echo "=== Searching for AGENTS.md references ==="
rg 'AGENTS\.md' -A 3 -B 3 -g '*.md' || echo "No matches found"
echo ""
echo "=== Checking if CLAUDE.md is newly created ==="
git ls-files CLAUDE.md 2>/dev/null && echo "File exists in repo" || echo "File may be new"
echo ""
echo "=== List all markdown files in repo root ==="
git ls-files '*.md' 2>/dev/null | head -20Repository: unclesp1d3r/Ouroboros
Length of output: 3261
Document the @AGENTS.md convention in AGENTS.md or .cursor/rules.
The @AGENTS.md syntax in CLAUDE.md appears to be agent-specific guidance (likely for Claude), but lacks documentation. This creates discoverability and maintenance issues. Given AGENTS.md is defined as the "authoritative rules" for AI assistants and contributors, add a section documenting:
- The
@filenameconvention and which AI systems/tools consume it - That CLAUDE.md is the counterpart to WARP.md (GEMINI.md rules for Gemini, CLAUDE.md rules for Claude)
- How this relates to API contracts and agent behavior governance
This ensures new contributors and tooling understand the convention and its role in API v1 contract integrity.
🤖 Prompt for AI Agents
In `@CLAUDE.md` at line 1, Add a new section to AGENTS.md (or .cursor/rules)
documenting the `@filename` convention, explicitly describing the `@AGENTS.md` token
and which AI systems/tools consume it (e.g., Claude via CLAUDE.md), stating that
CLAUDE.md is the Claude counterpart to WARP.md and GEMINI.md (for Gemini), and
explaining how these files tie into API v1 contract expectations and agent
behavior governance; update CLAUDE.md to reference AGENTS.md as the
authoritative rules and include a short note on how tooling should parse
`@filename` tokens and resolve agent-specific guidance.
Add comprehensive tests for the new get_valid_actions method on both CampaignStateMachine and AttackStateMachine. Tests verify correct action lists are returned for each state, ensuring error messages can accurately inform users of valid actions. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add get_valid_actions() to CampaignStateMachine and AttackStateMachine to return valid action names for a given state - Update error messages in campaigns.py and attacks.py to include valid actions when state transitions fail - Add missing db.commit() after resource deletion in cancel_pending_resource - Handle asyncio.CancelledError in cleanup_stale_resource to allow graceful task cancellation - Add InvalidResourceStateError to RFC9457 middleware exception handler - Simplify extension field handling in middleware (remove hasattr checks) - Narrow exception handling in attack_service to catch only ValueError - Improve InvalidResourceStateError docstring - Remove orphaned Raises documentation from list_campaigns_service Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
app/api/v1/endpoints/web/campaigns.py (1)
419-429:⚠️ Potential issue | 🟠 MajorAdd missing InvalidStateTransitionError handler to archive_campaign endpoint.
The
archive_campaign_serviceraisesInvalidStateTransitionErrorfor invalid state transitions (documented at line 582 in campaign_service.py), but this endpoint only catchesCampaignNotFoundError. Unhandled exceptions will propagate as HTTP 500, violating the error handling requirement to never leak internal errors to API clients.Other similar endpoints (
start_campaign,stop_campaign) already handle this exception with a 400 Bad Request response. Add the same exception handler here:Proposed fix
async def archive_campaign( campaign_id: int, db: Annotated[AsyncSession, Depends(get_db)], current_user: Annotated[User, Depends(get_current_user)], ) -> Response: # Return Response for 204 No Content try: await _check_user_has_access_to_campaign(campaign_id, "write", db, current_user) await archive_campaign_service(campaign_id, db) return Response(status_code=status.HTTP_204_NO_CONTENT) except CampaignNotFoundError as e: raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail=str(e)) from e + except InvalidStateTransitionError as e: + from app.models.campaign import CampaignState + + if isinstance(e.from_state, CampaignState): + valid_actions = CampaignStateMachine.get_valid_actions(e.from_state) + detail = f"Cannot archive campaign from state '{e.from_state.value}'. Valid actions: {valid_actions}" + else: + detail = f"Cannot archive campaign from state '{e.from_state.value}'." + raise HTTPException( + status_code=status.HTTP_400_BAD_REQUEST, + detail=detail, + ) from e
🤖 Fix all issues with AI agents
In `@app/api/v1/endpoints/web/attacks.py`:
- Around line 520-531: Move the local import of AttackState to the module-level
imports: remove the inline "from app.models.attack import AttackState" inside
the except block and add "from app.models.attack import AttackState" near the
other top-level imports; leave the except handler logic unchanged (it should
still catch InvalidStateTransitionError, reference e.from_state, call
AttackStateMachine.get_valid_actions, and raise the HTTPException) so
AttackState is available for the isinstance check without performing an inline
import.
In `@app/core/services/resource_service.py`:
- Around line 1666-1668: The service function names don't follow the
{action}_{resource}_service convention: rename cleanup_stale_resource to
cleanup_resource_service and cancel_pending_resource to
cancel_pending_resource_service (or the chosen {action}_{resource}_service
variants) in app/core/services/resource_service.py and update all call sites,
imports, and tests to reference the new names; ensure any async signatures
(async def cleanup_stale_resource/ cancel_pending_resource and their references)
remain unchanged except for the identifier, and run project-wide grep/IDE
refactor to replace usages so there are no broken imports or unresolved
references.
- Around line 1731-1742: The exception handler that logs failures when cleaning
up stale pending resources must rollback the AsyncSession to avoid leaving it in
a failed state; modify the except Exception block (the one that logs with
logger.bind(resource_id=resource_id, age_hours=...,
project_id=resource.project_id, ...).error(...)) to call await
session.rollback() (or session.rollback() if synchronous) before logging and
returning False, and for more precision consider catching
sqlalchemy.exc.SQLAlchemyError separately to ensure rollback is performed for
DB-related errors while still re-raising asyncio.CancelledError as before.
- Around line 1770-1792: The resource is being loaded via
db.get(AttackResourceFile, resource_id) before any tenant scoping, risking
cross-tenant visibility; change the lookup to include project scoping by
querying AttackResourceFile filtered to the caller's allowed project_ids
(derived from current_user.project_associations) and include unrestricted
resources (project_id IS NULL) only when current_user.is_superuser; if the
caller provided an explicit project restriction, use that to filter as well, and
preserve the current error semantics (raise ResourceNotFoundError when no result
after scoped lookup, or raise ProjectAccessDeniedError only when a non-superuser
attempts to access an unrestricted resource) so that ownership checks occur in
the query rather than after db.get.
In `@tests/unit/test_state_machines.py`:
- Around line 862-883: The test test_error_conversion_to_rfc9457 currently
swallows failures by using try/except; replace the try/except with a
pytest.raises(InvalidStateTransitionError) context around
CampaignStateMachine.validate_transition to assert the exception is raised,
capture the exception via the context manager (e.g., as excinfo) and then use
excinfo.value (the InvalidStateTransitionError) to construct the
InvalidStateTransitionProblem and run the same assertions; this ensures
CampaignStateMachine.validate_transition actually raises
InvalidStateTransitionError and the RFC9457 conversion is validated.
| except InvalidStateTransitionError as e: | ||
| from app.models.attack import AttackState | ||
|
|
||
| if isinstance(e.from_state, AttackState): | ||
| valid_actions = AttackStateMachine.get_valid_actions(e.from_state) | ||
| detail = f"Cannot abort attack from state '{e.from_state.value}'. Valid actions: {valid_actions}" | ||
| else: | ||
| detail = f"Cannot abort attack from state '{e.from_state.value}'." | ||
| raise HTTPException( | ||
| status_code=status.HTTP_400_BAD_REQUEST, | ||
| detail=detail, | ||
| ) from e |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Move inline import to module level for consistency.
The from app.models.attack import AttackState import at line 521 is functional but deviates from the module's import organization. Since AttackState is used for type checking and this handler may be invoked frequently, consider moving it to the module-level imports.
Proposed fix
At module level (around line 48):
from app.core.state_machines import AttackStateMachine, InvalidStateTransitionError
from app.db.session import get_db
-from app.models.attack import Attack
+from app.models.attack import Attack, AttackState
from app.models.campaign import CampaignThen in the handler:
except InvalidStateTransitionError as e:
- from app.models.attack import AttackState
-
if isinstance(e.from_state, AttackState):🤖 Prompt for AI Agents
In `@app/api/v1/endpoints/web/attacks.py` around lines 520 - 531, Move the local
import of AttackState to the module-level imports: remove the inline "from
app.models.attack import AttackState" inside the except block and add "from
app.models.attack import AttackState" near the other top-level imports; leave
the except handler logic unchanged (it should still catch
InvalidStateTransitionError, reference e.from_state, call
AttackStateMachine.get_valid_actions, and raise the HTTPException) so
AttackState is available for the isinstance check without performing an inline
import.
| async def cleanup_stale_resource( | ||
| resource: AttackResourceFile, db: AsyncSession | ||
| ) -> bool: |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Rename service functions to the {action}_{resource}_service convention.
cleanup_stale_resource and cancel_pending_resource do not follow the required service naming pattern. Consider renaming to cleanup_resource_service and cancel_pending_resource_service (or similar) and update call sites.
As per coding guidelines, "app/core/services/**: Service Layer - focus on business logic patterns: Functions named {action}_{resource}_service()."
Also applies to: 1745-1747
🤖 Prompt for AI Agents
In `@app/core/services/resource_service.py` around lines 1666 - 1668, The service
function names don't follow the {action}_{resource}_service convention: rename
cleanup_stale_resource to cleanup_resource_service and cancel_pending_resource
to cancel_pending_resource_service (or the chosen {action}_{resource}_service
variants) in app/core/services/resource_service.py and update all call sites,
imports, and tests to reference the new names; ensure any async signatures
(async def cleanup_stale_resource/ cancel_pending_resource and their references)
remain unchanged except for the identifier, and run project-wide grep/IDE
refactor to replace usages so there are no broken imports or unresolved
references.
| except asyncio.CancelledError: | ||
| # Re-raise to allow graceful task cancellation | ||
| raise | ||
| except Exception as e: # noqa: BLE001 - Defensive catch-all for background cleanup | ||
| logger.bind( | ||
| resource_id=resource_id, | ||
| age_hours=round(age_hours, 2), | ||
| project_id=resource.project_id, | ||
| error=str(e), | ||
| error_type=type(e).__name__, | ||
| ).error("Failed to clean up stale pending resource") | ||
| return False |
There was a problem hiding this comment.
Rollback the session on cleanup failures.
If a SQLAlchemy error occurs, returning False without a rollback can leave the AsyncSession in a failed state for subsequent operations.
Proposed fix
except Exception as e: # noqa: BLE001 - Defensive catch-all for background cleanup
+ await db.rollback()
logger.bind(
resource_id=resource_id,
age_hours=round(age_hours, 2),
project_id=resource.project_id,🤖 Prompt for AI Agents
In `@app/core/services/resource_service.py` around lines 1731 - 1742, The
exception handler that logs failures when cleaning up stale pending resources
must rollback the AsyncSession to avoid leaving it in a failed state; modify the
except Exception block (the one that logs with
logger.bind(resource_id=resource_id, age_hours=...,
project_id=resource.project_id, ...).error(...)) to call await
session.rollback() (or session.rollback() if synchronous) before logging and
returning False, and for more precision consider catching
sqlalchemy.exc.SQLAlchemyError separately to ensure rollback is performed for
DB-related errors while still re-raising asyncio.CancelledError as before.
| def test_error_conversion_to_rfc9457(self) -> None: | ||
| """Test converting InvalidStateTransitionError to RFC9457 problem.""" | ||
| try: | ||
| CampaignStateMachine.validate_transition( | ||
| CampaignState.DRAFT, CampaignState.PAUSED, action="pause" | ||
| ) | ||
| except InvalidStateTransitionError as e: | ||
| problem = InvalidStateTransitionProblem( | ||
| from_state=e.from_state.value, | ||
| to_state=e.to_state.value, | ||
| action=e.action, | ||
| entity_type="campaign", | ||
| valid_transitions=[ | ||
| s.value | ||
| for s in CampaignStateMachine.get_valid_transitions(e.from_state) | ||
| ], | ||
| ) | ||
| assert problem.current_state == "draft" | ||
| assert problem.attempted_state == "paused" | ||
| assert problem.action == "pause" | ||
| assert "active" in problem.valid_transitions | ||
| assert "archived" in problem.valid_transitions |
There was a problem hiding this comment.
Ensure the RFC9457 conversion test fails when no exception is raised.
The current try/except lets the test pass if validate_transition stops raising, which creates a false positive. Use pytest.raises to enforce the expectation.
Proposed fix
- try:
- CampaignStateMachine.validate_transition(
- CampaignState.DRAFT, CampaignState.PAUSED, action="pause"
- )
- except InvalidStateTransitionError as e:
- problem = InvalidStateTransitionProblem(
- from_state=e.from_state.value,
- to_state=e.to_state.value,
- action=e.action,
- entity_type="campaign",
- valid_transitions=[
- s.value
- for s in CampaignStateMachine.get_valid_transitions(e.from_state)
- ],
- )
- assert problem.current_state == "draft"
- assert problem.attempted_state == "paused"
- assert problem.action == "pause"
- assert "active" in problem.valid_transitions
- assert "archived" in problem.valid_transitions
+ with pytest.raises(InvalidStateTransitionError) as exc_info:
+ CampaignStateMachine.validate_transition(
+ CampaignState.DRAFT, CampaignState.PAUSED, action="pause"
+ )
+ e = exc_info.value
+ problem = InvalidStateTransitionProblem(
+ from_state=e.from_state.value,
+ to_state=e.to_state.value,
+ action=e.action,
+ entity_type="campaign",
+ valid_transitions=[
+ s.value
+ for s in CampaignStateMachine.get_valid_transitions(e.from_state)
+ ],
+ )
+ assert problem.current_state == "draft"
+ assert problem.attempted_state == "paused"
+ assert problem.action == "pause"
+ assert "active" in problem.valid_transitions
+ assert "archived" in problem.valid_transitions🤖 Prompt for AI Agents
In `@tests/unit/test_state_machines.py` around lines 862 - 883, The test
test_error_conversion_to_rfc9457 currently swallows failures by using
try/except; replace the try/except with a
pytest.raises(InvalidStateTransitionError) context around
CampaignStateMachine.validate_transition to assert the exception is raised,
capture the exception via the context manager (e.g., as excinfo) and then use
excinfo.value (the InvalidStateTransitionError) to construct the
InvalidStateTransitionProblem and run the same assertions; this ensures
CampaignStateMachine.validate_transition actually raises
InvalidStateTransitionError and the RFC9457 conversion is validated.
Add skill file capturing Ouroboros development patterns: - Commit conventions (conventional commits format) - Three-API architecture (Agent, Web UI, Control) - Service layer patterns with domain exceptions - RFC9457 error handling for Control API - State machine patterns for Campaign/Attack lifecycle - Testing tier selection guidance - Protected files documentation - Required library substitutions
- Delete frontend/.eslintrc.local.js (not referenced anywhere) - Add .reports/dead-code-analysis.md documenting findings - Backend Python code is clean (no unused imports/variables) - 156 unused Shadcn-UI files kept for future use
- Remove duplicate HTTPException handler registration in main.py - Improve resource cleanup task to minimize lock duration: - Query resource IDs first without holding locks - Lock each resource individually with skip_locked=True - Commit after each successful deletion - Handle race conditions where resource was deleted/uploaded by another worker
- Add InvalidStateTransitionError handler to archive_campaign endpoint - Move AttackState import to module level in attacks.py - Add session rollback in cleanup_stale_resource exception handler
- block-push-main: Block direct pushes to main branch - warn-stdlib-logging: Warn when using stdlib logging instead of loguru - warn-utcnow: Warn when using deprecated datetime.utcnow() - warn-lru-cache: Warn when using functools.lru_cache instead of cashews
Error handling improvements: - Narrow exception handlers in resource cleanup to specific types (S3Error, SQLAlchemyError, OSError, ConnectionError) - Add error_type to all exception logs for better debugging - Add debug logging to RFC9457 middleware for unhandled exceptions Race condition fix: - Add with_for_update() row locking in cancel_pending_resource to prevent concurrent modification Logging improvements: - Add debug logging for silent return paths in verify_upload_and_cleanup - Add structured logging with bind() for all error handlers
Add type guard to narrow CampaignState | AttackState union before calling CampaignStateMachine.get_valid_actions, matching the pattern used in start_campaign and stop_campaign handlers. Fixes basedpyright error in CI.
Add Alembic migration to add 'PAUSED' value to the PostgreSQL attackstate enum type. This is required for the attack state machine to support pause/resume functionality. Also clean up minor test code style issue in state machine tests.
The blocked items are repetitive and just documentation.
This pull request introduces RFC9457-compliant error handling for the Control API, adds a new endpoint for manual cancellation of pending resource uploads, and improves state-machine-based validation of state transitions for campaigns and attacks. The most important changes are grouped below by theme.
RFC9457 Problem Details Error Handling for Control API:
application/problem+jsonformat with extension fields. (app/core/control_rfc9457_middleware.py,app/api/v1/endpoints/agent/v1_http_exception_handler.py) [1] [2] [3] [4]InvalidStateTransitionProblemandInvalidResourceStateErrorexceptions to provide detailed machine-readable error information for invalid state transitions and resource errors. (app/core/control_exceptions.py)Resource Management Improvements:
app/api/v1/endpoints/control/resources.py,app/api/v1/endpoints/control/router.py) [1] [2] [3]app/core/config.py)State Machine Validation for Campaigns and Attacks:
CampaignStateMachine, raisingInvalidStateTransitionErrorfor invalid actions instead of generic HTTP exceptions. (app/core/services/campaign_service.py) [1] [2] [3] [4] [5]AttackStateMachine, ensuring only valid state changes are allowed. (app/core/services/attack_service.py) [1] [2]InvalidStateTransitionErrorin campaign endpoints to return appropriate HTTP 400 errors with clear messages. (app/api/v1/endpoints/web/campaigns.py) [1] [2] [3]