Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 34 additions & 0 deletions .claude/skills/aurelio-review-pr/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,40 @@ Based on changed files, launch applicable review agents **in parallel** using th
| **type-design-analyzer** | Type annotations or classes added/modified | `pr-review-toolkit:type-design-analyzer` |
| **logging-audit** | Any `.py` file in `src/` changed | `pr-review-toolkit:code-reviewer` |
| **resilience-audit** | Provider-layer `.py` files changed (`src/ai_company/providers/`) | `pr-review-toolkit:code-reviewer` |
| **docs-consistency** | **ALWAYS** — runs on every PR regardless of change type | `pr-review-toolkit:code-reviewer` |

The **docs-consistency** agent ensures project documentation never drifts from the codebase. It runs on **every PR** — code changes, config changes, docs-only changes, all of them.

**What to check:**

Read the current `DESIGN_SPEC.md`, `CLAUDE.md`, and `README.md` in full. Then compare them against the PR diff and the actual current state of the codebase. Flag anything that is now inaccurate, incomplete, or missing.

**DESIGN_SPEC.md (CRITICAL — this is the project's source of truth):**
1. §15.3 Project Structure — does it match the actual files/directories under `src/ai_company/`? Any new modules missing? Any listed files that no longer exist? (CRITICAL)
2. §3.1 Agent Identity Card — does the config/runtime split documentation match the actual model code? (MAJOR)
3. §15.4 Key Design Decisions — are technology choices and rationale still accurate? (MAJOR)
4. §15.5 Pydantic Model Conventions — do the documented conventions match how models are actually written in code? Are "Adopted" vs "Planned" labels still accurate? (MAJOR)
5. §10.2 Cost Tracking — does the implementation note match the actual `TokenUsage` and spending summary models? (MAJOR)
6. §11.1.1 Tool Execution Model — does it match actual `ToolInvoker` behavior? (MAJOR)
7. §15.2 Technology Stack — are versions, libraries, and rationale current? (MEDIUM)
8. §9.2 Provider Configuration — are model IDs, provider capability examples, and config/runtime mapping still representative? (MEDIUM)
9. §9.3 LiteLLM Integration — does the integration status match reality? (MEDIUM)
10. Any other section that describes behavior, structure, or patterns that have changed (MAJOR)

**CLAUDE.md (CRITICAL — this guides all future development):**
11. Code Conventions — do documented patterns match what's actually in the code? New patterns used but not documented? Documented patterns no longer followed? (CRITICAL)
12. Logging section — are event import paths, logger patterns, and rules accurate? (CRITICAL)
13. Resilience section — does it match the actual retry/rate-limit implementation? (MAJOR)
14. Package Structure — does it match the actual directory layout? (MAJOR)
15. Testing section — are markers, commands, and conventions current? (MEDIUM)
16. Any other section that gives instructions that don't match reality (CRITICAL)

**README.md:**
17. Installation, usage, and getting-started instructions — still accurate? (MAJOR)
18. Feature descriptions — do they match what's actually built? (MEDIUM)
19. Links — any dead links or references to things that moved? (MINOR)

**Key principle:** It is better to flag a false positive than to let documentation drift silently. When in doubt, flag it.

The **logging-audit** agent prompt must check for these violations (see CLAUDE.md `## Logging`):

Expand Down
36 changes: 36 additions & 0 deletions .claude/skills/pre-pr-review/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -164,6 +164,42 @@ This captures committed-but-unpushed changes AND any uncommitted/untracked work
| **logging-audit** | Any `src_py` changed | `pr-review-toolkit:code-reviewer` (custom prompt below) |
| **resilience-audit** | Files in `src/ai_company/providers/` changed | `pr-review-toolkit:code-reviewer` (custom prompt below) |
| **security-reviewer** | Files in `src/ai_company/api/`, `src/ai_company/security/`, `src/ai_company/tools/`, `src/ai_company/config/` changed, OR diff contains `subprocess`, `eval`, `exec`, `pickle`, `yaml.load`, auth/credential patterns | `everything-claude-code:security-reviewer` |
| **docs-consistency** | **ALWAYS** — runs on every PR regardless of change type | `pr-review-toolkit:code-reviewer` (custom prompt below) |

### Docs-consistency custom prompt

The docs-consistency agent ensures project documentation never drifts from the codebase. It runs on **every PR** — code changes, config changes, docs-only changes, all of them.

**What to check:**

Read the current `DESIGN_SPEC.md`, `CLAUDE.md`, and `README.md` in full. Then compare them against the PR diff and the actual current state of the codebase. Flag anything that is now inaccurate, incomplete, or missing.

**DESIGN_SPEC.md (CRITICAL — this is the project's source of truth):**
1. §15.3 Project Structure — does it match the actual files/directories under `src/ai_company/`? Any new modules missing? Any listed files that no longer exist? (CRITICAL)
2. §3.1 Agent Identity Card — does the config/runtime split documentation match the actual model code? (MAJOR)
3. §15.4 Key Design Decisions — are technology choices and rationale still accurate? (MAJOR)
4. §15.5 Pydantic Model Conventions — do the documented conventions match how models are actually written in code? Are "Adopted" vs "Planned" labels still accurate? (MAJOR)
5. §10.2 Cost Tracking — does the implementation note match the actual `TokenUsage` and spending summary models? (MAJOR)
6. §11.1.1 Tool Execution Model — does it match actual `ToolInvoker` behavior? (MAJOR)
7. §15.2 Technology Stack — are versions, libraries, and rationale current? (MEDIUM)
8. §9.2 Provider Configuration — are model IDs, provider capability examples, and config/runtime mapping still representative? (MEDIUM)
9. §9.3 LiteLLM Integration — does the integration status match reality? (MEDIUM)
10. Any other section that describes behavior, structure, or patterns that have changed (MAJOR)

**CLAUDE.md (CRITICAL — this guides all future development):**
11. Code Conventions — do documented patterns match what's actually in the code? New patterns used but not documented? Documented patterns no longer followed? (CRITICAL)
12. Logging section — are event import paths, logger patterns, and rules accurate? (CRITICAL)
13. Resilience section — does it match the actual retry/rate-limit implementation? (MAJOR)
14. Package Structure — does it match the actual directory layout? (MAJOR)
15. Testing section — are markers, commands, and conventions current? (MEDIUM)
16. Any other section that gives instructions that don't match reality (CRITICAL)

**README.md:**
17. Installation, usage, and getting-started instructions — still accurate? (MAJOR)
18. Feature descriptions — do they match what's actually built? (MEDIUM)
19. Links — any dead links or references to things that moved? (MINOR)

**Key principle:** It is better to flag a false positive than to let documentation drift silently. When in doubt, flag it.
Comment thread
coderabbitai[bot] marked this conversation as resolved.

### Logging-audit custom prompt

Expand Down
10 changes: 6 additions & 4 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,8 +66,10 @@ src/ai_company/
- **PEP 758 except syntax**: use `except A, B:` (no parentheses) — ruff enforces this on Python 3.14
- **Type hints**: all public functions, mypy strict mode
- **Docstrings**: Google style, required on public classes/functions (enforced by ruff D rules)
- **Immutability**: create new objects, never mutate existing ones
- **Models**: Pydantic v2 (`BaseModel`, `model_validator`, `ConfigDict`)
- **Immutability**: create new objects, never mutate existing ones. For `dict`/`list` fields in frozen Pydantic models, use `MappingProxyType` wrapping at construction (not `deepcopy` on access). Deep-copy only at system boundaries (e.g. passing data to `tool.execute()`, serializing for persistence).
- **Config vs runtime state**: frozen Pydantic models for config/identity; separate mutable-via-copy models (using `model_copy(update=...)`) for runtime state that evolves (e.g. agent execution state, task progress). Never mix static config fields with mutable runtime fields in one model.
- **Models**: Pydantic v2 (`BaseModel`, `model_validator`, `ConfigDict`). Planned conventions for new code: use `@computed_field` for derived values instead of storing + validating redundant fields; use `NotBlankStr` (from `core.types`) for non-optional identifier/name fields instead of manual whitespace validators. Existing models are being migrated incrementally.
- **Async concurrency**: prefer `asyncio.TaskGroup` for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare `create_task`. Existing code is being migrated incrementally.
- **Line length**: 88 characters (ruff)
- **Functions**: < 50 lines, files < 800 lines
- **Errors**: handle explicitly, never silently swallow
Expand All @@ -78,7 +80,7 @@ src/ai_company/
- **Every module** with business logic MUST have: `from ai_company.observability import get_logger` then `logger = get_logger(__name__)`
- **Never** use `import logging` / `logging.getLogger()` / `print()` in application code
- **Variable name**: always `logger` (not `_logger`, not `log`)
- **Event names**: always use constants from `ai_company.observability.events`
- **Event names**: always use constants from `ai_company.observability.events` (e.g. `PROVIDER_CALL_START`, `BUDGET_RECORD_ADDED`, `TOOL_INVOKE_START`). Import directly: `from ai_company.observability.events import EVENT_CONSTANT`
- **Structured kwargs**: always `logger.info(EVENT, key=value)` — never `logger.info("msg %s", val)`
- **All error paths** must log at WARNING or ERROR with context before raising
- **All state transitions** must log at INFO
Expand Down Expand Up @@ -131,5 +133,5 @@ src/ai_company/
## Dependencies

- **Pinned**: all versions use `==` in `pyproject.toml`
- **Groups**: `test` (pytest + plugins), `dev` (includes test + ruff, mypy, pre-commit, commitizen, pydantic)
- **Groups**: `test` (pytest + plugins), `dev` (includes test + ruff, mypy, pre-commit, commitizen)
- **Install**: `uv sync` installs everything (dev group is default)
Loading