Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 32 additions & 0 deletions .claude/skills/aurelio-review-pr/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,38 @@ Based on changed files, launch applicable review agents **in parallel** using th
| **type-design-analyzer** | Type annotations or classes added/modified | `pr-review-toolkit:type-design-analyzer` |
| **logging-audit** | Any `.py` file in `src/` changed | `pr-review-toolkit:code-reviewer` |
| **resilience-audit** | Provider-layer `.py` files changed (`src/ai_company/providers/`) | `pr-review-toolkit:code-reviewer` |
| **docs-consistency** | **ALWAYS** — runs on every PR regardless of change type | `pr-review-toolkit:code-reviewer` |

The **docs-consistency** agent ensures project documentation never drifts from the codebase. It runs on **every PR** — code changes, config changes, docs-only changes, all of them.

**What to check:**

Read the current `DESIGN_SPEC.md`, `CLAUDE.md`, and `README.md` in full. Then compare them against the PR diff and the actual current state of the codebase. Flag anything that is now inaccurate, incomplete, or missing.

**DESIGN_SPEC.md (CRITICAL — this is the project's source of truth):**
1. §15.3 Project Structure — does it match the actual files/directories under `src/ai_company/`? Any new modules missing? Any listed files that no longer exist? (CRITICAL)
2. §15.5 Pydantic Model Conventions — do the documented conventions match how models are actually written in code? (MAJOR)
3. §15.4 Key Design Decisions — are technology choices and rationale still accurate? (MAJOR)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For better readability and to align with the document's structure, it would be clearer to list the DESIGN_SPEC.md checklist items in the same order as their section numbers. Section §15.4 should come before §15.5.

Suggested change
2. §15.5 Pydantic Model Conventionsdo the documented conventions match how models are actually written in code? (MAJOR)
3. §15.4 Key Design Decisionsare technology choices and rationale still accurate? (MAJOR)
2. §15.4 Key Design Decisionsare technology choices and rationale still accurate? (MAJOR)
3. §15.5 Pydantic Model Conventionsdo the documented conventions match how models are actually written in code? (MAJOR)

4. §3.1 Agent Identity Card — does the config/runtime split documentation match the actual model code? (MAJOR)
5. §10.2 Cost Tracking — does the implementation note match the actual `TokenUsage` and spending summary models? (MAJOR)
6. §11.1.1 Tool Execution Model — does it match actual `ToolInvoker` behavior? (MAJOR)
7. §15.2 Technology Stack — are versions, libraries, and rationale current? (MEDIUM)
8. Any other section that describes behavior, structure, or patterns that have changed (MAJOR)

**CLAUDE.md (CRITICAL — this guides all future development):**
9. Code Conventions — do documented patterns match what's actually in the code? New patterns used but not documented? Documented patterns no longer followed? (CRITICAL)
10. Logging section — are event import paths, logger patterns, and rules accurate? (CRITICAL)
11. Resilience section — does it match the actual retry/rate-limit implementation? (MAJOR)
12. Package Structure — does it match the actual directory layout? (MAJOR)
13. Testing section — are markers, commands, and conventions current? (MEDIUM)
14. Any other section that gives instructions that don't match reality (CRITICAL)

**README.md:**
15. Installation, usage, and getting-started instructions — still accurate? (MAJOR)
16. Feature descriptions — do they match what's actually built? (MEDIUM)
17. Links — any dead links or references to things that moved? (MINOR)

**Key principle:** It is better to flag a false positive than to let documentation drift silently. When in doubt, flag it.

The **logging-audit** agent prompt must check for these violations (see CLAUDE.md `## Logging`):

Expand Down
34 changes: 34 additions & 0 deletions .claude/skills/pre-pr-review/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -164,6 +164,40 @@ This captures committed-but-unpushed changes AND any uncommitted/untracked work
| **logging-audit** | Any `src_py` changed | `pr-review-toolkit:code-reviewer` (custom prompt below) |
| **resilience-audit** | Files in `src/ai_company/providers/` changed | `pr-review-toolkit:code-reviewer` (custom prompt below) |
| **security-reviewer** | Files in `src/ai_company/api/`, `src/ai_company/security/`, `src/ai_company/tools/`, `src/ai_company/config/` changed, OR diff contains `subprocess`, `eval`, `exec`, `pickle`, `yaml.load`, auth/credential patterns | `everything-claude-code:security-reviewer` |
| **docs-consistency** | **ALWAYS** — runs on every PR regardless of change type | `pr-review-toolkit:code-reviewer` (custom prompt below) |

### Docs-consistency custom prompt

The docs-consistency agent ensures project documentation never drifts from the codebase. It runs on **every PR** — code changes, config changes, docs-only changes, all of them.

**What to check:**

Read the current `DESIGN_SPEC.md`, `CLAUDE.md`, and `README.md` in full. Then compare them against the PR diff and the actual current state of the codebase. Flag anything that is now inaccurate, incomplete, or missing.

**DESIGN_SPEC.md (CRITICAL — this is the project's source of truth):**
1. §15.3 Project Structure — does it match the actual files/directories under `src/ai_company/`? Any new modules missing? Any listed files that no longer exist? (CRITICAL)
2. §15.5 Pydantic Model Conventions — do the documented conventions match how models are actually written in code? (MAJOR)
3. §15.4 Key Design Decisions — are technology choices and rationale still accurate? (MAJOR)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To improve clarity and consistency with the document's layout, I suggest reordering these checklist items to follow the section numbering in DESIGN_SPEC.md. This places §15.4 before §15.5.

Suggested change
2. §15.5 Pydantic Model Conventionsdo the documented conventions match how models are actually written in code? (MAJOR)
3. §15.4 Key Design Decisionsare technology choices and rationale still accurate? (MAJOR)
2. §15.4 Key Design Decisionsare technology choices and rationale still accurate? (MAJOR)
3. §15.5 Pydantic Model Conventionsdo the documented conventions match how models are actually written in code? (MAJOR)

4. §3.1 Agent Identity Card — does the config/runtime split documentation match the actual model code? (MAJOR)
5. §10.2 Cost Tracking — does the implementation note match the actual `TokenUsage` and spending summary models? (MAJOR)
6. §11.1.1 Tool Execution Model — does it match actual `ToolInvoker` behavior? (MAJOR)
7. §15.2 Technology Stack — are versions, libraries, and rationale current? (MEDIUM)
8. Any other section that describes behavior, structure, or patterns that have changed (MAJOR)

**CLAUDE.md (CRITICAL — this guides all future development):**
9. Code Conventions — do documented patterns match what's actually in the code? New patterns used but not documented? Documented patterns no longer followed? (CRITICAL)
10. Logging section — are event import paths, logger patterns, and rules accurate? (CRITICAL)
11. Resilience section — does it match the actual retry/rate-limit implementation? (MAJOR)
12. Package Structure — does it match the actual directory layout? (MAJOR)
13. Testing section — are markers, commands, and conventions current? (MEDIUM)
14. Any other section that gives instructions that don't match reality (CRITICAL)

**README.md:**
15. Installation, usage, and getting-started instructions — still accurate? (MAJOR)
16. Feature descriptions — do they match what's actually built? (MEDIUM)
17. Links — any dead links or references to things that moved? (MINOR)

**Key principle:** It is better to flag a false positive than to let documentation drift silently. When in doubt, flag it.
Comment thread
coderabbitai[bot] marked this conversation as resolved.

### Logging-audit custom prompt

Expand Down
8 changes: 5 additions & 3 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,8 +66,10 @@ src/ai_company/
- **PEP 758 except syntax**: use `except A, B:` (no parentheses) — ruff enforces this on Python 3.14
- **Type hints**: all public functions, mypy strict mode
- **Docstrings**: Google style, required on public classes/functions (enforced by ruff D rules)
- **Immutability**: create new objects, never mutate existing ones
- **Models**: Pydantic v2 (`BaseModel`, `model_validator`, `ConfigDict`)
- **Immutability**: create new objects, never mutate existing ones. For `dict`/`list` fields in frozen Pydantic models, use `MappingProxyType` wrapping at construction (not `deepcopy` on access). Deep-copy only at system boundaries (e.g. passing data to `tool.execute()`, serializing for persistence).
- **Config vs runtime state**: frozen Pydantic models for config/identity; separate mutable-via-copy models (using `model_copy(update=...)`) for runtime state that evolves (e.g. agent execution state, task progress). Never mix static config fields with mutable runtime fields in one model.
- **Models**: Pydantic v2 (`BaseModel`, `model_validator`, `computed_field`, `ConfigDict`). Use `@computed_field` for derived values (e.g. `total_tokens = input + output`) instead of storing + validating redundant fields. Use `NotBlankStr` (from `core.types`) for all non-optional identifier/name fields instead of manual whitespace validators.
- **Async concurrency**: use `asyncio.TaskGroup` for fan-out/fan-in parallel operations (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare `create_task`.
Comment thread
coderabbitai[bot] marked this conversation as resolved.
Outdated
- **Line length**: 88 characters (ruff)
- **Functions**: < 50 lines, files < 800 lines
- **Errors**: handle explicitly, never silently swallow
Expand All @@ -78,7 +80,7 @@ src/ai_company/
- **Every module** with business logic MUST have: `from ai_company.observability import get_logger` then `logger = get_logger(__name__)`
- **Never** use `import logging` / `logging.getLogger()` / `print()` in application code
- **Variable name**: always `logger` (not `_logger`, not `log`)
- **Event names**: always use constants from `ai_company.observability.events`
- **Event names**: always use constants from the appropriate domain module under `ai_company.observability.events/` (e.g. `events.provider`, `events.budget`, `events.tool`, `events.config`, `events.template`, `events.routing`, `events.task`, `events.prompt`)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check if domain-specific event submodules exist
echo "=== Checking events/ directory structure ==="
ls -la src/ai_company/observability/events/ 2>/dev/null || echo "events/ directory does not exist or is a file"

echo ""
echo "=== Checking if events.py is a file or directory ==="
file src/ai_company/observability/events.py

echo ""
echo "=== Files importing from ai_company.observability.events ==="
rg -l "from ai_company.observability.events import" src/ai_company/ 2>/dev/null | head -20

echo ""
echo "=== Count of import statements ==="
rg "from ai_company.observability.events import" src/ai_company/ 2>/dev/null | wc -l

echo ""
echo "=== Sample imports from flat structure ==="
rg -n "from ai_company.observability.events import" src/ai_company/ 2>/dev/null | head -15

echo ""
echo "=== Check CLAUDE.md line 83 context ==="
sed -n '80,90p' CLAUDE.md

Repository: Aureliolo/ai-company

Length of output: 3530


🏁 Script executed:

#!/bin/bash
echo "=== Content of events.py (first 60 lines) ==="
head -60 src/ai_company/observability/events.py

echo ""
echo "=== Show import pattern in one file ==="
head -30 src/ai_company/providers/base.py | grep -A 10 "from ai_company.observability.events"

Repository: Aureliolo/ai-company

Length of output: 3392


CLAUDE.md line 83 documents a non-existent module structure. The documentation prescribes domain-specific submodules (events.provider, events.budget, events.tool, etc.), but the actual implementation uses a flat module structure (ai_company.observability.events) where all constants are defined at the module level. This is confirmed by all 20 files across the codebase, which correctly import directly from the flat module (e.g., from ai_company.observability.events import PROVIDER_CALL_ERROR). The learnings also explicitly recommend the flat structure.

Update CLAUDE.md line 83 to reflect the actual flat module pattern instead of describing non-existent submodules.

🧰 Tools
🪛 LanguageTool

[style] ~83-~83: A comma is missing here.
Context: ...der ai_company.observability.events/ (e.g. events.provider, events.budget, `ev...

(EG_NO_COMMA)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@CLAUDE.md` at line 83, Update the documentation sentence that references
non-existent submodules (e.g., "events.provider", "events.budget", etc.) to
describe the actual flat module pattern: state that event constants live
directly in the ai_company.observability.events module and show the typical
import style (for example, importing PROVIDER_CALL_ERROR from
ai_company.observability.events). Replace the list of domain submodules with
guidance to use the single flat module and an example constant name like
PROVIDER_CALL_ERROR to make the correct usage explicit.

- **Structured kwargs**: always `logger.info(EVENT, key=value)` — never `logger.info("msg %s", val)`
- **All error paths** must log at WARNING or ERROR with context before raising
- **All state transitions** must log at INFO
Expand Down
Loading