feat(examples): company-aware world model for personal-finance#358
Merged
Conversation
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
buremba
added a commit
that referenced
this pull request
Apr 25, 2026
…are world model After the world-model refactor (#358), employer becomes a company entity. PAYE ref lives on the employed_by relationship metadata (one company can pay one person via different PAYE schemes). Director status is the director_of relationship between the user's $member and the company. SA105 properties join through owned_by / co_owned_by relationships and filter on the new use enum. Also: HMRC identifiers (UTR, NI) live in entity_identities, not in $member.metadata. Step 0 now resolves the user's $member + reads UTR + NI from entity_identities. Output header reads from these instead of metadata. Adds residence_status to the header.
buremba
added a commit
that referenced
this pull request
Apr 25, 2026
When a document yields a UTR, NI number, PAYE ref, Companies House number, or VAT number, write them to entity_identities (not metadata). Adds a Step 5b table mapping each identifier kind to its destination, plus a duplicate-check SQL pattern. PAYE reference is the exception — it's per-employment, so it stays on the employed_by relationship metadata. Also updates Step 5 to set account ownership via owned_by / co_owned_by (matching the world-model refactor in #358) and to link income_source.employed_by → company for employment income.
4 tasks
buremba
added a commit
that referenced
this pull request
Apr 26, 2026
…are world model After the world-model refactor (#358), employer becomes a company entity. PAYE ref lives on the employed_by relationship metadata (one company can pay one person via different PAYE schemes). Director status is the director_of relationship between the user's $member and the company. SA105 properties join through owned_by / co_owned_by relationships and filter on the new use enum. Also: HMRC identifiers (UTR, NI) live in entity_identities, not in $member.metadata. Step 0 now resolves the user's $member + reads UTR + NI from entity_identities. Output header reads from these instead of metadata. Adds residence_status to the header.
buremba
added a commit
that referenced
this pull request
Apr 26, 2026
) * feat(examples): add SA100 assembly playbook for personal-finance agent New ASSEMBLY.md captures the full end-to-end flow for producing a UK Self Assessment return from the entities the agent has ingested over the year: - Tax-year constants (personal allowance, dividend allowance, PSA bands, CGT annual exempt amount, rate bands) for 2024-25 and 2025-26. - Six SQL query templates — one per SA100 section (main dividends/ interest, SA102 employment, SA105 UK property, SA108 CGT, contributions, relief claims) — that the agent runs via query_sql. - Calculation rules where HMRC logic bites (SA105 mortgage interest restricted to 20% basic-rate credit; gift aid gross-up; relief-at- source pension higher-rate claim; CGT exempt amount then rate by marginal band). - Markdown output layout the agent produces for the user to paste into HMRC online, including explicit "⚠️ Gaps to resolve" section so missing fields surface rather than getting fabricated. Implemented as an agent prompt/playbook rather than a TypeScript connector — connectors can't query the local DB, while the agent already has query_sql as a first-class MCP tool. This keeps the domain knowledge (UK tax logic) in the agent's prompt stack where it can be reviewed and edited without a deploy. SOUL.md now points at ASSEMBLY.md instead of a made-up assemble_self_assessment operation. * feat(examples): align ASSEMBLY.md SA102/SA105 SQL with the company-aware world model After the world-model refactor (#358), employer becomes a company entity. PAYE ref lives on the employed_by relationship metadata (one company can pay one person via different PAYE schemes). Director status is the director_of relationship between the user's $member and the company. SA105 properties join through owned_by / co_owned_by relationships and filter on the new use enum. Also: HMRC identifiers (UTR, NI) live in entity_identities, not in $member.metadata. Step 0 now resolves the user's $member + reads UTR + NI from entity_identities. Output header reads from these instead of metadata. Adds residence_status to the header. * fix(examples): split finance_costs from allowable_expenses in SA105 query The note above the SA105 query said to surface finance costs separately, but the SQL aggregated all expense_of rows into allowable_expenses with no carve-out — so the agent had no source for the finance-costs column and would either invent it or silently drop the basic-rate credit. - Filter the existing expense subquery to exclude metadata.tax_category = 'finance'. - Add a parallel subquery summing only finance-tagged expenses. - Replace the prose-only SA105 output section with a concrete table row that includes the finance-costs column. - Tighten the note to call out the mis-tagged-finance-row gap path.
buremba
added a commit
that referenced
this pull request
Apr 26, 2026
When a document yields a UTR, NI number, PAYE ref, Companies House number, or VAT number, write them to entity_identities (not metadata). Adds a Step 5b table mapping each identifier kind to its destination, plus a duplicate-check SQL pattern. PAYE reference is the exception — it's per-employment, so it stays on the employed_by relationship metadata. Also updates Step 5 to set account ownership via owned_by / co_owned_by (matching the world-model refactor in #358) and to link income_source.employed_by → company for employment income.
buremba
added a commit
that referenced
this pull request
Apr 26, 2026
* feat(examples): add statement ingestion playbook for personal-finance agent
New INGESTION.md covers the full flow when a user uploads a bank
statement, broker contract note, or payslip through WhatsApp:
1. Fetch the signed downloadUrl via curl (through the gateway proxy).
2. Extract text — pdftotext for PDFs, csvtk for CSVs, direct read
for OFX/QIF.
3. Apply the same extraction schema the gmail-tx watcher uses.
4. Post-validate: opening + movements ≈ closing; all dates within
the statement period; de-duplicate against existing transactions.
5. Create entities with a parsed_from provenance link to a document
entity.
6. Surface gaps + delta mismatches as questions to the user rather
than silently committing.
Declares `nix_packages = ["poppler_utils", "csvtk"]` in the agent's
lobu.toml so pdftotext and csvtk are available to the sandboxed worker.
SOUL.md now points at INGESTION.md instead of a made-up
parse_statement tool — same approach as the ASSEMBLY.md playbook for
SA100 assembly. The agent's general-purpose Bash + existing MCP tools
are enough; the value is in the prompt + Nix declaration.
* feat(examples): teach INGESTION.md the identity-namespace convention
When a document yields a UTR, NI number, PAYE ref, Companies House
number, or VAT number, write them to entity_identities (not metadata).
Adds a Step 5b table mapping each identifier kind to its destination,
plus a duplicate-check SQL pattern.
PAYE reference is the exception — it's per-employment, so it stays on
the employed_by relationship metadata.
Also updates Step 5 to set account ownership via owned_by / co_owned_by
(matching the world-model refactor in #358) and to link
income_source.employed_by → company for employment income.
Promotes company to a first-class peer of $member and adds the relationship vocabulary needed to model real users — most freelancers and contractors are both an SA100 individual filer AND the director of their own Ltd, with money flowing between the two. Schema additions: - company entity (covers Ltd, PLC, LLP, sole-trader, partnership, trust, charity, foreign — discriminate via company_type). Carries Companies House ref, accounting period, VAT scheme, PSC flag. - Subject relationships: director_of, shareholder_of, employee_of, partner_in, spouse_of (symmetric), controls (PSC register), and accountant_for (for hired-accountant access later). - Asset ownership: owned_by (account|holding|asset_lot|property → $member|company) and co_owned_by (joint accounts/properties with share_pct). - transfer_pair: links the two legs of an internal transfer so neither side counts as taxable income or as an allowable expense. Schema removals: - employer, trade, trade_of: collapsed into company. A sole trader is now a company(company_type=sole_trader); their employer is a company; SA103 self-employment continues to flow through expense_of → company. Schema updates: - account: explicit "owner via owned_by/co_owned_by" rule. Adds business_current, business_savings, workplace_pension, loan to the wrapper enum. Adds iban for non-UK accounts. - property: adds use enum (primary_residence | let | FHL | commercial_let | mixed_use | investment_held) — drives tax treatment more than physical type does. Adds purchase_date / purchase_cost for PRR computations on disposal. Drops joint_share_pct in favour of co_owned_by. - employed_by, expense_of: descriptions updated to reference company. Identity convention (taught in SOUL.md, not yet enforced in code): - hmrc_utr (works for both $member and company), hmrc_ni_number, hmrc_paye_reference, companies_house_number, vat_number all live in entity_identities, NOT in metadata. Other durable personal-tax facts (DOB, student_loan_plan, domicile_status, marital_status) ride on save_knowledge events with semantic_type=identity. This is the long-term world model. CT600/SA800/SA900/VAT support slots in additively in v2 — same entities, different assembly playbook.
Residence status can change year to year (someone moving in/out of the UK has different status). It's a property of the tax year, not of the $member. Drives SA109 routing. Adds residence_status enum (uk_resident | non_resident | split_year_arriver | split_year_leaver | dual_resident) plus arrival_date / departure_date for split-year cases.
- transfer_pair.yaml: drop the "or related subjects" wording that contradicted the SOUL.md teaching. Make explicit that cross-subject flows (Ltd → personal salary, etc.) are NOT internal transfers. - SOUL.md: spell out the SA100 isolation invariant (filter on account.owner_type before aggregating) and the shareholding sanity check (sum shareholding_pct = 100, otherwise flag a gap). - company.yaml: forbid hmrc_* / companies_house_number / vat_number in metadata via a `not` constraint, so accidental writes hit a schema error instead of silently shadowing the entity_identities row.
81adb4a to
865a027
Compare
Contributor
|
Triage decision: Reasons:
Next: Manual review and merge required after final approval |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Promotes `company` to a first-class peer of `$member` so the agent can model the real shape of UK contractors / freelancers / PSC owners — individual filer AND director-of-own-Ltd, with money flowing between the two.
Schema additions
Schema removals
Schema updates
Identity convention (in SOUL.md)
Stacked on
Targets `feat/personal-finance-example` (#350).
Test plan
Follow-up
Companion PR (`feat/install-identity-provisioning`) extends signup + install to populate `$member` entity + `entity_identities` (auth_user_id, email, optional wa_jid/phone) so the lookups taught in SOUL.md actually resolve.