Skip to content

feat(examples): company-aware world model for personal-finance#358

Merged
buremba merged 3 commits into
mainfrom
feat/world-model-companies
Apr 26, 2026
Merged

feat(examples): company-aware world model for personal-finance#358
buremba merged 3 commits into
mainfrom
feat/world-model-companies

Conversation

@buremba
Copy link
Copy Markdown
Member

@buremba buremba commented Apr 25, 2026

Summary

Promotes `company` to a first-class peer of `$member` so the agent can model the real shape of UK contractors / freelancers / PSC owners — individual filer AND director-of-own-Ltd, with money flowing between the two.

Schema additions

  • `company` entity — Ltd, PLC, LLP, sole-trader, partnership, trust, charity, foreign (discriminated via `company_type`). Carries Companies House ref, accounting period, VAT scheme, PSC flag.
  • Subject relationships: `director_of`, `shareholder_of`, `employee_of`, `partner_in`, `spouse_of` (symmetric), `controls` (PSC register), `accountant_for`.
  • Asset ownership: `owned_by` (polymorphic to `$member` or `company`) and `co_owned_by` (joint accounts/properties with `share_pct`).
  • `transfer_pair`: links the two legs of an internal transfer so neither counts as income or expense.

Schema removals

  • `employer`, `trade`, `trade_of` — collapsed into `company`. A sole trader is now `company(company_type=sole_trader)`.

Schema updates

  • `account` — explicit "owner via owned_by/co_owned_by" rule; `business_current`/`business_savings`/`workplace_pension`/`loan` added to wrapper enum; `iban` for non-UK.
  • `property` — `use` enum (`primary_residence | let | FHL | commercial_let | mixed_use | investment_held`) — drives tax treatment more than physical `type`. Adds `purchase_date`/`purchase_cost` for future PRR. Drops `joint_share_pct` in favour of `co_owned_by`.
  • `employed_by`, `expense_of` — descriptions updated to reference company.

Identity convention (in SOUL.md)

  • `hmrc_utr`, `hmrc_ni_number`, `hmrc_paye_reference`, `companies_house_number`, `vat_number` live in `entity_identities`, not metadata.
  • Durable personal-tax facts (DOB, student_loan_plan, domicile_status, marital_status) ride on `save_knowledge` events with `semantic_type=identity`.

Stacked on

Targets `feat/personal-finance-example` (#350).

Test plan

  • All 32 model YAMLs validate against `@lobu/owletto-cli`'s `validateModel`.
  • Pre-commit Biome + tsc pass.
  • Manual: seed the personal-finance template org with `owletto seed`, confirm company + relationships appear via `manage_entity_schema(action=list)`.
  • Manual: tell the agent "I'm a director of Acme Ltd which I own 100% of", confirm it creates a `company` entity + `director_of` + `shareholder_of` (`shareholding_pct=100`) relationships.

Follow-up

Companion PR (`feat/install-identity-provisioning`) extends signup + install to populate `$member` entity + `entity_identities` (auth_user_id, email, optional wa_jid/phone) so the lookups taught in SOUL.md actually resolve.

@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

buremba added a commit that referenced this pull request Apr 25, 2026
…are world model

After the world-model refactor (#358), employer becomes a company entity.
PAYE ref lives on the employed_by relationship metadata (one company can
pay one person via different PAYE schemes). Director status is the
director_of relationship between the user's $member and the company.
SA105 properties join through owned_by / co_owned_by relationships and
filter on the new use enum.

Also: HMRC identifiers (UTR, NI) live in entity_identities, not in
$member.metadata. Step 0 now resolves the user's $member + reads UTR
+ NI from entity_identities. Output header reads from these instead of
metadata. Adds residence_status to the header.
buremba added a commit that referenced this pull request Apr 25, 2026
When a document yields a UTR, NI number, PAYE ref, Companies House
number, or VAT number, write them to entity_identities (not metadata).
Adds a Step 5b table mapping each identifier kind to its destination,
plus a duplicate-check SQL pattern.

PAYE reference is the exception — it's per-employment, so it stays on
the employed_by relationship metadata.

Also updates Step 5 to set account ownership via owned_by / co_owned_by
(matching the world-model refactor in #358) and to link
income_source.employed_by → company for employment income.
Base automatically changed from feat/personal-finance-example to main April 26, 2026 16:23
buremba added a commit that referenced this pull request Apr 26, 2026
…are world model

After the world-model refactor (#358), employer becomes a company entity.
PAYE ref lives on the employed_by relationship metadata (one company can
pay one person via different PAYE schemes). Director status is the
director_of relationship between the user's $member and the company.
SA105 properties join through owned_by / co_owned_by relationships and
filter on the new use enum.

Also: HMRC identifiers (UTR, NI) live in entity_identities, not in
$member.metadata. Step 0 now resolves the user's $member + reads UTR
+ NI from entity_identities. Output header reads from these instead of
metadata. Adds residence_status to the header.
buremba added a commit that referenced this pull request Apr 26, 2026
)

* feat(examples): add SA100 assembly playbook for personal-finance agent

New ASSEMBLY.md captures the full end-to-end flow for producing a UK
Self Assessment return from the entities the agent has ingested over
the year:

- Tax-year constants (personal allowance, dividend allowance, PSA bands,
  CGT annual exempt amount, rate bands) for 2024-25 and 2025-26.
- Six SQL query templates — one per SA100 section (main dividends/
  interest, SA102 employment, SA105 UK property, SA108 CGT,
  contributions, relief claims) — that the agent runs via query_sql.
- Calculation rules where HMRC logic bites (SA105 mortgage interest
  restricted to 20% basic-rate credit; gift aid gross-up; relief-at-
  source pension higher-rate claim; CGT exempt amount then rate by
  marginal band).
- Markdown output layout the agent produces for the user to paste into
  HMRC online, including explicit "⚠️ Gaps to resolve" section so
  missing fields surface rather than getting fabricated.

Implemented as an agent prompt/playbook rather than a TypeScript
connector — connectors can't query the local DB, while the agent
already has query_sql as a first-class MCP tool. This keeps the
domain knowledge (UK tax logic) in the agent's prompt stack where it
can be reviewed and edited without a deploy.

SOUL.md now points at ASSEMBLY.md instead of a made-up
assemble_self_assessment operation.

* feat(examples): align ASSEMBLY.md SA102/SA105 SQL with the company-aware world model

After the world-model refactor (#358), employer becomes a company entity.
PAYE ref lives on the employed_by relationship metadata (one company can
pay one person via different PAYE schemes). Director status is the
director_of relationship between the user's $member and the company.
SA105 properties join through owned_by / co_owned_by relationships and
filter on the new use enum.

Also: HMRC identifiers (UTR, NI) live in entity_identities, not in
$member.metadata. Step 0 now resolves the user's $member + reads UTR
+ NI from entity_identities. Output header reads from these instead of
metadata. Adds residence_status to the header.

* fix(examples): split finance_costs from allowable_expenses in SA105 query

The note above the SA105 query said to surface finance costs separately,
but the SQL aggregated all expense_of rows into allowable_expenses with
no carve-out — so the agent had no source for the finance-costs column
and would either invent it or silently drop the basic-rate credit.

- Filter the existing expense subquery to exclude
  metadata.tax_category = 'finance'.
- Add a parallel subquery summing only finance-tagged expenses.
- Replace the prose-only SA105 output section with a concrete table
  row that includes the finance-costs column.
- Tighten the note to call out the mis-tagged-finance-row gap path.
buremba added a commit that referenced this pull request Apr 26, 2026
When a document yields a UTR, NI number, PAYE ref, Companies House
number, or VAT number, write them to entity_identities (not metadata).
Adds a Step 5b table mapping each identifier kind to its destination,
plus a duplicate-check SQL pattern.

PAYE reference is the exception — it's per-employment, so it stays on
the employed_by relationship metadata.

Also updates Step 5 to set account ownership via owned_by / co_owned_by
(matching the world-model refactor in #358) and to link
income_source.employed_by → company for employment income.
buremba added a commit that referenced this pull request Apr 26, 2026
* feat(examples): add statement ingestion playbook for personal-finance agent

New INGESTION.md covers the full flow when a user uploads a bank
statement, broker contract note, or payslip through WhatsApp:

  1. Fetch the signed downloadUrl via curl (through the gateway proxy).
  2. Extract text — pdftotext for PDFs, csvtk for CSVs, direct read
     for OFX/QIF.
  3. Apply the same extraction schema the gmail-tx watcher uses.
  4. Post-validate: opening + movements ≈ closing; all dates within
     the statement period; de-duplicate against existing transactions.
  5. Create entities with a parsed_from provenance link to a document
     entity.
  6. Surface gaps + delta mismatches as questions to the user rather
     than silently committing.

Declares `nix_packages = ["poppler_utils", "csvtk"]` in the agent's
lobu.toml so pdftotext and csvtk are available to the sandboxed worker.

SOUL.md now points at INGESTION.md instead of a made-up
parse_statement tool — same approach as the ASSEMBLY.md playbook for
SA100 assembly. The agent's general-purpose Bash + existing MCP tools
are enough; the value is in the prompt + Nix declaration.

* feat(examples): teach INGESTION.md the identity-namespace convention

When a document yields a UTR, NI number, PAYE ref, Companies House
number, or VAT number, write them to entity_identities (not metadata).
Adds a Step 5b table mapping each identifier kind to its destination,
plus a duplicate-check SQL pattern.

PAYE reference is the exception — it's per-employment, so it stays on
the employed_by relationship metadata.

Also updates Step 5 to set account ownership via owned_by / co_owned_by
(matching the world-model refactor in #358) and to link
income_source.employed_by → company for employment income.
buremba added 3 commits April 26, 2026 17:32
Promotes company to a first-class peer of $member and adds the
relationship vocabulary needed to model real users — most freelancers
and contractors are both an SA100 individual filer AND the director of
their own Ltd, with money flowing between the two.

Schema additions:

- company entity (covers Ltd, PLC, LLP, sole-trader, partnership, trust,
  charity, foreign — discriminate via company_type). Carries Companies
  House ref, accounting period, VAT scheme, PSC flag.
- Subject relationships: director_of, shareholder_of, employee_of,
  partner_in, spouse_of (symmetric), controls (PSC register), and
  accountant_for (for hired-accountant access later).
- Asset ownership: owned_by (account|holding|asset_lot|property →
  $member|company) and co_owned_by (joint accounts/properties with
  share_pct).
- transfer_pair: links the two legs of an internal transfer so neither
  side counts as taxable income or as an allowable expense.

Schema removals:

- employer, trade, trade_of: collapsed into company. A sole trader is
  now a company(company_type=sole_trader); their employer is a
  company; SA103 self-employment continues to flow through expense_of
  → company.

Schema updates:

- account: explicit "owner via owned_by/co_owned_by" rule. Adds
  business_current, business_savings, workplace_pension, loan to the
  wrapper enum. Adds iban for non-UK accounts.
- property: adds use enum (primary_residence | let | FHL |
  commercial_let | mixed_use | investment_held) — drives tax treatment
  more than physical type does. Adds purchase_date / purchase_cost for
  PRR computations on disposal. Drops joint_share_pct in favour of
  co_owned_by.
- employed_by, expense_of: descriptions updated to reference company.

Identity convention (taught in SOUL.md, not yet enforced in code):

- hmrc_utr (works for both $member and company), hmrc_ni_number,
  hmrc_paye_reference, companies_house_number, vat_number all live in
  entity_identities, NOT in metadata. Other durable personal-tax facts
  (DOB, student_loan_plan, domicile_status, marital_status) ride on
  save_knowledge events with semantic_type=identity.

This is the long-term world model. CT600/SA800/SA900/VAT support
slots in additively in v2 — same entities, different assembly playbook.
Residence status can change year to year (someone moving in/out of the
UK has different status). It's a property of the tax year, not of the
$member. Drives SA109 routing.

Adds residence_status enum (uk_resident | non_resident |
split_year_arriver | split_year_leaver | dual_resident) plus
arrival_date / departure_date for split-year cases.
- transfer_pair.yaml: drop the "or related subjects" wording that
  contradicted the SOUL.md teaching. Make explicit that cross-subject
  flows (Ltd → personal salary, etc.) are NOT internal transfers.
- SOUL.md: spell out the SA100 isolation invariant (filter on
  account.owner_type before aggregating) and the shareholding sanity
  check (sum shareholding_pct = 100, otherwise flag a gap).
- company.yaml: forbid hmrc_* / companies_house_number / vat_number in
  metadata via a `not` constraint, so accidental writes hit a schema
  error instead of silently shadowing the entity_identities row.
@buremba buremba force-pushed the feat/world-model-companies branch from 81adb4a to 865a027 Compare April 26, 2026 16:32
@buremba buremba merged commit 0df7e19 into main Apr 26, 2026
10 checks passed
@buremba buremba deleted the feat/world-model-companies branch April 26, 2026 16:33
@github-actions github-actions Bot added the triage:needs-human Triage agent escalated for human review label Apr 26, 2026
@github-actions
Copy link
Copy Markdown
Contributor

Triage decision: needs-human

Reasons:

  • PR size exceeds auto-merge threshold (424 lines, 19 files; limits 300 lines, 10 files)

Next: Manual review and merge required after final approval

@buremba buremba restored the feat/world-model-companies branch May 12, 2026 00:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

triage:needs-human Triage agent escalated for human review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant