diff --git a/CHANGELOG.md b/CHANGELOG.md
index abdac25..d8611f5 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -6,6 +6,71 @@ and this project follows [Semantic Versioning](https://semver.org/spec/v2.0.0.ht
 
 ## [Unreleased]
 
+## [0.9.0] - 2026-05-14
+
+**Theme: policy artifact validate + test framework.** v0.9.0 ships the
+two CLI surfaces that turn the YAML / JSON policy from "a config file
+the pipeline loads" into a policy-as-code artifact that compliance
+teams can validate and test in CI, independently from the agent code
+it governs.
+
+### Added
+- **`vaara.policy.validate` module.** `validate(Policy)` returns a
+  `ValidationReport` with structured `PolicyIssue` records (level,
+  code, path, message). Semantic warnings emitted: empty
+  `action_classes`, threshold bands narrower than 0.05 (default and
+  per-class merged), threshold overrides targeting an action class
+  not declared, sequence-pattern steps that do not name a declared
+  action class, escalation routes whose `if_articles` overlap no
+  emitted regulatory tag, missing default escalation route.
+  `validate_source(source, fmt="auto")` combines load and check so a
+  single call yields `(policy, report)` or `(None,
+  report-with-error)`. Stable JSON shape via `ValidationReport.to_dict()`.
+- **`vaara.policy.test_cases` module — Conftest analog for Vaara
+  policies.** `evaluate(policy, action_class, risk_score,
+  matched_sequences=())` is the underlying primitive: applies any
+  matched sequence pattern boosts (capped at 1.0), resolves the
+  merged threshold for the action class, returns
+  `EvaluationResult(verdict, boosted_risk, route)`. `PolicyTestCase`
+  captures the inputs plus an expected verdict and (for
+  `escalate`) an expected operator route. `run_test_cases(policy,
+  cases)` runs the list, captures evaluation errors as failed cases
+  rather than raising, and returns `PolicyTestResult` rows. The
+  evaluator validates inputs at the boundary (risk score in `[0,1]`,
+  action class declared, matched sequences known).
+- **`vaara.policy.test_cases_io` module.** `load_test_cases(path)`
+  reads a YAML or JSON cases document and returns a list of
+  `PolicyTestCase`. Document shape mirrors typical OPA / Conftest
+  test files: a top-level `cases:` list with `action_class`,
+  `risk_score`, optional `matched_sequences`, and an `expect:` block
+  carrying `verdict` and optional `route`.
+- **`vaara policy validate POLICY_PATH [--json]`** and **`vaara
+  policy test POLICY_PATH --cases CASES_PATH [--json]`** CLI
+  subcommands. Both honour standard CI exit codes: validate returns
+  1 on parse errors (warnings do not flip), test returns 1 on any
+  failed case (and 2 if the policy itself fails to parse).
+- **`examples/policies/test_cases.yaml`** — six worked test cases
+  exercising thresholds, sequence-pattern boost, default and
+  article-matched escalation routes against
+  `examples/policies/full.yaml`.
+- **48 new tests** (`tests/test_policy_validate.py`,
+  `tests/test_policy_test_cases.py`,
+  `tests/test_policy_test_cases_io.py`, `tests/test_policy_cli.py`)
+  covering report shape, every warning code, evaluator edges
+  (threshold-equal-escalate, boost cap at 1.0, unknown action class,
+  unknown sequence, out-of-range risk), case-construction validation
+  (bad verdict, route without escalate), YAML and JSON case files,
+  the worked example end-to-end, and CLI smoke for each subcommand
+  including `--json`. Full suite 472 / 472 pass.
+- **COMPLIANCE.md** gains a *Policy artifact review* subsection under
+  Article 14 documenting both CLI surfaces as the path to reviewing
+  the policy artifact independently from the agent code.
+
+### Note
+Backwards-compatible. Pure addition. No existing module signatures
+change. `Policy` and the load path are unchanged; the new modules
+sit beside them under `vaara.policy.*`.
+
 ## [0.8.0] - 2026-05-14
 
 **Theme: Article 73 serious-incident export (interim).** Adds the export
diff --git a/COMPLIANCE.md b/COMPLIANCE.md
index e1fe67e..e786e5f 100644
--- a/COMPLIANCE.md
+++ b/COMPLIANCE.md
@@ -45,6 +45,33 @@ whether the model is confident. A conformal interval of [0.58, 0.62]
 versus [0.2, 0.95] tells them whether to trust the number. Vaara
 surfaces both on every escalation.
 
+### Policy artifact review
+
+The Vaara policy is a declarative YAML / JSON document loaded via
+`vaara.policy.from_yaml()` or `from_json()`. As of v0.9, two CLI
+surfaces let a compliance team review the policy artifact
+independently from the agent code that uses it:
+
+- `vaara policy validate POLICY_PATH` runs structured semantic checks
+  (parse errors plus warnings for narrow threshold bands, dangling
+  per-class overrides, unreachable escalation routes, sequence steps
+  not naming a declared action class, missing default escalation
+  route). Exit code 0 if no errors; warnings print without flipping
+  the exit code.
+- `vaara policy test POLICY_PATH --cases CASES_PATH` runs a YAML/JSON
+  cases file against the policy (Conftest analog for Vaara). Each
+  case names an action class, a risk score, any sequence patterns to
+  treat as matched, and an expected verdict and route. Exit code 0
+  if every case passes.
+
+Both commands carry a `--json` flag so CI pipelines can consume the
+output directly. The policy document and its cases file live in the
+deployer's source-control tree, version-controlled and diffable,
+alongside any other policy-as-code artifacts (Rego, Cedar, Casbin)
+used in the same governance stack. Worked example at
+`examples/policies/test_cases.yaml` exercises
+`examples/policies/full.yaml`.
+
 ### Article 26 (deployer obligations)
 
 Article 26 obligations sit on the deployer, not on Vaara. The evidence
diff --git a/README.md b/README.md
index f0f15f1..ad97d6e 100644
--- a/README.md
+++ b/README.md
@@ -62,6 +62,7 @@ else:
 - [Article 14 runtime: why oversight of agentic AI has to be evidenced as action, not model](https://futurium.ec.europa.eu/ga/apply-ai-alliance/community-content/article-14-runtime-why-oversight-agentic-ai-has-be-evidenced-action-not-model): why this exists. Posted on the EU Apply AI Alliance Futurium.
 - `src/vaara/integrations/`: LangChain, OpenAI Agents SDK, CrewAI, MCP server.
 - `src/vaara/audit/`: hash-chain trail, SQLite backend, append-only WAL.
+- `src/vaara/policy/`: declarative YAML / JSON policy schema with `vaara policy validate` (semantic checks) and `vaara policy test` (Conftest-style cases-file runner) for reviewing the policy artifact in CI independently from agent code.
 - `src/vaara/sandbox/`: synthetic-trace cold-start calibration.
 
 > Vaara helps deployers assemble evidence for their own conformity work. It does not certify compliance or constitute legal advice. Deployers own their obligations under the EU AI Act and other applicable law.
diff --git a/examples/policies/test_cases.yaml b/examples/policies/test_cases.yaml
new file mode 100644
index 0000000..ddaaf87
--- /dev/null
+++ b/examples/policies/test_cases.yaml
@@ -0,0 +1,58 @@
+# Worked example: test cases for `examples/policies/full.yaml`.
+# Run with:
+#     vaara policy test examples/policies/full.yaml --cases examples/policies/test_cases.yaml
+# Compliance teams can author and version-control these files independently
+# from agent code (EU AI Act Article 14 — independently-reviewable policy
+# artifact). Exit code is 0 iff every case passes.
+
+cases:
+  # Baseline allow — risk below the escalate threshold.
+  - name: fs.write_file low risk allows
+    action_class: fs.write_file
+    risk_score: 0.30
+    expect:
+      verdict: allow
+
+  # Default escalate threshold (0.55) reached, no per-class override on escalate.
+  # fs.write_file does NOT declare aiact:14 so escalation falls through to default.
+  - name: fs.write_file mid risk escalates to default route
+    action_class: fs.write_file
+    risk_score: 0.60
+    expect:
+      verdict: escalate
+      route: on_call
+
+  # Per-class override tightens deny to 0.75 (default is 0.85).
+  - name: fs.write_file deny via tighter override
+    action_class: fs.write_file
+    risk_score: 0.80
+    expect:
+      verdict: deny
+
+  # tx.sign override is stricter: escalate=0.40, deny=0.65.
+  # Its regulatory tags include aiact:14 so the first escalation route wins.
+  - name: tx.sign escalates to ai_oversight_team on aiact 14 match
+    action_class: tx.sign
+    risk_score: 0.50
+    expect:
+      verdict: escalate
+      route: ai_oversight_team
+
+  # Sequence boost pushes a sub-escalate score over deny:
+  #   0.40 (risk) + 0.30 (config_then_signal boost) = 0.70 >= 0.65 deny override.
+  - name: tx.sign deny after config_then_signal sequence boost
+    action_class: tx.sign
+    risk_score: 0.40
+    matched_sequences: [config_then_signal]
+    expect:
+      verdict: deny
+
+  # email.send uses default thresholds (escalate 0.55, deny 0.85).
+  # Its only regulatory tag is aiact:13 — no escalation route matches that,
+  # so escalations fall through to the default 'on_call' route.
+  - name: email.send escalates to on_call when no specific route matches
+    action_class: email.send
+    risk_score: 0.70
+    expect:
+      verdict: escalate
+      route: on_call
diff --git a/pyproject.toml b/pyproject.toml
index ecf79ce..d7f95f9 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
 
 [project]
 name = "vaara"
-version = "0.8.0"
+version = "0.9.0"
 description = "Adaptive AI Agent Execution Layer for risk scoring, audit trails, and regulatory compliance"
 requires-python = ">=3.10"
 license = "Apache-2.0"
diff --git a/src/vaara/__init__.py b/src/vaara/__init__.py
index 49219c7..89d7d4c 100644
--- a/src/vaara/__init__.py
+++ b/src/vaara/__init__.py
@@ -6,7 +6,7 @@
 oversight.
 """
 
-__version__ = "0.8.0"
+__version__ = "0.9.0"
 
 from vaara.pipeline import InterceptionPipeline, InterceptionResult
 
diff --git a/src/vaara/cli.py b/src/vaara/cli.py
index 825ff63..cd1e185 100644
--- a/src/vaara/cli.py
+++ b/src/vaara/cli.py
@@ -45,6 +45,19 @@
         Mark pending items older than ``timeout-seconds`` as expired.
         Claimed items are left alone.
 
+    vaara policy validate POLICY_PATH [--json]
+        Load a YAML/JSON policy and run semantic checks. Exit 0 if no
+        errors. Warnings (narrow threshold bands, dangling overrides,
+        unreachable escalation routes, missing default route, sequence
+        steps not naming a declared action class) print without
+        flipping the exit code.
+
+    vaara policy test POLICY_PATH --cases CASES_PATH [--json]
+        Run a YAML/JSON cases file against a policy (Conftest analog).
+        Each case names an action_class, a risk_score, optional
+        matched_sequences, and an expected verdict / route. Exit 0 if
+        every case passes.
+
     vaara version
         Print the installed Vaara version.
 
@@ -465,6 +478,78 @@ def _cmd_review_expire(args: argparse.Namespace) -> int:
     return 0
 
 
+def _render_validation_report(report, *, source_label: str, as_json: bool) -> str:
+    if as_json:
+        return json.dumps(report.to_dict(), indent=2)
+    if not report.issues:
+        return f"{source_label}: ok (no issues)"
+    lines = [
+        f"  [{i.level.value}] {i.code}"
+        f"{(' at ' + i.path) if i.path else ''}: {i.message}"
+        for i in report.issues
+    ]
+    header = (
+        f"{source_label}: {len(report.errors)} error(s), "
+        f"{len(report.warnings)} warning(s)"
+    )
+    return header + "\n" + "\n".join(lines)
+
+
+def _cmd_policy_validate(args: argparse.Namespace) -> int:
+    from vaara.policy.validate import validate_source
+
+    policy_path = Path(args.policy).expanduser()
+    _policy, report = validate_source(policy_path)
+    print(_render_validation_report(
+        report, source_label=str(policy_path), as_json=args.json,
+    ))
+    return 0 if report.ok else 1
+
+
+def _render_test_results(results, *, as_json: bool) -> str:
+    if as_json:
+        return json.dumps({
+            "total": len(results),
+            "passed": sum(1 for r in results if r.passed),
+            "failed": sum(1 for r in results if not r.passed),
+            "results": [r.to_dict() for r in results],
+        }, indent=2)
+    lines = []
+    for r in results:
+        mark = "PASS" if r.passed else "FAIL"
+        suffix = "" if r.passed else f" — {r.diagnostic}"
+        lines.append(f"  [{mark}] {r.case.name}{suffix}")
+    failed = sum(1 for r in results if not r.passed)
+    header = f"{len(results)} case(s), {len(results) - failed} passed, {failed} failed"
+    return header + "\n" + "\n".join(lines)
+
+
+def _cmd_policy_test(args: argparse.Namespace) -> int:
+    from vaara.policy.test_cases import run_test_cases
+    from vaara.policy.test_cases_io import load_test_cases
+    from vaara.policy.validate import validate_source
+
+    policy_path = Path(args.policy).expanduser()
+    cases_path = Path(args.cases).expanduser()
+
+    policy, report = validate_source(policy_path)
+    if policy is None:
+        print(_render_validation_report(
+            report, source_label=str(policy_path), as_json=args.json,
+        ), file=sys.stderr)
+        return 2
+
+    try:
+        cases = load_test_cases(cases_path)
+    except Exception as e:
+        print(f"failed to load cases from {cases_path}: {e}", file=sys.stderr)
+        return 2
+
+    results = run_test_cases(policy, cases)
+    print(_render_test_results(results, as_json=args.json))
+    return 0 if all(r.passed for r in results) else 1
+
+
 def build_parser() -> argparse.ArgumentParser:
     p = argparse.ArgumentParser(prog="vaara", description="Vaara AI Agent Execution Layer")
     sub = p.add_subparsers(dest="cmd", required=True)
@@ -636,6 +721,38 @@ def build_parser() -> argparse.ArgumentParser:
     )
     re_.set_defaults(func=_cmd_review_expire)
 
+    pp_policy = sub.add_parser(
+        "policy",
+        help="Policy artifact commands (validate, test)",
+    )
+    psub = pp_policy.add_subparsers(dest="policy_cmd", required=True)
+
+    pvalid = psub.add_parser(
+        "validate",
+        help="Load a policy and report parse errors plus semantic warnings",
+    )
+    pvalid.add_argument("policy", help="Path to a YAML or JSON policy file")
+    pvalid.add_argument(
+        "--json", action="store_true",
+        help="Emit the report as JSON (stable shape for CI)",
+    )
+    pvalid.set_defaults(func=_cmd_policy_validate)
+
+    ptest = psub.add_parser(
+        "test",
+        help="Run a YAML/JSON cases file against a policy (Conftest analog)",
+    )
+    ptest.add_argument("policy", help="Path to a YAML or JSON policy file")
+    ptest.add_argument(
+        "--cases", required=True,
+        help="Path to a YAML or JSON file containing a 'cases:' list",
+    )
+    ptest.add_argument(
+        "--json", action="store_true",
+        help="Emit results as JSON (stable shape for CI)",
+    )
+    ptest.set_defaults(func=_cmd_policy_test)
+
     return p
 
 
diff --git a/src/vaara/policy/__init__.py b/src/vaara/policy/__init__.py
index ebcb746..e0ee8e0 100644
--- a/src/vaara/policy/__init__.py
+++ b/src/vaara/policy/__init__.py
@@ -12,6 +12,16 @@
 The companion JSON Schema document at `docs/policy_schema.json` is the citable
 spec for compliance-evidence purposes. Hand-rolled validation in `loader.py`
 mirrors the schema, with clean error paths for human readers.
+
+Beyond load and parse, two surfaces support reviewing the policy artifact
+independently from agent code:
+
+- ``validate`` / ``validate_source`` (in ``vaara.policy.validate``) returns a
+  structured report with parse errors and semantic warnings — usable in CI.
+- ``evaluate`` + ``run_test_cases`` (in ``vaara.policy.test_cases``) let a
+  team write synthetic action contexts and assert expected verdicts against a
+  policy, Conftest-style. YAML/JSON case files load via
+  ``load_test_cases`` (in ``vaara.policy.test_cases_io``).
 """
 
 from vaara.policy.schema import (
@@ -24,16 +34,43 @@
     Thresholds,
 )
 from vaara.policy.loader import from_dict, from_json, from_yaml
+from vaara.policy.validate import (
+    IssueLevel,
+    PolicyIssue,
+    ValidationReport,
+    validate,
+    validate_source,
+)
+from vaara.policy.test_cases import (
+    EvaluationResult,
+    PolicyTestCase,
+    PolicyTestResult,
+    evaluate,
+    run_test_cases,
+)
+from vaara.policy.test_cases_io import load_test_cases, parse_cases
 
 __all__ = [
     "SCHEMA_VERSION",
     "ActionClassDef",
     "EscalationRoute",
+    "EvaluationResult",
+    "IssueLevel",
     "Policy",
     "PolicyError",
+    "PolicyIssue",
+    "PolicyTestCase",
+    "PolicyTestResult",
     "SequencePattern",
     "Thresholds",
+    "ValidationReport",
+    "evaluate",
     "from_dict",
     "from_json",
     "from_yaml",
+    "load_test_cases",
+    "parse_cases",
+    "run_test_cases",
+    "validate",
+    "validate_source",
 ]
diff --git a/src/vaara/policy/test_cases.py b/src/vaara/policy/test_cases.py
new file mode 100644
index 0000000..e54194b
--- /dev/null
+++ b/src/vaara/policy/test_cases.py
@@ -0,0 +1,158 @@
+"""Policy test framework — Conftest analog for Vaara YAML/JSON policies.
+
+A test case is a synthetic action context plus an expected verdict.
+`run_test_cases(policy, cases)` evaluates each case against the policy
+and returns pass/fail results. `evaluate(policy, action_class,
+risk_score, matched_sequences=())` is the underlying primitive: given a
+risk score and any matched sequence patterns, what does the policy say
+the verdict and (if escalating) the route should be.
+
+YAML/JSON case-file loading lives in ``vaara.policy.test_cases_io`` —
+this module stays a pure data + evaluation layer.
+"""
+
+from __future__ import annotations
+
+from dataclasses import dataclass, field
+from typing import Iterable
+
+from vaara.policy.schema import Policy
+
+
+_VALID_VERDICTS = frozenset({"allow", "escalate", "deny"})
+
+
+@dataclass(frozen=True)
+class EvaluationResult:
+    verdict: str
+    boosted_risk: float
+    route: str | None
+
+
+def evaluate(
+    policy: Policy,
+    action_class: str,
+    risk_score: float,
+    matched_sequences: Iterable[str] = (),
+) -> EvaluationResult:
+    if not 0.0 <= risk_score <= 1.0:
+        raise ValueError(f"risk_score must be in [0,1], got {risk_score}")
+    if action_class not in policy.action_classes:
+        raise ValueError(f"action_class {action_class!r} not declared in policy")
+
+    matched = set(matched_sequences)
+    known = {s.name for s in policy.sequences}
+    unknown = matched - known
+    if unknown:
+        raise ValueError(
+            f"matched_sequences references unknown pattern(s): {sorted(unknown)!r}"
+        )
+
+    matched_seq_articles: set[str] = set()
+    boosted = float(risk_score)
+    for seq in policy.sequences:
+        if seq.name in matched:
+            boosted = min(1.0, boosted + seq.risk_boost)
+            matched_seq_articles.update(seq.regulatory)
+
+    thr = policy.threshold_for(action_class)
+    if boosted >= thr.deny:
+        verdict = "deny"
+    elif boosted >= thr.escalate:
+        verdict = "escalate"
+    else:
+        verdict = "allow"
+
+    route: str | None = None
+    if verdict == "escalate":
+        articles = set(policy.action_classes[action_class].regulatory)
+        articles.update(matched_seq_articles)
+        route = policy.escalation_route_for(articles)
+
+    return EvaluationResult(verdict=verdict, boosted_risk=boosted, route=route)
+
+
+@dataclass(frozen=True)
+class PolicyTestCase:
+    name: str
+    action_class: str
+    risk_score: float
+    matched_sequences: tuple[str, ...] = field(default_factory=tuple)
+    expected_verdict: str = "allow"
+    expected_route: str | None = None
+
+    def __post_init__(self) -> None:
+        if self.expected_verdict not in _VALID_VERDICTS:
+            raise ValueError(
+                f"case {self.name!r}: expected_verdict must be one of "
+                f"{sorted(_VALID_VERDICTS)}, got {self.expected_verdict!r}"
+            )
+        if self.expected_route is not None and self.expected_verdict != "escalate":
+            raise ValueError(
+                f"case {self.name!r}: expected_route only meaningful for "
+                f"expected_verdict='escalate'"
+            )
+
+
+@dataclass(frozen=True)
+class PolicyTestResult:
+    case: PolicyTestCase
+    passed: bool
+    actual: EvaluationResult | None
+    diagnostic: str
+
+    def to_dict(self) -> dict:
+        d: dict = {
+            "name": self.case.name,
+            "passed": self.passed,
+            "diagnostic": self.diagnostic,
+        }
+        if self.actual is not None:
+            d["actual"] = {
+                "verdict": self.actual.verdict,
+                "boosted_risk": self.actual.boosted_risk,
+                "route": self.actual.route,
+            }
+        return d
+
+
+def run_test_cases(
+    policy: Policy, cases: Iterable[PolicyTestCase],
+) -> list[PolicyTestResult]:
+    results: list[PolicyTestResult] = []
+    for case in cases:
+        try:
+            actual = evaluate(
+                policy, case.action_class, case.risk_score, case.matched_sequences,
+            )
+        except (ValueError, KeyError) as e:
+            results.append(PolicyTestResult(
+                case=case, passed=False, actual=None,
+                diagnostic=f"evaluation error: {e}",
+            ))
+            continue
+
+        verdict_ok = actual.verdict == case.expected_verdict
+        route_ok = case.expected_route is None or actual.route == case.expected_route
+        passed = verdict_ok and route_ok
+
+        if passed:
+            diag = "ok"
+        else:
+            parts = []
+            if not verdict_ok:
+                parts.append(
+                    f"verdict: expected {case.expected_verdict!r}, "
+                    f"got {actual.verdict!r}"
+                )
+            if not route_ok:
+                parts.append(
+                    f"route: expected {case.expected_route!r}, "
+                    f"got {actual.route!r}"
+                )
+            diag = "; ".join(parts)
+
+        results.append(PolicyTestResult(
+            case=case, passed=passed, actual=actual, diagnostic=diag,
+        ))
+    return results
diff --git a/src/vaara/policy/test_cases_io.py b/src/vaara/policy/test_cases_io.py
new file mode 100644
index 0000000..a14d5a8
--- /dev/null
+++ b/src/vaara/policy/test_cases_io.py
@@ -0,0 +1,89 @@
+"""YAML / JSON loaders for policy test-case files.
+
+Cases document shape::
+
+    cases:
+      - name: baseline allow
+        action_class: fs.write_file
+        risk_score: 0.3
+        matched_sequences: []       # optional
+        expect:
+          verdict: allow
+          route: ai_oversight_team  # optional, only for verdict=escalate
+"""
+
+from __future__ import annotations
+
+import json
+from pathlib import Path
+from typing import Union
+
+from vaara.policy.schema import PolicyError
+from vaara.policy.test_cases import PolicyTestCase
+
+
+def parse_cases(data: object) -> list[PolicyTestCase]:
+    if not isinstance(data, dict):
+        raise PolicyError(
+            f"cases document must be a mapping, got {type(data).__name__}"
+        )
+    raw = data.get("cases")
+    if not isinstance(raw, list):
+        raise PolicyError("cases document must have a 'cases:' list")
+
+    out: list[PolicyTestCase] = []
+    for i, entry in enumerate(raw):
+        if not isinstance(entry, dict):
+            raise PolicyError(f"cases[{i}]: must be a mapping")
+        name = entry.get("name") or f"case_{i}"
+        try:
+            action_class = entry["action_class"]
+            risk_score = float(entry["risk_score"])
+        except KeyError as e:
+            raise PolicyError(
+                f"cases[{i}] ({name}): missing required field {e.args[0]!r}"
+            ) from None
+        matched = tuple(entry.get("matched_sequences") or ())
+        expect = entry.get("expect") or {}
+        if not isinstance(expect, dict):
+            raise PolicyError(f"cases[{i}] ({name}): 'expect' must be a mapping")
+        try:
+            out.append(PolicyTestCase(
+                name=name,
+                action_class=action_class,
+                risk_score=risk_score,
+                matched_sequences=matched,
+                expected_verdict=expect.get("verdict", "allow"),
+                expected_route=expect.get("route"),
+            ))
+        except ValueError as e:
+            raise PolicyError(f"cases[{i}]: {e}") from None
+    return out
+
+
+def load_test_cases(source: Union[str, Path]) -> list[PolicyTestCase]:
+    if isinstance(source, str) and "\n" in source:
+        return parse_cases(_parse_text(source))
+    path = Path(source) if not isinstance(source, Path) else source
+    text = path.read_text(encoding="utf-8")
+    prefer_yaml = path.suffix.lower() in {".yaml", ".yml"}
+    return parse_cases(_parse_text(text, prefer_yaml=prefer_yaml))
+
+
+def _parse_text(text: str, *, prefer_yaml: bool = False) -> object:
+    if not prefer_yaml and text.lstrip().startswith("{"):
+        try:
+            return json.loads(text)
+        except json.JSONDecodeError as e:
+            raise PolicyError(f"invalid JSON in cases document: {e}") from None
+    try:
+        import yaml  # type: ignore[import-untyped]
+    except ImportError as e:
+        raise ImportError(
+            "loading a YAML cases document requires the [yaml] extra. "
+            "Install with: pip install 'vaara[yaml]'"
+        ) from e
+    try:
+        return yaml.safe_load(text)
+    except yaml.YAMLError as e:
+        raise PolicyError(f"invalid YAML in cases document: {e}") from None
diff --git a/src/vaara/policy/validate.py b/src/vaara/policy/validate.py
new file mode 100644
index 0000000..0fa503b
--- /dev/null
+++ b/src/vaara/policy/validate.py
@@ -0,0 +1,190 @@
+"""Structured policy validation.
+
+`from_dict` / `from_json` / `from_yaml` raise PolicyError on the first
+parse-time failure. That is right for the load path but coarse for a
+CI / compliance workflow that wants every issue at once, surfaces
+non-blocking warnings, and routes output to JSON. `validate(policy)`
+runs semantic checks and returns a ValidationReport. `validate_source`
+combines load and check so a single call yields (policy, report) or
+(None, report-with-error).
+
+Output rendering (text / JSON) lives in the CLI, not here, so the
+core module stays a pure data layer.
+"""
+
+from __future__ import annotations
+
+from dataclasses import asdict, dataclass, field
+from enum import Enum
+from pathlib import Path
+from typing import Union
+
+from vaara.policy.loader import from_dict, from_json, from_yaml
+from vaara.policy.schema import Policy, PolicyError
+
+
+class IssueLevel(str, Enum):
+    ERROR = "error"
+    WARNING = "warning"
+
+
+@dataclass(frozen=True)
+class PolicyIssue:
+    level: IssueLevel
+    code: str
+    path: str
+    message: str
+
+    def to_dict(self) -> dict:
+        d = asdict(self)
+        d["level"] = self.level.value
+        return d
+
+
+@dataclass(frozen=True)
+class ValidationReport:
+    issues: tuple[PolicyIssue, ...] = field(default_factory=tuple)
+
+    @property
+    def errors(self) -> tuple[PolicyIssue, ...]:
+        return tuple(i for i in self.issues if i.level is IssueLevel.ERROR)
+
+    @property
+    def warnings(self) -> tuple[PolicyIssue, ...]:
+        return tuple(i for i in self.issues if i.level is IssueLevel.WARNING)
+
+    @property
+    def ok(self) -> bool:
+        return not self.errors
+
+    def to_dict(self) -> dict:
+        return {
+            "ok": self.ok,
+            "error_count": len(self.errors),
+            "warning_count": len(self.warnings),
+            "issues": [i.to_dict() for i in self.issues],
+        }
+
+
+_MIN_THRESHOLD_GAP = 0.05
+
+
+def validate(policy: Policy) -> ValidationReport:
+    issues: list[PolicyIssue] = []
+    action_class_names = set(policy.action_classes)
+
+    if not action_class_names:
+        issues.append(PolicyIssue(
+            IssueLevel.WARNING, "no_action_classes", "action_classes",
+            "policy declares no action classes — pipeline cannot route any tool",
+        ))
+
+    default = policy.thresholds_default
+    if default.deny - default.escalate < _MIN_THRESHOLD_GAP:
+        issues.append(PolicyIssue(
+            IssueLevel.WARNING, "narrow_threshold_band", "thresholds.default",
+            f"escalate={default.escalate} and deny={default.deny} differ by less "
+            f"than {_MIN_THRESHOLD_GAP} — operators get a narrow review band",
+        ))
+
+    for name in policy.thresholds_overrides:
+        path = f"thresholds.{name}"
+        if name not in action_class_names:
+            issues.append(PolicyIssue(
+                IssueLevel.WARNING, "threshold_override_dangling", path,
+                f"threshold override targets action class {name!r} that is not "
+                f"declared in action_classes — override will never fire",
+            ))
+        merged = policy.threshold_for(name)
+        if merged.deny - merged.escalate < _MIN_THRESHOLD_GAP:
+            issues.append(PolicyIssue(
+                IssueLevel.WARNING, "narrow_threshold_band", path,
+                f"merged escalate={merged.escalate} and deny={merged.deny} differ "
+                f"by less than {_MIN_THRESHOLD_GAP} for {name!r}",
+            ))
+
+    for seq in policy.sequences:
+        for i, step in enumerate(seq.pattern):
+            if step not in action_class_names:
+                issues.append(PolicyIssue(
+                    IssueLevel.WARNING, "sequence_step_unknown_class",
+                    f"sequences.{seq.name}.pattern[{i}]",
+                    f"sequence step {step!r} does not name a declared action class "
+                    f"— if this is a deployer-side tool name, ignore",
+                ))
+
+    emitted: set[str] = set()
+    for ac in policy.action_classes.values():
+        emitted.update(ac.regulatory)
+    for seq in policy.sequences:
+        emitted.update(seq.regulatory)
+
+    has_default_route = False
+    for i, route in enumerate(policy.escalation_routes):
+        path = f"escalation.routes[{i}]"
+        if not route.if_articles:
+            has_default_route = True
+            continue
+        if not emitted.intersection(route.if_articles):
+            issues.append(PolicyIssue(
+                IssueLevel.WARNING, "escalation_route_unreachable", path,
+                f"route to {route.operator_group!r} fires on "
+                f"{list(route.if_articles)!r} but no action class or sequence "
+                f"emits any of those articles",
+            ))
+
+    if policy.escalation_routes and not has_default_route:
+        issues.append(PolicyIssue(
+            IssueLevel.WARNING, "no_default_escalation_route", "escalation.routes",
+            "no fallback route (route with empty `if`) declared — unmatched "
+            "escalations fall through to the implicit 'on_call' group",
+        ))
+
+    return ValidationReport(issues=tuple(issues))
+
+
+def validate_source(
+    source: Union[str, Path, dict], *, fmt: str = "auto",
+) -> tuple[Policy | None, ValidationReport]:
+    try:
+        policy = _load(source, fmt)
+    except PolicyError as e:
+        return None, ValidationReport(issues=(PolicyIssue(
+            IssueLevel.ERROR, "parse_error", "", str(e),
+        ),))
+    except ImportError as e:
+        return None, ValidationReport(issues=(PolicyIssue(
+            IssueLevel.ERROR, "missing_extra", "", str(e),
+        ),))
+    return policy, validate(policy)
+
+
+def _load(source: Union[str, Path, dict], fmt: str) -> Policy:
+    if isinstance(source, dict):
+        return from_dict(source)
+    if fmt == "auto":
+        fmt = _sniff_format(source)
+    if fmt == "json":
+        return from_json(source)
+    if fmt == "yaml":
+        return from_yaml(source)
+    raise PolicyError(f"unknown format {fmt!r} (expected 'json' or 'yaml')")
+
+
+def _sniff_format(source: Union[str, Path]) -> str:
+    path: Path | None = None
+    if isinstance(source, Path):
+        path = source
+    elif isinstance(source, str) and "\n" not in source:
+        candidate = Path(source)
+        if candidate.is_file():
+            path = candidate
+    if path is not None:
+        suffix = path.suffix.lower()
+        if suffix in {".yaml", ".yml"}:
+            return "yaml"
+        if suffix == ".json":
+            return "json"
+    if isinstance(source, str) and source.lstrip().startswith("{"):
+        return "json"
+    return "yaml" if isinstance(source, str) else "json"
diff --git a/tests/test_policy_cli.py b/tests/test_policy_cli.py
new file mode 100644
index 0000000..8ba5b07
--- /dev/null
+++ b/tests/test_policy_cli.py
@@ -0,0 +1,152 @@
+"""End-to-end CLI tests for `vaara policy validate` and `vaara policy test`."""
+
+from __future__ import annotations
+
+import json
+from pathlib import Path
+
+import pytest
+
+from vaara.cli import main
+
+
+@pytest.fixture
+def policy_file(tmp_path: Path) -> Path:
+    pytest.importorskip("yaml")
+    path = tmp_path / "policy.yaml"
+    path.write_text(
+        "version: '0.1'\n"
+        "domains: [eu_ai_act]\n"
+        "action_classes:\n"
+        "  fs.write_file:\n"
+        "    category: data\n"
+        "    reversibility: partially_reversible\n"
+        "    blast_radius: local\n"
+        "    urgency: timely\n"
+        "    regulatory: [aiact:9]\n"
+        "thresholds: {default: {escalate: 0.55, deny: 0.85}}\n"
+        "sequences: {}\n"
+        "escalation: {routes: [{default: on_call}]}\n",
+        encoding="utf-8",
+    )
+    return path
+
+
+@pytest.fixture
+def cases_file(tmp_path: Path) -> Path:
+    pytest.importorskip("yaml")
+    path = tmp_path / "cases.yaml"
+    path.write_text(
+        "cases:\n"
+        "  - name: allow_low\n"
+        "    action_class: fs.write_file\n"
+        "    risk_score: 0.3\n"
+        "    expect: {verdict: allow}\n"
+        "  - name: escalate_mid\n"
+        "    action_class: fs.write_file\n"
+        "    risk_score: 0.7\n"
+        "    expect: {verdict: escalate, route: on_call}\n",
+        encoding="utf-8",
+    )
+    return path
+
+
+class TestValidateCLI:
+    def test_clean_policy_exit_zero(
+        self, policy_file: Path, capsys: pytest.CaptureFixture[str],
+    ) -> None:
+        rc = main(["policy", "validate", str(policy_file)])
+        captured = capsys.readouterr()
+        assert rc == 0
+        assert "ok" in captured.out or "0 error" in captured.out
+
+    def test_parse_error_exit_one(
+        self, tmp_path: Path, capsys: pytest.CaptureFixture[str],
+    ) -> None:
+        bad = tmp_path / "bad.json"
+        bad.write_text('{"version": "9.9"}', encoding="utf-8")
+        rc = main(["policy", "validate", str(bad)])
+        captured = capsys.readouterr()
+        assert rc == 1
+        assert "error" in captured.out.lower()
+
+    def test_json_output_shape(
+        self, policy_file: Path, capsys: pytest.CaptureFixture[str],
+    ) -> None:
+        rc = main(["policy", "validate", str(policy_file), "--json"])
+        captured = capsys.readouterr()
+        assert rc == 0
+        payload = json.loads(captured.out)
+        assert payload["ok"] is True
+        assert "error_count" in payload
+        assert "warning_count" in payload
+        assert isinstance(payload["issues"], list)
+
+
+class TestTestCLI:
+    def test_all_pass_exit_zero(
+        self,
+        policy_file: Path,
+        cases_file: Path,
+        capsys: pytest.CaptureFixture[str],
+    ) -> None:
+        rc = main([
+            "policy", "test", str(policy_file), "--cases", str(cases_file),
+        ])
+        captured = capsys.readouterr()
+        assert rc == 0
+        assert "2 passed" in captured.out
+
+    def test_failure_exit_one(
+        self,
+        policy_file: Path,
+        tmp_path: Path,
+        capsys: pytest.CaptureFixture[str],
+    ) -> None:
+        pytest.importorskip("yaml")
+        bad_cases = tmp_path / "bad_cases.yaml"
+        bad_cases.write_text(
+            "cases:\n"
+            "  - name: wrong\n"
+            "    action_class: fs.write_file\n"
+            "    risk_score: 0.3\n"
+            "    expect: {verdict: deny}\n",
+            encoding="utf-8",
+        )
+        rc = main([
+            "policy", "test", str(policy_file), "--cases", str(bad_cases),
+        ])
+        captured = capsys.readouterr()
+        assert rc == 1
+        assert "FAIL" in captured.out
+
+    def test_policy_parse_error_exits_two(
+        self,
+        tmp_path: Path,
+        cases_file: Path,
+        capsys: pytest.CaptureFixture[str],
+    ) -> None:
+        bad = tmp_path / "bad.json"
+        bad.write_text('{"version": "9.9"}', encoding="utf-8")
+        rc = main([
+            "policy", "test", str(bad), "--cases", str(cases_file),
+        ])
+        assert rc == 2
+
+    def test_json_output_shape(
+        self,
+        policy_file: Path,
+        cases_file: Path,
+        capsys: pytest.CaptureFixture[str],
+    ) -> None:
+        rc = main([
+            "policy", "test", str(policy_file),
+            "--cases", str(cases_file), "--json",
+        ])
+        captured = capsys.readouterr()
+        assert rc == 0
+        payload = json.loads(captured.out)
+        assert payload["total"] == 2
+        assert payload["passed"] == 2
+        assert payload["failed"] == 0
+        assert len(payload["results"]) == 2
diff --git a/tests/test_policy_test_cases.py b/tests/test_policy_test_cases.py
new file mode 100644
index 0000000..e4fe0cc
--- /dev/null
+++ b/tests/test_policy_test_cases.py
@@ -0,0 +1,158 @@
+"""Tests for vaara.policy.test_cases — evaluator + Conftest-analog test runner."""
+
+from __future__ import annotations
+
+import pytest
+
+from vaara.policy import SCHEMA_VERSION, from_dict
+from vaara.policy.test_cases import (
+    EvaluationResult,
+    PolicyTestCase,
+    evaluate,
+    run_test_cases,
+)
+
+
+def _policy() -> dict:
+    return {
+        "version": SCHEMA_VERSION,
+        "domains": ["eu_ai_act", "dora"],
+        "action_classes": {
+            "fs.write_file": {
+                "category": "data", "reversibility": "partially_reversible",
+                "blast_radius": "local", "urgency": "timely",
+                "regulatory": ["aiact:9", "aiact:12"],
+            },
+            "tx.sign": {
+                "category": "financial", "reversibility": "irreversible",
+                "blast_radius": "shared", "urgency": "irrevocable",
+                "regulatory": ["aiact:14", "dora:10"],
+            },
+        },
+        "thresholds": {
+            "default": {"escalate": 0.55, "deny": 0.85},
+            "fs.write_file": {"deny": 0.75},
+            "tx.sign": {"escalate": 0.40, "deny": 0.65},
+        },
+        "sequences": {
+            "config_then_signal": {
+                "pattern": ["config.write", "tx.sign"],
+                "risk_boost": 0.3, "window_seconds": 60,
+                "regulatory": ["aiact:14"],
+            },
+        },
+        "escalation": {
+            "routes": [
+                {"if": ["aiact:14"], "operator_group": "ai_oversight_team"},
+                {"if": ["dora:10"], "operator_group": "ict_risk_team"},
+                {"default": "on_call"},
+            ],
+        },
+    }
+
+
+class TestEvaluate:
+    def test_allow_below_escalate(self) -> None:
+        p = from_dict(_policy())
+        r = evaluate(p, "fs.write_file", 0.30)
+        assert r == EvaluationResult(verdict="allow", boosted_risk=0.30, route=None)
+
+    def test_escalate_at_threshold(self) -> None:
+        p = from_dict(_policy())
+        r = evaluate(p, "fs.write_file", 0.55)
+        assert r.verdict == "escalate"
+        assert r.route == "on_call"
+
+    def test_deny_via_per_class_override(self) -> None:
+        p = from_dict(_policy())
+        r = evaluate(p, "fs.write_file", 0.80)
+        assert r.verdict == "deny"
+
+    def test_tx_sign_escalate_routes_aiact14(self) -> None:
+        p = from_dict(_policy())
+        r = evaluate(p, "tx.sign", 0.50)
+        assert r.verdict == "escalate"
+        assert r.route == "ai_oversight_team"
+
+    def test_sequence_boost_lifts_to_deny(self) -> None:
+        p = from_dict(_policy())
+        r = evaluate(p, "tx.sign", 0.40, ["config_then_signal"])
+        assert r.verdict == "deny"
+        assert r.boosted_risk == pytest.approx(0.70)
+
+    def test_unknown_action_class_raises(self) -> None:
+        p = from_dict(_policy())
+        with pytest.raises(ValueError, match="not declared"):
+            evaluate(p, "nope", 0.5)
+
+    def test_unknown_sequence_raises(self) -> None:
+        p = from_dict(_policy())
+        with pytest.raises(ValueError, match="unknown pattern"):
+            evaluate(p, "tx.sign", 0.4, ["ghost_pattern"])
+
+    def test_out_of_range_risk_raises(self) -> None:
+        p = from_dict(_policy())
+        with pytest.raises(ValueError, match="risk_score"):
+            evaluate(p, "tx.sign", 1.5)
+
+    def test_boosted_risk_caps_at_one(self) -> None:
+        p = from_dict(_policy())
+        r = evaluate(p, "tx.sign", 0.90, ["config_then_signal"])
+        assert r.boosted_risk == 1.0
+        assert r.verdict == "deny"
+
+
+class TestRunCases:
+    def test_all_pass(self) -> None:
+        p = from_dict(_policy())
+        cases = [
+            PolicyTestCase("low", "fs.write_file", 0.30, expected_verdict="allow"),
+            PolicyTestCase(
+                "high", "tx.sign", 0.50,
+                expected_verdict="escalate", expected_route="ai_oversight_team",
+            ),
+        ]
+        results = run_test_cases(p, cases)
+        assert all(r.passed for r in results)
+        assert results[0].diagnostic == "ok"
+
+    def test_verdict_mismatch_fails(self) -> None:
+        p = from_dict(_policy())
+        results = run_test_cases(p, [
+            PolicyTestCase("wrong", "fs.write_file", 0.30, expected_verdict="deny"),
+        ])
+        assert not results[0].passed
+        assert "verdict" in results[0].diagnostic
+
+    def test_route_mismatch_fails(self) -> None:
+        p = from_dict(_policy())
+        results = run_test_cases(p, [
+            PolicyTestCase(
+                "wrong_route", "tx.sign", 0.50,
+                expected_verdict="escalate", expected_route="ghost_team",
+            ),
+        ])
+        assert not results[0].passed
+        assert "route" in results[0].diagnostic
+
+    def test_evaluation_error_is_captured_not_raised(self) -> None:
+        p = from_dict(_policy())
+        results = run_test_cases(p, [
+            PolicyTestCase("bad_class", "ghost.tool", 0.5),
+        ])
+        assert not results[0].passed
+        assert results[0].actual is None
+        assert "evaluation error" in results[0].diagnostic
+
+
+class TestCaseValidation:
+    def test_bad_verdict_rejected_at_construction(self) -> None:
+        with pytest.raises(ValueError, match="expected_verdict"):
+            PolicyTestCase("x", "tx.sign", 0.5, expected_verdict="banana")
+
+    def test_route_without_escalate_rejected(self) -> None:
+        with pytest.raises(ValueError, match="expected_route"):
+            PolicyTestCase(
+                "x", "tx.sign", 0.5,
+                expected_verdict="allow", expected_route="team",
+            )
diff --git a/tests/test_policy_test_cases_io.py b/tests/test_policy_test_cases_io.py
new file mode 100644
index 0000000..60661c0
--- /dev/null
+++ b/tests/test_policy_test_cases_io.py
@@ -0,0 +1,107 @@
+"""Tests for vaara.policy.test_cases_io — YAML/JSON cases-file loading."""
+
+from __future__ import annotations
+
+import json
+from pathlib import Path
+
+import pytest
+
+from vaara.policy.loader import from_yaml
+from vaara.policy.schema import PolicyError
+from vaara.policy.test_cases import run_test_cases
+from vaara.policy.test_cases_io import load_test_cases, parse_cases
+
+
+class TestParseCases:
+    def test_minimal_case(self) -> None:
+        cases = parse_cases({
+            "cases": [{
+                "name": "c1", "action_class": "tx.sign", "risk_score": 0.5,
+                "expect": {"verdict": "escalate", "route": "ai_oversight_team"},
+            }],
+        })
+        assert len(cases) == 1
+        assert cases[0].expected_route == "ai_oversight_team"
+
+    def test_missing_required_field_raises(self) -> None:
+        with pytest.raises(PolicyError, match="missing required field"):
+            parse_cases({"cases": [{"name": "x", "action_class": "tx.sign"}]})
+
+    def test_must_have_cases_list(self) -> None:
+        with pytest.raises(PolicyError, match="cases:"):
+            parse_cases({"version": "0.1"})
+
+    def test_top_level_must_be_mapping(self) -> None:
+        with pytest.raises(PolicyError, match="must be a mapping"):
+            parse_cases(["not a mapping"])
+
+    def test_each_entry_must_be_mapping(self) -> None:
+        with pytest.raises(PolicyError, match="must be a mapping"):
+            parse_cases({"cases": ["bad"]})
+
+    def test_default_name_filled_in(self) -> None:
+        cases = parse_cases({
+            "cases": [{
+                "action_class": "tx.sign", "risk_score": 0.3,
+            }],
+        })
+        assert cases[0].name == "case_0"
+
+    def test_invalid_expect_block_rejected(self) -> None:
+        with pytest.raises(PolicyError, match="expect"):
+            parse_cases({
+                "cases": [{
+                    "action_class": "tx.sign", "risk_score": 0.3,
+                    "expect": "allow",
+                }],
+            })
+
+
+class TestLoadFromFile:
+    def test_load_json_file(self, tmp_path: Path) -> None:
+        path = tmp_path / "cases.json"
+        path.write_text(json.dumps({
+            "cases": [{
+                "name": "c1", "action_class": "tx.sign", "risk_score": 0.3,
+                "expect": {"verdict": "allow"},
+            }],
+        }), encoding="utf-8")
+        cases = load_test_cases(path)
+        assert cases[0].name == "c1"
+        assert cases[0].expected_verdict == "allow"
+
+    def test_load_yaml_file(self, tmp_path: Path) -> None:
+        pytest.importorskip("yaml")
+        path = tmp_path / "cases.yaml"
+        path.write_text(
+            "cases:\n"
+            "  - name: c1\n"
+            "    action_class: tx.sign\n"
+            "    risk_score: 0.5\n"
+            "    expect: {verdict: escalate, route: ai_oversight_team}\n",
+            encoding="utf-8",
+        )
+        cases = load_test_cases(path)
+        assert cases[0].expected_route == "ai_oversight_team"
+
+    def test_load_yaml_invalid_raises_policy_error(self, tmp_path: Path) -> None:
+        pytest.importorskip("yaml")
+        path = tmp_path / "bad.yaml"
+        path.write_text("cases: [: : invalid", encoding="utf-8")
+        with pytest.raises(PolicyError, match="invalid YAML"):
+            load_test_cases(path)
+
+
+class TestExampleCasesFile:
+    def test_full_example_passes_against_full_policy(self) -> None:
+        pytest.importorskip("yaml")
+        policy_path = Path("examples/policies/full.yaml")
+        cases_path = Path("examples/policies/test_cases.yaml")
+        if not policy_path.is_file() or not cases_path.is_file():
+            pytest.skip("example files not present in working tree")
+        policy = from_yaml(policy_path)
+        cases = load_test_cases(cases_path)
+        results = run_test_cases(policy, cases)
+        failed = [r for r in results if not r.passed]
+        assert not failed, [r.diagnostic for r in failed]
diff --git a/tests/test_policy_validate.py b/tests/test_policy_validate.py
new file mode 100644
index 0000000..42fe824
--- /dev/null
+++ b/tests/test_policy_validate.py
@@ -0,0 +1,196 @@
+"""Tests for vaara.policy.validate — structured semantic checks on a loaded Policy."""
+
+from __future__ import annotations
+
+from pathlib import Path
+
+import pytest
+
+from vaara.policy import SCHEMA_VERSION, from_dict
+from vaara.policy.validate import (
+    IssueLevel,
+    PolicyIssue,
+    ValidationReport,
+    validate,
+    validate_source,
+)
+
+
+def _base_policy() -> dict:
+    return {
+        "version": SCHEMA_VERSION,
+        "domains": ["eu_ai_act"],
+        "action_classes": {
+            "fs.write_file": {
+                "category": "data", "reversibility": "partially_reversible",
+                "blast_radius": "local", "urgency": "timely",
+                "regulatory": ["aiact:9", "aiact:12"],
+            },
+        },
+        "thresholds": {"default": {"escalate": 0.55, "deny": 0.85}},
+        "sequences": {},
+        "escalation": {"routes": [{"default": "on_call"}]},
+    }
+
+
+class TestReportShape:
+    def test_empty_report_is_ok(self) -> None:
+        report = ValidationReport()
+        assert report.ok
+        assert report.errors == ()
+        assert report.warnings == ()
+
+    def test_to_dict_has_stable_keys(self) -> None:
+        issue = PolicyIssue(IssueLevel.WARNING, "x", "p", "m")
+        report = ValidationReport(issues=(issue,))
+        d = report.to_dict()
+        assert set(d) == {"ok", "error_count", "warning_count", "issues"}
+        assert d["ok"] is True
+        assert d["warning_count"] == 1
+        assert d["issues"][0] == {
+            "level": "warning", "code": "x", "path": "p", "message": "m",
+        }
+
+    def test_error_makes_report_not_ok(self) -> None:
+        report = ValidationReport(issues=(
+            PolicyIssue(IssueLevel.ERROR, "e", "", "boom"),
+        ))
+        assert not report.ok
+        assert len(report.errors) == 1
+        assert len(report.warnings) == 0
+
+
+class TestSemanticChecks:
+    def test_clean_policy_passes(self) -> None:
+        report = validate(from_dict(_base_policy()))
+        assert report.ok
+        assert report.warnings == ()
+
+    def test_no_action_classes_warns(self) -> None:
+        data = _base_policy()
+        data["action_classes"] = {}
+        report = validate(from_dict(data))
+        assert report.ok
+        codes = [i.code for i in report.issues]
+        assert "no_action_classes" in codes
+
+    def test_narrow_default_threshold_warns(self) -> None:
+        data = _base_policy()
+        data["thresholds"]["default"] = {"escalate": 0.60, "deny": 0.62}
+        report = validate(from_dict(data))
+        assert any(
+            i.code == "narrow_threshold_band" and i.path == "thresholds.default"
+            for i in report.issues
+        )
+
+    def test_threshold_override_dangling_warns(self) -> None:
+        data = _base_policy()
+        data["thresholds"]["nonexistent.tool"] = {"deny": 0.70}
+        report = validate(from_dict(data))
+        assert any(
+            i.code == "threshold_override_dangling"
+            and "nonexistent.tool" in i.path
+            for i in report.issues
+        )
+
+    def test_narrow_override_band_warns(self) -> None:
+        data = _base_policy()
+        data["thresholds"]["fs.write_file"] = {"escalate": 0.60, "deny": 0.62}
+        report = validate(from_dict(data))
+        assert any(
+            i.code == "narrow_threshold_band" and "fs.write_file" in i.path
+            for i in report.issues
+        )
+
+    def test_sequence_step_unknown_class_warns(self) -> None:
+        data = _base_policy()
+        data["sequences"] = {
+            "exfil": {
+                "pattern": ["read_data", "fs.write_file"],
+                "risk_boost": 0.2, "window_seconds": 60,
+                "regulatory": [],
+            },
+        }
+        report = validate(from_dict(data))
+        unknown = [
+            i for i in report.issues
+            if i.code == "sequence_step_unknown_class"
+        ]
+        assert any("[0]" in i.path for i in unknown)
+        assert not any("[1]" in i.path for i in unknown)
+
+    def test_unreachable_escalation_route_warns(self) -> None:
+        data = _base_policy()
+        data["escalation"]["routes"] = [
+            {"if": ["aiact:99"], "operator_group": "ghost_team"},
+            {"default": "on_call"},
+        ]
+        report = validate(from_dict(data))
+        assert any(
+            i.code == "escalation_route_unreachable"
+            for i in report.issues
+        )
+
+    def test_no_default_route_warns(self) -> None:
+        data = _base_policy()
+        data["escalation"]["routes"] = [
+            {"if": ["aiact:9"], "operator_group": "data_team"},
+        ]
+        report = validate(from_dict(data))
+        assert any(
+            i.code == "no_default_escalation_route"
+            for i in report.issues
+        )
+
+
+class TestValidateSource:
+    def test_parse_error_yields_error_report(self, tmp_path: Path) -> None:
+        path = tmp_path / "bad.json"
+        path.write_text('{"version": "9.9"}', encoding="utf-8")
+        policy, report = validate_source(path)
+        assert policy is None
+        assert not report.ok
+        assert report.errors[0].code == "parse_error"
+
+    def test_clean_yaml_round_trip(self, tmp_path: Path) -> None:
+        pytest.importorskip("yaml")
+        path = tmp_path / "clean.yaml"
+        path.write_text(
+            "version: '0.1'\n"
+            "domains: [eu_ai_act]\n"
+            "action_classes:\n"
+            "  fs.write_file:\n"
+            "    category: data\n"
+            "    reversibility: partially_reversible\n"
+            "    blast_radius: local\n"
+            "    urgency: timely\n"
+            "    regulatory: [aiact:9]\n"
+            "thresholds: {default: {escalate: 0.55, deny: 0.85}}\n"
+            "sequences: {}\n"
+            "escalation: {routes: [{default: on_call}]}\n",
+            encoding="utf-8",
+        )
+        policy, report = validate_source(path)
+        assert policy is not None
+        assert report.ok
+
+    def test_dict_input_is_accepted(self) -> None:
+        policy, report = validate_source(_base_policy())
+        assert policy is not None
+        assert report.ok
+
+    def test_explicit_format_overrides_sniff(self, tmp_path: Path) -> None:
+        pytest.importorskip("yaml")
+        path = tmp_path / "no_suffix"
+        path.write_text(
+            "version: '0.1'\n"
+            "domains: [eu_ai_act]\n"
+            "action_classes: {}\n"
+            "thresholds: {default: {escalate: 0.55, deny: 0.85}}\n"
+            "sequences: {}\n"
+            "escalation: {routes: [{default: on_call}]}\n",
+            encoding="utf-8",
+        )
+        policy, report = validate_source(path, fmt="yaml")
+        assert policy is not None
+        assert report.ok