|
| 1 | +# CLAUDE TASK: Run NL/T editing tests for Unity MCP repo and emit JUnit |
| 2 | + |
| 3 | +You are running in CI at the repository root. Use only the tools that are allowed by the workflow: |
| 4 | +- View, GlobTool, GrepTool for reading. |
| 5 | +- Bash for local shell (git is allowed). |
| 6 | +- BatchTool for grouping. |
| 7 | +- MCP tools from server "unity" (exposed as mcp__unity__*). |
| 8 | + |
| 9 | +## Test target |
| 10 | +- Primary file: `Assets/Scripts/Interaction/SmartReach.cs` |
| 11 | +- For each operation, prefer structured edit tools (`replace_method`, `insert_method`, `delete_method`, `anchor_insert`, `apply_text_edits`, `regex_replace`) via the MCP server. |
| 12 | +- Include `precondition_sha256` for any text path write. |
| 13 | + |
| 14 | +## Output requirements |
| 15 | +- Create a JUnit XML at `reports/claude-nl-tests.xml`. |
| 16 | +- Each test = one `<testcase>` with `classname="UnityMCP.NL"` or `UnityMCP.T`. |
| 17 | +- On failure, include a `<failure>` node with a concise message and the last evidence snippet (10–20 lines). |
| 18 | +- Also write a human summary at `reports/claude-nl-tests.md` with checkboxes and the windowed reads. |
| 19 | + |
| 20 | +## Safety & hygiene |
| 21 | +- Make edits in-place, then revert them at the end (`git stash -u`/`git reset --hard` or balanced counter-edits) so the workspace is clean for subsequent steps. |
| 22 | +- Never push commits from CI. |
| 23 | +- If a write fails midway, ensure the file is restored before proceeding. |
| 24 | + |
| 25 | +## NL-0. Sanity Reads (windowed) |
| 26 | +- Tail 120 lines of SmartReach.cs. |
| 27 | +- Show 40 lines around method `DeactivateIK`. |
| 28 | +- **Pass** if both windows render with expected anchors present. |
| 29 | + |
| 30 | +## NL-1. Method replace/insert/delete (natural-language) |
| 31 | +- Replace `HasTarget` with block-bodied version returning `currentTarget != null`. |
| 32 | +- Insert `PrintSeries()` after `GetCurrentTarget` logging `1,2,3`. |
| 33 | +- Verify by reading 20 lines around the anchor. |
| 34 | +- Delete `PrintSeries()` and verify removal. |
| 35 | +- **Pass** if diffs match and verification windows show expected content. |
| 36 | + |
| 37 | +## NL-2. Anchor comment insertion |
| 38 | +- Add a comment `Build marker OK` immediately above `TestSelectObjectToPlace` attribute line. |
| 39 | +- **Pass** if the comment appears directly above `[ContextMenu("Test SelectObjectToPlace")]`. |
| 40 | + |
| 41 | +## NL-3. End-of-class insertion |
| 42 | +- Insert a 3-line comment `Tail test A/B/C` before the last method (preview, then apply). |
| 43 | +- **Pass** if windowed read shows the three lines at the intended location. |
| 44 | + |
| 45 | +## NL-4. Compile trigger |
| 46 | +- After any NL edit, ensure no stale compiler errors: |
| 47 | + - Write a short marker edit, then **revert** after validating. |
| 48 | + - The CI job will run Unity compile separately; record your local check (e.g., file parity and syntax sanity) as INFO, but do not attempt to invoke Unity here. |
| 49 | + |
| 50 | +## T-A. Anchor insert (text path) |
| 51 | +- Insert after `GetCurrentTarget`: `private int __TempHelper(int a, int b) => a + b;` |
| 52 | +- Verify via read; then delete with a `regex_replace` targeting only that helper block. |
| 53 | +- **Pass** if round-trip leaves the file exactly as before. |
| 54 | + |
| 55 | +## T-B. Replace method body with minimal range |
| 56 | +- Identify `HasTarget` body lines; single `replace_range` to change only inside braces; then revert. |
| 57 | +- **Pass** on exact-range change + revert. |
| 58 | + |
| 59 | +## T-C. Attribute preservation |
| 60 | +- For `DumpTargetingSnapshot`, change only interior `Debug.Log` lines via `replace_range`; attributes must remain untouched (inline or previous-line variants). |
| 61 | +- **Pass** if attributes unchanged. |
| 62 | + |
| 63 | +## T-D. End-of-class insertion (anchor) |
| 64 | +- Find final class brace; `position: before` to append a temporary helper; then remove. |
| 65 | +- **Pass** if insert/remove verified. |
| 66 | + |
| 67 | +## T-E. Temporary method lifecycle |
| 68 | +- Insert helper (T-A), update helper implementation via `apply_text_edits`, then delete with `regex_replace`. |
| 69 | +- **Pass** if lifecycle completes and file returns to original checksum. |
| 70 | + |
| 71 | +## T-F. Multi-edit atomic batch |
| 72 | +- In one call, perform two `replace_range` tweaks and one comment insert at the class end; verify all-or-nothing behavior. |
| 73 | +- **Pass** if either all 3 apply or none. |
| 74 | + |
| 75 | +## T-G. Path normalization |
| 76 | +- Run the same edit once with `unity://path/Assets/...` and once with `Assets/...` (if supported). |
| 77 | +- **Pass** if both target the same file and no `Assets/Assets` duplication. |
| 78 | + |
| 79 | +## T-H. Validation levels |
| 80 | +- After edits, run `validate` with `level: "standard"`, then `"basic"` for temporarily unbalanced text ops; final state must be valid. |
| 81 | +- **Pass** if validation OK and final file compiles in CI step. |
| 82 | + |
| 83 | +## T-I. Failure surfaces (expected) |
| 84 | +- Too large payload: `apply_text_edits` with >15 KB aggregate → expect `{status:"too_large"}`. |
| 85 | +- Stale file: change externally, then resend with old `precondition_sha256` → expect `{status:"stale_file"}` with hashes. |
| 86 | +- Overlap: two overlapping ranges → expect rejection. |
| 87 | +- Unbalanced braces: remove a closing `}` → expect validation failure and **no write**. |
| 88 | +- Header guard: attempt insert before the first `using` → expect `{status:"header_guard"}`. |
| 89 | +- Anchor aliasing: `insert`/`content` alias → expect success (aliased to `text`). |
| 90 | +- Auto-upgrade: try a text edit overwriting a method header → prefer structured `replace_method` or return a clear error. |
| 91 | +- **Pass** when each negative case returns the expected failure without persisting changes. |
| 92 | + |
| 93 | +## T-J. Idempotency & no-op |
| 94 | +- Re-run the same `replace_range` with identical content → expect success with no change. |
| 95 | +- Re-run a delete of an already-removed helper via `regex_replace` → clean no-op. |
| 96 | +- **Pass** if both behave idempotently. |
| 97 | + |
| 98 | +### Implementation notes |
| 99 | +- Always capture pre- and post‑windows (±20–40 lines) as evidence in the JUnit `<failure>` or as `<system-out>`. |
| 100 | +- For any file write, include `precondition_sha256` and verify the post‑hash in your log. |
| 101 | +- At the end, restore the repository to its original state (`git status` must be clean). |
| 102 | + |
| 103 | +# Emit the JUnit file to reports/claude-nl-tests.xml and a summary markdown to reports/claude-nl-tests.md. |
0 commit comments