feat: add structured output support via json_schema#683
Conversation
json_schema
Code Review Summary - PR #683I've completed a comprehensive review using specialized agents across security, code quality, performance, testing, and documentation. Here are the key findings: 🔴 Critical Issues1. Command Injection Vulnerability (Security)
2. Unsafe GitHub Actions Output Injection (Security)
🟡 Important Issues3. Insufficient JSON Schema Validation (Security)
4. Ineffective Empty Key Validation (Code Quality)
5. Inconsistent Error Handling (Code Quality)
6. Test/Production Code Mismatch (Testing)
✅ Strengths
📚 Documentation GapsThe
Consider adding a structured outputs section to the usage guide for discoverability. 🧪 Test Coverage GapsMissing tests for:
Overall Assessment: The feature implementation is solid with good architecture, but has critical security vulnerabilities that must be addressed before merging. The command injection and output injection issues pose real security risks in GitHub Actions environments. Recommendation: Address the two critical security issues, then consider the code quality and testing improvements before release. |
c7b804d to
5990de0
Compare
Comprehensive Code Review SummaryI've completed a thorough review using specialized review agents covering code quality, performance, test coverage, documentation, and security. Overall, this is a well-implemented feature with strong security practices, but there are several noteworthy improvements needed before merging. 🔴 Critical Issues (Must Fix)1. Code Duplication - JSON Schema ValidationImpact: Maintenance burden and risk of inconsistencies The JSON schema validation logic is duplicated between 2. Tests Don't Test Production CodeImpact: False confidence in test coverage The unit tests in 3. Documentation InaccuracyImpact: User confusion The 🟡 High Priority Issues (Should Fix)4. Performance - String Concatenation in LoopFile: Use array accumulator pattern instead of string concatenation to avoid excessive memory allocations. 5. Security - Missing Size Limit on JSON Schema InputImpact: DoS through resource exhaustion Add validation to limit JSON schema input size (recommend 100KB max) to prevent unbounded memory/disk usage. 6. Resource Leak - Temp File CleanupImpact: Disk space issues in long-running workflows Temporary schema files are created but never cleaned up. Add cleanup logic in finally block. 7. Test Coverage GapsMissing critical tests:
🟢 Medium Priority Issues (Consider)8. Potential Race Condition
9. Output Name Collision DetectionFields like 10. JSON Truncation SafetyTruncating objects/arrays at 1MB can produce invalid JSON. Skip these outputs rather than setting invalid data. 11. Integration Tests DisabledThe comprehensive test suite in ✅ Positive ObservationsThe implementation demonstrates excellent security practices:
The test workflows are well-designed with explicit prompts for deterministic results, and the feature has clear separation of concerns. 📋 RecommendationsBefore Merge (Critical):
Follow-up PR (High Priority): Future Improvements: 📊 Review Metrics
See inline comments for specific code suggestions and detailed explanations. |
5990de0 to
7204a00
Compare
Code Review Summary: PR #683 - Structured Output SupportI've completed a comprehensive review using specialized agents for security, code quality, performance, testing, and documentation. Overall, this is a well-implemented feature with good test coverage and documentation, but there are critical security issues that must be addressed before merging. 🚨 Critical Issues (Must Fix)1. Command Injection Vulnerability (base-action/src/index.ts:42)The JSON schema input is directly interpolated into shell arguments without proper escaping, allowing command injection. The fix is simple: use the existing Severity: Critical - Allows arbitrary command execution in GitHub Actions runners 🔴 High Priority Issues (Should Fix)2. Prototype Pollution (base-action/src/run-claude.ts:210)Missing validation for dangerous property names like 3. Error Handling Anti-pattern (base-action/src/run-claude.ts:262)The function both calls 4. Documentation Inaccuracies (docs/usage.md:238, 241)
5. Critical Test Gap (base-action/test/structured-output.test.ts:112)The main 🟡 Medium Priority Issues (Nice to Fix)6. Performance: Unnecessary Stringification (base-action/src/run-claude.ts:238)Large objects are stringified before checking size limits, wasting CPU on objects that will be discarded. 7. Type Safety:
|
…t JSON
Since GitHub Actions composite actions cannot expose dynamic outputs,
individual field outputs were not accessible anyway and only added
complexity and collision risk.
Simplified by:
- Removing individual core.setOutput() calls for each field
- Removing RESERVED_OUTPUTS check (no longer needed)
- Removing sanitizeOutputName, convertToString, MAX_OUTPUT_SIZE helpers
- Removing related unit tests for removed functionality
Users access all fields via single structured_output JSON string:
fromJSON(steps.<id>.outputs.structured_output).field_name
Or with jq:
echo '${{ steps.<id>.outputs.structured_output }}' | jq '.field_name'
All tests pass (462 tests).
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Updated documentation to reflect that structured outputs are now only
accessible via the single structured_output JSON string, not as
individual fields.
Changes:
- docs/usage.md: Updated "Accessing Structured Outputs" section
- Show fromJSON() usage in GitHub Actions expressions
- Show jq usage in bash
- Explain composite action limitation
- Remove outdated "Output Naming Rules" and size limit sections
- action.yml: Updated json_schema input description
- examples/test-failure-analysis.yml: Updated to use fromJSON() and jq
Users now access fields via:
fromJSON(steps.<id>.outputs.structured_output).field_name
Or:
echo '${{ steps.<id>.outputs.structured_output }}' | jq '.field_name'
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Fixed error handling anti-pattern identified in PR review where the function was calling core.setFailed() AND throwing errors, causing confusion about error handling flow. Changes: - parseAndSetStructuredOutputs now just throws errors without calling core.setFailed() - follows single responsibility principle - Caller (runClaude) catches errors and calls core.setFailed() once - Removed unnecessary structuredOutputSuccess boolean flag - Clearer error handling flow: function parses/throws, caller decides how to handle failures Addresses review comment: #683 (comment) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Fixed outdated documentation that incorrectly stated field naming restrictions that don't exist in the implementation. Changes: - Removed incorrect claim about field naming requirements (letter/underscore start) - Removed incorrect claim about special character sanitization - Clarified that field names can use any valid JSON property name - Updated access pattern to show fromJSON() usage - Clarified 1MB limit applies to entire structured_output string, not per-field The implementation simply does JSON.stringify(result.structured_output) without any sanitization, so any valid JSON property name works (including hyphens like "test-result", as validated by integration tests). Addresses review comment: #683 (comment) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Fixed test coverage gap where tests were only parsing JSON manually without actually invoking the parseAndSetStructuredOutputs function. Changes: - Export parseAndSetStructuredOutputs for testing - Rewrite tests to use spyOn() to mock @actions/core functions - Add tests that actually call the function and verify: - core.setOutput() called with correct JSON string - core.info() called with correct field count - Error thrown when result exists but structured_output undefined - Error thrown when no result message exists - Handles special characters in field names (hyphens, dots, @ symbols) - Handles arrays and nested objects correctly - File errors propagate correctly All 8 tests now properly test the actual implementation with full coverage of success and error paths. Addresses review comment: #683 (comment) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
| // Add JSON schema if specified (no escaping - parseShellArgs handles it) | ||
| if (process.env.JSON_SCHEMA) { | ||
| // Wrap in single quotes for parseShellArgs | ||
| claudeArgs += ` --json-schema '${process.env.JSON_SCHEMA}'`; |
There was a problem hiding this comment.
Security Issue: Inconsistent JSON Schema Escaping
The comment says "no escaping - parseShellArgs handles it", but single quotes aren't escaped. This differs from src/utils/json-schema.ts which properly escapes quotes:
const escapedSchema = schema.replace(/'/g, "'\\''");
return `${claudeArgs} --json-schema '${escapedSchema}'`;JSON schemas often contain quotes. Without proper escaping, schemas with quotes will fail or behave unexpectedly.
| // Add JSON schema if specified (no escaping - parseShellArgs handles it) | |
| if (process.env.JSON_SCHEMA) { | |
| // Wrap in single quotes for parseShellArgs | |
| claudeArgs += ` --json-schema '${process.env.JSON_SCHEMA}'`; | |
| // Add JSON schema if specified (escape single quotes for shell safety) | |
| if (process.env.JSON_SCHEMA) { | |
| const escapedSchema = process.env.JSON_SCHEMA.replace(/'/g, "'\\''"); | |
| claudeArgs += ` --json-schema '${escapedSchema}'`; | |
| } |
base-action/src/run-claude.ts
Outdated
| type ExecutionMessage = { | ||
| type: string; | ||
| structured_output?: Record<string, unknown>; |
There was a problem hiding this comment.
Code Quality: Weak Type Safety
The ExecutionMessage type is too broad. type: string allows any string, making type guards ineffective. Missing fields like subtype, is_error, etc. that are used in the code.
Recommended improvement using discriminated unions:
type ExecutionMessage =
| { type: "system"; subtype: string; model?: string }
| { type: "result"; subtype?: string; is_error?: boolean;
duration_ms?: number; cost_usd?: number;
structured_output?: Record<string, unknown> }
| { type: "turn"; content?: string }
| { type: string }; // fallback for unknown typesThis provides compile-time validation and better IDE support.
|
|
||
| if (!result?.structured_output) { | ||
| throw new Error( | ||
| `json_schema was provided but Claude did not return structured_output.\n` + | ||
| `Found ${messages.length} messages. Result exists: ${!!result}\n`, | ||
| ); |
There was a problem hiding this comment.
Code Quality: Improve Error Message Context
The error message doesn't help debug why structured_output is missing. Could be schema validation failure, Claude error, or wrong message format.
| if (!result?.structured_output) { | |
| throw new Error( | |
| `json_schema was provided but Claude did not return structured_output.\n` + | |
| `Found ${messages.length} messages. Result exists: ${!!result}\n`, | |
| ); | |
| if (!result?.structured_output) { | |
| const messageTypes = messages.map(m => m.type).join(', '); | |
| const hasResult = !!result; | |
| const hasStructuredOutput = result ? !!result.structured_output : false; | |
| throw new Error( | |
| `json_schema was provided but Claude did not return structured_output.\n` + | |
| `Found ${messages.length} messages (types: ${messageTypes}).\n` + | |
| `Result message exists: ${hasResult}${hasResult ? `, has structured_output: ${hasStructuredOutput}` : ''}.\n` + | |
| `This usually means Claude failed to comply with the schema or encountered an error.`, | |
| ); | |
| } |
| "Set structured_output with 0 field(s)", | ||
| ); | ||
| }); | ||
| }); |
There was a problem hiding this comment.
Test Coverage: Missing Edge Cases
The test suite is comprehensive but missing a few important edge cases:
- Null values in structured output fields:
test("should handle null values in structured_output", async () => {
await createMockExecutionFile({ nullable_field: null, present_field: "value" });
await parseAndSetStructuredOutputs(TEST_EXECUTION_FILE);
const parsed = JSON.parse(setOutputSpy.mock.calls[0][1]);
expect(parsed.nullable_field).toBeNull();
});- Multiple result messages (currently uses
find()which returns first match):
test("should use first result message when multiple exist", async () => {
const messages = [
{ type: "result", structured_output: { first: true } },
{ type: "result", structured_output: { second: true } }
];
await writeFile(TEST_EXECUTION_FILE, JSON.stringify(messages));
await parseAndSetStructuredOutputs(TEST_EXECUTION_FILE);
const parsed = JSON.parse(setOutputSpy.mock.calls[0][1]);
expect(parsed.first).toBe(true);
expect(parsed.second).toBeUndefined();
});beffcd8 to
84265a4
Compare
| if (process.env.JSON_SCHEMA) { | ||
| // Wrap in single quotes for parseShellArgs | ||
| claudeArgs += ` --json-schema '${process.env.JSON_SCHEMA}'`; |
There was a problem hiding this comment.
Security: Command Injection Risk
The JSON schema is wrapped in single quotes without proper escaping, creating a potential injection vector. An attacker controlling JSON_SCHEMA could include a single quote to inject malicious CLI flags.
Issue: '${process.env.JSON_SCHEMA}' doesn't escape single quotes within the schema.
Example Attack:
{"type":"object"}' --malicious-flag '
Result: --json-schema '{"type":"object"}' --malicious-flag ''
Fix: Apply the same escaping used in appendJsonSchemaArg (src/utils/json-schema.ts:15):
if (process.env.JSON_SCHEMA) {
const escapedSchema = process.env.JSON_SCHEMA.replace(/'/g, "'\\''");
claudeArgs += ` --json-schema '${escapedSchema}'`;
}Or better yet, reuse the utility function:
import { appendJsonSchemaArg } from '../../utils/json-schema';
// ...
claudeArgs = appendJsonSchemaArg(claudeArgs, process.env.JSON_SCHEMA);| const content = await readFile(executionFile, "utf-8"); | ||
| const messages = JSON.parse(content) as { | ||
| type: string; | ||
| structured_output?: Record<string, unknown>; | ||
| }[]; | ||
|
|
||
| // Search backwards - result is typically last or second-to-last message | ||
| const result = messages.findLast( | ||
| (m) => m.type === "result" && m.structured_output, | ||
| ); |
There was a problem hiding this comment.
Performance: Inefficient File Parsing
This reads and parses the entire execution file (potentially 10MB per line 344) to extract just the last result message. For long-running Claude sessions, this is 50-100x slower than necessary.
Issue: Full file read + full JSON parse + array iteration when only the last message is needed.
Optimization: Since the execution file is newline-delimited JSON, read it in reverse:
// Read last ~10KB of file (enough for typical result message)
const fd = await open(executionFile, 'r');
const stat = await fd.stat();
const readSize = Math.min(10240, stat.size);
const buffer = Buffer.alloc(readSize);
await fd.read(buffer, 0, readSize, stat.size - readSize);
await fd.close();
// Parse lines in reverse
const lines = buffer.toString('utf-8').split('\n').reverse();
for (const line of lines) {
if (!line.trim()) continue;
try {
const msg = JSON.parse(line);
if (msg.type === 'result' && msg.structured_output) {
return msg.structured_output;
}
} catch { continue; }
}Also consider: Add size limit validation before parsing (currently vulnerable to DoS via huge JSON).
| - name: Retry if flaky | ||
| if: fromJSON(steps.analyze.outputs.structured_output).is_flaky == true | ||
| run: gh workflow run CI |
There was a problem hiding this comment.
Documentation Issue: Fragile fromJSON() pattern
This pattern is fragile - if the action fails or output is malformed, the workflow will error. GitHub Actions expressions don't handle JSON parsing failures gracefully.
Better approach (add this to docs):
# Safer: check output exists before parsing
- name: Retry if flaky
if: |
steps.analyze.outputs.structured_output != '' &&
fromJSON(steps.analyze.outputs.structured_output).is_flaky == true
run: gh workflow run CIOr parse explicitly:
- name: Parse results
id: parse
run: |
OUTPUT='${{ steps.analyze.outputs.structured_output }}'
if [ -n "$OUTPUT" ]; then
echo "is_flaky=$(echo "$OUTPUT" | jq -r '.is_flaky')" >> $GITHUB_OUTPUT
else
echo "is_flaky=false" >> $GITHUB_OUTPUT
fi
- name: Retry if flaky
if: steps.parse.outputs.is_flaky == 'true'
run: gh workflow run CI| ### Accessing Structured Outputs | ||
|
|
||
| All structured output fields are available in the `structured_output` output as a JSON string: | ||
|
|
There was a problem hiding this comment.
Documentation Gap: Missing Failure Behavior
The docs don't explain what happens when Claude fails to return structured output.
Add section:
### Error Handling
If `json_schema` is provided but Claude cannot return structured output, **the action will fail** with:json_schema was provided but Claude did not return structured_output
This ensures your automation workflows don't proceed with missing data.
**To handle failures gracefully**:
```yaml
- name: Analyze code
id: analyze
continue-on-error: true
uses: anthropics/claude-code-action@v1
with:
json_schema: |
{...}
- name: Handle failure
if: steps.analyze.outputs.conclusion == 'failure'
run: echo "Analysis failed - using defaults"
| duration_ms: 1000, | ||
| structured_output: structuredOutput, | ||
| }); | ||
| } |
There was a problem hiding this comment.
Test Coverage Gap: Missing critical edge cases
The tests are comprehensive but miss several important scenarios:
structured_output: nullvsundefined- Current test only checks when key is missing, not when explicitly set to null- Multiple result messages - Should verify
findLast()returns the last one - Non-object structured_output - What if Claude returns an array/string/number at root?
Add tests:
test("should handle structured_output: null", async () => {
const messages = [
{ type: "result", structured_output: null }
];
await writeFile(TEST_EXECUTION_FILE, JSON.stringify(messages));
await expect(parseAndSetStructuredOutputs(TEST_EXECUTION_FILE))
.rejects.toThrow("did not return structured_output");
});
test("should use last result when multiple exist", async () => {
await createMockExecutionFile({ first: true });
// Add another result
const messages = JSON.parse(await readFile(TEST_EXECUTION_FILE, "utf-8"));
messages.push({ type: "result", structured_output: { last: true } });
await writeFile(TEST_EXECUTION_FILE, JSON.stringify(messages));
await parseAndSetStructuredOutputs(TEST_EXECUTION_FILE);
const output = JSON.parse(setOutputSpy.mock.calls[0][1]);
expect(output.last).toBe(true);
expect(output.first).toBeUndefined();
});| } catch (error) { | ||
| if (error instanceof Error) { | ||
| throw error; // Preserve original error and stack trace | ||
| } | ||
| throw new Error(`Failed to parse structured outputs: ${error}`); | ||
| } | ||
| } |
There was a problem hiding this comment.
Code Quality: Misleading error handling pattern
This pattern claims to "preserve original error and stack trace" but that's misleading:
- Re-throwing doesn't preserve the catch site in the stack
- The non-Error case (second branch) is effectively unreachable since all errors thrown in the try block are Error instances:
readFilethrows ErrorJSON.parsethrows SyntaxError (extends Error)- Line 146 throws Error
Options:
A) Remove the pattern (simple):
} catch (error) {
// All errors in try block are Error instances
throw error instanceof Error
? error
: new Error(`Unexpected error: ${error}`);
}B) Document it's defensive (if you want to keep it):
} catch (error) {
// Defensive: handle exotic non-Error throws (strings, numbers, etc.)
// In practice, all errors from readFile/JSON.parse are Error instances
if (error instanceof Error) {
throw error;
}
throw new Error(`Failed to parse structured outputs: ${error}`);
}
Summary: Comprehensive Code ReviewI've completed a thorough review of PR #683 using specialized review agents. Here are the key findings: 🔴 Critical Issues1. Security: Command Injection VulnerabilityFile: base-action/src/index.ts:40-42 The JSON schema is concatenated into shell arguments without proper escaping, creating a potential command injection vector. See inline comment for details and fix. 2. Missing Output in Main ActionFile: action.yml The structured_output is generated by the base-action but not exposed in the main action's outputs section. Users cannot access this critical feature when using the full action. 🟡 High Priority Issues3. Performance: Inefficient File Parsing (base-action/src/run-claude.ts:134-143)4. Documentation: Fragile fromJSON() Pattern (docs/usage.md:215-217)5. Documentation: Missing Failure Behavior (docs/usage.md:230)🟢 Medium Priority Issues6. Test Coverage Gaps (base-action/test/structured-output.test.ts)7. Code Quality: Misleading Error Handling (base-action/src/run-claude.ts:159-165)8. Example Uses Wrong Version (examples/test-failure-analysis.yml:27)✅ Positive Aspects
Recommendations by PriorityMust fix before merge:
Should fix: Review Statistics: 13 files reviewed, 7 inline comments posted, 2 critical issues, 3 high priority issues, 3 medium priority issues |
|
Closing to create fresh PR without review noise |
Add support for Agent SDK structured outputs.
New Feature
json_schema- JSON schema for validated outputsDocumentation
https://docs.claude.com/en/docs/agent-sdk/structured-outputs
Examples
examples/test-failure-analysis.yml- Auto-retry flaky tests.github/workflows/test-structured-output.yml- Integration tests (disabled until feature ships)Tests