-
Notifications
You must be signed in to change notification settings - Fork 2.3k
chore: cover code mode with end to end provider tests #6183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR extends the provider test script to support testing the new code execution mode alongside the existing developer extension mode. The test can now be run in two modes: normal mode (direct tool calls) and code execution mode (JS batching), allowing verification that code execution mode works equivalently to the stock developer extension across all providers.
- Adds
--code-execflag to enable code execution mode testing - Configures mode-specific settings for builtins and success patterns
- Parameterizes test assertions to work with both modes
scripts/test_providers.sh
Outdated
| case $arg in | ||
| --code-exec) | ||
| CODE_EXEC_MODE=true | ||
| shift |
Copilot
AI
Dec 19, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The shift command has no effect here because the loop iterates over "$@" directly, not positional parameters. Remove this line as it serves no purpose in this context.
| shift |
scripts/test_providers.sh
Outdated
| if [ "$CODE_EXEC_MODE" = true ]; then | ||
| echo "Mode: code_execution (JS batching)" | ||
| BUILTINS="developer,code_execution" | ||
| SUCCESS_PATTERN="execute_code | code_execution|read_module | code_execution" |
Copilot
AI
Dec 19, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The regex pattern mixes literal pipes (with spaces) and regex OR operator (pipe without spaces), making it unclear. Consider using explicit grouping like "(execute_code | code_execution)|(read_module | code_execution)" to make the intent clearer, or escape the literal pipes if they should be matched literally.
| SUCCESS_PATTERN="execute_code | code_execution|read_module | code_execution" | |
| SUCCESS_PATTERN="(execute_code | code_execution)|(read_module | code_execution)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| case $arg in | ||
| --code-exec) | ||
| CODE_EXEC_MODE=true | ||
| ;; |
Copilot
AI
Dec 19, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The shift command inside the loop will cause the loop to skip arguments. Since you're iterating over "$@" with "for arg in", the shift modifies the positional parameters but doesn't affect the loop iteration. Remove the shift command as it serves no purpose here and could cause unexpected behavior if additional arguments are added in the future.
| ;; |
codefromthecrypt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah so the main change here is we are still doing "list the files...", just that we are adding code execution to do it which forces the other extensions inside. Same test
|
Nice work @michaelneale ! |
Attempting to cover the new code mode with same tests for providers. Should work the same as stock developer extension