Skip to content

Conversation

@michaelneale
Copy link
Collaborator

Attempting to cover the new code mode with same tests for providers. Should work the same as stock developer extension

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends the provider test script to support testing the new code execution mode alongside the existing developer extension mode. The test can now be run in two modes: normal mode (direct tool calls) and code execution mode (JS batching), allowing verification that code execution mode works equivalently to the stock developer extension across all providers.

  • Adds --code-exec flag to enable code execution mode testing
  • Configures mode-specific settings for builtins and success patterns
  • Parameterizes test assertions to work with both modes

case $arg in
--code-exec)
CODE_EXEC_MODE=true
shift
Copy link

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The shift command has no effect here because the loop iterates over "$@" directly, not positional parameters. Remove this line as it serves no purpose in this context.

Suggested change
shift

Copilot uses AI. Check for mistakes.
if [ "$CODE_EXEC_MODE" = true ]; then
echo "Mode: code_execution (JS batching)"
BUILTINS="developer,code_execution"
SUCCESS_PATTERN="execute_code | code_execution|read_module | code_execution"
Copy link

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The regex pattern mixes literal pipes (with spaces) and regex OR operator (pipe without spaces), making it unclear. Consider using explicit grouping like "(execute_code | code_execution)|(read_module | code_execution)" to make the intent clearer, or escape the literal pipes if they should be matched literally.

Suggested change
SUCCESS_PATTERN="execute_code | code_execution|read_module | code_execution"
SUCCESS_PATTERN="(execute_code | code_execution)|(read_module | code_execution)"

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings December 19, 2025 01:02
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

case $arg in
--code-exec)
CODE_EXEC_MODE=true
;;
Copy link

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The shift command inside the loop will cause the loop to skip arguments. Since you're iterating over "$@" with "for arg in", the shift modifies the positional parameters but doesn't affect the loop iteration. Remove the shift command as it serves no purpose here and could cause unexpected behavior if additional arguments are added in the future.

Suggested change
;;

Copilot uses AI. Check for mistakes.
Copy link
Collaborator

@codefromthecrypt codefromthecrypt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah so the main change here is we are still doing "list the files...", just that we are adding code execution to do it which forces the other extensions inside. Same test

@codefromthecrypt codefromthecrypt merged commit d481404 into main Dec 19, 2025
20 checks passed
@codefromthecrypt codefromthecrypt deleted the micn/test-code-mode branch December 19, 2025 04:02
@alexhancock
Copy link
Collaborator

Nice work @michaelneale !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants