Skip to content

Conversation

@michaelneale
Copy link
Collaborator

@michaelneale michaelneale commented May 19, 2025

for: #2573

models that specialise editing can be used to reduce the load on the main LLM providers, increase accuracy and quality and speed and lower cost.
This supports both morphllm and relace models (and any openai compatible endpoint)

@michaelneale michaelneale changed the title WIP: first pass at trying the morph model WIP: first pass at trying the morph model for fast editing May 19, 2025
@michaelneale
Copy link
Collaborator Author

I ran it through the benchmark ("system" is default, local is with morph):

Summary of Benchmark Results

I've run the developer_search_replace benchmark using both the system goose binary and the local ./target/release/goose binary. Here are the results:

System goose Binary Results

Scores:

  • 3 out of 4 runs scored 1.0 (success)
  • 1 run scored 0.0 (failure)

Execution Times:

  • 32.86 seconds
  • 70.13 seconds
  • 37.41 seconds
  • 41.97 seconds

Average Execution Time: 45.59 seconds
Success Rate: 75%

Local ./target/release/goose Binary Results

Scores:

  • 4 out of 4 runs scored 1.0 (success)

Execution Times:

  • 50.65 seconds
  • 64.52 seconds
  • 75.51 seconds
  • 68.33 seconds

Average Execution Time: 64.75 seconds
Success Rate: 100%

Comparison

  1. Success Rate:

    • Local binary: 100% (4/4 successful runs)
    • System binary: 75% (3/4 successful runs)
  2. Execution Time:

    • Local binary: Slower with an average of 64.75 seconds
    • System binary: Faster with an average of 45.59 seconds
  3. Consistency:

    • Local binary: More consistent in terms of success rate (all runs succeeded)
    • System binary: Less consistent (one run failed)

The local build appears to be more reliable in completing the benchmark successfully, but it takes longer to execute. The system binary is faster on average but had one failed run.

@michaelneale
Copy link
Collaborator Author

@bhaktatejas922 I will take another look - don't worry too much about those times, the fact it is did stuff which wasn't reliably working before so measuring work vs not is tricky! we need to get better at these benchmarks

@michaelneale michaelneale requested a review from baxen May 22, 2025 09:08
* main: (82 commits)
  feat: lead/worker model (#2719)
  fix: pass ref in pr comment workflow (#2777)
  feat: goose web for local terminal alternative (#2718)
  chore: run CI on merge_group (#2786)
  fix: Don't break from consuming subprocess output in shell tool until both streams are done (#2771)
  Add retries w/ exponential backoff for databricks provider (#2764)
  Fix paths in google drive mcp documentation (#2775)
  testing windows build (#2770)
  docs: Add Context7 YouTube Video (#2779)
  cli(command): Add `export` command to CLI for markdown export of sessions (#2533)
  fix(copilot): gh copilot auth token conflicts w/ gh mcp env var (#2743)
  feat(providers): Add support for Gemini 2.5 Flash Preview and Pro Preview models (#2780)
  fix: pr comment build cli workflow (#2774)
  hotfix: don't always run prompt (#2773)
  Lifei/test workflow (#2772)
  chore: use hermit to install node, rust and protoc (#2766)
  Feat: Refined the documentation for Goose (#2751)
  mcp(developer): add fallback on .gitignore if no .gooseignore is present (#2661)
  cli(ux): Show active context length in CLI (#2315)
  cli(config): Add GOOSE_CONTEXT_STRATEGY setting (#2666)
  ...
@michaelneale michaelneale changed the title WIP: first pass at trying the morph model for fast editing enabling optional fast edit models Jun 5, 2025
michaelneale and others added 9 commits June 5, 2025 15:02
* main:
  claude 4 listing (#2843)
  fix: Use the existing spinner in interactive mode (#2829)
  chore(release): release version 1.0.27 (#2844)
  Revert "Mnovich/temporal scheduler (#2745)" (#2839)
  chore(release): release version 1.0.26 (#2833)
  Removed ui-v2 directory and updated project to use node in hermit and readme (#2831)
  Mnovich/temporal scheduler (#2745)
  fix: intel builds (#2832)
  chore(release): release version 1.0.25 (#2811)
  Nostrbook MCP is now on npm (#2816)
  Update macOS install guide with Homebrew instructions (#2823)
  remember window position (#2808)
  feat(ui): put the scheduler behind an alpha (#2810)
  debug config issues on windows (#2809)
  Add Speech MCP extension to extensions directory (#2807)
  Iand/blog goosehints metadata update (#2800)
  Iand/blog goosehints (#2798)
  blog post about goosehints and persistent context (#2796)
  [goose-llm] system prompt override (#2791)
  chore: small bit of a cleanup - removing unused dir (#2761)
* main: (51 commits)
  Docs: Fetch MCP doesnt work with Gemini (#2940)
  feat: add Help & Feedback section in App Settings (#2935)
  docs: blog update (#2937)
  docs: fixing blog image (#2936)
  docs: lead/worker tutorial and blog post (#2930)
  chore(deps): bump golang.org/x/net from 0.14.0 to 0.38.0 in /temporal-service (#2836)
  chore(deps): bump google.golang.org/grpc from 1.57.0 to 1.57.1 in /temporal-service (#2834)
  fix updater download text (#2919)
  chore(release): release version 1.0.28 (#2906)
  Enable updater and remove unzipping and installing update text (#2918)
  docs: updates for lead-worker model (#2916)
  fix: correct spelling in error messages and documentation (#2840)
  Change updater to use platform agnostic and secure zip library (#2913)
  Docs: Edit recipes on Goose desktop (#2912)
  Disable updater until we can debug more in release (#2908)
  fix router trait error (#2910)
  fix: Check for stderr error in receive() (#2905)
  Damien/sagemaker tgi (#2888)
  feat: (tool router) llm tool selector (#2866)
  feat: (tool router) adds extension name in vector db & search tool (#2855)
  ...
@michaelneale michaelneale marked this pull request as ready for review June 16, 2025 09:53
@michaelneale michaelneale changed the title enabling optional fast edit models feat: optional fast edit models Jun 16, 2025
@michaelneale michaelneale requested a review from ahau-square June 16, 2025 09:56
* main: (26 commits)
  chore(release): release version 1.0.29 (#2978)
  [fix][small] Replaced goose prompt unicode quotations with ascii quotations (#2972)
  fix: goose recipe prompt is not shown again when switch the view from settings to chat (#2870)
  fix: remove computer controller presentation (#2956)
  Fix GitHub Copilot Provider Config (#2955)
  Blog: Why I Used Goose to Build a Chaotic Emotion Detection App (#2959)
  Docs: Recipe settings (#2970)
  feat(ui): Add confirmation dialog for unsaved changes in extension modal (#2971)
  feat: alphabetize extensions in goose CLI (#2966)
  switch roles on condition for windows (#2975)
  fix version param for canary (#2974)
  enabling windows builds with code signing (#2968)
  feat(cli): add system prompt parameter to run command (#2253)
  Fix window not showing for some users (#2967)
  Add documentation for running with Ramalama local model serving in OCI Containers (#1973)
  Reddit MCP Server Tutorial (#2949)
  [fix] goose not quitting app completely (#2950)
  Opopadich/issue 1625 (#2904)
  chore(deps): bump go.temporal.io/api from 1.24.0 to 1.44.1 in /temporal-service (#2837)
  feat: add newline at end of file writes (#2221)
  ...
Copy link
Contributor

@Kvadratni Kvadratni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm

@michaelneale michaelneale merged commit 2a4a0e1 into main Jun 18, 2025
7 checks passed
@michaelneale michaelneale deleted the micn/fast-edit-morph-model branch June 18, 2025 05:12
laanak08 added a commit that referenced this pull request Jun 18, 2025
* main: (28 commits)
  feat: optional fast edit models (#2580)
  feat: Add lead-worker model selection and real-time model display in GUI (#2964)
  chore(release): release version 1.0.29 (#2978)
  [fix][small] Replaced goose prompt unicode quotations with ascii quotations (#2972)
  fix: goose recipe prompt is not shown again when switch the view from settings to chat (#2870)
  fix: remove computer controller presentation (#2956)
  Fix GitHub Copilot Provider Config (#2955)
  Blog: Why I Used Goose to Build a Chaotic Emotion Detection App (#2959)
  Docs: Recipe settings (#2970)
  feat(ui): Add confirmation dialog for unsaved changes in extension modal (#2971)
  feat: alphabetize extensions in goose CLI (#2966)
  switch roles on condition for windows (#2975)
  fix version param for canary (#2974)
  enabling windows builds with code signing (#2968)
  feat(cli): add system prompt parameter to run command (#2253)
  Fix window not showing for some users (#2967)
  Add documentation for running with Ramalama local model serving in OCI Containers (#1973)
  Reddit MCP Server Tutorial (#2949)
  [fix] goose not quitting app completely (#2950)
  Opopadich/issue 1625 (#2904)
  ...
lifeizhou-ap added a commit that referenced this pull request Jun 20, 2025
* main:
  Blog: Add video to container use blog (#3008)
  Use official logo in Goose web (#3012)
  fix shims for extensions on windows (#3009)
  fix powershell executions (#3006)
  Docs linux desktop (#3007)
  Platform Tool for Scheduler: Allow Goose to Manage Its Own Schedule (#2944)
  docs: container use blog and guide (#2962)
  Fix: Workflow syntax (#3002)
  Added just lint-ui for linting front end code (#2997)
  fix typo in secret name (#2994)
  feat(ui): add chain-of-thought panel above assistant messages (#2899)
  feat(cli): Add `--quiet /-q` flag to goose run (#2939)
  Feat: Recipe Library (#2946)
  Docs: Goose on Windows Installation (#2990)
  Fixes : Workflow error on issue comment (#2958)
  Add a setting for the quit confirmation dialog (#2901)
  Update bundle-desktop-windows.yml (#2988)
  feat: optional fast edit models (#2580)
  feat: Add lead-worker model selection and real-time model display in GUI (#2964)
btdeviant pushed a commit to btdeviant/goose that referenced this pull request Jun 25, 2025
s-soroosh pushed a commit to s-soroosh/goose that referenced this pull request Jul 18, 2025
Co-authored-by: Eitan Borgnia <[email protected]>
Signed-off-by: Soroosh <[email protected]>
cbruyndoncx pushed a commit to cbruyndoncx/goose that referenced this pull request Jul 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants