Support using ollama as an inline_completion_provider #15968

aitorpazos · 2024-08-08T10:10:25Z

Check for existing issues

Completed

Describe the feature

I am successfully using my local ollama models using assistant panel.

I would love to be able to use them as well as an inline_completion_provider.

Currently, only none, copilot or supermaven values are supported.

If applicable, add mockups / screenshots to help present your vision of the feature

No response

The text was updated successfully, but these errors were encountered:

MatejLach · 2024-09-07T14:41:03Z

Some considerations that come to mind regarding this, keeping in mind the current support for Ollama:

In contrast to something like GitHub Copilot, whose entire purpose is to provide inline completions, some of the most popular models used with Ollama, such as llama3.1 DO NOT support inline completion.
For those that do, such as deepseek-coder-v2, you can make a POST request to /api/generate like:

{
  "model": "deepseek-coder-v2:latest",
  "prompt": "time := time.",   
  "suffix": "    return time;",
  "options": {
    "temperature": 0
  },
  "keep_alive": -1,
  "stream": false
}

And get back a response such as:

{
  "model": "deepseek-coder-v2:latest",
  "created_at": "2024-09-07T14:28:17.013718016Z",
  "response": "Now().Unix()\n",   // our inline completion, inserted between prompt and suffix
  "done": true,
  "done_reason": "stop",
  // metadata fields omitted 
}

For models that do not support inline completions the above request results in i.e.:

{
  "error": "llama3.1:latest does not support insert"
}

However it would be desirable to use say llama3.1 for chat while using deepseek-coder-v2 for inline completion at the same time, therefore the list of Olllama inline completion models should be separate from chat models.

Another consideration is how much sense it would make to support remote Ollama instances for inline completions. , I've got Ollama running both locally on my laptop and on a server I've got on my LAN to get access to more powerful models.
Anecdotally, the difference between local response times (~100ms, vs the server ~300ms), mean that using the LAN server for inline completions is probably impractical vs using it for chat is just fine.
It also means that it would be nice if the chat Ollama provider vs the inline completion Ollama provider did not share the same base URL.

skewty · 2024-09-29T03:08:45Z

This repo was mentioned in #14134.
I am including it here so it doesn't get forgotten and for easier reference.

Proxy that allows you to use ollama as a copilot like Github copilot

https://github.com/bernardo-bruning/ollama-copilot

Written in Go

xmaayy · 2024-10-04T14:29:32Z

@skewty Seems like that wont work?
#6701

tlvenn · 2024-10-08T04:23:31Z

As mentioned in #16030 to address your concern @MatejLach, we should be able to configure an autocomplete model along side a chat model.

Copy pasting from the issue:

Heres an example of the config.json in Continue.dev to change autocomplete model:

  "tabAutocompleteModel": {
    "title": "Tab Autocomplete Model",
    "provider": "ollama",
    "apiBase": "http://localhost:11434",
    "model": "deepseek-coder:6.7b",
    "contextLength": 8192
  },
  "tabAutocompleteOptions": {
    "prefixPercentage": 0.5,
    "maxPromptTokens": 4096
  },

navidRashik · 2024-12-05T06:44:16Z

@tlvenn this config is for continue.dev and this extention is not present in zed editor now that how would that work in zed?

F1shez · 2025-01-04T19:04:00Z

I just started testing zed and I think it's a really necessary and important task for to transition from vscode with continue.dev

jorge-menjivar · 2025-01-18T23:21:36Z

This feature is a must if I am going to move from Cursor. Github copilot is not good enough so I at least want to have more options.

mbitsnbites · 2025-02-09T10:24:32Z

Another consideration is how much sense it would make to support remote Ollama instances for inline completions

Not sure if there are any specific concerns with ollama, but generally speaking I think it makes perfect sense to support remote instances. For instance I sometimes use an old quad-core Celeron laptop (with 4 GB RAM and no GPU) for coding. It's completely useless for LLM inference, but I have a 12-core Linux server with 32GB RAM and a GTX 1080 GPU in my LAN and it serves up 7B models in llama.cpp without breaking a sweat. I'm also quite confident that my LAN does not add any noticeable lag. I for one would prefer running models on a locally hosted remote server.

mcmacker4 · 2025-02-09T12:46:11Z

Another consideration is how much sense it would make to support remote Ollama instances for inline completions

Not sure if there are any specific concerns with ollama, but generally speaking I think it makes perfect sense to support remote instances.

I think, given that ollama is an HTTP API, the specifics of where you host your ollama instance is not really of Zed's concern, right?
Wether you pont Zed to a localhost URL or an IP three countries over, Zed's only ask is that the API is accessible.

aitorpazos added admin read enhancement [core label] labels Aug 8, 2024

github-actions bot mentioned this issue Aug 8, 2024

Top-Ranking Issues (last 7 days) 📊 #6952

Open

notpeter mentioned this issue Aug 9, 2024

Support autocomplete from ollama models for inline completions (duplicate) #16030

Closed

1 task

notpeter added ai Improvement related to Assistant, Copilot, or other AI features inline completion Umbrella label for Copilot, Supermaven, etc. completions and removed triage labels Aug 9, 2024

github-actions bot mentioned this issue Aug 10, 2024

Top-Ranking Issues (All Time) 📊 #5393

Open

notpeter mentioned this issue Aug 15, 2024

Ollama copilot hookup #14134

Closed

1 task

notpeter mentioned this issue Sep 5, 2024

Need inline_completion_provider support ollama #17451

Closed

1 task

notpeter mentioned this issue Dec 2, 2024

[Codestral] Add Mistral coding models as co-pilot and assistant #21346

Closed

1 task

notpeter mentioned this issue Jan 6, 2025

Support for the codellama models (currently they don't show in the chat option) #21831

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support using ollama as an inline_completion_provider #15968

Support using ollama as an inline_completion_provider #15968

aitorpazos commented Aug 8, 2024

MatejLach commented Sep 7, 2024 •

edited

Loading

skewty commented Sep 29, 2024

xmaayy commented Oct 4, 2024

tlvenn commented Oct 8, 2024

navidRashik commented Dec 5, 2024

F1shez commented Jan 4, 2025

jorge-menjivar commented Jan 18, 2025

mbitsnbites commented Feb 9, 2025

mcmacker4 commented Feb 9, 2025 •

edited

Loading

Support using ollama as an inline_completion_provider #15968

Support using ollama as an inline_completion_provider #15968

Comments

aitorpazos commented Aug 8, 2024

Check for existing issues

Describe the feature

If applicable, add mockups / screenshots to help present your vision of the feature

MatejLach commented Sep 7, 2024 • edited Loading

skewty commented Sep 29, 2024

Proxy that allows you to use ollama as a copilot like Github copilot

xmaayy commented Oct 4, 2024

tlvenn commented Oct 8, 2024

navidRashik commented Dec 5, 2024

F1shez commented Jan 4, 2025

jorge-menjivar commented Jan 18, 2025

mbitsnbites commented Feb 9, 2025

mcmacker4 commented Feb 9, 2025 • edited Loading

MatejLach commented Sep 7, 2024 •

edited

Loading

mcmacker4 commented Feb 9, 2025 •

edited

Loading