Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for other LLM services #29

Closed
yaroslavyaroslav opened this issue Oct 6, 2023 · 19 comments · Fixed by #40
Closed

Add support for other LLM services #29

yaroslavyaroslav opened this issue Oct 6, 2023 · 19 comments · Fixed by #40
Labels
enhancement New feature or request

Comments

@yaroslavyaroslav
Copy link
Owner

yaroslavyaroslav commented Oct 6, 2023

There's a few competitive services released their API just yet.

@yaroslavyaroslav yaroslavyaroslav changed the title Add others services LLM support Add support for other LLM services Oct 6, 2023
@yaroslavyaroslav yaroslavyaroslav pinned this issue Oct 6, 2023
@yigitkonur
Copy link

An IDE is needed for Prompt Engineers. Or someone like you will take the initiative, enhance it with a Sublime Text Plugin, and bring this feature to Sublime Text, which would be truly wonderful for all of us. I hope you never lose your motivation for the project, I will become a sponsor as soon as possible - I sincerely thank you for your contribution.

@yaroslavyaroslav yaroslavyaroslav added the enhancement New feature or request label Oct 10, 2023
@ishaan-jaff
Copy link

Hi @yaroslavyaroslav @yigitkonur - I believe we can make this easier
I’m the maintainer of LiteLLM - we allow you to deploy a LLM proxy to call 100+ LLMs in 1 format - PaLM, Bedrock, OpenAI, Anthropic etc https://github.com/BerriAI/litellm/tree/main/openai-proxy.

If this looks useful (we're used in production)- please let me know how we can help.

Usage

PaLM request

curl http://0.0.0.0:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
     "model": "palm/chat-bison",
     "messages": [{"role": "user", "content": "Say this is a test!"}],
     "temperature": 0.7
   }'

gpt-3.5-turbo request

curl http://0.0.0.0:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
     "model": "gpt-3.5-turbo",
     "messages": [{"role": "user", "content": "Say this is a test!"}],
     "temperature": 0.7
   }'

claude-2 request

curl http://0.0.0.0:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
     "model": "claude-2",
     "messages": [{"role": "user", "content": "Say this is a test!"}],
     "temperature": 0.7
   }'

@yaroslavyaroslav
Copy link
Owner Author

@ishaan-jaff wow, thanks for highlighting this. Indeed this could be a simpler solution, than implementing all of those by my own. Even though there're some caveats on Sublime Text side we have here, e.g. there's a hardly limited list of dependency we can rely on within the plugin code. Though I saw such plugins that was rely on complete third party solution, like some node.js running code, so this should be solvable.

Just a few question I have here, that I hope would save some time on both sides:

  1. Is it complete cross platform in its un-contained run, or there are pits and falls to overcome to make it working both windows, linux and macOS?
  2. Do I get it right that this is nothing but a local server that manages all the network by its own based on request's content it receive, i.e. model field? I dropped a quick look into the docs, just want to ensure this moment specifically.

@ishaan-jaff
Copy link

  • yes the proxy is cross platform
  • yes it's a proxy server that allows you to call all LLMs in one format. You can choose to deploy it or use it locally

@yaroslavyaroslav
Copy link
Owner Author

@ishaan-jaff Thanks, I'll consider it in depth when I'll come closer to implementing this one.

@james2doyle
Copy link

It would be cool if this plugin supported Ollama. You can run it locally as a standalone server, and make API calls to it: https://github.com/jmorganca/ollama/blob/main/docs/api.md

@ishaan-jaff
Copy link

Hi @james2doyle litellm already supports ollama

@james2doyle
Copy link

@ishaan-jaff Oh nice. I misunderstood what litellm was, I thought it was a hosted service

@ishaan-jaff
Copy link

no worries - litellm is a python package to call 100+ LLMs in the same I/O format. We also offer a proxy server if you don't want to make code changes to your app

@yaroslavyaroslav
Copy link
Owner Author

yaroslavyaroslav commented Nov 3, 2023

FYI: for now there's no option to use arbitrary dependency within the ST plug-in, but TIL that package control 4.0 beta allows just this, regarding its current state I believe it would be released within a quarter of something.

So all work in regards of this task will be started right after that release, as well as some other missed features, like precise tokens count.

UPD: I found PC 4.0 beta project, so I believe that it's quite far from being released in next quarter.

@yaroslavyaroslav
Copy link
Owner Author

The good news is that PC 4.0.0 released has been just yet, which means soon it'll be possible to add support for a custom python libraries into the package. We're not there yet, coz packagecontrol.io itself still doesn't fully supports 4.0.0 scheme (e.g. arbitrary libraries as dependencies), but I believe that it would take about a month or two to make it happen.

@rubjo
Copy link

rubjo commented Feb 22, 2024

Any news on this? Per now using https://github.com/icebaker/nano-bots-api via https://github.com/icebaker/sublime-nano-bots to talk to Ollama / Mistral locally. Which works, but would like to see if something using this could be better.

@yaroslavyaroslav
Copy link
Owner Author

Nope unfortunately, every time I tried any local llm as an assistant for my language of interest I noticed a way below gpt4 suggestions quality. More than that the same picture I'm observing with the all competing services as perplexity.

So honestly I have no plans to implement this until things changes.

Peeking at the very last bard/geminy 1m context window though. Maybe it'll worth it.

@Aiq0
Copy link
Contributor

Aiq0 commented Feb 25, 2024

Any news on this? Per now using https://github.com/icebaker/nano-bots-api via https://github.com/icebaker/sublime-nano-bots to talk to Ollama / Mistral locally. Which works, but would like to see if something using this could be better.

I was able to make Ollama working by some small changes, as Ollama API is compatible with OpenAI API:

  • instal package manually, so to make changes to it
  • change request in openai_network_client.py, to use self.connection = HTTPClient('localhost:11434') self.connection = HTTPConnection('localhost:11434') instead of logic present there
  • added dummy token "ollama-dummy-longer-than-10-characters" (or remove token checking)
  • change models in assitants:
"assistants": [
	{
		"assistant_role": "Apply the change requested by the user to the code with respect to senior knowledge of programming",
		"chat_model": "codellama",
		"max_tokens": 4000,
		"name": "Replace",
		"prompt_mode": "replace"
	},
	{
		"assistant_role": "Insert code or whatever user will request with the following command instead of placeholder with respect to senior knowledge of programming",
		"chat_model": "codellama",
		"max_tokens": 4000,
		"name": "Insert",
		"prompt_mode": "insert",
		"placeholder": "## placeholder"
	},
	{
		"assistant_role": "Append code or whatever user will request with the following command instead of placeholder with respect to senior knowledge of programming",
		"chat_model": "codellama",
		"max_tokens": 4000,
		"name": "Append",
		"prompt_mode": "append"
	},
]

So it will work nicely, if there would be config options to:

  • toggle between HTTP and HTTPS
  • change URL
  • do not require token, when url is not api.openai.com (or just advice to add some dummy token)

@rubjo
Copy link

rubjo commented Feb 26, 2024

@Aiq0 Confirmed working, thank you!
(Used HTTPConnection, not HTTPClient)

@Aiq0
Copy link
Contributor

Aiq0 commented Feb 26, 2024

@Aiq0 Confirmed working, thank you! (Used HTTPConnection, not HTTPClient)

You are welcome. (sorry, that was typo)

@yaroslavyaroslav
Copy link
Owner Author

@rubjo @Aiq0 glad to hear it folks!

It would be just awesome if you'd do an extra mile and perform a PR with such functionality. The network layer code is kinda the least confusing piece through the whole code base, so I believe it could be taken without paying too much effort for that.

@Aiq0
Copy link
Contributor

Aiq0 commented Feb 26, 2024

@rubjo @Aiq0 glad to hear it folks!

It would be just awesome if you'd do an extra mile and perform a PR with such functionality. The network layer code is kinda the least confusing piece through the whole code base, so I believe it could be taken without paying too much effort for that.

OK, I am going to add some config settings for tweaking connection and create PR (most likely tomorrow). Is there anything other that should be considered?

@yaroslavyaroslav
Copy link
Owner Author

I believe not much. Just please try to avoid to overcomplicating things. Like to not present an additional settings if they can be avoided (e.g. I believe that it's perfectly fine to come along with dummy token on the user side for a local models rather than providing a separate toggle for that).

If you're about to add some global settings options, please consider to do so on the first level if it's possible.

A few words dropped in Readme about this new feature would definitely worth it either.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants