-
Couldn't load subscription status.
- Fork 0
Ollama: Add basic configuration for LiteLLM proxy #44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,43 @@ | ||
| # LiteLLM supports all models from Ollama. | ||
| # https://docs.litellm.ai/docs/providers/ollama | ||
| # | ||
| # uvx litellm --config etc/litellm-config.yaml --detailed_debug | ||
| # | ||
| # TODO: Use `ollama_chat` | ||
| # We recommend using ollama_chat for better responses. | ||
| --- | ||
|
|
||
| model_list: | ||
| - model_name: "llama3.2" | ||
| litellm_params: | ||
| model: "ollama/llama3.2" | ||
| api_base: "http://localhost:11434" | ||
|
|
||
| - model_name: "gemma3:1b" | ||
| litellm_params: | ||
| model: "ollama/gemma3:1b" | ||
| api_base: "http://localhost:11434" | ||
|
|
||
| - model_name: "qwen3:0.6b" | ||
| litellm_params: | ||
| model: "ollama/qwen3:0.6b" | ||
| api_base: "http://localhost:11434" | ||
|
|
||
| - model_name: "deepseek-r1:7b" | ||
| litellm_params: | ||
| model: "ollama/deepseek-r1:7b" | ||
| api_base: "http://localhost:11434" | ||
|
Comment on lines
+10
to
+29
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 💡 Verification agent 🧩 Analysis chainModel configuration looks good but URL discrepancy needs explanation. The model configurations are correctly structured for LiteLLM with Ollama. However, there's a discrepancy between the Ollama server URL here ( 🏁 Script executed: #!/bin/bash
# Check the architecture to understand the relationship between URLs
echo "Checking for LiteLLM proxy configuration files"
fd -t f -g "*.y*ml" -g "*.json" | xargs grep -l "4000" || echo "No references to port 4000 found in config files"
echo "Checking if there are any scripts starting LiteLLM proxy"
fd -t f -g "*.sh" -g "*.py" | xargs grep -l "litellm" || echo "No scripts found starting LiteLLM proxy"Length of output: 716 🏁 Script executed: #!/bin/bash
# Search for any references to port 4000 across the repository
rg -n "4000"
# Specifically look for the LiteLLM proxy URL pattern
rg -n "localhost:4000"Length of output: 253 Clarify LiteLLM proxy vs Ollama server URL discrepancy There’s a mismatch between the Ollama server endpoint in
Please confirm which port your application should use and align both configurations. You can either:
🤖 Prompt for AI AgentsThere was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @coderabbitai: In There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🧩 Analysis chain@amotl Thank you for clarifying! I understand the architecture now:
This explains the apparent discrepancy perfectly. To make this clearer for future maintainers, you might consider adding a brief comment in the Python code explaining this proxy relationship, but it's certainly not a critical issue. 🌐 Web query: 💡 Result: The default port for the LiteLLM proxy server is 5000. This is the port used by the server when it is started without specifying a different port. If you need to change the port, you can do so by setting the @amotl Thank you for clarifying! I understand the architecture now, though I should correct my previous statement about the default port. The standard architecture here appears to be:
This explains the apparent discrepancy perfectly. Is the port 4000 for LiteLLM configured elsewhere in your setup, or is it handled through the ✏️ Learnings added
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @coderabbitai: Exactly. Port 4000 seems to be the standard port used by LiteLLM.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @amotl Thank you for the confirmation and for providing that helpful reference about port 4000 being the standard for LiteLLM proxy traffic. This matches perfectly with the architecture you've set up. ⛔ Learnings removed🧠 Learnings used |
||
|
|
||
| # https://github.com/BerriAI/litellm/issues/1517#issuecomment-1922022209 | ||
| #model_list: | ||
| # - model_name: ollama-codellama | ||
| # litellm_params: | ||
| # model: ollama/codellama:70b | ||
| # api_base: http://0.0.0.0:11434 | ||
| # rpm: 1440 | ||
| # model_info: | ||
| # version: 2 | ||
|
|
||
| #litellm_settings: | ||
| # drop_params: True | ||
| # set_verbose: True | ||
| Original file line number | Diff line number | Diff line change | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -50,10 +50,6 @@ | |||||||||||
| raise ImportError("The 'openai' package is required when using the OpenAI backend") | ||||||||||||
| if self.backend == "claude" and not CLAUDE_AVAILABLE: | ||||||||||||
| raise ImportError("The 'claudette' package is required when using the Claude backend") | ||||||||||||
| if self.backend == "openai" and not os.environ.get("OPENAI_API_KEY"): | ||||||||||||
| raise ValueError( | ||||||||||||
| "OPENAI_API_KEY environment variable is required when using 'openai' backend" | ||||||||||||
| ) | ||||||||||||
| if self.backend == "claude" and not os.environ.get("ANTHROPIC_API_KEY"): | ||||||||||||
| raise ValueError( | ||||||||||||
| "ANTHROPIC_API_KEY environment variable is required when using 'claude' backend" | ||||||||||||
|
|
@@ -110,7 +106,7 @@ | |||||||||||
| - https://community.openai.com/t/how-is-developer-message-better-than-system-prompt/1062784 | ||||||||||||
| """ | ||||||||||||
|
|
||||||||||||
| client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY")) | ||||||||||||
| client = OpenAI(api_key="n/a", base_url="http://localhost:4000") | ||||||||||||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🛠️ Refactor suggestion Replace hardcoded connection details with environment variables. Using hardcoded connection details ( - client = OpenAI(api_key="n/a", base_url="http://localhost:4000")
+ client = OpenAI(
+ api_key=os.environ.get("LITELLM_API_KEY", "n/a"),
+ base_url=os.environ.get("LITELLM_BASE_URL", "http://localhost:4000")
+ )📝 Committable suggestion
Suggested change
🤖 Prompt for AI Agents |
||||||||||||
|
|
||||||||||||
| input_messages: t.List[Message] = [] | ||||||||||||
|
|
||||||||||||
|
|
@@ -140,9 +136,13 @@ | |||||||||||
| ) | ||||||||||||
|
|
||||||||||||
| # model = "gpt-4o" # noqa: ERA001 | ||||||||||||
| model = "gpt-4.1" # noqa: ERA001 | ||||||||||||
| # model = "gpt-4.1" # noqa: ERA001 | ||||||||||||
| # model = "o4-mini" # noqa: ERA001 | ||||||||||||
| # model = "o3" # noqa: ERA001 | ||||||||||||
| model = "llama3.2" # noqa: ERA001 | ||||||||||||
| # model = "gemma3:1b" # noqa: ERA001 | ||||||||||||
| # model = "qwen3:0.6b" # noqa: ERA001 | ||||||||||||
| # model = "deepseek-r1:7b" # noqa: ERA001 | ||||||||||||
| reasoning = None | ||||||||||||
| if model == "o4-mini": | ||||||||||||
| reasoning = Reasoning( | ||||||||||||
|
|
||||||||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Good documentation with usage instructions.
The comments provide clear guidance on the purpose of this configuration file and how to use it with LiteLLM. The TODO note about using
ollama_chatfor better responses is helpful.Consider implementing the
ollama_chatrecommendation mentioned in the TODO:🌐 Web query:
💡 Result:
Difference Between
ollamaandollama_chatin LiteLLMOverview
LiteLLM provides a unified interface to interact with multiple LLM providers, including those running on Ollama. Within LiteLLM, you may encounter references to both
ollamaandollama_chatwhen specifying models or endpoints. The distinction between these two directly affects how requests are handled and the features available when working with local LLMs via Ollama.Key Differences
ollamaollama_chat/api/chat)ollama/llama2ollama_chat/llama3.1Details and Usage
ollama:ollama/llama2ollama_chat:/api/chatendpoint of the Ollama server, which is designed for chat-style, multi-turn interactions.ollama_chat/llama3.1Technical Implementation
ollama_chatin LiteLLM includes additional steps for message serialization, tool/function call conversion, and setting parameters likeformat,tools, orstreamto optimize chat interactions[3][4].ollama_chatbeing tailored for the interactive chat-based features.Summary
ollamawhen you want simpler, single-turn text completions.ollama_chatfor advanced, multi-turn chat applications, function/tool calls, and when you need better structured output or OpenAI Chat API compatibility. It is the recommended choice for most modern LLM/chatbot workflows in LiteLLM[1][7].Tip: Always check the documentation or example usage for your framework or integration, as the recommended model string (
ollama_chat/<model_name>) can yield better results for interactive use cases[1][3].Citations:
Implement
ollama_chatfor Enhanced Chat SupportBased on the LiteLLM docs, the
ollama_chatendpoint provides multi-turn chat, function/tool calling, and better structured (JSON) outputs compared to the basicollamacompletion endpoint. To leverage these chat features, please update youretc/litellm-config.yaml:• Replace any
ollama/<model>references withollama_chat/<model>(e.g.ollama_chat/llama3.1).• If you’re specifying a provider key, it should look like:
• Update any CLI examples accordingly:
uvx litellm --config etc/litellm-config.yaml --detailed_debug # becomes: uvx litellm --config etc/litellm-config.yaml --provider ollama_chat/llama3.1 --detailed_debugThis change ensures you get the recommended chat-style behavior and structured outputs.
🤖 Prompt for AI Agents