diff --git a/documentation/docs/getting-started/providers.md b/documentation/docs/getting-started/providers.md index 6a3f670d95b9..db61effdbc59 100644 --- a/documentation/docs/getting-started/providers.md +++ b/documentation/docs/getting-started/providers.md @@ -27,6 +27,7 @@ Goose relies heavily on tool calling capabilities and currently works best with | [GCP Vertex AI](https://cloud.google.com/vertex-ai) | Google Cloud's Vertex AI platform, supporting Gemini and Claude models. **Credentials must be configured in advance. Follow the instructions at https://cloud.google.com/vertex-ai/docs/authentication.** | `GCP_PROJECT_ID`, `GCP_LOCATION` and optional `GCP_MAX_RETRIES` (6), `GCP_INITIAL_RETRY_INTERVAL_MS` (5000), `GCP_BACKOFF_MULTIPLIER` (2.0), `GCP_MAX_RETRY_INTERVAL_MS` (320_000). | | [Groq](https://groq.com/) | High-performance inference hardware and tools for LLMs. | `GROQ_API_KEY` | | [Ollama](https://ollama.com/) | Local model runner supporting Qwen, Llama, DeepSeek, and other open-source models. **Because this provider runs locally, you must first [download and run a model](/docs/getting-started/providers#local-llms-ollama).** | `OLLAMA_HOST` | +| [Ramalama](https://ramalama.ai/) | Local model using native [OCI](https://opencontainers.org/) container runtimes, [CNCF](https://www.cncf.io/) tools, and supporting models as OCI artifacts. Ramalama API an compatible alternative to Ollama and can be used with the Goose Ollama provider. Supports Qwen, Llama, DeepSeek, and other open-source models. **Because this provider runs locally, you must first [download and run a model](/docs/getting-started/providers#local-llms-ollama).** | `OLLAMA_HOST` | | [OpenAI](https://platform.openai.com/api-keys) | Provides gpt-4o, o1, and other advanced language models. Also supports OpenAI-compatible endpoints (e.g., self-hosted LLaMA, vLLM, KServe). **o1-mini and o1-preview are not supported because Goose uses tool calling.** | `OPENAI_API_KEY`, `OPENAI_HOST` (optional), `OPENAI_ORGANIZATION` (optional), `OPENAI_PROJECT` (optional), `OPENAI_CUSTOM_HEADERS` (optional) | | [OpenRouter](https://openrouter.ai/) | API gateway for unified access to various models with features like rate-limiting management. | `OPENROUTER_API_KEY` | @@ -260,9 +261,11 @@ To set up Google Gemini with Goose, follow these steps: -### Local LLMs (Ollama) +### Local LLMs (Ollama or Ramalama) -Ollama provides local LLMs, which requires a bit more set up before you can use it with Goose. +Ollama and Ramalama are both options to provide local LLMs, each which requires a bit more set up before you can use one of them with Goose. + +#### Ollama 1. [Download Ollama](https://ollama.com/download). 2. Run any [model supporting tool-calling](https://ollama.com/search?c=tools): @@ -357,6 +360,102 @@ For Ollama, if you don't provide a host, we set it to `localhost:11434`. When co └ Configuration saved successfully ``` +#### Ramalama + +1. [Download Ramalama](https://github.com/containers/ramalama?tab=readme-ov-file#install). +2. Run any Ollama [model supporting tool-calling](https://ollama.com/search?c=tools) or [GGUF format HuggingFace Model](https://huggingface.co/search/full-text?q=%22tools+support%22+%2B+%22gguf%22&type=model) : + +:::warning Limited Support for models without tool calling +Goose extensively uses tool calling, so models without it (e.g. `DeepSeek-r1`) can only do chat completion. If using models without tool calling, all Goose [extensions must be disabled](/docs/getting-started/using-extensions#enablingdisabling-extensions). As an alternative, you can use a [custom DeepSeek-r1 model](/docs/getting-started/providers#deepseek-r1) we've made specifically for Goose. +::: + +Example: + +```sh +# NOTE: the --runtime-args="--jinja" flag is required for Ramalama to work with the Goose Ollama provider. +ramalama serve --runtime-args="--jinja" ollama://qwen2.5 +``` + +3. In a separate terminal window, configure with Goose: + +```sh +goose configure +``` + +4. Choose to `Configure Providers` + +``` +┌ goose-configure +│ +◆ What would you like to configure? +│ ● Configure Providers (Change provider or update credentials) +│ ○ Toggle Extensions +│ ○ Add Extension +└ +``` + +5. Choose `Ollama` as the model provider since Ramalama is API compatible and can use the Goose Ollama provider + +``` +┌ goose-configure +│ +◇ What would you like to configure? +│ Configure Providers +│ +◆ Which model provider should we use? +│ ○ Anthropic +│ ○ Databricks +│ ○ Google Gemini +│ ○ Groq +│ ● Ollama (Local open source models) +│ ○ OpenAI +│ ○ OpenRouter +└ +``` + +5. Enter the host where your model is running + +:::info Endpoint +For the Ollama provider, if you don't provide a host, we set it to `localhost:11434`. When constructing the URL, we preprend `http://` if the scheme is not `http` or `https`. Since Ramalama's default port to serve on is 8080, we set `OLLAMA_HOST=http://0.0.0.0:8080` +::: + +``` +┌ goose-configure +│ +◇ What would you like to configure? +│ Configure Providers +│ +◇ Which model provider should we use? +│ Ollama +│ +◆ Provider Ollama requires OLLAMA_HOST, please enter a value +│ http://0.0.0.0:8080 +└ +``` + + +6. Enter the model you have running + +``` +┌ goose-configure +│ +◇ What would you like to configure? +│ Configure Providers +│ +◇ Which model provider should we use? +│ Ollama +│ +◇ Provider Ollama requires OLLAMA_HOST, please enter a value +│ http://0.0.0.0:8080 +│ +◇ Enter a model from that provider: +│ qwen2.5 +│ +◇ Welcome! You're all set to explore and utilize my capabilities. Let's get started on solving your problems together! +│ +└ Configuration saved successfully +``` + ### DeepSeek-R1 Ollama provides open source LLMs, such as `DeepSeek-r1`, that you can install and run locally. @@ -464,4 +563,4 @@ If you have any questions or need help with a specific provider, feel free to re [providers]: /docs/getting-started/providers -[function-calling-leaderboard]: https://gorilla.cs.berkeley.edu/leaderboard.html \ No newline at end of file +[function-calling-leaderboard]: https://gorilla.cs.berkeley.edu/leaderboard.html