-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Native Ollama LLM Integration + Example Project + Full Unit Tests #3570
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
Summary of ChangesHello @ayman3000, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances ADK's LLM provider ecosystem by introducing a native Ollama integration. This new backend directly addresses and fixes critical tool-calling failures that previously affected cloud-based Ollama models when used via LiteLLM. By providing robust and reliable tool support, this change makes ADK fully compatible with enterprise cloud inference, hybrid local and cloud deployments, and advanced agent workflows that depend on accurate function calling, thereby broadening ADK's accessibility and utility. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
|
Response from ADK Triaging Agent Hello @ayman3000, thank you for this significant contribution! Before we can proceed with the review, could you please sign the Contributor License Agreement (CLA)? It seems the This will help us to move forward with your PR. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This is a fantastic contribution that adds native Ollama support to ADK, filling a significant gap. The implementation is well-structured and the inclusion of a full example project and unit tests is excellent.
I've found a few issues, mainly related to the completeness of the response parsing and corresponding tests. Specifically:
- The
LlmResponseis missing usage metadata and the model version from the Ollama response. - The unit tests for these features are present but will fail due to the missing implementation and use an incorrect mock response structure.
- There's a potential bug in handling malformed JSON in tool call arguments.
My review includes suggestions to address these points. Once these are fixed, this will be a very solid addition to the project. Great work!
| except json.JSONDecodeError: | ||
| logger.debug( | ||
| 'Failed to parse tool call arguments as JSON: %s', arguments | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If json.loads(arguments) fails, arguments remains a string. This string is then passed as the args parameter to types.FunctionCall, which expects a dictionary-like object. This will likely cause a validation error downstream. It would be safer to treat unparseable arguments as an empty dictionary and log a warning.
| except json.JSONDecodeError: | |
| logger.debug( | |
| 'Failed to parse tool call arguments as JSON: %s', arguments | |
| ) | |
| except json.JSONDecodeError: | |
| logger.warning( | |
| 'Failed to parse tool call arguments as JSON: %s. Defaulting to empty arguments.', | |
| arguments, | |
| ) | |
| arguments = {} |
| def test_to_llm_response_usage_metadata(): | ||
| o = Ollama() | ||
| resp = mock_response_ok( | ||
| text="Hi", | ||
| usage={"prompt_tokens": 10, "completion_tokens": 5, "total_tokens": 15} | ||
| ) | ||
|
|
||
| out = o._to_llm_response(resp) | ||
|
|
||
| assert out.usage_metadata.prompt_token_count == 10 | ||
| assert out.usage_metadata.candidates_token_count == 5 | ||
| assert out.usage_metadata.total_token_count == 15 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test for usage metadata will fail because the implementation in _to_llm_response does not parse it. Additionally, the mock response structure in this test does not match the actual Ollama API response. The Ollama API returns prompt_eval_count and eval_count at the top level of the JSON response, not a nested usage dictionary. The test should be updated to reflect the correct API response format and to work with the corrected implementation.
| def test_to_llm_response_usage_metadata(): | |
| o = Ollama() | |
| resp = mock_response_ok( | |
| text="Hi", | |
| usage={"prompt_tokens": 10, "completion_tokens": 5, "total_tokens": 15} | |
| ) | |
| out = o._to_llm_response(resp) | |
| assert out.usage_metadata.prompt_token_count == 10 | |
| assert out.usage_metadata.candidates_token_count == 5 | |
| assert out.usage_metadata.total_token_count == 15 | |
| def test_to_llm_response_usage_metadata(): | |
| o = Ollama() | |
| resp = mock_response_ok(text="Hi") | |
| resp["prompt_eval_count"] = 10 | |
| resp["eval_count"] = 5 | |
| out = o._to_llm_response(resp) | |
| assert out.usage_metadata is not None | |
| assert out.usage_metadata.prompt_token_count == 10 | |
| assert out.usage_metadata.candidates_token_count == 5 | |
| assert out.usage_metadata.total_token_count == 15 |
| async def test_model_override(monkeypatch): | ||
| resp = mock_response_ok("Hello") | ||
|
|
||
| async def fake_thread(fn, *args): | ||
| return resp | ||
|
|
||
| monkeypatch.setattr("asyncio.to_thread", fake_thread) | ||
|
|
||
| o = Ollama(model="default") | ||
| req = LlmRequest( | ||
| model="override", | ||
| contents=[types.Content(role="user", parts=[types.Part.from_text("X")])] | ||
| ) | ||
|
|
||
| out = [r async for r in o.generate_content_async(req)][0] | ||
| assert out.model_version == "override" No newline at end of file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test will fail because _to_llm_response does not set model_version from the Ollama response. The test should also be improved to assert that the correct model name is sent in the request payload, and the mock response should include the model field, as the real Ollama API would.
| async def test_model_override(monkeypatch): | |
| resp = mock_response_ok("Hello") | |
| async def fake_thread(fn, *args): | |
| return resp | |
| monkeypatch.setattr("asyncio.to_thread", fake_thread) | |
| o = Ollama(model="default") | |
| req = LlmRequest( | |
| model="override", | |
| contents=[types.Content(role="user", parts=[types.Part.from_text("X")])] | |
| ) | |
| out = [r async for r in o.generate_content_async(req)][0] | |
| assert out.model_version == "override" | |
| async def test_model_override(monkeypatch): | |
| resp = mock_response_ok("Hello") | |
| resp['model'] = 'override' | |
| async def fake_thread(fn, *args): | |
| payload = args[0] | |
| assert payload['model'] == 'override' | |
| return resp | |
| monkeypatch.setattr("asyncio.to_thread", fake_thread) | |
| o = Ollama(model="default") | |
| req = LlmRequest( | |
| model="override", | |
| contents=[types.Content(role="user", parts=[types.Part.from_text("X")])] | |
| ) | |
| out = [r async for r in o.generate_content_async(req)][0] | |
| assert out.model_version == "override" |
| return LlmResponse( | ||
| content=types.Content(role='model', parts=parts), | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementation is missing parsing of usage metadata and model version from the Ollama response. The Ollama API provides prompt_eval_count, eval_count, and model which should be mapped to usage_metadata and model_version in the LlmResponse to provide complete information and fix failing tests.
| return LlmResponse( | |
| content=types.Content(role='model', parts=parts), | |
| ) | |
| usage_metadata = None | |
| prompt_tokens = response_json.get('prompt_eval_count') | |
| completion_tokens = response_json.get('eval_count') | |
| if prompt_tokens is not None and completion_tokens is not None: | |
| usage_metadata = types.GenerateContentResponseUsageMetadata( | |
| prompt_token_count=prompt_tokens, | |
| candidates_token_count=completion_tokens, | |
| total_token_count=prompt_tokens + completion_tokens, | |
| ) | |
| return LlmResponse( | |
| content=types.Content(role='model', parts=parts), | |
| usage_metadata=usage_metadata, | |
| model_version=response_json.get('model'), | |
| ) |
…parsing, updated tests
|
Thanks for the review! Please let me know if you would like any additional adjustments. |
|
Hi @ayman3000, Thank you for your work on this pull request. We appreciate the effort you've invested. |
|
🚀 PR: Native Ollama LLM Integration + Example Project + Full Unit Tests
Includes: Critical Fix for Ollama Cloud Tool-Calling + Comparison Test with LiteLLM
🔗 Link to Issue or Description of Change
No existing issue.
Submitting this as a major feature contribution that fills a critical gap in ADK’s LLM provider ecosystem.
❗ Problem
Google ADK currently relies on LiteLLM for Ollama usage (
ollama_chat/...).However, LiteLLM still fails with Ollama when using cloud models such as:
glm-4.6:cloudgpt-oss:20b-cloudThe failures include:
❌ LiteLLM + Ollama Cloud → Broken Tool Calling
In short:
This makes ADK incomplete for:
🔥 Why This Feature Is Critical
✔ This PR provides the first working, reliable tool-calling path for Ollama Cloud models inside ADK.
Native Ollama backend → Stable.
LiteLLM backend → Broken for cloud models.
The ADK ecosystem must support Ollama Cloud + tools — and this PR provides exactly that.
✅ Solution — Full Native Ollama Support
This PR adds:
✔ 1. New Native Backend —
ollama_llm.pyImplements
BaseLlmwith robust support for:🧩 Core Features
POST /api/chatgoogle_adk.xxx🎛 Model Routing
Supports all of these: