Skip to content

Conversation

@romartin
Copy link
Contributor

@romartin romartin commented Aug 12, 2025

Jira Issue: https://issues.redhat.com/browse/AAP-51301

Description

Setting up the Gemini inference provider into the stack. It will be used by the Ansible AI Assisted Installer, the ALIA one will keep using Granite.

Just notice that from this point, Gemini models are now being listed in the v1/models endpoint as well (but only working if an API key is properly set):

        {
            "identifier": "gemini/gemini-1.5-flash",
            "metadata": {},
            "api_model_type": "llm",
            "provider_id": "gemini",
            "type": "model",
            "provider_resource_id": "gemini-1.5-flash",
            "model_type": "llm"
        },
        {
            "identifier": "gemini/gemini-1.5-pro",
            "metadata": {},
            "api_model_type": "llm",
            "provider_id": "gemini",
            "type": "model",
            "provider_resource_id": "gemini-1.5-pro",
            "model_type": "llm"
        },
        {
            "identifier": "gemini/gemini-2.0-flash",
            "metadata": {},
            "api_model_type": "llm",
            "provider_id": "gemini",
            "type": "model",
            "provider_resource_id": "gemini-2.0-flash",
            "model_type": "llm"
        },
        {
            "identifier": "gemini/gemini-2.5-flash",
            "metadata": {},
            "api_model_type": "llm",
            "provider_id": "gemini",
            "type": "model",
            "provider_resource_id": "gemini-2.5-flash",
            "model_type": "llm"
        },
        {
            "identifier": "gemini/gemini-2.5-pro",
            "metadata": {},
            "api_model_type": "llm",
            "provider_id": "gemini",
            "type": "model",
            "provider_resource_id": "gemini-2.5-pro",
            "model_type": "llm"
        },
        {
            "identifier": "gemini/text-embedding-004",
            "metadata": {
                "embedding_dimension": 768.0,
                "context_length": 2048.0
            },
            "api_model_type": "llm",
            "provider_id": "gemini",
            "type": "model",
            "provider_resource_id": "text-embedding-004",
            "model_type": "llm"
        },

Testing

Tested locally.

Production deployment

  • This code change is ready for production on its own
  • This code change requires the following considerations before going to production:

@romartin romartin force-pushed the AAP-51301-gemini-inference-provider branch 2 times, most recently from 230d2f2 to 2276579 Compare August 12, 2025 20:33
@TamiTakamiya
Copy link
Collaborator

I have tested with the container image built in the Konflux PR build. Gemini (gemini-2.5-light) LLM was used properly, but for some reasons the knowlege_search failed. It may be caused by some setting issues of my test environment...

@TamiTakamiya
Copy link
Collaborator

I have tested with the container image built in the Konflux PR build. Gemini (gemini-2.5-light) LLM was used properly, but for some reasons the knowlege_search failed. It may be caused by some setting issues of my test environment...

It was a configuration issue on my test environment set-up. I could call /streaming_query endpoint and get referenced documents list both with Granite 3.3 and gemini-2.5-flash using the container image quay.io/ansible/ansible-chatbot-stack:on-pr-86-22765790b0b63c98cf25042309d3a08033654fa5

Copy link
Collaborator

@TamiTakamiya TamiTakamiya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although I left a comment on uv export format, it's very minor. I have verified the container image built worked with Gemini (and Granite) LLM.

After the conflict is resolved, I will re-approve this. Thanks!

@romartin romartin force-pushed the AAP-51301-gemini-inference-provider branch from 2276579 to f968069 Compare August 13, 2025 12:51
@romartin
Copy link
Contributor Author

@ldjebran @TamiTakamiya all changes from your comments applied, also rebased from main Ready for final review plz!

Copy link
Contributor

@ldjebran ldjebran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Collaborator

@TamiTakamiya TamiTakamiya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@romartin romartin merged commit 1c116c7 into ansible:main Aug 13, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants