Skip to content

[ML] VoyageAI Integration - Clean Version#139812

Closed
fzowl wants to merge 4 commits intoelastic:mainfrom
voyage-ai:voyageai-integration-v3
Closed

[ML] VoyageAI Integration - Clean Version#139812
fzowl wants to merge 4 commits intoelastic:mainfrom
voyage-ai:voyageai-integration-v3

Conversation

@fzowl
Copy link
Contributor

@fzowl fzowl commented Dec 19, 2025

Summary
This PR contains the cleaned up VoyageAI integration with the following improvements:

Testing:
All tests compile successfully
Text embeddings functionality verified
Multimodal embeddings support added
Rerank functionality included

 - text embedding models
 - contextual models
 - multimodal models
@fzowl fzowl changed the title Voyage integration: [ML] VoyageAI Integration - Clean Version Dec 19, 2025
@elasticsearchmachine
Copy link
Collaborator

@fzowl please enable the option "Allow edits and access to secrets by maintainers" on your PR. For more information, see the documentation.

@elasticsearchmachine elasticsearchmachine added needs:triage Requires assignment of a team area label v9.4.0 external-contributor Pull request authored by a developer outside the Elasticsearch team labels Dec 19, 2025
fzowl and others added 2 commits December 19, 2025 13:44
 - text embedding models
 - contextual models
 - multimodal models
@fzowl
Copy link
Contributor Author

fzowl commented Dec 19, 2025

@DonalEvans Can you please take a look?

 - text embedding models
 - contextual models
 - multimodal models
@PeteGillinElastic PeteGillinElastic added the :SearchOrg/Inference Label for the Search Inference team label Dec 23, 2025
@elasticsearchmachine elasticsearchmachine added Team:Search - Inference and removed needs:triage Requires assignment of a team area label labels Dec 23, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/search-inference-team (Team:Search - Inference)

@DonalEvans DonalEvans self-assigned this Jan 5, 2026
@DonalEvans
Copy link
Contributor

@fzowl Sorry for taking so long to get to this, I'm just coming back from vacation. I've had a quick look at the changes in this PR, and I feel that the approach used to implement multimodal embeddings for Voyage is at odds with the way we plan to implement them for other providers and will need to be reworked.

This PR which adds support for the embedding task to the Jina AI integration should be used as a template for the approach we want to be using, where the existing *EmbeddingsModel class is used for all dense embeddings (both text-only and multimodal) and the difference in behaviour between text-only and multimodal embeddings is handled by having different service settings classes for the text_embedding and embedding tasks, and making some modifications to the *EmbeddingsRequestEntity class to format the request sent to the model appropriately depending on whether it supports multimodal input or not. Taking this approach means that we don't need to duplicate huge amounts of code between almost identical "flavours" of classes, which this PR is currently doing.

I had hoped to get that PR merged or at least ready for review before the Christmas break because I knew that this PR would need to be based on it, but I ran out of time, so I apologise for asking you to rework this PR as a result. Until the Jina AI embedding PR is merged, I don't think it will be possible to really complete this PR, because the Jina PR includes some changes to the embedding task framework to allow it to support task settings and to allow us to determine whether a given set of inputs contains any non-text (i.e. image) values which will be necessary for any other implementations of the embedding task.

Some other broad concerns around the approach in this PR after a quick look at it are:

  • The embedding task does not yet support URLs for images, but when support for them is added, it will be by adding a new URL value to the DataFormat enum, not by reusing the TEXT value. If you want to add support URLs for Voyage multimodal embeddings, this change to add the URL value to the DataFormat enum should be made first.
  • Adding support for contextual models without also supporting the unique input format they require (a list of lists of String) is not ideal. The existing input format used by the embedding task supports providing a list of lists (see below), so it is possible for a user to structure the inputs correctly, but care will need to be taken since all of the code paths that use InferenceStringGroup currently assume that it will only contain a single InferenceString, which would not be true if the nested list format was used, so there might need to be some changes made in places that are making that assumption.
  • Rather than using the model name to determine what format a request should take (which will break if model naming conventions change in future) adding a field to the service settings to allow the user to specify what type of model it is is a more robust solution. For the Jina AI embedding task, this is a boolean multimodal_model field which indicates whether a model supports the multimodal request format or not. For Voyage it could be an enum with values text_embedding, multimodal and contextual, which gets used when creating the request to determine what URL to use and how to format the request body in the VoyageAIEmbeddingsRequestEntity class.
  • Instead of creating new RequestManager implementations for each task type, follow the pattern used in OpenAiActionCreator, CohereActionCreator and some others where a GenericRequestManager is used instead. (This pattern isn't used in JinaAIActionCreator, so don't copy that class).
Using the `embedding` task input format to specify a list of lists of String for a contextual embedding
"input": [
            {
              "content": [
                {"type": "text", "value": "doc 1 chunk 1"},
                {"type": "text", "value": "doc 1 chunk 2"}
              ]
            },
            {
              "content": [
                {"type": "text", "value": "doc 2 chunk 1"},
                {"type": "text", "value": "doc 2 chunk 2"}
              ]
            }
        ],

@fzowl
Copy link
Contributor Author

fzowl commented Feb 13, 2026

@DonalEvans I'm closing this one and will open a fresh PR.

@fzowl fzowl closed this Feb 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>enhancement external-contributor Pull request authored by a developer outside the Elasticsearch team :SearchOrg/Inference Label for the Search Inference team Team:Search - Inference v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants