[ML] VoyageAI Integration - Clean Version by fzowl · Pull Request #137519 · elastic/elasticsearch

fzowl · 2025-11-03T13:28:02Z

Summary

This PR contains the cleaned up VoyageAI integration with the following improvements:

Testing:

All tests compile successfully
Text embeddings functionality verified
Multimodal embeddings support added
Rerank functionality included

- embeddings works, tested - initial rerank code What's missing: - unit and integration tests - rerank request/response mapping and verification

- embeddings works, tested - rerank works, tested (https://www.elastic.co/search-labs/blog/elasticsearch-cohere-rerank) What's missing: - unit and integration tests

Moving dimensions to ServiceSettings

- Add text, multimodal, and contextual embeddings support - Add rerank functionality - All code in services/voyageai directory - Includes comprehensive test coverage - Removed 3 files from external/voyageai directory - Removed 3 files from external/http/sender directory - All VoyageAI code now in services/voyageai directory only

- Add text, multimodal, and contextual embeddings support - Add rerank functionality - All code in services/voyageai directory - Includes comprehensive test coverage - Removed test-voyageai-e2e.sh script - All VoyageAI code now properly organized in services/voyageai only

- Add text, multimodal, and contextual embeddings support - Add rerank functionality - All code in services/voyageai directory - Includes comprehensive test coverage - Removed test-voyageai-e2e.sh script - Deleted ALL voyageai files from external directory (22 files) - All VoyageAI code now properly organized in services/voyageai only

* Fix irregular spaces * Update analysis-keyword-repeat-tokenfilter.md * Update search-suggesters.md * Update search-profile.md

elasticsearchmachine · 2025-11-05T13:49:05Z

Pinging @elastic/ml-core (Team:ML)

fzowl · 2025-11-24T17:28:21Z

@DonalEvans @jonathan-buttner Can you please take a look?

DonalEvans · 2025-11-24T17:34:04Z

Can you please take a look?

I will try to get to this PR this week, but I have a lot on my plate right now with other large PRs to review. Thank you for your patience.

DonalEvans · 2025-12-09T21:20:22Z

@fzowl Hi, sorry for taking so long to get to this, I've been out sick recently.

Before I review these changes thoroughly, I have some concerns related to the multimodal model support being added in this PR. We recently added a new task type to the inference API, embedding, which is intended to support both multimodal and non-multimodal models. Rather than adding support for Voyage's multimodal embedding models to the text_embedding task, I think it would be better to keep text_embedding only supporting non-multimodal models and add an integration for the embedding task type to the VoyageAI service in order to support multimodal models.

We don't currently have any integrations that support the embedding task type since it's brand new, but I plan to create one for the JinaAI service very soon which can be used as a template for other integrations. Once that's done, it should be relatively straightforward to implement the equivalent support for VoyageAI.

One other broad concern I have about the approach in this PR is that using the model name to determine the request format may be brittle and difficult to maintain. Can we guarantee that for a given model name prefix the behaviour will always be consistent? Might it be better to allow the user to explicitly set something in the service settings to indicate the type of model being used rather than trying to infer it ourselves?

fzowl · 2025-12-13T13:25:50Z

@DonalEvans Thanks for your reply! I probably will close this PR and open a new one (even cleaner, less commits) and follow the suggested way.

Regarding your concern about the approach to determine the request format: yes, the model name determines the request format, so at a high level, this should be fine. We can work on a better condition ie. model name contains substring (ie. 'multimodal'), or list the text embedding models, list the multimodal models, ie.

fzowl · 2025-12-19T12:56:14Z

@DonalEvans I'm closing this PR in favor of #139812 Can you please take a look?

fzowl added 17 commits October 18, 2025 12:02

VoyageAI embeddings and rerank:

8ce9dac

- embeddings works, tested - initial rerank code What's missing: - unit and integration tests - rerank request/response mapping and verification

VoyageAI embeddings and rerank:

17a7b28

- embeddings works, tested - rerank works, tested (https://www.elastic.co/search-labs/blog/elasticsearch-cohere-rerank) What's missing: - unit and integration tests

VoyageAI embeddings and rerank:

61409ea

- embeddings works, tested - rerank works, tested (https://www.elastic.co/search-labs/blog/elasticsearch-cohere-rerank) What's missing: - unit and integration tests

Adding initial tests

3b9cc3e

Moving dimensions to ServiceSettings

Correcting due to comments

e6937bc

Adding BIT support

480ccc9

Initial tests

438045e

More tests

6dea3f9

More tests/corrections

90ec511

Removing warnings

eb8824b

Further tests

08d73c9

Adding changelog and correcting TransportVersions

64e7fcf

Spotless tests

f3f43ed

Changes due to the comments

e067465

Changes due to the comments

aac1d48

Adding VoyageAI's v3.5 models

3206862

Adding VoyageAI's v3.5 models

cd926f1

elasticsearchmachine added v9.3.0 external-contributor Pull request authored by a developer outside the Elasticsearch team needs:triage Requires assignment of a team area label labels Nov 3, 2025

fzowl force-pushed the voyageai-integration-v2 branch 3 times, most recently from c480404 to 3c2400a Compare November 3, 2025 13:59

fzowl force-pushed the voyageai-integration-v2 branch from 3c2400a to c2e11ea Compare November 3, 2025 14:00

fzowl force-pushed the voyageai-integration-v2 branch from 370aba7 to 72f6dd3 Compare November 3, 2025 14:13

fzowl and others added 3 commits November 3, 2025 15:22

Merge branch 'main' into voyageai-integration-v2

100f19f

Fix irregular spaces (elastic#137014)

15896ab

* Fix irregular spaces * Update analysis-keyword-repeat-tokenfilter.md * Update search-suggesters.md * Update search-profile.md

elasticsearchmachine added Team:ML Meta label for the ML team and removed needs:triage Requires assignment of a team area label labels Nov 5, 2025

davidkyle added the >enhancement label Nov 5, 2025

davidkyle self-assigned this Nov 5, 2025

davidkyle requested review from davidkyle and jonathan-buttner November 6, 2025 15:25

davidkyle assigned jonathan-buttner and unassigned davidkyle Nov 13, 2025

davidkyle requested review from DonalEvans and removed request for davidkyle November 13, 2025 13:36

fzowl added 7 commits November 14, 2025 12:24

Merge branch 'main' into voyageai-integration-v2

9ef9b35

Merge branch 'main' into voyageai-integration-v2

d55b7cc

Merge branch 'elastic:main' into voyageai-integration-v2

d170168

Merge branch 'main' into voyageai-integration-v2

2a72e89

Merge branch 'main' into voyageai-integration-v2

4e4d069

Merge branch 'main' into voyageai-integration-v2

7c0dbb0

Merge branch 'main' into voyageai-integration-v2

18a145d

Merge branch 'main' into voyageai-integration-v2

e417195

fzowl added 3 commits December 15, 2025 17:40

Merge branch 'main' into voyageai-integration-v2

bbaf163

Merge branch 'elastic:main' into voyageai-integration-v2

783405c

Merge branch 'elastic:main' into voyageai-integration-v2

4a02911

elasticsearchmachine added v9.4.0 and removed v9.3.0 labels Dec 17, 2025

fzowl closed this Dec 19, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ML] VoyageAI Integration - Clean Version#137519

[ML] VoyageAI Integration - Clean Version#137519
fzowl wants to merge 41 commits intoelastic:mainfrom
voyage-ai:voyageai-integration-v2

fzowl commented Nov 3, 2025 •

edited

Loading

Uh oh!

elasticsearchmachine commented Nov 5, 2025

Uh oh!

fzowl commented Nov 24, 2025

Uh oh!

DonalEvans commented Nov 24, 2025

Uh oh!

DonalEvans commented Dec 9, 2025

Uh oh!

fzowl commented Dec 13, 2025

Uh oh!

fzowl commented Dec 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Conversation

fzowl commented Nov 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Testing:

Uh oh!

elasticsearchmachine commented Nov 5, 2025

Uh oh!

fzowl commented Nov 24, 2025

Uh oh!

DonalEvans commented Nov 24, 2025

Uh oh!

DonalEvans commented Dec 9, 2025

Uh oh!

fzowl commented Dec 13, 2025

Uh oh!

fzowl commented Dec 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

fzowl commented Nov 3, 2025 •

edited

Loading