Skip to content

Conversation

@Evgenii-Kazannik
Copy link
Contributor

@Evgenii-Kazannik Evgenii-Kazannik commented Jan 9, 2026

RERANK

request put {{base-url}}/_inference/rerank/mixedbread { "service": "mixedbread", "service_settings": { "api_key": "{{mb-api-key}}", "model_id": "mixedbread-ai/mxbai-rerank-xsmall-v1" }, "task_settings": { "return_documents": true, "top_k": 1 } }
response { "inference_id": "mixedbread", "task_type": "rerank", "service": "mixedbread", "service_settings": { "model_id": "mixedbread-ai/mxbai-rerank-xsmall-v1", "rate_limit": { "requests_per_minute": 240 } }, "task_settings": { "return_documents": true } }
request post {{base-url}}/_inference/rerank/mixedbread { "input": ["Luke", "like", "leia", "chewy","r2d2", "star", "wars"], "query": "star wars main character", "top_n": 2, "return_documents": true }
response { "rerank": [ { "index": 0, "relevance_score": 0.083740234, "text": "Luke" }, { "index": 2, "relevance_score": 0.06994629, "text": "leia" } ] }
direct request post https://api.mixedbread.com/v1/reranking { "model": "mixedbread-ai/mxbai-rerank-xsmall-v1", "query": "Who is the author of To Kill a Mockingbird?", "input": [ "To Kill a Mockingbird is a novel by Harper Lee", "The novel Moby-Dick was written by Herman Melville", "Harper Lee, an American novelist", "Jane Austen was an English novelist", "The Harry Potter series written by British author J.K. Rowling", "The Great Gatsby, a novel written by American author F. Scott Fitzgerald" ], "top_k": 3, "return_input": true }
response
"usage": {
    "prompt_tokens": 162,
    "total_tokens": 162,
    "completion_tokens": 0
},
"model": "mixedbread-ai/mxbai-rerank-xsmall-v1",
"data": [
    {
        "index": 0,
        "score": 0.98291015625,
        "input": "To Kill a Mockingbird is a novel by Harper Lee",
        "object": "rank_result"
    },
    {
        "index": 2,
        "score": 0.61962890625,
        "input": "Harper Lee, an American novelist",
        "object": "rank_result"
    },
    {
        "index": 3,
        "score": 0.36328125,
        "input": "Jane Austen was an English novelist",
        "object": "rank_result"
    }
],
"object": "list",
"top_k": 3,
"return_input": true

}

@elasticsearchmachine elasticsearchmachine added v9.4.0 external-contributor Pull request authored by a developer outside the Elasticsearch team labels Jan 9, 2026
@Evgenii-Kazannik Evgenii-Kazannik force-pushed the Add-Mixedbread-AI-Rerank-support branch from 2683926 to 6133d64 Compare January 13, 2026 14:56
@Evgenii-Kazannik Evgenii-Kazannik added external-contributor Pull request authored by a developer outside the Elasticsearch team and removed external-contributor Pull request authored by a developer outside the Elasticsearch team labels Jan 14, 2026
@Evgenii-Kazannik Evgenii-Kazannik marked this pull request as ready for review January 14, 2026 17:24
@elasticsearchmachine elasticsearchmachine added the needs:triage Requires assignment of a team area label label Jan 14, 2026
@jonathan-buttner jonathan-buttner self-assigned this Jan 20, 2026
@jonathan-buttner jonathan-buttner added :SearchOrg/Inference Label for the Search Inference team Team:Search - Inference >enhancement and removed needs:triage Requires assignment of a team area label labels Jan 20, 2026
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/search-inference-team (Team:Search - Inference)

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds support for Mixedbread AI's rerank API to Elasticsearch's inference plugin. The implementation follows the established pattern for inference service providers and includes comprehensive test coverage.

Changes:

  • Implements Mixedbread rerank service with model, request/response handling, and action creators
  • Adds service settings and task settings with configurable parameters (top_n, return_documents)
  • Registers the new service in InferencePlugin and InferenceNamedWriteablesProvider

Reviewed changes

Copilot reviewed 27 out of 27 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
MixedbreadService.java Main service implementation for Mixedbread rerank with configuration and inference methods
MixedbreadRerankModel.java Model class defining rerank-specific configuration and URI building
MixedbreadRerankRequest.java Request builder for Mixedbread rerank API calls
MixedbreadRerankResponseEntity.java Response parser for Mixedbread rerank API responses
MixedbreadRerankTaskSettings.java Task-level settings (top_n, return_documents)
MixedbreadRerankServiceSettings.java Service-level settings (model_id, rate limits)
MixedbreadActionCreator.java Creates executable actions for rerank operations
MixedbreadConstants.java Shared constants for field names and API paths
MixedbreadAccount.java Account credentials and URI management
InferencePlugin.java Registers the Mixedbread service factory
InferenceNamedWriteablesProvider.java Registers named writeables for serialization
Test files (8 files) Comprehensive test coverage for all components

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

assertThat(thrownException.getMessage(), containsString("field [top_n] is not of the expected type"));
}

public void UpdatedTaskSettings_WithEmptyMap_ReturnsSameSettings() {
Copy link

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test name has inconsistent capitalization. It should start with 'test' (lowercase) to match Java naming conventions and be consistent with other test methods in the same file.

Suggested change
public void UpdatedTaskSettings_WithEmptyMap_ReturnsSameSettings() {
public void testUpdatedTaskSettings_WithEmptyMap_ReturnsSameSettings() {

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


import java.util.Map;

import static org.elasticsearch.xpack.inference.services.jinaai.rerank.JinaAIRerankTaskSettingsTests.getTaskSettingsMap;
Copy link

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test is incorrectly using a JinaAI import for the helper method. This should use the Mixedbread equivalent method 'MixedbreadRerankTaskSettingsTests.getTaskSettingsMap' instead of importing from JinaAI rerank task settings tests.

Suggested change
import static org.elasticsearch.xpack.inference.services.jinaai.rerank.JinaAIRerankTaskSettingsTests.getTaskSettingsMap;
import static org.elasticsearch.xpack.inference.services.mixedbread.rerank.MixedbreadRerankTaskSettingsTests.getTaskSettingsMap;

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

XContentParser.Token token = parser.currentToken();
ensureExpectedToken(XContentParser.Token.START_OBJECT, token, parser);

positionParserAtTokenAfterField(parser, "data", "FAILED_TO_FIND_FIELD_TEMPLATE"); // TODO error message
Copy link

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a TODO comment with an incomplete error message placeholder. The error message template should be properly defined, as this is used when parsing fails if the 'data' field is not found in the response.

Suggested change
positionParserAtTokenAfterField(parser, "data", "FAILED_TO_FIND_FIELD_TEMPLATE"); // TODO error message
positionParserAtTokenAfterField(parser, "data", "Failed to find [data] field in Mixedbread rerank response");

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 144 to 154
public Boolean getDoesReturnDocuments() {
return returnDocuments;
}

public Integer getTopNDocumentsOnly() {
return topNDocumentsOnly;
}

public Boolean getReturnDocuments() {
return returnDocuments;
}
Copy link

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The class has two redundant getter methods for the same field. Both 'getDoesReturnDocuments()' and 'getReturnDocuments()' return the same 'returnDocuments' field. One of these methods should be removed to avoid confusion and maintain clean API design.

Copilot uses AI. Check for mistakes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, let's remove one of these.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


@Override
public int rerankerWindowSize(String modelId) {
// Cohere rerank model truncates at 4096 tokens https://docs.cohere.com/reference/rerank
Copy link

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment incorrectly mentions "Cohere rerank model" when this is a Mixedbread service implementation. The comment should be updated to reference Mixedbread's actual model token limits or window size documentation.

Suggested change
// Cohere rerank model truncates at 4096 tokens https://docs.cohere.com/reference/rerank
// Mixedbread rerank models currently support context windows of up to 4096 tokens (see Mixedbread documentation)

Copilot uses AI. Check for mistakes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pointing to cohere, I think we want the numbers posted here: https://www.mixedbread.com/docs/models/reranking/mxbai-rerank-large-v2

Looks like the older models have a window size of 512. We should make this configurable though. Let's add an optional field to the service settings that can control this value and default it to 8k.

Copy link
Contributor Author

@Evgenii-Kazannik Evgenii-Kazannik Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The newly added settings field will need to be used in

@Override
    public int rerankerWindowSize(String modelId) {

so we probably need to pass a model as a parameter instead of modelId but then I need to make refactoring impacting other services, namely to make the change in TransportGetRerankerWindowSizeAction and services overriding rerankerWindowSize

Should we do it or it's better to revert some changes and make it configurable via model_id like that?

MixedbreadService

```

private static final Map<String, Integer> RERANKERS_INPUT_SIZE = Map.of(
"mixedbread-ai/mxbai-rerank-xsmall-v1",
512,
"mixedbread-ai/mxbai-rerank-base-v1",
512,
"mixedbread-ai/mxbai-rerank-large-v1",
512
// Windows size.
// The v1 models: 512
// The v2 models: at least 8k
// https://www.mixedbread.com/docs/models/reranking/mxbai-rerank-large-v1
);

@Override
public int rerankerWindowSize(String modelId) {
    Integer inputSize = RERANKERS_INPUT_SIZE.get(modelId);
    return inputSize != null ? inputSize : DEFAULT_RERANKER_INPUT_SIZE_WORDS;
}

@jonathan-buttner @DonalEvans 

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The window size is not something a user can configure, it's an unchanging property of the model, so we don't need to store it in service settings. The current approach of having a map with the model IDs that don't use the 8k default is fine, but the rerankerWindowSize() method returns the size in words, not in tokens, so we'll need to translate from the 512/8000 values in tokens to smaller values in words by multiplying by 0.75 and rounding down a bit, which is the approach we use for other providers.

For consistency, the PR that originally introduced this feature can be used as a guide, with 512 tokens translating to a window size of 300 words, and 8000 tokens translating to 5500 words.


public class MixedbreadConstants {
public static final String VERSION_1 = "v1";
public static final String RERANK_PATH = "rerank";
Copy link

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The constant RERANK_PATH is defined as "rerank" here but in MixedbreadRerankModel.java it's defined as "reranking". This inconsistency could lead to incorrect API paths being constructed. These should be unified to use the same value.

Suggested change
public static final String RERANK_PATH = "rerank";
public static final String RERANK_PATH = "reranking";

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mixedbread supports both. I left "rerank". Done

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While both may work, the documentation for Mixedbread reranking uses reranking as the endpoint, so that's what we should be using.

Copy link
Contributor

@jonathan-buttner jonathan-buttner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting this together, I left some feedback.

private static final String INVALID_REQUEST_TYPE_MESSAGE = "Invalid request type: expected Mixedbread %s request but got %s";

private static final ResponseHandler RERANK_HANDLER = new MixedbreadRerankResponseHandler("mixedbread rerank", (request, response) -> {
if ((request instanceof MixedbreadRerankRequest) == false) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe we typically check the the request type unless we need to use it. If the response format is invalid, the parsing logic will throw an error which should be good enough. Let's remove the if-block.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I deleted the if-block

),
QueryAndDocsInputs.class
);
var errorMessage = buildErrorMessage(TaskType.RERANK, model.getInferenceEntityId());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use the helper method constructFailedToSendRequestMessage. Take a look at OpenAiActionCreator for example usage.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cleaned up and used the helper method instead. Thx

public static void decorateWithAuthHeader(HttpPost request, MixedbreadAccount account) {
request.setHeader(HttpHeaders.CONTENT_TYPE, XContentType.JSON.mediaType());
request.setHeader(createAuthBearerHeader(account.apiKey()));
request.setHeader(new BasicHeader(REQUEST_SOURCE_HEADER, ELASTIC_REQUEST_SOURCE));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you point me to documentation as to why we need this header?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used other implementations as references mainly Cohere, Jina AI and the ones you suggested in the comments.
It happened I ended up with an unnecessary header. Since I didn't find the one to be required.
This class is now deleted due to the changes related to other comments and this header is not used in my implementation.

* Write 360 120 1-minute
* Update 480 160 1-minute
* Delete 240 80 1-minute
* <a href="https://www.mixedbread.com/api-reference/rate-limits">Rate Limiting</a>.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These rate limits are for their storage operations. I'm not really sure what that is. If you go to the pricing page we can see that the free tier is limited to 100 requests per minute: https://www.mixedbread.com/pricing

Can you update the value to 100 and add the url I linked?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the url and added the link. Thank you

public static MixedbreadRerankServiceSettings fromMap(Map<String, Object> map, ConfigurationParseContext context) {
ValidationException validationException = new ValidationException();

String url = extractOptionalString(map, URL, ModelConfigurations.SERVICE_SETTINGS, validationException);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does mixedbread allow users to spin up deployments? From poking around it seems like requests are only made to https://api.mixedbread.com. Can we remove this? For testing we'll need a way to pass a local URL. For examples of how to do that take a look at https://github.com/elastic/elasticsearch/blob/main/x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/services/mistral/MistralModel.java

We basically just need a setter on the base model class.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, seems it can be removed. I did it. Also added the the suggested method

}

@Override
protected void validateInputType(InputType inputType, Model model, ValidationException validationException) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remove this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the implementation from the method


@Override
public TransportVersion getMinimalSupportedVersion() {
return TransportVersion.minimumCompatible();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switch this to use the new style.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


@Override
public Set<TaskType> supportedStreamingTasks() {
return COMPLETION_ONLY;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't support streaming tasks for Mixedbread yet so let's remove this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


@Override
public int rerankerWindowSize(String modelId) {
// Cohere rerank model truncates at 4096 tokens https://docs.cohere.com/reference/rerank
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pointing to cohere, I think we want the numbers posted here: https://www.mixedbread.com/docs/models/reranking/mxbai-rerank-large-v2

Looks like the older models have a window size of 512. We should make this configurable though. Let's add an optional field to the service settings that can control this value and default it to 8k.

@jonathan-buttner
Copy link
Contributor

Also please add a change log entry.


// should only be used for testing
MixedbreadRerankModel(
String modelId,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this should be inferenceEntityId or inferenceId to avoid confusion with actual model ID. That's common mix up across the code base. So you might find such naming somewhere else, but it should be consistent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replaced it with inferenceId

return new MixedbreadRerankServiceSettings(model, rateLimitSettings);
}

private final String model;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it be modelId? Along with other issues of strings being named model instead of modelId. model has strong connection with actual model objects.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Done

context
);

if (validationException.validationErrors().isEmpty() == false) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest using
validationException.throwIfValidationErrorsExist();
Readable, easy to understand and it is there exactly for throwing validation exceptions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. Done.
We may also want to make this change in other services upon some refactoring

public static MixedbreadRerankServiceSettings fromMap(Map<String, Object> map, ConfigurationParseContext context) {
ValidationException validationException = new ValidationException();

String model = extractRequiredString(map, MODEL_ID, ModelConfigurations.SERVICE_SETTINGS, validationException);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same related to naming of model instead of modelId

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

*/
public abstract class MixedbreadModel extends RateLimitGroupingModel {
private final SecureString apiKey;
private final RateLimitSettings rateLimitServiceSettings;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry if this was already answered elsewhere, but why do we have fields with RateLimitSettings both in Model class and in ServiceSettings class? Seems like unnecessary duplication. We could avoid having to store field in the model by accessing it through service settings.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. Done

return uri;
}

public RateLimitSettings rateLimitSettings() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need 2 identical methods? rateLimitServiceSettings and rateLimitSettings

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deleted one

Map<String, Object> serviceSettings,
Map<String, Object> taskSettings,
ChunkingSettings chunkingSettings,
Map<String, Object> secretSettings,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it should remain nullable because createModel is used by parsePersistedConfig that usually passes null for secret settings. Compared to parsePersistedConfigWithSecrets that passes the actual secret settings. You might want to check Nvidia integration for reference

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Done

Map<String, Object> secretSettings,
ConfigurationParseContext context
) {
if (taskType != TaskType.RERANK) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

task type check is performed within retrieveModelCreatorFromMapOrThrow method and can be removed from here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

* This class extends RateLimitGroupingModel and provides common functionality for Mixedbread models.
*/
public abstract class MixedbreadModel extends RateLimitGroupingModel {
private final SecureString apiKey;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe apiKey should be a part of ModelSecrets/SecretSettings, not a field in the model.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, done

Copy link
Contributor

@jonathan-buttner jonathan-buttner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, left a few more suggestions.

private final RateLimitSettings rateLimitSettings;

public MixedbreadRerankServiceSettings(String modelId, @Nullable RateLimitSettings rateLimitSettings) {
this.modelId = modelId;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we're requiring modelId let's add an Objects.requireNonNull.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

}

public MixedbreadRerankServiceSettings(StreamInput in) throws IOException {
this.modelId = in.readOptionalString();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Model id will be required so we can remove the optional part here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


@Override
protected XContentBuilder toXContentFragmentOfExposedFields(XContentBuilder builder, Params params) throws IOException {
if (modelId != null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

model id is required so we don't need this if check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


@Override
public void writeTo(StreamOutput out) throws IOException {
out.writeOptionalString(modelId);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's switch to writeString() instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

private static Document parseDocument(XContentParser parser) throws IOException {
var token = parser.currentToken();
if (token == XContentParser.Token.START_OBJECT) {
return new Document(DocumentObject.PARSER.apply(parser, null).text());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you been able to retrieve results where the object field is an object? The api docs online seem to indicate that it'll be a string, or potentially an array? https://www.mixedbread.com/api-reference/endpoints/reranking/rerank-documents

If you have been able to get an object back, can you share the request you made to do that?

Copy link
Contributor Author

@Evgenii-Kazannik Evgenii-Kazannik Feb 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like an "object” is just a type label.

e.g.
“object”: “list”
“object”: “rank_result” / "text_document"

And the string type in the spec indicates how the value is represented: it's always of a string type;
And I believe we will always have a list of ranked documents returned each containing its own object type

that's what we typically have as a response
...
"data": [
{
"index": 0,
"score": 0.98291015625,
"input": "To Kill a Mockingbird is a novel by Harper Lee",
"object": "rank_result"
},
{
"index": 2,
"score": 0.61962890625,
"input": "Harper Lee, an American novelist",
"object": "rank_result"
},
{
"index": 3,
"score": 0.3642578125,
"input": "Jane Austen was an English novelist",
"object": "rank_result"
}
],
"object": "list",

I agree that the wording is a bit confusing in that documentation

/**
* Apart from v1 all other models have a context length of at least 8k.
*/
private static final int DEFAULT_RERANKER_INPUT_SIZE_WORDS = 8000;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From their docs it seems to be 32k. Are we using a conservative value here instead?

https://www.mixedbread.com/docs/models/reranking

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I somehow thought to stick to the lowest, but it would be right to ask, my fault.
I changed it to 32 and also converted tokens to the words, I added the link to the source I used to do that.

taskType,
serviceSettingsMap,
taskSettingsMap,
chunkingSettings,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's just pass null here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

taskType,
serviceSettingsMap,
taskSettingsMap,
chunkingSettings,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's pass null here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


ChunkingSettings chunkingSettings = null;

return parsePersistedConfigWithSecrets(inferenceEntityId, taskType, serviceSettingsMap, taskSettingsMap, chunkingSettings, null);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's pass null instead of defining chunking settings.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -0,0 +1,846 @@
/*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have parameterized tests for some of the tests in this file.

Take a look at AbstractInferenceServiceParameterizedParsingTests, andAbstractInferenceServiceParameterizedModelCreationTests.

Let's follow some examples for how we've leveraged those.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. Added tests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>enhancement external-contributor Pull request authored by a developer outside the Elasticsearch team :SearchOrg/Inference Label for the Search Inference team Team:Search - Inference v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants