Skip to content

[ML] Enable built-in Inference Endpoints and default for Semantic Text#116931

Merged
davidkyle merged 5 commits intoelastic:mainfrom
davidkyle:remove-defaults-ff
Nov 18, 2024
Merged

[ML] Enable built-in Inference Endpoints and default for Semantic Text#116931
davidkyle merged 5 commits intoelastic:mainfrom
davidkyle:remove-defaults-ff

Conversation

@davidkyle
Copy link
Member

Adds built-in inference endpoints for the ELSER (.elser-2-elasticsearch) and multilingual-e5-small models (.multilingual-e5-small-elasticsearch). These endpoints will always appear in the GET _inference API, the models are automatically downloaded and deployed on a call to POST _inference. The endpoint use adaptive allocations with scale to 0 enabled (min_number_of_allocations: 0), after 15 minutes of inactivity the model deployment will scale down to 0 allocations at which point they use 0 resources and the node they are running on may scale down. Model deployments scale up again when the models are used. The built-in endpoints start with a . prefix and are suffixed with the name of the service that hosts them, in this case elasticsearch.

The Semantic Text field mapping defaults the inference_id option to the built-in ELSER inference endpoint. Indexing a document with a semantic text field mapping with trigger the download and deployment of the model.

GET _inference/_all
{
  "endpoints": [
    {
      "inference_id": ".elser-2-elasticsearch",
      "task_type": "sparse_embedding",
      "service": "elasticsearch",
      "service_settings": {
        "num_threads": 1,
        "model_id": ".elser_model_2",
        "adaptive_allocations": {
          "enabled": true,
          "min_number_of_allocations": 0,
          "max_number_of_allocations": 32
        }
      },
      "chunking_settings": {
        "strategy": "word",
        "max_chunk_size": 250,
        "overlap": 100
      }
    },
    {
      "inference_id": ".multilingual-e5-small-elasticsearch",
      "task_type": "text_embedding",
      "service": "elasticsearch",
      "service_settings": {
        "num_threads": 1,
        "model_id": ".multilingual-e5-small",
        "adaptive_allocations": {
          "enabled": true,
          "min_number_of_allocations": 0,
          "max_number_of_allocations": 32
        }
      },
      "chunking_settings": {
        "strategy": "word",
        "max_chunk_size": 250,
        "overlap": 100
      }
    }
  ]


@elasticsearchmachine elasticsearchmachine added Team:ML Meta label for the ML team Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch labels Nov 18, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-relevance (Team:Search Relevance)

@elasticsearchmachine
Copy link
Collaborator

Hi @davidkyle, I've created a changelog YAML for you.

@davidkyle davidkyle added the auto-backport Automatically create backport pull requests when merged label Nov 18, 2024
@davidkyle davidkyle merged commit 9790cc4 into elastic:main Nov 18, 2024
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

The backport operation could not be completed due to the following error:

An unexpected error occurred when attempting to backport this PR.

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 116931

davidkyle added a commit to davidkyle/elasticsearch that referenced this pull request Nov 18, 2024
elastic#116931)

Adds built-in inference endpoints for the ELSER (.elser-2-elasticsearch)
and multilingual-e5-small models (.multilingual-e5-small-elasticsearch).
The semantic text inference Id field now defaults to elser-2-elasticsearch
# Conflicts:
#	test/test-clusters/src/main/java/org/elasticsearch/test/cluster/FeatureFlag.java
#	x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/InferenceFeatures.java
alexey-ivanov-es pushed a commit to alexey-ivanov-es/elasticsearch that referenced this pull request Nov 28, 2024
elastic#116931)

Adds built-in inference endpoints for the ELSER (.elser-2-elasticsearch)
and multilingual-e5-small models (.multilingual-e5-small-elasticsearch).
The semantic text inference Id field now defaults to elser-2-elasticsearch
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-backport Automatically create backport pull requests when merged backport pending >enhancement :ml Machine learning :Search Relevance/Vectors Vector search Team:ML Meta label for the ML team Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch v8.17.0 v9.0.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants