huggingface · vasqu · Mar 18, 2026 · Feb 23, 2026 · Feb 23, 2026 · Feb 23, 2026
diff --git a/docs/source/en/_toctree.yml b/docs/source/en/_toctree.yml
@@ -632,6 +632,8 @@
         title: Jamba
       - local: model_doc/jetmoe
         title: JetMoe
+      - local: model_doc/jina_embeddings_v3
+        title: jina_embeddings_v3
       - local: model_doc/led
         title: LED
       - local: model_doc/lfm2

diff --git a/docs/source/en/model_doc/jina_embeddings_v3.md b/docs/source/en/model_doc/jina_embeddings_v3.md
@@ -0,0 +1,165 @@
+<!--Copyright 2026 The HuggingFace Team. All rights reserved.
+Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
+the License. You may obtain a copy of the License at
+http://www.apache.org/licenses/LICENSE-2.0
+Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
+an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
+specific language governing permissions and limitations under the License.
+⚠️  Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
+rendered properly in your Markdown viewer.
+-->
+
+*This model was released on 2024-09-16 and added to Hugging Face Transformers on 2026-03-18.*
+
+<div style="float: right;">
+    <div class="flex flex-wrap space-x-1">
+        <img alt="PyTorch" src="https://img.shields.io/badge/PyTorch-DE3412?style=flat&logo=pytorch&logoColor=white" >
+        <img alt="FlashAttention" src="https://img.shields.io/badge/%E2%9A%A1%EF%B8%8E%20FlashAttention-eae0c8?style=flat">
+        <img alt="SDPA" src="https://img.shields.io/badge/SDPA-DE3412?style=flat&logo=pytorch&logoColor=white">
+    </div>
+</div>
+
+
+# JinaEmbeddingsV3
+
+The [Jina-Embeddings-v3](https://huggingface.co/papers/2409.10173) is a multilingual, multi-task text embedding model designed for a variety of NLP applications. Based on the XLM-RoBERTa architecture, this model supports **Rotary Position Embeddings (RoPE)** replacing absolute position embeddings to support long input sequences up to 8192 tokens. Additionally, it features 5 built-in **Task-Specific LoRA Adapters:** that allow the model to generate task-specific embeddings (e.g., for retrieval vs. classification) without increasing inference latency significantly.
+
+
+You can find the original Jina Embeddings v3 checkpoints under the [Jina AI](https://huggingface.co/jinaai) organization.
+
+
+> [!TIP]
+> Click on the Jina Embeddings v3 models in the right sidebar for more examples of how to apply the model to different language tasks.
+
+The example below demonstrates how to extract features (embeddings) with [`Pipeline`], [`AutoModel`], and from the command line.
+
+<hfoptions id="usage">
+<hfoption id="Pipeline">
+
+```py
+import torch
+from transformers import pipeline
+
+pipeline = pipeline(
+    task="feature-extraction",
+    model="jinaai/jina-embeddings-v3-hf",
+)
+# Returns a list of lists containing the embeddings for each token
+embeddings = pipeline("Jina Embeddings V3 is great for semantic search.")
+```
+
+
+</hfoption>
+<hfoption id="AutoModel">
+
+
+```py
+import torch
+from transformers import AutoModel, AutoTokenizer
+
+tokenizer = AutoTokenizer.from_pretrained("jinaai/jina-embeddings-v3-hf")
+model = AutoModel.from_pretrained("jinaai/jina-embeddings-v3-hf", device_map="auto")
+
+prompt = "Jina Embeddings V3 is great for semantic search."
+inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
+
+with torch.no_grad():
+    outputs = model(**inputs)
+    # The base AutoModel returns the raw hidden states for all tokens
+    last_hidden_states = outputs.last_hidden_state
+
+print(f"Features shape: {last_hidden_states.shape}")
+```
+
+</hfoption>
+</hfoptions>
+
+## Task-Specific LoRA Adapters
+
+A key feature of `JinaEmbeddingsV3` is it's LoRA adapters, which allow you to tailor the output embeddings to specific useful use cases without the overhead of loading entirely different models.
+
+The following tasks are supported:
+
+* **`retrieval.query`**: Used for query embeddings in asymmetric retrieval tasks (e.g., search queries).
+* **`retrieval.passage`**: Used for passage embeddings in asymmetric retrieval tasks (e.g., the documents being searched).
+* **`separation`**: Used for embeddings in clustering and re-ranking applications.
+* **`classification`**: Used for embeddings in classification tasks.
+* **`text-matching`**: Used for embeddings in tasks that quantify similarity between two texts, such as Semantic Textual Similarity (STS) or symmetric retrieval tasks.
+
+
+To generate high-quality sentence or paragraph embeddings, you need to apply **mean pooling** to the model's token embeddings. Mean pooling takes all token embeddings from the model's output and averages them, masking out the padding tokens.
+
+Here is how you can generate sentence embeddings tailored for a retrieval query task using the `AutoModel` API.
+
+```python
+import torch
+import torch.nn.functional as F
+from transformers import AutoTokenizer, AutoModel
+
+def mean_pooling(model_output, attention_mask):
+    # First element of model_output contains all token embeddings
+    token_embeddings = model_output[0]
+    input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
+
+    # Sum the embeddings and divide by the number of non-padding tokens
+    sum_embeddings = torch.sum(token_embeddings * input_mask_expanded, 1)
+    sum_mask = torch.clamp(input_mask_expanded.sum(1), min=1e-9)
+    return sum_embeddings / sum_mask
+
+
+sentences = [
+    "How is the weather today?", 
+    "What is the current weather like today?"
+]
+
+tokenizer = AutoTokenizer.from_pretrained("jinaai/jina-embeddings-v3-hf")
+model = AutoModel.from_pretrained("jinaai/jina-embeddings-v3-hf")
+
+encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors="pt").to(model.device)
+
+# Set up the adapter mask for your specific task
+task = 'retrieval_query'  # Can be any of (retrieval_passage, separation, classification, text_matching) depending on the use-case.
+
+model.load_adapter("jinaai/jina-embeddings-v3-hf", adapter_name=task, adapter_kwargs={"subfolder": task})
+
+model.set_adapter(task)
+
+with torch.no_grad():
+    model_output = model(**encoded_input)
+
+embeddings = mean_pooling(model_output, encoded_input["attention_mask"])
+embeddings = F.normalize(embeddings, p=2, dim=1)
+
+print(embeddings.shape)
+# Output: torch.Size([2, 1024])
+```
+
+
+## JinaEmbeddingsV3Config 
+
+[[autodoc]] JinaEmbeddingsV3Config
+
+## JinaEmbeddingsV3Model
+
+[[autodoc]] JinaEmbeddingsV3Model
+    - forward
+
+## JinaEmbeddingsV3ForMaskedLM 
+
+[[autodoc]] JinaEmbeddingsV3ForMaskedLM
+    - forward
+
+## JinaEmbeddingsV3ForSequenceClassification
+
+[[autodoc]] JinaEmbeddingsV3ForSequenceClassification
+    - forward
+
+## JinaEmbeddingsV3ForTokenClassification
+
+[[autodoc]] JinaEmbeddingsV3ForTokenClassification
+    - forward
+
+## JinaEmbeddingsV3ForQuestionAnswering
+
+[[autodoc]] JinaEmbeddingsV3ForQuestionAnswering
+    - forward
diff --git a/src/transformers/conversion_mapping.py b/src/transformers/conversion_mapping.py
@@ -420,6 +420,22 @@ def _build_checkpoint_conversion_mapping():
                 target_patterns="LayerNorm.bias",
             ),
         ],
+        "jina_embeddings_v3": [
+            WeightRenaming(source_patterns="emb_ln", target_patterns="embeddings.LayerNorm"),
+            WeightRenaming(source_patterns="encoder.layers", target_patterns="layers"),
+            WeightConverter(
+                source_patterns="mixer.Wqkv",
+                target_patterns=[
+                    "self_attn.q_proj",
+                    "self_attn.k_proj",
+                    "self_attn.v_proj",
+                ],
+                operations=[Chunk(dim=0)],
+            ),
+            WeightRenaming(source_patterns="mixer.out_proj", target_patterns="self_attn.o_proj"),
+            WeightRenaming(source_patterns="norm1", target_patterns="post_attention_layernorm"),
+            WeightRenaming(source_patterns="norm2", target_patterns="post_mlp_layernorm"),
+        ],
     }
     mapping["legacy"] += [
         WeightRenaming(

diff --git a/src/transformers/models/__init__.py b/src/transformers/models/__init__.py
@@ -203,6 +203,7 @@
     from .jamba import *
     from .janus import *
     from .jetmoe import *
+    from .jina_embeddings_v3 import *
     from .kosmos2 import *
     from .kosmos2_5 import *
     from .kyutai_speech_to_text import *

diff --git a/src/transformers/models/auto/configuration_auto.py b/src/transformers/models/auto/configuration_auto.py
@@ -237,6 +237,7 @@
         ("jamba", "JambaConfig"),
         ("janus", "JanusConfig"),
         ("jetmoe", "JetMoeConfig"),
+        ("jina_embeddings_v3", "JinaEmbeddingsV3Config"),
         ("kosmos-2", "Kosmos2Config"),
         ("kosmos-2.5", "Kosmos2_5Config"),
         ("kyutai_speech_to_text", "KyutaiSpeechToTextConfig"),
@@ -741,6 +742,7 @@
         ("jamba", "Jamba"),
         ("janus", "Janus"),
         ("jetmoe", "JetMoe"),
+        ("jina_embeddings_v3", "JinaEmbeddingsV3"),
         ("kosmos-2", "KOSMOS-2"),
         ("kosmos-2.5", "KOSMOS-2.5"),
         ("kyutai_speech_to_text", "KyutaiSpeechToText"),

diff --git a/src/transformers/models/auto/modeling_auto.py b/src/transformers/models/auto/modeling_auto.py
@@ -234,6 +234,7 @@ class _BaseModelWithGenerate(PreTrainedModel, GenerationMixin):
         ("jamba", "JambaModel"),
         ("janus", "JanusModel"),
         ("jetmoe", "JetMoeModel"),
+        ("jina_embeddings_v3", "JinaEmbeddingsV3Model"),
         ("kosmos-2", "Kosmos2Model"),
         ("kosmos-2.5", "Kosmos2_5Model"),
         ("kyutai_speech_to_text", "KyutaiSpeechToTextModel"),
@@ -1049,6 +1050,7 @@ class _BaseModelWithGenerate(PreTrainedModel, GenerationMixin):
         ("fnet", "FNetForMaskedLM"),
         ("funnel", "FunnelForMaskedLM"),
         ("ibert", "IBertForMaskedLM"),
+        ("jina_embeddings_v3", "JinaEmbeddingsV3ForMaskedLM"),
         ("layoutlm", "LayoutLMForMaskedLM"),
         ("longformer", "LongformerForMaskedLM"),
         ("luke", "LukeForMaskedLM"),
@@ -1232,6 +1234,7 @@ class _BaseModelWithGenerate(PreTrainedModel, GenerationMixin):
         ("ibert", "IBertForSequenceClassification"),
         ("jamba", "JambaForSequenceClassification"),
         ("jetmoe", "JetMoeForSequenceClassification"),
+        ("jina_embeddings_v3", "JinaEmbeddingsV3ForSequenceClassification"),
         ("layoutlm", "LayoutLMForSequenceClassification"),
         ("layoutlmv2", "LayoutLMv2ForSequenceClassification"),
         ("layoutlmv3", "LayoutLMv3ForSequenceClassification"),
@@ -1331,6 +1334,7 @@ class _BaseModelWithGenerate(PreTrainedModel, GenerationMixin):
         ("gpt_neox", "GPTNeoXForQuestionAnswering"),
         ("gptj", "GPTJForQuestionAnswering"),
         ("ibert", "IBertForQuestionAnswering"),
+        ("jina_embeddings_v3", "JinaEmbeddingsV3ForQuestionAnswering"),
         ("layoutlmv2", "LayoutLMv2ForQuestionAnswering"),
         ("layoutlmv3", "LayoutLMv3ForQuestionAnswering"),
         ("led", "LEDForQuestionAnswering"),
@@ -1447,6 +1451,7 @@ class _BaseModelWithGenerate(PreTrainedModel, GenerationMixin):
         ("gpt_oss", "GptOssForTokenClassification"),
         ("helium", "HeliumForTokenClassification"),
         ("ibert", "IBertForTokenClassification"),
+        ("jina_embeddings_v3", "JinaEmbeddingsV3ForTokenClassification"),
         ("layoutlm", "LayoutLMForTokenClassification"),
         ("layoutlmv2", "LayoutLMv2ForTokenClassification"),
         ("layoutlmv3", "LayoutLMv3ForTokenClassification"),

diff --git a/src/transformers/models/auto/tokenization_auto.py b/src/transformers/models/auto/tokenization_auto.py
@@ -161,6 +161,7 @@
         ("instructblipvideo", "GPT2Tokenizer" if is_tokenizers_available() else None),
         ("internvl", "Qwen2Tokenizer" if is_tokenizers_available() else None),
         ("jais2", "GPT2Tokenizer" if is_tokenizers_available() else None),
+        ("jina_embeddings_v3", "XLMRobertaTokenizer" if is_tokenizers_available() else None),
         ("kosmos-2", "XLMRobertaTokenizer" if is_tokenizers_available() else None),
         ("lasr_ctc", "LasrTokenizer" if is_tokenizers_available() else None),
         ("lasr_encoder", "LasrTokenizer" if is_tokenizers_available() else None),

diff --git a/src/transformers/models/jina_embeddings_v3/__init__.py b/src/transformers/models/jina_embeddings_v3/__init__.py
@@ -0,0 +1,29 @@
+# Copyright 2026 The HuggingFace Team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+from typing import TYPE_CHECKING
+
+from ...utils import _LazyModule
+from ...utils.import_utils import define_import_structure
+
+
+if TYPE_CHECKING:
+    from .configuration_jina_embeddings_v3 import *
+    from .modeling_jina_embeddings_v3 import *
+else:
+    import sys
+
+    _file = globals()["__file__"]
+    sys.modules[__name__] = _LazyModule(__name__, _file, define_import_structure(_file), module_spec=__spec__)
diff --git a/src/transformers/models/jina_embeddings_v3/configuration_jina_embeddings_v3.py b/src/transformers/models/jina_embeddings_v3/configuration_jina_embeddings_v3.py
@@ -0,0 +1,72 @@
+#                🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨
+#           This file was automatically generated from src/transformers/models/jina_embeddings_v3/modular_jina_embeddings_v3.py.
+#               Do NOT edit this file manually as any edits will be overwritten by the generation of
+#             the file from the modular. If any change should be done, please apply the change to the
+#                          modular_jina_embeddings_v3.py file directly. One of our CI enforces this.
+#                🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨
+# Copyright 2026 The Jina-AI and HuggingFace Inc. teams. All rights reserved.
+#
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from huggingface_hub.dataclasses import strict
+
+from ...configuration_utils import PreTrainedConfig
+from ...modeling_rope_utils import RopeParameters
+from ...utils import auto_docstring
+
+
+@auto_docstring(checkpoint="jinaai/jina-embeddings-v3-hf")
+@strict(accept_kwargs=True)
+class JinaEmbeddingsV3Config(PreTrainedConfig):
+    r"""
+    Examples:
+
+    ```python
+    >>> from transformers import JinaEmbeddingsV3Config, JinaEmbeddingsV3Model
+
+    >>> # Initializing a Jina-Embeddings-V3 jinaai/jina-embeddings-v3-hf style configuration
+    >>> configuration = JinaEmbeddingsV3Config()
+
+    >>> # Initializing a model (with random weights) from the jinaai/jina-embeddings-v3-hf style configuration
+    >>> model = JinaEmbeddingsV3Model(configuration)
+
+    >>> # Accessing the model configuration
+    >>> configuration = model.config
+    ```"""
+
+    model_type = "jina_embeddings_v3"
+
+    vocab_size: int = 250002
+    hidden_size: int = 1024
+    num_hidden_layers: int = 24
+    num_attention_heads: int = 16
+    intermediate_size: int = 4096
+    hidden_act: str = "gelu"
+    hidden_dropout_prob: float = 0.1
+    attention_probs_dropout_prob: float = 0.1
+    max_position_embeddings: int = 8194
+    type_vocab_size: int = 1
+    initializer_range: float = 0.02
+    layer_norm_eps: float = 1e-5
+    pad_token_id: int | None = 1
+    bos_token_id: int | None = 0
+    eos_token_id: int | None = 2
+    use_cache: bool = True
+    classifier_dropout: float | int | None = None
+    tie_word_embeddings: bool = True
+    default_theta = 20000.0
+    rope_parameters: RopeParameters | dict | None = None
+
+
+__all__ = ["JinaEmbeddingsV3Config"]