System Info
HF inference endpoint with custom container ghcr.io/huggingface/text-embeddings-inference:1.7.2 and deployed Qwen/Qwen3-Embedding-4B with task "sentence-embeddings" on a single L4 machine from AWS.
Information
Tasks
Reproduction
Just deploy a HF endpoint with Qwen/Qwen3-Embedding-4B and using ghcr.io/huggingface/text-embeddings-inference:1.7.2 image and make a test embedding.
Expected behavior
The embeddings I'm getting back are dim 1024 but this model should support up to 2560?