[Bug]: openai_embedding_client returns len 8192 embedding not 4096 #6744

ehuaa · 2024-07-24T11:52:55Z

Your current environment

Collecting environment information...
PyTorch version: 2.3.1+cu121

GPU models and configuration:
GPU 0: NVIDIA A40
GPU 1: NVIDIA A40
GPU 2: NVIDIA A40
GPU 3: NVIDIA A40

Nvidia driver version: 535.161.08
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

Versions of relevant libraries:
[pip3] flashinfer==0.0.9+cu121torch2.3
[pip3] numpy==1.26.4
[pip3] nvidia-nccl-cu12==2.20.5
[pip3] onnx==1.14.1
[pip3] onnxruntime==1.18.1
[pip3] sentence-transformers==3.0.1
[pip3] torch==2.3.1
[pip3] torchvision==0.18.1
[pip3] transformers==4.42.4
[pip3] triton==2.3.1

vLLM Version: 0.5.3

🐛 Describe the bug

My vllm version is the latest version, v0.5.3 post1
first i launch a embedding server as below
python3 -m vllm.entrypoints.openai.api_server --model Salesforce/SFR-Embedding-Mistral --dtype bfloat16 --enforce-eager --max-model-len 8192
Salesforce/SFR-Embedding-Mistral is an embedding model which has the same architecture with intfloat/e5-mistral

then i use https://github.com/vllm-project/vllm/blob/main/examples/openai_embedding_client.py to test online embedding result.
And returns a tensor of 8192 length which is not 4096 as MistralModel's hidden size.
I also make two other test:
a. run tests/entrypoints/openai/test_embedding.py and found that there is no problem with the three tests, which the embedding size is exactly 4096.
b. run examples/offline_inference_embedding.py and the embedding size is also exactly 4096.

Can you have a look at what's going wrong with openai_embedding_client.py, thanks

The text was updated successfully, but these errors were encountered:

CatherineSue · 2024-07-24T17:14:16Z

Just checked OpenAI's python lib, they defaultly encode the float data to "base64" if encoding_format is not given, see here, so in openai_embedding_client.py, the encoding of the embedding returned became "base64" instead of "float", hence 8192 dimensions, if we add encoding_format=float, the returned dimensions will be 4096. Will add a fix soon.

hibukipanim · 2024-08-04T13:41:35Z

setting encoding_format=float indeed resolve the issue, however maybe there is still a bug with base64 in the vllm server ? as it's the default encoding_format used by openai python API it should still return the correct size I guess? the reason it's 8192 is that every second element is 0

HollowMan6 · 2024-08-25T22:56:56Z

setting encoding_format=float indeed resolve the issue, however maybe there is still a bug with base64 in the vllm server ? as it's the default encoding_format used by openai python API it should still return the correct size I guess? the reason it's 8192 is that every second element is 0

@hibukipanim This should hopefully fixed by #7855

ehuaa added the bug Something isn't working label Jul 24, 2024

CatherineSue mentioned this issue Jul 24, 2024

[Bugfix] Fix encoding_format in examples/openai_embedding_client.py #6755

Merged

simon-mo closed this as completed in #6755 Jul 25, 2024

This was referenced Aug 25, 2024

Neo4j Vector Index - Increase dimension limit to 8192 neo4j/neo4j#13512

Closed

[Bugfix]: Use float32 for base64 embedding #7855

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: openai_embedding_client returns len 8192 embedding not 4096 #6744

[Bug]: openai_embedding_client returns len 8192 embedding not 4096 #6744

ehuaa commented Jul 24, 2024 •

edited

Loading

CatherineSue commented Jul 24, 2024 •

edited

Loading

hibukipanim commented Aug 4, 2024

HollowMan6 commented Aug 25, 2024

[Bug]: openai_embedding_client returns len 8192 embedding not 4096 #6744

[Bug]: openai_embedding_client returns len 8192 embedding not 4096 #6744

Comments

ehuaa commented Jul 24, 2024 • edited Loading

Your current environment

🐛 Describe the bug

CatherineSue commented Jul 24, 2024 • edited Loading

hibukipanim commented Aug 4, 2024

HollowMan6 commented Aug 25, 2024

ehuaa commented Jul 24, 2024 •

edited

Loading

CatherineSue commented Jul 24, 2024 •

edited

Loading