Skip to content

Precision does not work when requesting token_embeddings #2882

@kacperlukawski

Description

@kacperlukawski

Hi! I've been experimenting with token embeddings recently and found out I cannot reduce the precision if I request them.

emb = model.encode(
    [
        "Hello Word, a test sentence",
        "Here comes another sentence",
        "My final sentence",
    ],
    output_value="token_embeddings",
    precision="uint8"
)

The quantize_embeddings function raises a ValueError since the shapes of token embedding arrays are unaligned and cannot be combined into a single numpy array. Here is part of the stack trace:

  File "/home/kacper/.cache/pypoetry/virtualenvs/beir-qdrant-wY0sLiQM-py3.10/lib/python3.10/site-packages/sentence_transformers/SentenceTransformer.py", line 553, in encode
    all_embeddings = quantize_embeddings(all_embeddings, precision=precision)
  File "/home/kacper/.cache/pypoetry/virtualenvs/beir-qdrant-wY0sLiQM-py3.10/lib/python3.10/site-packages/sentence_transformers/quantization.py", line 400, in quantize_embeddings
    embeddings = np.array(embeddings)
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (4,) + inhomogeneous part.

@tomaarsen I'm happy to provide a PR fixing this.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions