Use correct type of pooling for embedding models #5500

iamlemec · 2024-02-15T07:12:55Z

We had previously been doing sum pooling to approximate mean pooling in all cases. This changes to:

On conversion, read pooling configuration (MEAN, CLS, NONE)
Actually do MEAN pooling by precomputing sequence lengths
Use ggml_get_rows to speed up CLS pooling (just first token)

Testing with a couple of different types of models, numbers look very close to SentenceTransformers. Remaining differentials are due to the various tokenization issues discussed.

convert-hf-to-gguf.py

gguf-py/gguf/constants.py

llama.cpp

iamlemec · 2024-02-15T17:15:23Z

@ggerganov Should I merge this? Don't know what the proper etiquette is around here.

ggerganov · 2024-02-15T17:18:45Z

Yes, squash merge. I'm not at PC atm

Use correct type of pooling for embedding models

use correct type of pooling for embedding models

ed749b8

s-kostyaev mentioned this pull request Feb 15, 2024

Embedding model support ollama/ollama#327

Closed

ggerganov approved these changes Feb 15, 2024

View reviewed changes

small typing fix from linter

d2b77cc

cebtenzzre reviewed Feb 15, 2024

View reviewed changes

convert-hf-to-gguf.py Outdated Show resolved Hide resolved

convert-hf-to-gguf.py Outdated Show resolved Hide resolved

gguf-py/gguf/constants.py Outdated Show resolved Hide resolved

llama.cpp Show resolved Hide resolved

convert script fixes

34aa045

cebtenzzre approved these changes Feb 15, 2024

View reviewed changes

cebtenzzre mentioned this pull request Feb 15, 2024

DOC: GPT4ALL Chat App Embedding model nomic-ai/gpt4all#1597

Closed

iamlemec merged commit 4524290 into ggerganov:master Feb 15, 2024
44 of 54 checks passed

jordankanter pushed a commit to jordankanter/llama.cpp that referenced this pull request Mar 13, 2024

Use correct type of pooling for embedding models (ggerganov#5500)

43e83a5

Use correct type of pooling for embedding models

hodlen pushed a commit to hodlen/llama.cpp that referenced this pull request Apr 1, 2024

Use correct type of pooling for embedding models (ggerganov#5500)

1a9df17

Use correct type of pooling for embedding models

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use correct type of pooling for embedding models #5500

Use correct type of pooling for embedding models #5500

iamlemec commented Feb 15, 2024

iamlemec commented Feb 15, 2024

ggerganov commented Feb 15, 2024

Use correct type of pooling for embedding models #5500

Use correct type of pooling for embedding models #5500

Conversation

iamlemec commented Feb 15, 2024

iamlemec commented Feb 15, 2024

ggerganov commented Feb 15, 2024