Skip to content

UPSTREAM PR #18658: model-conversion : add detect_pooling script#843

Open
loci-dev wants to merge 1 commit intomainfrom
upstream-PR18658-branch_danbev-model-conversion-detect-pooling
Open

UPSTREAM PR #18658: model-conversion : add detect_pooling script#843
loci-dev wants to merge 1 commit intomainfrom
upstream-PR18658-branch_danbev-model-conversion-detect-pooling

Conversation

@loci-dev
Copy link

@loci-dev loci-dev commented Jan 7, 2026

Mirrored from ggml-org/llama.cpp#18658

This commit adds a Python script to automatically detect the pooling configuration from a sentence-transformers model directory.

The motivation for this change is that I make a mistake when adding the sentence-transformers support and I incorrectly assumed that if an embedding model uses sentence-transformers, it always used pooling. With the recent addition of support for late interaction models, which can have a down-projection but do not use pooling (like LFM2-ColBert-350M).

This commit builds upon ggml-org/llama.cpp#18464 which needs to be merged first.

Refs: ggml-org/llama.cpp#18607 (comment)

This commit adds a Python script to automatically detect the pooling
configuration from a sentence-transformers model directory.

The motivation for this change is that I make a mistake when adding the
sentence-transformers support and I incorrectly assumed that if an
embedding model uses sentence-transformers, it always used pooling. With
the recent addition of support for late interaction models, which can
have a down-projection but do not use pooling (like LFM2-ColBert-350M).

This commit builds upon ggml-org/llama.cpp#18464
which needs to be merged first.

Refs: ggml-org/llama.cpp#18607 (comment)
@loci-review
Copy link

loci-review bot commented Jan 7, 2026

Explore the complete analysis inside the Version Insights

Perfect! I've retrieved the summary report for your project. Here's what the analysis shows:

Summary Report for llama.cpp PR #843

Key Findings:

  • No significant performance changes detected
  • The analysis compared two versions and found no modified functions with performance changes greater than 2% in either response time or throughput time
  • This indicates a neutral performance impact - the changes maintain performance stability

Project Information:

This is a positive result indicating that your code changes don't introduce any performance regressions or unexpected improvements, maintaining consistent performance characteristics across the codebase.

Would you like more detailed information about specific functions or any other aspect of this performance comparison?

@loci-dev loci-dev force-pushed the main branch 26 times, most recently from 7515e5e to 5dbcd6b Compare January 10, 2026 17:07
@loci-dev loci-dev force-pushed the main branch 26 times, most recently from 048ad94 to 6c1fde6 Compare February 3, 2026 13:32
@loci-dev loci-dev force-pushed the main branch 4 times, most recently from d4c3480 to f998d1f Compare February 15, 2026 02:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants