UPSTREAM PR #18658: model-conversion : add detect_pooling script by loci-dev · Pull Request #843 · auroralabs-loci/llama.cpp

loci-dev · 2026-01-07T09:40:46Z

This commit adds a Python script to automatically detect the pooling configuration from a sentence-transformers model directory.

The motivation for this change is that I make a mistake when adding the sentence-transformers support and I incorrectly assumed that if an embedding model uses sentence-transformers, it always used pooling. With the recent addition of support for late interaction models, which can have a down-projection but do not use pooling (like LFM2-ColBert-350M).

This commit builds upon ggml-org/llama.cpp#18464 which needs to be merged first.

Refs: ggml-org/llama.cpp#18607 (comment)

This commit adds a Python script to automatically detect the pooling configuration from a sentence-transformers model directory. The motivation for this change is that I make a mistake when adding the sentence-transformers support and I incorrectly assumed that if an embedding model uses sentence-transformers, it always used pooling. With the recent addition of support for late interaction models, which can have a down-projection but do not use pooling (like LFM2-ColBert-350M). This commit builds upon ggml-org/llama.cpp#18464 which needs to be merged first. Refs: ggml-org/llama.cpp#18607 (comment)

loci-review · 2026-01-07T10:27:00Z

Explore the complete analysis inside the Version Insights

Perfect! I've retrieved the summary report for your project. Here's what the analysis shows:

Summary Report for llama.cpp PR #843

Key Findings:

✅ No significant performance changes detected
The analysis compared two versions and found no modified functions with performance changes greater than 2% in either response time or throughput time
This indicates a neutral performance impact - the changes maintain performance stability

Project Information:

Repository: llama.cpp (auroralabs-loci)
Pull Request: UPSTREAM PR #18658: model-conversion : add detect_pooling script #843
Project ID: 2621b8c0-b5ce-11f0-b333-453f42058aa1
Report ID: 7956da10-ebad-11f0-81f2-dbb430499cb5

This is a positive result indicating that your code changes don't introduce any performance regressions or unexpected improvements, maintaining consistent performance characteristics across the codebase.

Would you like more detailed information about specific functions or any other aspect of this performance comparison?

loci-dev temporarily deployed to PROD__AL_DEMO January 7, 2026 09:40 — with GitHub Actions Inactive

loci-dev force-pushed the main branch from c6d4b6b to 47ebd13 Compare January 7, 2026 10:10

loci-dev force-pushed the main branch 26 times, most recently from 7515e5e to 5dbcd6b Compare January 10, 2026 17:07

loci-dev force-pushed the main branch 26 times, most recently from 048ad94 to 6c1fde6 Compare February 3, 2026 13:32

loci-dev force-pushed the main branch 4 times, most recently from d4c3480 to f998d1f Compare February 15, 2026 02:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UPSTREAM PR #18658: model-conversion : add detect_pooling script#843

UPSTREAM PR #18658: model-conversion : add detect_pooling script#843
loci-dev wants to merge 1 commit intomainfrom
upstream-PR18658-branch_danbev-model-conversion-detect-pooling

loci-dev commented Jan 7, 2026

Uh oh!

loci-review bot commented Jan 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

loci-dev commented Jan 7, 2026

Uh oh!

loci-review bot commented Jan 7, 2026

Summary Report for llama.cpp PR #843

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants