Skip to content

UPSTREAM PR #19170: Add Kimi-K2.5 support#1068

Open
loci-dev wants to merge 2 commits intomainfrom
upstream-PR19170-branch_AesSedai-kimi-k2.5
Open

UPSTREAM PR #19170: Add Kimi-K2.5 support#1068
loci-dev wants to merge 2 commits intomainfrom
upstream-PR19170-branch_AesSedai-kimi-k2.5

Conversation

@loci-dev
Copy link

Mirrored from ggml-org/llama.cpp#19170

Adding support for https://huggingface.co/moonshotai/Kimi-K2.5

Since this model includes compressed-tensors (INT4 for the conditional experts), I moved the dequant_model to the prepare_tensors call at @compilade's suggestion. The model conversion fails otherwise because the quantization_config is nested under the text_config in the config.json.

Additionally, this model adds some new keys for the vision tower, prefixed as vt_, and the preprocessor_config.json has the expected fields nested in the media_proc_cfg key.

This PR does not include the "hacked" Q4_0 changes by @jukofyork, referred to in this comment.

While the mmproj conversion appears to work and the model loads and can decode images, I've got some weird output when using the vision component that leads me to believe there is a conversion issue somewhere or some other missing component. I think I need some review from @ngxson to help get it working correctly.

Add new kimi-k2.5 keys to mtmd convert
Update V_MMPROJ tensor mapping for new mm_projector.proj keys
Update V_M_IMP_NORM for new mm_projector.pre_norm key
@loci-review
Copy link

loci-review bot commented Jan 29, 2026

Based on the analysis, no functions were identified with meaningful performance changes between the base and target versions. The function_insights_topk tool returned empty results for both response time and throughput time metrics, indicating that the code changes in this version do not introduce measurable performance impacts.

This suggests that the modifications between versions are either:

  • Non-performance-affecting changes (documentation, comments, formatting)
  • Refactoring that maintains equivalent performance characteristics
  • Changes to non-critical code paths with negligible execution time
  • Additions or modifications that were not exercised in the analysis workload

Conclusion: No performance regression or improvement was detected. The changes appear performance-neutral from a static analysis perspective.

See the complete breakdown in Version Insights
Have questions? Tag @loci-dev to ask about this PR.

@loci-dev loci-dev force-pushed the main branch 25 times, most recently from 96d29ac to dbad616 Compare January 31, 2026 05:22
@loci-dev loci-dev force-pushed the main branch 30 times, most recently from 5330dfe to ff4fb1d Compare February 2, 2026 01:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants