Skip to content

Commit 6d4a32e

Browse files
vinkal-chudgarCISC
authored andcommitted
model : make minicpm embedding_scale, residual_scale and logit_scale optional with legacy defaults (ggml-org#16273)
* minicpm: make GGUF scaling keys optional with legacy defaults Older MiniCPM GGUFs do not include the scaling metadata keys (minicpm.embedding_scale, minicpm.residual_scale, minicpm.logit_scale). The loader currently treats these as required, so quantization fails with: key not found in model: minicpm.embedding_scale This change restores backward compatibility by treating these keys as optional in the loader and using the older MiniCPM scaling values: embedding_scale = 12.0f residual_scale = 1.4f / sqrt(n_layer) logit_scale = 256.0f / n_embd When the GGUF provides the keys, their values override the defaults; otherwise the legacy defaults are used. Newer GGUFs that already include these keys are unaffected. Fixes: ggml-org#16192 Signed-off-by: Vinkal Chudgar <[email protected]> * Update src/llama-model.cpp Committed as suggested. Thanks! Co-authored-by: Sigbjørn Skjæret <[email protected]> --------- Signed-off-by: Vinkal Chudgar <[email protected]> Co-authored-by: Sigbjørn Skjæret <[email protected]>
1 parent 807f6f6 commit 6d4a32e

File tree

1 file changed

+10
-3
lines changed

1 file changed

+10
-3
lines changed

src/llama-model.cpp

Lines changed: 10 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -764,10 +764,17 @@ void llama_model::load_hparams(llama_model_loader & ml) {
764764
} break;
765765
case LLM_ARCH_MINICPM:
766766
{
767+
// Backward-compatible defaults for older MiniCPM GGUFs
768+
hparams.f_embedding_scale = 12.0f;
769+
hparams.f_residual_scale = 1.4f / sqrtf(float(hparams.n_layer));
770+
hparams.f_logit_scale = hparams.n_embd ? (256.0f / float(hparams.n_embd)) : 1.0f;
771+
767772
ml.get_key(LLM_KV_ATTENTION_LAYERNORM_RMS_EPS, hparams.f_norm_rms_eps);
768-
ml.get_key(LLM_KV_EMBEDDING_SCALE, hparams.f_embedding_scale);
769-
ml.get_key(LLM_KV_RESIDUAL_SCALE, hparams.f_residual_scale);
770-
ml.get_key(LLM_KV_LOGIT_SCALE, hparams.f_logit_scale);
773+
774+
// Optional KV reads, override defaults if present in newer GGUF exports
775+
ml.get_key(LLM_KV_EMBEDDING_SCALE, hparams.f_embedding_scale, /*required=*/false);
776+
ml.get_key(LLM_KV_RESIDUAL_SCALE, hparams.f_residual_scale, /*required=*/false);
777+
ml.get_key(LLM_KV_LOGIT_SCALE, hparams.f_logit_scale, /*required=*/false);
771778

772779
// MiniCPM uses rope by default, unlike Granite which uses it as a switch
773780
hparams.rope_finetuned = true;

0 commit comments

Comments
 (0)