Skip to content

Add Olmo3 implementation#16015

Merged
CISC merged 6 commits intoggml-org:masterfrom
2015aroras:shanea/olmo3
Sep 17, 2025
Merged

Add Olmo3 implementation#16015
CISC merged 6 commits intoggml-org:masterfrom
2015aroras:shanea/olmo3

Conversation

@2015aroras
Copy link
Copy Markdown
Contributor

@2015aroras 2015aroras commented Sep 15, 2025

This PR adds the upcoming Olmo 3. The main architectural differences from Olmo 2 are:

  • Sliding window attention is used for 3 out of 4 layers. RoPE scaling is not applied to sliding window attention layers.

Since the architecture is very similar to Olmo 2, this PR opts to merge Olmo 3 changes into the Olmo 2 implementation (similar to vllm-project/vllm#24534). I can create a separate Olmo 3 implementation instead if preferred.

@github-actions github-actions bot added the python python script changes label Sep 15, 2025
@2015aroras 2015aroras marked this pull request as ready for review September 15, 2025 20:08
@2015aroras
Copy link
Copy Markdown
Contributor Author

2015aroras commented Sep 15, 2025

I used the model conversion example for testing. I got the following results when using bf16 on shanearora/2025-sep-a-base-model, modified to have yarn rope scaling enabled.

📈 METRICS
==============================
MSE (Mean Squared Error):     1.592396e-02
Reference Variance:           6.831117e+00
NMSE:                         2.331092e-03
Max Absolute Error:           0.438750
Mean Absolute Error:          0.116665
NMSE (dB):                    -26.32 dB

🎯 INTERPRETATION
==============================
👍 Good match

📋 GUIDANCE
==============================
👍 GOOD: Your GGML conversion is working well.
   Small differences are likely due to precision/quantization.

📚 NMSE BENCHMARKS
==============================
✅ RESULT: PASS (NMSE = 2.33e-03)

Also, below is the allenai/OLMo-2-0425-1B with fp32.

📈 METRICS
==============================
MSE (Mean Squared Error):     1.594746e-03
Reference Variance:           9.219801e+00
NMSE:                         1.729697e-04
Max Absolute Error:           0.168732
Mean Absolute Error:          0.033951
NMSE (dB):                    -37.62 dB

🎯 INTERPRETATION
==============================
👍 Very good match

📋 GUIDANCE
==============================
✅ EXCELLENT: Your GGML conversion is working very well!
   The differences are negligible for practical use.

📚 NMSE BENCHMARKS
==============================
✅ RESULT: PASS (NMSE = 1.73e-04)

Comment thread src/llama-model.cpp Outdated
Comment thread src/llama-model.cpp Outdated
2015aroras and others added 2 commits September 15, 2025 15:13
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
@pwilkin
Copy link
Copy Markdown
Member

pwilkin commented Sep 16, 2025

@2015aroras What tool are you using to compare the conversion?

@2015aroras
Copy link
Copy Markdown
Contributor Author

@pwilkin I am using the model conversion tools inside this repo. These have been created to help make sure HF to llama.cpp conversion is accurate. The logs above are from the Model logits verfication step.

@pwilkin
Copy link
Copy Markdown
Member

pwilkin commented Sep 16, 2025

Ah, that's nice, I haven't used that specific one yet :)

@2015aroras
Copy link
Copy Markdown
Contributor Author

2015aroras commented Sep 16, 2025

All the check failures seem to be unrelated to this change. Before merging master again, an ios check was failing instead. So imo ready to merge.

Comment thread src/llama-model.cpp Outdated
@CISC
Copy link
Copy Markdown
Member

CISC commented Sep 16, 2025

All the check failures seem to be unrelated to this change. Before merging master again, an ios check was failing instead. So imo ready to merge.

Yes, sorry for the delay, just a minor cosmetic change and we'll merge. :)

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
@CISC CISC merged commit 85286f3 into ggml-org:master Sep 17, 2025
52 checks passed
blime4 referenced this pull request in blime4/llama.cpp Feb 5, 2026
* Add HF to gguf conversion logic for Olmo3

* Add Olmo3 implementation

* Update rope comment

* Fix indentation

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Apply suggestion from @CISC

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

python python script changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants