Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Microsoft Phi-4 model #674

Open
genert opened this issue Jan 15, 2025 · 2 comments
Open

Add support for Microsoft Phi-4 model #674

genert opened this issue Jan 15, 2025 · 2 comments

Comments

@genert
Copy link

genert commented Jan 15, 2025

Contact Details

No response

What happened?

Llamafile crashes when running microsoft's phi-4 model (https://huggingface.co/microsoft/phi-4-gguf):

INFO: Running command line: llamafile -m phi-4-q4.gguf --server --v2 --listen 0.0.0.0:9810 --ctx-size 16384--nologo
Apple Metal GPU support successfully loaded
llama.cpp/llama.cpp:8491: GGML_ASSERT(hparams.n_swa > 0) failed

error: Uncaught SIGABRT (SI_0) on XXXXXX pid 43736 tid 43736
llamafile
 Darwin Cosmopolitan 4.0.2 MODE=aarch64; Darwin Kernel Version 24.0.0: Tue Sep 24 23:39:07 PDT 2024; root:xnu-11215.1.12~1/RELEASE_ARM64_T6000 MacBookPro-2.lan 24.0.0
 cosmoaddr2line llamafile 1854c6600 1853d3c18 10215ab6c 800330f08 8000007dc 8000be8cc 8001a747c 800191f0c 80013225c 8001319d4 80001a4fc 800013730 8000127c8 800004880 800003d4c 800000140
 0000000000000000 x0 35636678f75a1276 x8  0000000000000148 x16 00000000003a0950 x24
 0000000000000000 x1 356366791e99e036 x9  00000001f1e18018 x17 0000000114870b30 x25
 0000000000000008 x2 000000000000000a x10 0000000000000000 x18 0000000000000000 x26
 0000000802bb2000 x3 0000000000000002 x11 0000000000000006 x19 0000000114830010 x27
 0000000000000000 x4 0000000000000080 x12 0000000000000103 x20 0000000802ba7540 x28
 000000000000aad8 x5 0000000000000500 x13 00000001e9c3f320 x21 000000016dca35f0 x29
 0000000000000008 x6 0000000000000500 x14 00000001027aed10 x22 00000001854fef70 x30
 0000000000000021 x7 000000011551d0e0 x15 0000000000000000 x23 000000016dca35d0 x31
 000000016dca35d0 sp 1854c6600 pc NULL-2058850552
 000000016dca35f0 fp 1853d3c18 lr NULL-2059844320
 000000016dca3610 fp 10215ab6c lr g_events+29288564
 000000016dca3620 fp 800330f08 lr raise+48
 000000016dca3630 fp 8000007dc lr abort+48
 000000016dca3650 fp 8000be8cc lr NULL+520660
 000000016dca3670 fp 8001a747c lr llm_build_context::build_inp_KQ_mask_swa(bool)+248
 000000016dca37a0 fp 800191f0c lr llm_build_context::build_phi3()+280
 000000016dca3858 fp 80013225c lr llama_build_graph(llama_context&, llama_batch const&, bool)+828
 000000016dca39f0 fp 8001319d4 lr llama_new_context_with_model+6532
 000000016dca4490 fp 80001a4fc lr lf::server::Slot::start()+584
 000000016dca4600 fp 800013730 lr lf::server::Slots::start(int)+136
 000000016dca4640 fp 8000127c8 lr lf::server::main(int, char**)+356
 000000016dca46e0 fp 800004880 lr main+188
 000000016dca5740 fp 800003d4c lr cosmo+1144
 000000016dca57a0 fp 800000140 lr _start

Version

v0.9.0

What operating system are you seeing the problem on?

Linux

Relevant log output

No response

@genert
Copy link
Author

genert commented Jan 15, 2025

The issue is resolved in the server cpp - ggerganov/llama.cpp#10817

@BradHutchings
Copy link

BradHutchings commented Jan 17, 2025

Need to add support for Phi-4 around line 4665 in llama.cpp/llama.cpp:

                } else if (hparams.n_layer == 40 && hparams.n_ctx_train == 16384) {
                    // default value for Phi-4
                    hparams.n_swa = 16384;
                }

And, it looks like hparams gets overwritten after this runs. I've verified that it does run with some debug output. The cases for Phi-3 don't matter because the models are configured correctly. Not so for Phi-4. I had to make a change to its config.json and rebuild the GGUF:

  "sliding_window": 16384,
  "xx_sliding_window": null,

And that gave me a working Phi-4 with llamafile. I will have the GGUF and a start script here later today:

https://huggingface.co/bradhutchings/Brads-LLMs

HTH.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants