(COMPATIBILITY) [v1.54 Smooth Sampling] - unknown model architecture: 'orion' #638

SabinStargem · 2024-01-25T12:46:34Z

I was trying to use 14b Orion LongChat, but it threw an error. Presumably, it is simply an new architecture. Here you go.

Welcome to KoboldCpp - Version 1.54
For command line arguments, please refer to --help

Attempting to use CuBLAS library for faster prompt ingestion. A compatible CuBLAS will be required.
Initializing dynamic library: koboldcpp_cublas.dll

Namespace(model=None, model_param='C:/KoboldCPP/Models/14b Orion LongChat - q6k.gguf', port=5001, port_param=5001, host='', launch=True, lora=None, config=None, threads=31, blasthreads=31, highpriority=False, contextsize=32768, blasbatchsize=512, ropeconfig=[0.0, 10000.0], smartcontext=False, noshift=False, bantokens=None, forceversion=0, nommap=False, usemlock=True, noavx2=False, debugmode=0, skiplauncher=False, hordeconfig=None, noblas=False, useclblast=None, usecublas=['normal', '0', 'mmq'], gpulayers=99, tensor_split=None, onready='', multiuser=1, remotetunnel=False, foreground=False, preloadstory=None, quiet=False, ssl=None)

Loading model: C:\KoboldCPP\Models\14b Orion LongChat - q6k.gguf
[Threads: 31, BlasThreads: 31, SmartContext: False, ContextShift: True]

The reported GGUF Arch is: orion

Identified as LLAMA model: (ver 6)
Attempting to Load...

Using automatic RoPE scaling. If the model has customized RoPE settings, they will be used directly instead!
System Info: AVX = 1 | AVX_VNNI = 0 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 0 | VSX = 0 |
ggml_init_cublas: found 1 CUDA devices:
Device 0: NVIDIA GeForce RTX 4090, compute capability 8.9, VMM: yes
llama_model_loader: loaded meta data with 21 key-value pairs and 444 tensors from C:\KoboldCPP\Models\14b Orion LongChat - q6k.gLﾁﾚ兎rror loading model: unknown model architecture: 'orion'
llama_load_model_from_file: failed to load model
Traceback (most recent call last):
File "koboldcpp.py", line 2519, in
File "koboldcpp.py", line 2366, in main
File "koboldcpp.py", line 310, in load_model
OSError: exception: access violation reading 0x0000000000000064
[28744] Failed to execute script 'koboldcpp' due to unhandled exception!

[process exited with code 1 (0x00000001)]

LostRuins · 2024-01-26T05:14:01Z

I don't think the "Orion" architecture is supported, don't see any references to it. Where and how did you get this model?

Tangweirui2021 · 2024-01-26T10:21:26Z

I have this problem,too. I convert this model manually following this guide.
https://github.com/OrionStarAI/Orion?tab=readme-ov-file#45-inference-by-llamacpp

LostRuins · 2024-01-26T10:37:26Z

Ah that makes sense. It relies on a pull request ggerganov#5118 that has not yet been merged, so it won't work until that happens.

Tangweirui2021 · 2024-01-26T11:01:46Z

You are right. I have tried to get the pr before converting the model. Actually, the converting will fail without this pr. The converted model do can run with this pr.

SabinStargem · 2024-01-26T22:53:57Z

Here is the GGUF for Orion Longchat 14b.

https://huggingface.co/demonsu/orion-14b-longchat-gguf/tree/main

LostRuins · 2024-02-08T15:59:09Z

Should be fixed now in v1.57, can check?

Tangweirui2021 · 2024-02-09T01:17:54Z

Yes,it seems work fine. Thanks for your job!

LostRuins closed this as completed Feb 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

(COMPATIBILITY) [v1.54 Smooth Sampling] - unknown model architecture: 'orion' #638

(COMPATIBILITY) [v1.54 Smooth Sampling] - unknown model architecture: 'orion' #638

SabinStargem commented Jan 25, 2024

LostRuins commented Jan 26, 2024

Tangweirui2021 commented Jan 26, 2024

LostRuins commented Jan 26, 2024

Tangweirui2021 commented Jan 26, 2024 •

edited

Loading

SabinStargem commented Jan 26, 2024

LostRuins commented Feb 8, 2024

Tangweirui2021 commented Feb 9, 2024

(COMPATIBILITY) [v1.54 Smooth Sampling] - unknown model architecture: 'orion' #638

(COMPATIBILITY) [v1.54 Smooth Sampling] - unknown model architecture: 'orion' #638

Comments

SabinStargem commented Jan 25, 2024

Attempting to use CuBLAS library for faster prompt ingestion. A compatible CuBLAS will be required. Initializing dynamic library: koboldcpp_cublas.dll

Identified as LLAMA model: (ver 6) Attempting to Load...

LostRuins commented Jan 26, 2024

Tangweirui2021 commented Jan 26, 2024

LostRuins commented Jan 26, 2024

Tangweirui2021 commented Jan 26, 2024 • edited Loading

SabinStargem commented Jan 26, 2024

LostRuins commented Feb 8, 2024

Tangweirui2021 commented Feb 9, 2024

Attempting to use CuBLAS library for faster prompt ingestion. A compatible CuBLAS will be required.
Initializing dynamic library: koboldcpp_cublas.dll

Identified as LLAMA model: (ver 6)
Attempting to Load...

Tangweirui2021 commented Jan 26, 2024 •

edited

Loading