[Do Not Merge] model : LFM2.5-Audio-1.5B#18641
[Do Not Merge] model : LFM2.5-Audio-1.5B#18641tdakhran wants to merge 32 commits intoggml-org:masterfrom
Conversation
c275436 to
e1a8fd1
Compare
|
If the string |
Or that. We just have to remember to remove them all from the merge message. :) |
Change is decoupled from ggml-org#18641. [LFM2.5-Audio-1.5B](https://huggingface.co/LiquidAI/LFM2.5-Audio-1.5B) needs streaming istft for generating output audio. * add streaming ISTFT class (`mtmd_audio_streaming_istft`) with overlap-add for audio reconstruction * replace global audio cache with per-instance cache, the model requires two independent caches, for preprocessing (audio input) and for istft (audio output). * unified templated FFT/IFFT implementation supporting both forward and inverse transforms
… tarek/feat/os-lfm2.5-audio-1.5b-upstream [no ci]
Change is decoupled from #18641. [LFM2.5-Audio-1.5B](https://huggingface.co/LiquidAI/LFM2.5-Audio-1.5B) needs streaming istft for generating output audio. * add streaming ISTFT class (`mtmd_audio_streaming_istft`) with overlap-add for audio reconstruction * replace global audio cache with per-instance cache, the model requires two independent caches, for preprocessing (audio input) and for istft (audio output). * unified templated FFT/IFFT implementation supporting both forward and inverse transforms
|
Hello Tarek, I am trying to build your WIP PR. With the last commit: 'Read n_layer from gguf', using LTO, building fails at the very end of building here: llama-server and llama-liquid-audio-server are succefully built, cli fails. If there is anything I can do to help testing let me know. Thank you so much. |
|
@elfarolab , mentioned commit didn't change anything related to compilation or LTO, could it be that there are stale object files somewhere? Tested that the clean build in UPD: it's related to miniaudio cli defines implementation here https://github.com/ggml-org/llama.cpp/pull/18641/changes#diff-73f13371b37801825dc2cdbfacadf9af40aef9dca4770d9dacbbe3534c7a7dacR13 , another implementation is defined in mtmd audio. try commenting this line |
|
Before building I delete the building destination directory every time. I always build llama.cpp the same way with the options above, never get failures. |
|
@elfarolab , it should work now, there were two implementations of miniaudio |
rebuilding |
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
…5-audio-1.5b-upstream
|
Hello Tarek, I hope you are doing well. I was testing the server a bit further today, with the updated models of 2 weeks ago. Running on nVidia AGX Orin dev kit. Thank you so much. Loading and crash log:ggml_cuda_init: found 1 CUDA devices: Device 0: Orin, compute capability 8.7, VMM: no build: 1 (4a2f68a) with GNU 11.4.0 for Linux aarch64 Loading model common_init_result: fitting params to device memory, for bugs during this step try to reproduce them with -fit off, or provide --verbose logs if the bug only occurs with -fit on llama_params_fit_impl: projected to use 2416 MiB of device memory vs. 8291 MiB of free device memory llama_params_fit_impl: will leave 5874 >= 1024 MiB of free device memory, no changes needed llama_params_fit: successfully fit params to free device memory llama_params_fit: fitting params to free memory took 0.23 seconds llama_model_load_from_file_impl: using device CUDA0 (Orin) (0000:00:00.0) - 8293 MiB free llama_model_loader: loaded meta data with 38 key-value pairs and 148 tensors from /opt/usbhd/models/LFM2.5-Audio-1.5B-GGUF/LFM2.5-Audio-1.5B-F16.gguf (version GGUF V3 (latest)) llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output. llama_model_loader: - kv 0: general.architecture str = lfm2 llama_model_loader: - kv 1: general.type str = model llama_model_loader: - kv 2: general.name str = LFM2.5 Audio 1.5B llama_model_loader: - kv 3: general.basename str = LFM2.5-Audio llama_model_loader: - kv 4: general.size_label str = 1.5B llama_model_loader: - kv 5: general.license str = other llama_model_loader: - kv 6: general.license.name str = lfm1.0 llama_model_loader: - kv 7: general.license.link str = LICENSE llama_model_loader: - kv 8: general.base_model.count u32 = 1 llama_model_loader: - kv 9: general.base_model.0.name str = LFM2 1.2B llama_model_loader: - kv 10: general.base_model.0.organization str = LiquidAI llama_model_loader: - kv 11: general.base_model.0.repo_url str = https://huggingface.co/LiquidAI/LFM2-... llama_model_loader: - kv 12: general.tags arr[str,7] = ["liquid", "lfm2", "audio", "lfm2-aud... llama_model_loader: - kv 13: general.languages arr[str,1] = ["en"] llama_model_loader: - kv 14: lfm2.block_count u32 = 16 llama_model_loader: - kv 15: lfm2.context_length u32 = 128000 llama_model_loader: - kv 16: lfm2.embedding_length u32 = 2048 llama_model_loader: - kv 17: lfm2.feed_forward_length u32 = 8192 llama_model_loader: - kv 18: lfm2.attention.head_count u32 = 32 llama_model_loader: - kv 19: lfm2.attention.head_count_kv arr[i32,16] = [0, 0, 8, 0, 0, 8, 0, 0, 8, 0, 8, 0, ... llama_model_loader: - kv 20: lfm2.rope.freq_base f32 = 1000000.000000 llama_model_loader: - kv 21: lfm2.attention.layer_norm_rms_epsilon f32 = 0.000010 llama_model_loader: - kv 22: general.file_type u32 = 1 llama_model_loader: - kv 23: lfm2.vocab_size u32 = 65536 llama_model_loader: - kv 24: lfm2.shortconv.l_cache u32 = 3 llama_model_loader: - kv 25: general.quantization_version u32 = 2 llama_model_loader: - kv 26: tokenizer.ggml.model str = gpt2 llama_model_loader: - kv 27: tokenizer.ggml.pre str = lfm2 llama_model_loader: - kv 28: tokenizer.ggml.tokens arr[str,65536] = ["<|pad|>", "<|startoftext|>", "<|end... llama_model_loader: - kv 29: tokenizer.ggml.token_type arr[i32,65536] = [3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, ... llama_model_loader: - kv 30: tokenizer.ggml.merges arr[str,63683] = ["Ċ Ċ", "Ċ ĊĊ", "ĊĊ Ċ", "Ċ �... llama_model_loader: - kv 31: tokenizer.ggml.bos_token_id u32 = 1 llama_model_loader: - kv 32: tokenizer.ggml.eos_token_id u32 = 7 llama_model_loader: - kv 33: tokenizer.ggml.padding_token_id u32 = 0 llama_model_loader: - kv 34: tokenizer.ggml.add_bos_token bool = true llama_model_loader: - kv 35: tokenizer.ggml.add_sep_token bool = false llama_model_loader: - kv 36: tokenizer.ggml.add_eos_token bool = false llama_model_loader: - kv 37: tokenizer.chat_template str = {{- bos_token -}}{%- set system_promp... llama_model_loader: - type f32: 55 tensors llama_model_loader: - type f16: 93 tensors print_info: file format = GGUF V3 (latest) print_info: file type = F16 print_info: file size = 2.18 GiB (16.00 BPW) load: 0 unused tokens load: printing all EOG tokens: load: - 2 ('<|endoftext|>') load: - 7 ('<|im_end|>') load: special tokens cache size = 507 load: token to piece cache size = 0.3756 MB print_info: arch = lfm2 print_info: vocab_only = 0 print_info: no_alloc = 0 print_info: n_ctx_train = 128000 print_info: n_embd = 2048 print_info: n_embd_inp = 2048 print_info: n_layer = 16 print_info: n_head = 32 print_info: n_head_kv = [0, 0, 8, 0, 0, 8, 0, 0, 8, 0, 8, 0, 8, 0, 8, 0] print_info: n_rot = 64 print_info: n_swa = 0 print_info: is_swa_any = 0 print_info: n_embd_head_k = 64 print_info: n_embd_head_v = 64 print_info: n_gqa = [0, 0, 4, 0, 0, 4, 0, 0, 4, 0, 4, 0, 4, 0, 4, 0] print_info: n_embd_k_gqa = [0, 0, 512, 0, 0, 512, 0, 0, 512, 0, 512, 0, 512, 0, 512, 0] print_info: n_embd_v_gqa = [0, 0, 512, 0, 0, 512, 0, 0, 512, 0, 512, 0, 512, 0, 512, 0] print_info: f_norm_eps = 0.0e+00 print_info: f_norm_rms_eps = 1.0e-05 print_info: f_clamp_kqv = 0.0e+00 print_info: f_max_alibi_bias = 0.0e+00 print_info: f_logit_scale = 0.0e+00 print_info: f_attn_scale = 0.0e+00 print_info: n_ff = 8192 print_info: n_expert = 0 print_info: n_expert_used = 0 print_info: n_expert_groups = 0 print_info: n_group_used = 0 print_info: causal attn = 1 print_info: pooling type = 0 print_info: rope type = 2 print_info: rope scaling = linear print_info: freq_base_train = 1000000.0 print_info: freq_scale_train = 1 print_info: n_ctx_orig_yarn = 128000 print_info: rope_yarn_log_mul= 0.0000 print_info: rope_finetuned = unknown print_info: model type = 1.2B print_info: model params = 1.17 B print_info: general.name = LFM2.5 Audio 1.5B print_info: vocab type = BPE print_info: n_vocab = 65536 print_info: n_merges = 63683 print_info: BOS token = 1 '<|startoftext|>' print_info: EOS token = 7 '<|im_end|>' print_info: EOT token = 2 '<|endoftext|>' print_info: PAD token = 0 '<|pad|>' print_info: LF token = 708 'Ċ' print_info: EOG token = 2 '<|endoftext|>' print_info: EOG token = 7 '<|im_end|>' print_info: max token length = 30 load_tensors: loading model tensors, this can take a while... (mmap = false) load_tensors: offloading output layer to GPU load_tensors: offloading 15 repeating layers to GPU load_tensors: offloaded 17/17 layers to GPU load_tensors: CUDA0 model buffer size = 2232.50 MiB load_tensors: CUDA_Host model buffer size = 256.00 MiB .................................................................. common_init_result: added <|endoftext|> logit bias = -inf common_init_result: added <|im_end|> logit bias = -inf llama_context: constructing llama_context llama_context: n_seq_max = 1 llama_context: n_ctx = 4096 llama_context: n_ctx_seq = 4096 llama_context: n_batch = 2048 llama_context: n_ubatch = 512 llama_context: causal_attn = 1 llama_context: flash_attn = auto llama_context: kv_unified = false llama_context: freq_base = 1000000.0 llama_context: freq_scale = 1 llama_context: n_ctx_seq (4096) < n_ctx_train (128000) -- the full capacity of the model will not be utilized llama_context: CUDA_Host output buffer size = 0.25 MiB llama_kv_cache: CUDA0 KV buffer size = 48.00 MiB llama_kv_cache: size = 48.00 MiB ( 4096 cells, 6 layers, 1/1 seqs), K (f16): 24.00 MiB, V (f16): 24.00 MiB llama_memory_recurrent: CUDA0 RS buffer size = 0.16 MiB llama_memory_recurrent: size = 0.16 MiB ( 1 cells, 16 layers, 1 seqs), R (f32): 0.16 MiB, S (f32): 0.00 MiB llama_context: Flash Attention was auto, set to enabled llama_context: CUDA0 compute buffer size = 136.00 MiB llama_context: CUDA_Host compute buffer size = 12.01 MiB llama_context: graph nodes = 549 llama_context: graph splits = 2 common_init_from_params: warming up the model with an empty run - please wait ... (--no-warmup to disable) common_init_result: fitting params to device memory, for bugs during this step try to reproduce them with -fit off, or provide --verbose logs if the bug only occurs with -fit on llama_params_fit_impl: projected to use 268 MiB of device memory vs. 8241 MiB of free device memory llama_params_fit_impl: will leave 7972 >= 1024 MiB of free device memory, no changes needed llama_params_fit: successfully fit params to free device memory llama_params_fit: fitting params to free memory took 0.17 seconds llama_model_load_from_file_impl: using device CUDA0 (Orin) (0000:00:00.0) - 8241 MiB free llama_model_loader: loaded meta data with 29 key-value pairs and 77 tensors from /opt/usbhd/models/LFM2.5-Audio-1.5B-GGUF/tokenizer-LFM2.5-Audio-1.5B-F16.gguf (version GGUF V3 (latest)) llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output. llama_model_loader: - kv 0: general.architecture str = lfm2 llama_model_loader: - kv 1: general.type str = model llama_model_loader: - kv 2: general.name str = Audio_Detokenizer llama_model_loader: - kv 3: general.size_label str = 70M llama_model_loader: - kv 4: lfm2.block_count u32 = 8 llama_model_loader: - kv 5: lfm2.context_length u32 = 128000 llama_model_loader: - kv 6: lfm2.embedding_length u32 = 512 llama_model_loader: - kv 7: lfm2.feed_forward_length u32 = 2304 llama_model_loader: - kv 8: lfm2.attention.head_count u32 = 16 llama_model_loader: - kv 9: lfm2.attention.head_count_kv arr[i32,8] = [0, 0, 8, 0, 8, 0, 8, 0] llama_model_loader: - kv 10: lfm2.rope.freq_base f32 = 1000000.000000 llama_model_loader: - kv 11: lfm2.attention.layer_norm_rms_epsilon f32 = 0.000010 llama_model_loader: - kv 12: general.file_type u32 = 1 llama_model_loader: - kv 13: lfm2.vocab_size u32 = 65536 llama_model_loader: - kv 14: lfm2.shortconv.l_cache u32 = 3 llama_model_loader: - kv 15: lfm2.attention.sliding_window u32 = 30 llama_model_loader: - kv 16: lfm2.embedding_length_out u32 = 1282 llama_model_loader: - kv 17: general.quantization_version u32 = 2 llama_model_loader: - kv 18: tokenizer.ggml.model str = gpt2 llama_model_loader: - kv 19: tokenizer.ggml.pre str = lfm2 llama_model_loader: - kv 20: tokenizer.ggml.tokens arr[str,65536] = ["<|pad|>", "<|startoftext|>", "<|end... llama_model_loader: - kv 21: tokenizer.ggml.token_type arr[i32,65536] = [3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, ... llama_model_loader: - kv 22: tokenizer.ggml.merges arr[str,63683] = ["Ċ Ċ", "Ċ ĊĊ", "ĊĊ Ċ", "Ċ �... llama_model_loader: - kv 23: tokenizer.ggml.bos_token_id u32 = 1 llama_model_loader: - kv 24: tokenizer.ggml.eos_token_id u32 = 7 llama_model_loader: - kv 25: tokenizer.ggml.padding_token_id u32 = 0 llama_model_loader: - kv 26: tokenizer.ggml.add_bos_token bool = true llama_model_loader: - kv 27: tokenizer.ggml.add_sep_token bool = false llama_model_loader: - kv 28: tokenizer.ggml.add_eos_token bool = false llama_model_loader: - type f32: 29 tensors llama_model_loader: - type f16: 48 tensors print_info: file format = GGUF V3 (latest) print_info: file type = F16 print_info: file size = 133.82 MiB (16.00 BPW) load: 0 unused tokens load: printing all EOG tokens: load: - 2 ('<|endoftext|>') load: - 7 ('<|im_end|>') load: special tokens cache size = 507 load: token to piece cache size = 0.3756 MB print_info: arch = lfm2 print_info: vocab_only = 0 print_info: no_alloc = 0 print_info: n_ctx_train = 128000 print_info: n_embd = 512 print_info: n_embd_inp = 512 print_info: n_layer = 8 print_info: n_head = 16 print_info: n_head_kv = [0, 0, 8, 0, 8, 0, 8, 0] print_info: n_rot = 32 print_info: n_swa = 30 print_info: is_swa_any = 1 print_info: n_embd_head_k = 32 print_info: n_embd_head_v = 32 print_info: n_gqa = [0, 0, 2, 0, 2, 0, 2, 0] print_info: n_embd_k_gqa = [0, 0, 256, 0, 256, 0, 256, 0] print_info: n_embd_v_gqa = [0, 0, 256, 0, 256, 0, 256, 0] print_info: f_norm_eps = 0.0e+00 print_info: f_norm_rms_eps = 1.0e-05 print_info: f_clamp_kqv = 0.0e+00 print_info: f_max_alibi_bias = 0.0e+00 print_info: f_logit_scale = 0.0e+00 print_info: f_attn_scale = 0.0e+00 print_info: n_ff = 2304 print_info: n_expert = 0 print_info: n_expert_used = 0 print_info: n_expert_groups = 0 print_info: n_group_used = 0 print_info: causal attn = 1 print_info: pooling type = 0 print_info: rope type = 2 print_info: rope scaling = linear print_info: freq_base_train = 1000000.0 print_info: freq_scale_train = 1 print_info: freq_base_swa = 10000.0 print_info: freq_scale_swa = 1 print_info: n_ctx_orig_yarn = 128000 print_info: rope_yarn_log_mul= 0.0000 print_info: rope_finetuned = unknown print_info: model type = ?B print_info: model params = 70.14 M print_info: general.name = Audio_Detokenizer print_info: vocab type = BPE print_info: n_vocab = 65536 print_info: n_merges = 63683 print_info: BOS token = 1 '<|startoftext|>' print_info: EOS token = 7 '<|im_end|>' print_info: EOT token = 2 '<|endoftext|>' print_info: PAD token = 0 '<|pad|>' print_info: LF token = 708 'Ċ' print_info: EOG token = 2 '<|endoftext|>' print_info: EOG token = 7 '<|im_end|>' print_info: max token length = 30 load_tensors: loading model tensors, this can take a while... (mmap = false) load_tensors: offloading output layer to GPU load_tensors: offloading 7 repeating layers to GPU load_tensors: offloaded 9/9 layers to GPU load_tensors: CUDA0 model buffer size = 133.82 MiB load_tensors: CUDA_Host model buffer size = 64.00 MiB .................................. common_init_result: added <|endoftext|> logit bias = -inf common_init_result: added <|im_end|> logit bias = -inf llama_context: constructing llama_context llama_context: n_seq_max = 1 llama_context: n_ctx = 4096 llama_context: n_ctx_seq = 4096 llama_context: n_batch = 2048 llama_context: n_ubatch = 512 llama_context: causal_attn = 1 llama_context: flash_attn = auto llama_context: kv_unified = false llama_context: freq_base = 1000000.0 llama_context: freq_scale = 1 llama_context: n_ctx_seq (4096) < n_ctx_train (128000) -- the full capacity of the model will not be utilized llama_context: CUDA_Host output buffer size = 0.25 MiB llama_kv_cache_iswa: creating non-SWA KV cache, size = 4096 cells llama_kv_cache: size = 0.00 MiB ( 4096 cells, 0 layers, 1/1 seqs), K (f16): 0.00 MiB, V (f16): 0.00 MiB llama_kv_cache_iswa: creating SWA KV cache, size = 768 cells llama_kv_cache: CUDA0 KV buffer size = 2.25 MiB llama_kv_cache: size = 2.25 MiB ( 768 cells, 3 layers, 1/1 seqs), K (f16): 1.12 MiB, V (f16): 1.12 MiB llama_memory_recurrent: CUDA0 RS buffer size = 0.02 MiB llama_memory_recurrent: size = 0.02 MiB ( 1 cells, 8 layers, 1 seqs), R (f32): 0.02 MiB, S (f32): 0.00 MiB llama_context: layer 2 is assigned to device CUDA0 but the Flash Attention tensor is assigned to device CPU (usually due to missing support) llama_context: Flash Attention was auto, set to disabled llama_context: CUDA0 compute buffer size = 132.50 MiB llama_context: CUDA_Host compute buffer size = 2.51 MiB llama_context: graph nodes = 295 llama_context: graph splits = 2 common_init_from_params: warming up the model with an empty run - please wait ... (--no-warmup to disable) common_chat_params_init_lfm2: Using content relying on the template init: chat template example: <|im_start|>system You are a helpful assistant<|im_end|> <|im_start|>user Hello<|im_end|> <|im_start|>assistant Hi there<|im_end|> <|im_start|>user How are you?<|im_end|> <|im_start|>assistantclip_model_loader: model name: LFM2.5 Audio 1.5B clip_model_loader: has audio encoder --- audio hparams --- load_hparams: model size: 437.52 MiB audio_tokens->n_tokens = 26 encoding audio slice... What's your name? llama_perf_context_print: load time = 28150.55 ms llama_perf_context_print: load time = 28150.55 ms audio_tokens->n_tokens = 36 encoding audio slice... Please tell me a joke. llama_perf_context_print: load time = 28150.55 ms |
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
… into tarek/feat/os-lfm2.5-audio-1.5b-upstream [no ci]
c872bec to
d64c8d4
Compare
|
@ngxson, I tried porting the changes from |
|
Building again this PR, cleaned ccache, repo dir includes the last commits of today. On Jetson AGX Orin:Build options:Some are obvious, some are not used during this building, it is because I am using an universal and automatic configurable building script: The building error:If you need anything else to understand the error, I will stay available. |
309960e to
9960b91
Compare
|
@elfarolab , thanks for reporting, cli shall work now. |
rebuilding.. |
…2.5-audio-1.5b-upstream
|
Built successfully. |
|
After a succesfull ASR, at the TTS turn, llama-liquid-audio-server is not sending audio chunks anymore. The server receive the text to convert in audio but then looks like stop working. Command:./llama-liquid-audio-server --port 8086 --no-direct-io --no-mmap -m /opt/usbhd/models/LFM2.5-Audio-1.5B-GGUF/LFM2.5-Audio-1.5B-F16.gguf -mm /opt/usbhd/models/LFM2.5-Audio-1.5B-GGUF/mmproj-LFM2.5-Audio-1.5B-F16.gguf -mv /opt/usbhd/models/LFM2.5-Audio-1.5B-GGUF/vocoder-LFM2.5-Audio-1.5B-F16.gguf --tts-speaker-file /opt/usbhd/models/LFM2.5-Audio-1.5B-GGUF/tokenizer-LFM2.5-Audio-1.5B-F16.gguf |
|
@elfarolab , the sequence below works for me, did you update the inference example scripts? they changed a bit |
|
I found the error, it was at my side. Please tell me if you need any other information to debug this issue, I would like to help. CommandMy code log outputLong debug llama-liquid-audio-server output``` clip_model_loader: tensor[80]: n_dims = 2, name = a.blk.1.attn_v.weight, tensor_size=524288, offset=239890432, shape:[512, 512, 1, 1], type = f16 clip_model_loader: tensor[81]: n_dims = 2, name = a.blk.1.pos_bias_u, tensor_size=2048, offset=240414720, shape:[64, 8, 1, 1], type = f32 clip_model_loader: tensor[82]: n_dims = 2, name = a.blk.1.pos_bias_v, tensor_size=2048, offset=240416768, shape:[64, 8, 1, 1], type = f32 clip_model_loader: tensor[83]: n_dims = 1, name = a.blk.10.conv_norm.weight, tensor_size=2048, offset=240418816, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[84]: n_dims = 1, name = a.blk.10.conv_norm.bias, tensor_size=2048, offset=240420864, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[85]: n_dims = 1, name = a.blk.10.conv_dw.bias, tensor_size=2048, offset=240422912, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[86]: n_dims = 2, name = a.blk.10.conv_dw.weight, tensor_size=18432, offset=240424960, shape:[9, 512, 1, 1], type = f32 clip_model_loader: tensor[87]: n_dims = 1, name = a.blk.10.conv_pw1.bias, tensor_size=4096, offset=240443392, shape:[1024, 1, 1, 1], type = f32 clip_model_loader: tensor[88]: n_dims = 2, name = a.blk.10.conv_pw1.weight, tensor_size=2097152, offset=240447488, shape:[512, 1024, 1, 1], type = f32 clip_model_loader: tensor[89]: n_dims = 1, name = a.blk.10.conv_pw2.bias, tensor_size=2048, offset=242544640, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[90]: n_dims = 2, name = a.blk.10.conv_pw2.weight, tensor_size=1048576, offset=242546688, shape:[512, 512, 1, 1], type = f32 clip_model_loader: tensor[91]: n_dims = 1, name = a.blk.10.ffn_up.bias, tensor_size=8192, offset=243595264, shape:[2048, 1, 1, 1], type = f32 clip_model_loader: tensor[92]: n_dims = 2, name = a.blk.10.ffn_up.weight, tensor_size=2097152, offset=243603456, shape:[512, 2048, 1, 1], type = f16 clip_model_loader: tensor[93]: n_dims = 1, name = a.blk.10.ffn_down.bias, tensor_size=2048, offset=245700608, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[94]: n_dims = 2, name = a.blk.10.ffn_down.weight, tensor_size=2097152, offset=245702656, shape:[2048, 512, 1, 1], type = f16 clip_model_loader: tensor[95]: n_dims = 1, name = a.blk.10.ffn_up_1.bias, tensor_size=8192, offset=247799808, shape:[2048, 1, 1, 1], type = f32 clip_model_loader: tensor[96]: n_dims = 2, name = a.blk.10.ffn_up_1.weight, tensor_size=2097152, offset=247808000, shape:[512, 2048, 1, 1], type = f16 clip_model_loader: tensor[97]: n_dims = 1, name = a.blk.10.ffn_down_1.bias, tensor_size=2048, offset=249905152, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[98]: n_dims = 2, name = a.blk.10.ffn_down_1.weight, tensor_size=2097152, offset=249907200, shape:[2048, 512, 1, 1], type = f16 clip_model_loader: tensor[99]: n_dims = 1, name = a.blk.10.norm_conv.bias, tensor_size=2048, offset=252004352, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[100]: n_dims = 1, name = a.blk.10.norm_conv.weight, tensor_size=2048, offset=252006400, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[101]: n_dims = 1, name = a.blk.10.ffn_norm.bias, tensor_size=2048, offset=252008448, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[102]: n_dims = 1, name = a.blk.10.ffn_norm.weight, tensor_size=2048, offset=252010496, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[103]: n_dims = 1, name = a.blk.10.ffn_norm_1.bias, tensor_size=2048, offset=252012544, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[104]: n_dims = 1, name = a.blk.10.ffn_norm_1.weight, tensor_size=2048, offset=252014592, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[105]: n_dims = 1, name = a.blk.10.ln2.bias, tensor_size=2048, offset=252016640, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[106]: n_dims = 1, name = a.blk.10.ln2.weight, tensor_size=2048, offset=252018688, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[107]: n_dims = 1, name = a.blk.10.ln1.bias, tensor_size=2048, offset=252020736, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[108]: n_dims = 1, name = a.blk.10.ln1.weight, tensor_size=2048, offset=252022784, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[109]: n_dims = 1, name = a.blk.10.attn_k.bias, tensor_size=2048, offset=252024832, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[110]: n_dims = 2, name = a.blk.10.attn_k.weight, tensor_size=524288, offset=252026880, shape:[512, 512, 1, 1], type = f16 clip_model_loader: tensor[111]: n_dims = 1, name = a.blk.10.attn_out.bias, tensor_size=2048, offset=252551168, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[112]: n_dims = 2, name = a.blk.10.attn_out.weight, tensor_size=524288, offset=252553216, shape:[512, 512, 1, 1], type = f16 clip_model_loader: tensor[113]: n_dims = 2, name = a.blk.10.linear_pos.weight, tensor_size=524288, offset=253077504, shape:[512, 512, 1, 1], type = f16 clip_model_loader: tensor[114]: n_dims = 1, name = a.blk.10.attn_q.bias, tensor_size=2048, offset=253601792, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[115]: n_dims = 2, name = a.blk.10.attn_q.weight, tensor_size=524288, offset=253603840, shape:[512, 512, 1, 1], type = f16 clip_model_loader: tensor[116]: n_dims = 1, name = a.blk.10.attn_v.bias, tensor_size=2048, offset=254128128, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[117]: n_dims = 2, name = a.blk.10.attn_v.weight, tensor_size=524288, offset=254130176, shape:[512, 512, 1, 1], type = f16 clip_model_loader: tensor[118]: n_dims = 2, name = a.blk.10.pos_bias_u, tensor_size=2048, offset=254654464, shape:[64, 8, 1, 1], type = f32 clip_model_loader: tensor[119]: n_dims = 2, name = a.blk.10.pos_bias_v, tensor_size=2048, offset=254656512, shape:[64, 8, 1, 1], type = f32 clip_model_loader: tensor[120]: n_dims = 1, name = a.blk.11.conv_norm.weight, tensor_size=2048, offset=254658560, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[121]: n_dims = 1, name = a.blk.11.conv_norm.bias, tensor_size=2048, offset=254660608, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[122]: n_dims = 1, name = a.blk.11.conv_dw.bias, tensor_size=2048, offset=254662656, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[123]: n_dims = 2, name = a.blk.11.conv_dw.weight, tensor_size=18432, offset=254664704, shape:[9, 512, 1, 1], type = f32 clip_model_loader: tensor[124]: n_dims = 1, name = a.blk.11.conv_pw1.bias, tensor_size=4096, offset=254683136, shape:[1024, 1, 1, 1], type = f32 clip_model_loader: tensor[125]: n_dims = 2, name = a.blk.11.conv_pw1.weight, tensor_size=2097152, offset=254687232, shape:[512, 1024, 1, 1], type = f32 clip_model_loader: tensor[126]: n_dims = 1, name = a.blk.11.conv_pw2.bias, tensor_size=2048, offset=256784384, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[127]: n_dims = 2, name = a.blk.11.conv_pw2.weight, tensor_size=1048576, offset=256786432, shape:[512, 512, 1, 1], type = f32 clip_model_loader: tensor[128]: n_dims = 1, name = a.blk.11.ffn_up.bias, tensor_size=8192, offset=257835008, shape:[2048, 1, 1, 1], type = f32 clip_model_loader: tensor[129]: n_dims = 2, name = a.blk.11.ffn_up.weight, tensor_size=2097152, offset=257843200, shape:[512, 2048, 1, 1], type = f16 clip_model_loader: tensor[130]: n_dims = 1, name = a.blk.11.ffn_down.bias, tensor_size=2048, offset=259940352, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[131]: n_dims = 2, name = a.blk.11.ffn_down.weight, tensor_size=2097152, offset=259942400, shape:[2048, 512, 1, 1], type = f16 clip_model_loader: tensor[132]: n_dims = 1, name = a.blk.11.ffn_up_1.bias, tensor_size=8192, offset=262039552, shape:[2048, 1, 1, 1], type = f32 clip_model_loader: tensor[133]: n_dims = 2, name = a.blk.11.ffn_up_1.weight, tensor_size=2097152, offset=262047744, shape:[512, 2048, 1, 1], type = f16 clip_model_loader: tensor[134]: n_dims = 1, name = a.blk.11.ffn_down_1.bias, tensor_size=2048, offset=264144896, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[135]: n_dims = 2, name = a.blk.11.ffn_down_1.weight, tensor_size=2097152, offset=264146944, shape:[2048, 512, 1, 1], type = f16 clip_model_loader: tensor[136]: n_dims = 1, name = a.blk.11.norm_conv.bias, tensor_size=2048, offset=266244096, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[137]: n_dims = 1, name = a.blk.11.norm_conv.weight, tensor_size=2048, offset=266246144, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[138]: n_dims = 1, name = a.blk.11.ffn_norm.bias, tensor_size=2048, offset=266248192, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[139]: n_dims = 1, name = a.blk.11.ffn_norm.weight, tensor_size=2048, offset=266250240, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[140]: n_dims = 1, name = a.blk.11.ffn_norm_1.bias, tensor_size=2048, offset=266252288, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[141]: n_dims = 1, name = a.blk.11.ffn_norm_1.weight, tensor_size=2048, offset=266254336, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[142]: n_dims = 1, name = a.blk.11.ln2.bias, tensor_size=2048, offset=266256384, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[143]: n_dims = 1, name = a.blk.11.ln2.weight, tensor_size=2048, offset=266258432, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[144]: n_dims = 1, name = a.blk.11.ln1.bias, tensor_size=2048, offset=266260480, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[145]: n_dims = 1, name = a.blk.11.ln1.weight, tensor_size=2048, offset=266262528, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[146]: n_dims = 1, name = a.blk.11.attn_k.bias, tensor_size=2048, offset=266264576, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[147]: n_dims = 2, name = a.blk.11.attn_k.weight, tensor_size=524288, offset=266266624, shape:[512, 512, 1, 1], type = f16 clip_model_loader: tensor[148]: n_dims = 1, name = a.blk.11.attn_out.bias, tensor_size=2048, offset=266790912, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[149]: n_dims = 2, name = a.blk.11.attn_out.weight, tensor_size=524288, offset=266792960, shape:[512, 512, 1, 1], type = f16 clip_model_loader: tensor[150]: n_dims = 2, name = a.blk.11.linear_pos.weight, tensor_size=524288, offset=267317248, shape:[512, 512, 1, 1], type = f16 clip_model_loader: tensor[151]: n_dims = 1, name = a.blk.11.attn_q.bias, tensor_size=2048, offset=267841536, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[152]: n_dims = 2, name = a.blk.11.attn_q.weight, tensor_size=524288, offset=267843584, shape:[512, 512, 1, 1], type = f16 clip_model_loader: tensor[153]: n_dims = 1, name = a.blk.11.attn_v.bias, tensor_size=2048, offset=268367872, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[154]: n_dims = 2, name = a.blk.11.attn_v.weight, tensor_size=524288, offset=268369920, shape:[512, 512, 1, 1], type = f16 clip_model_loader: tensor[155]: n_dims = 2, name = a.blk.11.pos_bias_u, tensor_size=2048, offset=268894208, shape:[64, 8, 1, 1], type = f32 clip_model_loader: tensor[156]: n_dims = 2, name = a.blk.11.pos_bias_v, tensor_size=2048, offset=268896256, shape:[64, 8, 1, 1], type = f32 clip_model_loader: tensor[157]: n_dims = 1, name = a.blk.12.conv_norm.weight, tensor_size=2048, offset=268898304, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[158]: n_dims = 1, name = a.blk.12.conv_norm.bias, tensor_size=2048, offset=268900352, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[159]: n_dims = 1, name = a.blk.12.conv_dw.bias, tensor_size=2048, offset=268902400, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[160]: n_dims = 2, name = a.blk.12.conv_dw.weight, tensor_size=18432, offset=268904448, shape:[9, 512, 1, 1], type = f32 clip_model_loader: tensor[161]: n_dims = 1, name = a.blk.12.conv_pw1.bias, tensor_size=4096, offset=268922880, shape:[1024, 1, 1, 1], type = f32 clip_model_loader: tensor[162]: n_dims = 2, name = a.blk.12.conv_pw1.weight, tensor_size=2097152, offset=268926976, shape:[512, 1024, 1, 1], type = f32 clip_model_loader: tensor[163]: n_dims = 1, name = a.blk.12.conv_pw2.bias, tensor_size=2048, offset=271024128, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[164]: n_dims = 2, name = a.blk.12.conv_pw2.weight, tensor_size=1048576, offset=271026176, shape:[512, 512, 1, 1], type = f32 clip_model_loader: tensor[165]: n_dims = 1, name = a.blk.12.ffn_up.bias, tensor_size=8192, offset=272074752, shape:[2048, 1, 1, 1], type = f32 clip_model_loader: tensor[166]: n_dims = 2, name = a.blk.12.ffn_up.weight, tensor_size=2097152, offset=272082944, shape:[512, 2048, 1, 1], type = f16 clip_model_loader: tensor[167]: n_dims = 1, name = a.blk.12.ffn_down.bias, tensor_size=2048, offset=274180096, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[168]: n_dims = 2, name = a.blk.12.ffn_down.weight, tensor_size=2097152, offset=274182144, shape:[2048, 512, 1, 1], type = f16 clip_model_loader: tensor[169]: n_dims = 1, name = a.blk.12.ffn_up_1.bias, tensor_size=8192, offset=276279296, shape:[2048, 1, 1, 1], type = f32 clip_model_loader: tensor[170]: n_dims = 2, name = a.blk.12.ffn_up_1.weight, tensor_size=2097152, offset=276287488, shape:[512, 2048, 1, 1], type = f16 clip_model_loader: tensor[171]: n_dims = 1, name = a.blk.12.ffn_down_1.bias, tensor_size=2048, offset=278384640, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[172]: n_dims = 2, name = a.blk.12.ffn_down_1.weight, tensor_size=2097152, offset=278386688, shape:[2048, 512, 1, 1], type = f16 clip_model_loader: tensor[173]: n_dims = 1, name = a.blk.12.norm_conv.bias, tensor_size=2048, offset=280483840, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[174]: n_dims = 1, name = a.blk.12.norm_conv.weight, tensor_size=2048, offset=280485888, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[175]: n_dims = 1, name = a.blk.12.ffn_norm.bias, tensor_size=2048, offset=280487936, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[176]: n_dims = 1, name = a.blk.12.ffn_norm.weight, tensor_size=2048, offset=280489984, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[177]: n_dims = 1, name = a.blk.12.ffn_norm_1.bias, tensor_size=2048, offset=280492032, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[178]: n_dims = 1, name = a.blk.12.ffn_norm_1.weight, tensor_size=2048, offset=280494080, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[179]: n_dims = 1, name = a.blk.12.ln2.bias, tensor_size=2048, offset=280496128, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[180]: n_dims = 1, name = a.blk.12.ln2.weight, tensor_size=2048, offset=280498176, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[181]: n_dims = 1, name = a.blk.12.ln1.bias, tensor_size=2048, offset=280500224, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[182]: n_dims = 1, name = a.blk.12.ln1.weight, tensor_size=2048, offset=280502272, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[183]: n_dims = 1, name = a.blk.12.attn_k.bias, tensor_size=2048, offset=280504320, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[184]: n_dims = 2, name = a.blk.12.attn_k.weight, tensor_size=524288, offset=280506368, shape:[512, 512, 1, 1], type = f16 clip_model_loader: tensor[185]: n_dims = 1, name = a.blk.12.attn_out.bias, tensor_size=2048, offset=281030656, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[186]: n_dims = 2, name = a.blk.12.attn_out.weight, tensor_size=524288, offset=281032704, shape:[512, 512, 1, 1], type = f16 clip_model_loader: tensor[187]: n_dims = 2, name = a.blk.12.linear_pos.weight, tensor_size=524288, offset=281556992, shape:[512, 512, 1, 1], type = f16 clip_model_loader: tensor[188]: n_dims = 1, name = a.blk.12.attn_q.bias, tensor_size=2048, offset=282081280, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[189]: n_dims = 2, name = a.blk.12.attn_q.weight, tensor_size=524288, offset=282083328, shape:[512, 512, 1, 1], type = f16 clip_model_loader: tensor[190]: n_dims = 1, name = a.blk.12.attn_v.bias, tensor_size=2048, offset=282607616, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[191]: n_dims = 2, name = a.blk.12.attn_v.weight, tensor_size=524288, offset=282609664, shape:[512, 512, 1, 1], type = f16 clip_model_loader: tensor[192]: n_dims = 2, name = a.blk.12.pos_bias_u, tensor_size=2048, offset=283133952, shape:[64, 8, 1, 1], type = f32 clip_model_loader: tensor[193]: n_dims = 2, name = a.blk.12.pos_bias_v, tensor_size=2048, offset=283136000, shape:[64, 8, 1, 1], type = f32 clip_model_loader: tensor[194]: n_dims = 1, name = a.blk.13.conv_norm.weight, tensor_size=2048, offset=283138048, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[195]: n_dims = 1, name = a.blk.13.conv_norm.bias, tensor_size=2048, offset=283140096, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[196]: n_dims = 1, name = a.blk.13.conv_dw.bias, tensor_size=2048, offset=283142144, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[197]: n_dims = 2, name = a.blk.13.conv_dw.weight, tensor_size=18432, offset=283144192, shape:[9, 512, 1, 1], type = f32 clip_model_loader: tensor[198]: n_dims = 1, name = a.blk.13.conv_pw1.bias, tensor_size=4096, offset=283162624, shape:[1024, 1, 1, 1], type = f32 clip_model_loader: tensor[199]: n_dims = 2, name = a.blk.13.conv_pw1.weight, tensor_size=2097152, offset=283166720, shape:[512, 1024, 1, 1], type = f32 clip_model_loader: tensor[200]: n_dims = 1, name = a.blk.13.conv_pw2.bias, tensor_size=2048, offset=285263872, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[201]: n_dims = 2, name = a.blk.13.conv_pw2.weight, tensor_size=1048576, offset=285265920, shape:[512, 512, 1, 1], type = f32 clip_model_loader: tensor[202]: n_dims = 1, name = a.blk.13.ffn_up.bias, tensor_size=8192, offset=286314496, shape:[2048, 1, 1, 1], type = f32 clip_model_loader: tensor[203]: n_dims = 2, name = a.blk.13.ffn_up.weight, tensor_size=2097152, offset=286322688, shape:[512, 2048, 1, 1], type = f16 clip_model_loader: tensor[204]: n_dims = 1, name = a.blk.13.ffn_down.bias, tensor_size=2048, offset=288419840, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[205]: n_dims = 2, name = a.blk.13.ffn_down.weight, tensor_size=2097152, offset=288421888, shape:[2048, 512, 1, 1], type = f16 clip_model_loader: tensor[206]: n_dims = 1, name = a.blk.13.ffn_up_1.bias, tensor_size=8192, offset=290519040, shape:[2048, 1, 1, 1], type = f32 clip_model_loader: tensor[207]: n_dims = 2, name = a.blk.13.ffn_up_1.weight, tensor_size=2097152, offset=290527232, shape:[512, 2048, 1, 1], type = f16 clip_model_loader: tensor[208]: n_dims = 1, name = a.blk.13.ffn_down_1.bias, tensor_size=2048, offset=292624384, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[209]: n_dims = 2, name = a.blk.13.ffn_down_1.weight, tensor_size=2097152, offset=292626432, shape:[2048, 512, 1, 1], type = f16 clip_model_loader: tensor[210]: n_dims = 1, name = a.blk.13.norm_conv.bias, tensor_size=2048, offset=294723584, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[211]: n_dims = 1, name = a.blk.13.norm_conv.weight, tensor_size=2048, offset=294725632, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[212]: n_dims = 1, name = a.blk.13.ffn_norm.bias, tensor_size=2048, offset=294727680, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[213]: n_dims = 1, name = a.blk.13.ffn_norm.weight, tensor_size=2048, offset=294729728, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[214]: n_dims = 1, name = a.blk.13.ffn_norm_1.bias, tensor_size=2048, offset=294731776, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[215]: n_dims = 1, name = a.blk.13.ffn_norm_1.weight, tensor_size=2048, offset=294733824, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[216]: n_dims = 1, name = a.blk.13.ln2.bias, tensor_size=2048, offset=294735872, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[217]: n_dims = 1, name = a.blk.13.ln2.weight, tensor_size=2048, offset=294737920, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[218]: n_dims = 1, name = a.blk.13.ln1.bias, tensor_size=2048, offset=294739968, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[219]: n_dims = 1, name = a.blk.13.ln1.weight, tensor_size=2048, offset=294742016, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[220]: n_dims = 1, name = a.blk.13.attn_k.bias, tensor_size=2048, offset=294744064, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[221]: n_dims = 2, name = a.blk.13.attn_k.weight, tensor_size=524288, offset=294746112, shape:[512, 512, 1, 1], type = f16 clip_model_loader: tensor[222]: n_dims = 1, name = a.blk.13.attn_out.bias, tensor_size=2048, offset=295270400, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[223]: n_dims = 2, name = a.blk.13.attn_out.weight, tensor_size=524288, offset=295272448, shape:[512, 512, 1, 1], type = f16 clip_model_loader: tensor[224]: n_dims = 2, name = a.blk.13.linear_pos.weight, tensor_size=524288, offset=295796736, shape:[512, 512, 1, 1], type = f16 clip_model_loader: tensor[225]: n_dims = 1, name = a.blk.13.attn_q.bias, tensor_size=2048, offset=296321024, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[226]: n_dims = 2, name = a.blk.13.attn_q.weight, tensor_size=524288, offset=296323072, shape:[512, 512, 1, 1], type = f16 clip_model_loader: tensor[227]: n_dims = 1, name = a.blk.13.attn_v.bias, tensor_size=2048, offset=296847360, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[228]: n_dims = 2, name = a.blk.13.attn_v.weight, tensor_size=524288, offset=296849408, shape:[512, 512, 1, 1], type = f16 clip_model_loader: tensor[229]: n_dims = 2, name = a.blk.13.pos_bias_u, tensor_size=2048, offset=297373696, shape:[64, 8, 1, 1], type = f32 clip_model_loader: tensor[230]: n_dims = 2, name = a.blk.13.pos_bias_v, tensor_size=2048, offset=297375744, shape:[64, 8, 1, 1], type = f32 clip_model_loader: tensor[231]: n_dims = 1, name = a.blk.14.conv_norm.weight, tensor_size=2048, offset=297377792, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[232]: n_dims = 1, name = a.blk.14.conv_norm.bias, tensor_size=2048, offset=297379840, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[233]: n_dims = 1, name = a.blk.14.conv_dw.bias, tensor_size=2048, offset=297381888, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[234]: n_dims = 2, name = a.blk.14.conv_dw.weight, tensor_size=18432, offset=297383936, shape:[9, 512, 1, 1], type = f32 clip_model_loader: tensor[235]: n_dims = 1, name = a.blk.14.conv_pw1.bias, tensor_size=4096, offset=297402368, shape:[1024, 1, 1, 1], type = f32 clip_model_loader: tensor[236]: n_dims = 2, name = a.blk.14.conv_pw1.weight, tensor_size=2097152, offset=297406464, shape:[512, 1024, 1, 1], type = f32 clip_model_loader: tensor[237]: n_dims = 1, name = a.blk.14.conv_pw2.bias, tensor_size=2048, offset=299503616, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[238]: n_dims = 2, name = a.blk.14.conv_pw2.weight, tensor_size=1048576, offset=299505664, shape:[512, 512, 1, 1], type = f32 clip_model_loader: tensor[239]: n_dims = 1, name = a.blk.14.ffn_up.bias, tensor_size=8192, offset=300554240, shape:[2048, 1, 1, 1], type = f32 clip_model_loader: tensor[240]: n_dims = 2, name = a.blk.14.ffn_up.weight, tensor_size=2097152, offset=300562432, shape:[512, 2048, 1, 1], type = f16 clip_model_loader: tensor[241]: n_dims = 1, name = a.blk.14.ffn_down.bias, tensor_size=2048, offset=302659584, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[242]: n_dims = 2, name = a.blk.14.ffn_down.weight, tensor_size=2097152, offset=302661632, shape:[2048, 512, 1, 1], type = f16 clip_model_loader: tensor[243]: n_dims = 1, name = a.blk.14.ffn_up_1.bias, tensor_size=8192, offset=304758784, shape:[2048, 1, 1, 1], type = f32 clip_model_loader: tensor[244]: n_dims = 2, name = a.blk.14.ffn_up_1.weight, tensor_size=2097152, offset=304766976, shape:[512, 2048, 1, 1], type = f16 clip_model_loader: tensor[245]: n_dims = 1, name = a.blk.14.ffn_down_1.bias, tensor_size=2048, offset=306864128, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[246]: n_dims = 2, name = a.blk.14.ffn_down_1.weight, tensor_size=2097152, offset=306866176, shape:[2048, 512, 1, 1], type = f16 clip_model_loader: tensor[247]: n_dims = 1, name = a.blk.14.norm_conv.bias, tensor_size=2048, offset=308963328, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[248]: n_dims = 1, name = a.blk.14.norm_conv.weight, tensor_size=2048, offset=308965376, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[249]: n_dims = 1, name = a.blk.14.ffn_norm.bias, tensor_size=2048, offset=308967424, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[250]: n_dims = 1, name = a.blk.14.ffn_norm.weight, tensor_size=2048, offset=308969472, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[251]: n_dims = 1, name = a.blk.14.ffn_norm_1.bias, tensor_size=2048, offset=308971520, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[252]: n_dims = 1, name = a.blk.14.ffn_norm_1.weight, tensor_size=2048, offset=308973568, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[253]: n_dims = 1, name = a.blk.14.ln2.bias, tensor_size=2048, offset=308975616, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[254]: n_dims = 1, name = a.blk.14.ln2.weight, tensor_size=2048, offset=308977664, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[255]: n_dims = 1, name = a.blk.14.ln1.bias, tensor_size=2048, offset=308979712, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[256]: n_dims = 1, name = a.blk.14.ln1.weight, tensor_size=2048, offset=308981760, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[257]: n_dims = 1, name = a.blk.14.attn_k.bias, tensor_size=2048, offset=308983808, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[258]: n_dims = 2, name = a.blk.14.attn_k.weight, tensor_size=524288, offset=308985856, shape:[512, 512, 1, 1], type = f16 clip_model_loader: tensor[259]: n_dims = 1, name = a.blk.14.attn_out.bias, tensor_size=2048, offset=309510144, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[260]: n_dims = 2, name = a.blk.14.attn_out.weight, tensor_size=524288, offset=309512192, shape:[512, 512, 1, 1], type = f16 clip_model_loader: tensor[261]: n_dims = 2, name = a.blk.14.linear_pos.weight, tensor_size=524288, offset=310036480, shape:[512, 512, 1, 1], type = f16 clip_model_loader: tensor[262]: n_dims = 1, name = a.blk.14.attn_q.bias, tensor_size=2048, offset=310560768, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[263]: n_dims = 2, name = a.blk.14.attn_q.weight, tensor_size=524288, offset=310562816, shape:[512, 512, 1, 1], type = f16 clip_model_loader: tensor[264]: n_dims = 1, name = a.blk.14.attn_v.bias, tensor_size=2048, offset=311087104, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[265]: n_dims = 2, name = a.blk.14.attn_v.weight, tensor_size=524288, offset=311089152, shape:[512, 512, 1, 1], type = f16 clip_model_loader: tensor[266]: n_dims = 2, name = a.blk.14.pos_bias_u, tensor_size=2048, offset=311613440, shape:[64, 8, 1, 1], type = f32 clip_model_loader: tensor[267]: n_dims = 2, name = a.blk.14.pos_bias_v, tensor_size=2048, offset=311615488, shape:[64, 8, 1, 1], type = f32 clip_model_loader: tensor[268]: n_dims = 1, name = a.blk.15.conv_norm.weight, tensor_size=2048, offset=311617536, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[269]: n_dims = 1, name = a.blk.15.conv_norm.bias, tensor_size=2048, offset=311619584, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[270]: n_dims = 1, name = a.blk.15.conv_dw.bias, tensor_size=2048, offset=311621632, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[271]: n_dims = 2, name = a.blk.15.conv_dw.weight, tensor_size=18432, offset=311623680, shape:[9, 512, 1, 1], type = f32 clip_model_loader: tensor[272]: n_dims = 1, name = a.blk.15.conv_pw1.bias, tensor_size=4096, offset=311642112, shape:[1024, 1, 1, 1], type = f32 clip_model_loader: tensor[273]: n_dims = 2, name = a.blk.15.conv_pw1.weight, tensor_size=2097152, offset=311646208, shape:[512, 1024, 1, 1], type = f32 clip_model_loader: tensor[274]: n_dims = 1, name = a.blk.15.conv_pw2.bias, tensor_size=2048, offset=313743360, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[275]: n_dims = 2, name = a.blk.15.conv_pw2.weight, tensor_size=1048576, offset=313745408, shape:[512, 512, 1, 1], type = f32 clip_model_loader: tensor[276]: n_dims = 1, name = a.blk.15.ffn_up.bias, tensor_size=8192, offset=314793984, shape:[2048, 1, 1, 1], type = f32 clip_model_loader: tensor[277]: n_dims = 2, name = a.blk.15.ffn_up.weight, tensor_size=2097152, offset=314802176, shape:[512, 2048, 1, 1], type = f16 clip_model_loader: tensor[278]: n_dims = 1, name = a.blk.15.ffn_down.bias, tensor_size=2048, offset=316899328, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[279]: n_dims = 2, name = a.blk.15.ffn_down.weight, tensor_size=2097152, offset=316901376, shape:[2048, 512, 1, 1], type = f16 clip_model_loader: tensor[280]: n_dims = 1, name = a.blk.15.ffn_up_1.bias, tensor_size=8192, offset=318998528, shape:[2048, 1, 1, 1], type = f32 clip_model_loader: tensor[281]: n_dims = 2, name = a.blk.15.ffn_up_1.weight, tensor_size=2097152, offset=319006720, shape:[512, 2048, 1, 1], type = f16 clip_model_loader: tensor[282]: n_dims = 1, name = a.blk.15.ffn_down_1.bias, tensor_size=2048, offset=321103872, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[283]: n_dims = 2, name = a.blk.15.ffn_down_1.weight, tensor_size=2097152, offset=321105920, shape:[2048, 512, 1, 1], type = f16 clip_model_loader: tensor[284]: n_dims = 1, name = a.blk.15.norm_conv.bias, tensor_size=2048, offset=323203072, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[285]: n_dims = 1, name = a.blk.15.norm_conv.weight, tensor_size=2048, offset=323205120, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[286]: n_dims = 1, name = a.blk.15.ffn_norm.bias, tensor_size=2048, offset=323207168, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[287]: n_dims = 1, name = a.blk.15.ffn_norm.weight, tensor_size=2048, offset=323209216, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[288]: n_dims = 1, name = a.blk.15.ffn_norm_1.bias, tensor_size=2048, offset=323211264, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[289]: n_dims = 1, name = a.blk.15.ffn_norm_1.weight, tensor_size=2048, offset=323213312, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[290]: n_dims = 1, name = a.blk.15.ln2.bias, tensor_size=2048, offset=323215360, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[291]: n_dims = 1, name = a.blk.15.ln2.weight, tensor_size=2048, offset=323217408, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[292]: n_dims = 1, name = a.blk.15.ln1.bias, tensor_size=2048, offset=323219456, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[293]: n_dims = 1, name = a.blk.15.ln1.weight, tensor_size=2048, offset=323221504, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[294]: n_dims = 1, name = a.blk.15.attn_k.bias, tensor_size=2048, offset=323223552, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[295]: n_dims = 2, name = a.blk.15.attn_k.weight, tensor_size=524288, offset=323225600, shape:[512, 512, 1, 1], type = f16 clip_model_loader: tensor[296]: n_dims = 1, name = a.blk.15.attn_out.bias, tensor_size=2048, offset=323749888, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[297]: n_dims = 2, name = a.blk.15.attn_out.weight, tensor_size=524288, offset=323751936, shape:[512, 512, 1, 1], type = f16 clip_model_loader: tensor[298]: n_dims = 2, name = a.blk.15.linear_pos.weight, tensor_size=524288, offset=324276224, shape:[512, 512, 1, 1], type = f16 clip_model_loader: tensor[299]: n_dims = 1, name = a.blk.15.attn_q.bias, tensor_size=2048, offset=324800512, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[300]: n_dims = 2, name = a.blk.15.attn_q.weight, tensor_size=524288, offset=324802560, shape:[512, 512, 1, 1], type = f16 clip_model_loader: tensor[301]: n_dims = 1, name = a.blk.15.attn_v.bias, tensor_size=2048, offset=325326848, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[302]: n_dims = 2, name = a.blk.15.attn_v.weight, tensor_size=524288, offset=325328896, shape:[512, 512, 1, 1], type = f16 clip_model_loader: tensor[303]: n_dims = 2, name = a.blk.15.pos_bias_u, tensor_size=2048, offset=325853184, shape:[64, 8, 1, 1], type = f32 clip_model_loader: tensor[304]: n_dims = 2, name = a.blk.15.pos_bias_v, tensor_size=2048, offset=325855232, shape:[64, 8, 1, 1], type = f32 clip_model_loader: tensor[305]: n_dims = 1, name = a.blk.16.conv_norm.weight, tensor_size=2048, offset=325857280, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[306]: n_dims = 1, name = a.blk.16.conv_norm.bias, tensor_size=2048, offset=325859328, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[307]: n_dims = 1, name = a.blk.16.conv_dw.bias, tensor_size=2048, offset=325861376, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[308]: n_dims = 2, name = a.blk.16.conv_dw.weight, tensor_size=18432, offset=325863424, shape:[9, 512, 1, 1], type = f32 clip_model_loader: tensor[309]: n_dims = 1, name = a.blk.16.conv_pw1.bias, tensor_size=4096, offset=325881856, shape:[1024, 1, 1, 1], type = f32 clip_model_loader: tensor[310]: n_dims = 2, name = a.blk.16.conv_pw1.weight, tensor_size=2097152, offset=325885952, shape:[512, 1024, 1, 1], type = f32 clip_model_loader: tensor[311]: n_dims = 1, name = a.blk.16.conv_pw2.bias, tensor_size=2048, offset=327983104, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[312]: n_dims = 2, name = a.blk.16.conv_pw2.weight, tensor_size=1048576, offset=327985152, shape:[512, 512, 1, 1], type = f32 clip_model_loader: tensor[313]: n_dims = 1, name = a.blk.16.ffn_up.bias, tensor_size=8192, offset=329033728, shape:[2048, 1, 1, 1], type = f32 clip_model_loader: tensor[314]: n_dims = 2, name = a.blk.16.ffn_up.weight, tensor_size=2097152, offset=329041920, shape:[512, 2048, 1, 1], type = f16 clip_model_loader: tensor[315]: n_dims = 1, name = a.blk.16.ffn_down.bias, tensor_size=2048, offset=331139072, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[316]: n_dims = 2, name = a.blk.16.ffn_down.weight, tensor_size=2097152, offset=331141120, shape:[2048, 512, 1, 1], type = f16 clip_model_loader: tensor[317]: n_dims = 1, name = a.blk.16.ffn_up_1.bias, tensor_size=8192, offset=333238272, shape:[2048, 1, 1, 1], type = f32 clip_model_loader: tensor[318]: n_dims = 2, name = a.blk.16.ffn_up_1.weight, tensor_size=2097152, offset=333246464, shape:[512, 2048, 1, 1], type = f16 clip_model_loader: tensor[319]: n_dims = 1, name = a.blk.16.ffn_down_1.bias, tensor_size=2048, offset=335343616, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[320]: n_dims = 2, name = a.blk.16.ffn_down_1.weight, tensor_size=2097152, offset=335345664, shape:[2048, 512, 1, 1], type = f16 clip_model_loader: tensor[321]: n_dims = 1, name = a.blk.16.norm_conv.bias, tensor_size=2048, offset=337442816, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[322]: n_dims = 1, name = a.blk.16.norm_conv.weight, tensor_size=2048, offset=337444864, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[323]: n_dims = 1, name = a.blk.16.ffn_norm.bias, tensor_size=2048, offset=337446912, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[324]: n_dims = 1, name = a.blk.16.ffn_norm.weight, tensor_size=2048, offset=337448960, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[325]: n_dims = 1, name = a.blk.16.ffn_norm_1.bias, tensor_size=2048, offset=337451008, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[326]: n_dims = 1, name = a.blk.16.ffn_norm_1.weight, tensor_size=2048, offset=337453056, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[327]: n_dims = 1, name = a.blk.16.ln2.bias, tensor_size=2048, offset=337455104, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[328]: n_dims = 1, name = a.blk.16.ln2.weight, tensor_size=2048, offset=337457152, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[329]: n_dims = 1, name = a.blk.16.ln1.bias, tensor_size=2048, offset=337459200, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[330]: n_dims = 1, name = a.blk.16.ln1.weight, tensor_size=2048, offset=337461248, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[331]: n_dims = 1, name = a.blk.16.attn_k.bias, tensor_size=2048, offset=337463296, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[332]: n_dims = 2, name = a.blk.16.attn_k.weight, tensor_size=524288, offset=337465344, shape:[512, 512, 1, 1], type = f16 clip_model_loader: tensor[333]: n_dims = 1, name = a.blk.16.attn_out.bias, tensor_size=2048, offset=337989632, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[334]: n_dims = 2, name = a.blk.16.attn_out.weight, tensor_size=524288, offset=337991680, shape:[512, 512, 1, 1], type = f16 clip_model_loader: tensor[335]: n_dims = 2, name = a.blk.16.linear_pos.weight, tensor_size=524288, offset=338515968, shape:[512, 512, 1, 1], type = f16 clip_model_loader: tensor[336]: n_dims = 1, name = a.blk.16.attn_q.bias, tensor_size=2048, offset=339040256, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[337]: n_dims = 2, name = a.blk.16.attn_q.weight, tensor_size=524288, offset=339042304, shape:[512, 512, 1, 1], type = f16 clip_model_loader: tensor[338]: n_dims = 1, name = a.blk.16.attn_v.bias, tensor_size=2048, offset=339566592, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[339]: n_dims = 2, name = a.blk.16.attn_v.weight, tensor_size=524288, offset=339568640, shape:[512, 512, 1, 1], type = f16 clip_model_loader: tensor[340]: n_dims = 2, name = a.blk.16.pos_bias_u, tensor_size=2048, offset=340092928, shape:[64, 8, 1, 1], type = f32 clip_model_loader: tensor[341]: n_dims = 2, name = a.blk.16.pos_bias_v, tensor_size=2048, offset=340094976, shape:[64, 8, 1, 1], type = f32 clip_model_loader: tensor[342]: n_dims = 1, name = a.blk.2.conv_norm.weight, tensor_size=2048, offset=340097024, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[343]: n_dims = 1, name = a.blk.2.conv_norm.bias, tensor_size=2048, offset=340099072, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[344]: n_dims = 1, name = a.blk.2.conv_dw.bias, tensor_size=2048, offset=340101120, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[345]: n_dims = 2, name = a.blk.2.conv_dw.weight, tensor_size=18432, offset=340103168, shape:[9, 512, 1, 1], type = f32 clip_model_loader: tensor[346]: n_dims = 1, name = a.blk.2.conv_pw1.bias, tensor_size=4096, offset=340121600, shape:[1024, 1, 1, 1], type = f32 clip_model_loader: tensor[347]: n_dims = 2, name = a.blk.2.conv_pw1.weight, tensor_size=2097152, offset=340125696, shape:[512, 1024, 1, 1], type = f32 clip_model_loader: tensor[348]: n_dims = 1, name = a.blk.2.conv_pw2.bias, tensor_size=2048, offset=342222848, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[349]: n_dims = 2, name = a.blk.2.conv_pw2.weight, tensor_size=1048576, offset=342224896, shape:[512, 512, 1, 1], type = f32 clip_model_loader: tensor[350]: n_dims = 1, name = a.blk.2.ffn_up.bias, tensor_size=8192, offset=343273472, shape:[2048, 1, 1, 1], type = f32 clip_model_loader: tensor[351]: n_dims = 2, name = a.blk.2.ffn_up.weight, tensor_size=2097152, offset=343281664, shape:[512, 2048, 1, 1], type = f16 clip_model_loader: tensor[352]: n_dims = 1, name = a.blk.2.ffn_down.bias, tensor_size=2048, offset=345378816, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[353]: n_dims = 2, name = a.blk.2.ffn_down.weight, tensor_size=2097152, offset=345380864, shape:[2048, 512, 1, 1], type = f16 clip_model_loader: tensor[354]: n_dims = 1, name = a.blk.2.ffn_up_1.bias, tensor_size=8192, offset=347478016, shape:[2048, 1, 1, 1], type = f32 clip_model_loader: tensor[355]: n_dims = 2, name = a.blk.2.ffn_up_1.weight, tensor_size=2097152, offset=347486208, shape:[512, 2048, 1, 1], type = f16 clip_model_loader: tensor[356]: n_dims = 1, name = a.blk.2.ffn_down_1.bias, tensor_size=2048, offset=349583360, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[357]: n_dims = 2, name = a.blk.2.ffn_down_1.weight, tensor_size=2097152, offset=349585408, shape:[2048, 512, 1, 1], type = f16 clip_model_loader: tensor[358]: n_dims = 1, name = a.blk.2.norm_conv.bias, tensor_size=2048, offset=351682560, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[359]: n_dims = 1, name = a.blk.2.norm_conv.weight, tensor_size=2048, offset=351684608, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[360]: n_dims = 1, name = a.blk.2.ffn_norm.bias, tensor_size=2048, offset=351686656, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[361]: n_dims = 1, name = a.blk.2.ffn_norm.weight, tensor_size=2048, offset=351688704, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[362]: n_dims = 1, name = a.blk.2.ffn_norm_1.bias, tensor_size=2048, offset=351690752, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[363]: n_dims = 1, name = a.blk.2.ffn_norm_1.weight, tensor_size=2048, offset=351692800, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[364]: n_dims = 1, name = a.blk.2.ln2.bias, tensor_size=2048, offset=351694848, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[365]: n_dims = 1, name = a.blk.2.ln2.weight, tensor_size=2048, offset=351696896, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[366]: n_dims = 1, name = a.blk.2.ln1.bias, tensor_size=2048, offset=351698944, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[367]: n_dims = 1, name = a.blk.2.ln1.weight, tensor_size=2048, offset=351700992, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[368]: n_dims = 1, name = a.blk.2.attn_k.bias, tensor_size=2048, offset=351703040, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[369]: n_dims = 2, name = a.blk.2.attn_k.weight, tensor_size=524288, offset=351705088, shape:[512, 512, 1, 1], type = f16 clip_model_loader: tensor[370]: n_dims = 1, name = a.blk.2.attn_out.bias, tensor_size=2048, offset=352229376, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[371]: n_dims = 2, name = a.blk.2.attn_out.weight, tensor_size=524288, offset=352231424, shape:[512, 512, 1, 1], type = f16 clip_model_loader: tensor[372]: n_dims = 2, name = a.blk.2.linear_pos.weight, tensor_size=524288, offset=352755712, shape:[512, 512, 1, 1], type = f16 clip_model_loader: tensor[373]: n_dims = 1, name = a.blk.2.attn_q.bias, tensor_size=2048, offset=353280000, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[374]: n_dims = 2, name = a.blk.2.attn_q.weight, tensor_size=524288, offset=353282048, shape:[512, 512, 1, 1], type = f16 clip_model_loader: tensor[375]: n_dims = 1, name = a.blk.2.attn_v.bias, tensor_size=2048, offset=353806336, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[376]: n_dims = 2, name = a.blk.2.attn_v.weight, tensor_size=524288, offset=353808384, shape:[512, 512, 1, 1], type = f16 clip_model_loader: tensor[377]: n_dims = 2, name = a.blk.2.pos_bias_u, tensor_size=2048, offset=354332672, shape:[64, 8, 1, 1], type = f32 clip_model_loader: tensor[378]: n_dims = 2, name = a.blk.2.pos_bias_v, tensor_size=2048, offset=354334720, shape:[64, 8, 1, 1], type = f32 clip_model_loader: tensor[379]: n_dims = 1, name = a.blk.3.conv_norm.weight, tensor_size=2048, offset=354336768, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[380]: n_dims = 1, name = a.blk.3.conv_norm.bias, tensor_size=2048, offset=354338816, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[381]: n_dims = 1, name = a.blk.3.conv_dw.bias, tensor_size=2048, offset=354340864, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[382]: n_dims = 2, name = a.blk.3.conv_dw.weight, tensor_size=18432, offset=354342912, shape:[9, 512, 1, 1], type = f32 clip_model_loader: tensor[383]: n_dims = 1, name = a.blk.3.conv_pw1.bias, tensor_size=4096, offset=354361344, shape:[1024, 1, 1, 1], type = f32 clip_model_loader: tensor[384]: n_dims = 2, name = a.blk.3.conv_pw1.weight, tensor_size=2097152, offset=354365440, shape:[512, 1024, 1, 1], type = f32 clip_model_loader: tensor[385]: n_dims = 1, name = a.blk.3.conv_pw2.bias, tensor_size=2048, offset=356462592, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[386]: n_dims = 2, name = a.blk.3.conv_pw2.weight, tensor_size=1048576, offset=356464640, shape:[512, 512, 1, 1], type = f32 clip_model_loader: tensor[387]: n_dims = 1, name = a.blk.3.ffn_up.bias, tensor_size=8192, offset=357513216, shape:[2048, 1, 1, 1], type = f32 clip_model_loader: tensor[388]: n_dims = 2, name = a.blk.3.ffn_up.weight, tensor_size=2097152, offset=357521408, shape:[512, 2048, 1, 1], type = f16 clip_model_loader: tensor[389]: n_dims = 1, name = a.blk.3.ffn_down.bias, tensor_size=2048, offset=359618560, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[390]: n_dims = 2, name = a.blk.3.ffn_down.weight, tensor_size=2097152, offset=359620608, shape:[2048, 512, 1, 1], type = f16 clip_model_loader: tensor[391]: n_dims = 1, name = a.blk.3.ffn_up_1.bias, tensor_size=8192, offset=361717760, shape:[2048, 1, 1, 1], type = f32 clip_model_loader: tensor[392]: n_dims = 2, name = a.blk.3.ffn_up_1.weight, tensor_size=2097152, offset=361725952, shape:[512, 2048, 1, 1], type = f16 clip_model_loader: tensor[393]: n_dims = 1, name = a.blk.3.ffn_down_1.bias, tensor_size=2048, offset=363823104, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[394]: n_dims = 2, name = a.blk.3.ffn_down_1.weight, tensor_size=2097152, offset=363825152, shape:[2048, 512, 1, 1], type = f16 clip_model_loader: tensor[395]: n_dims = 1, name = a.blk.3.norm_conv.bias, tensor_size=2048, offset=365922304, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[396]: n_dims = 1, name = a.blk.3.norm_conv.weight, tensor_size=2048, offset=365924352, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[397]: n_dims = 1, name = a.blk.3.ffn_norm.bias, tensor_size=2048, offset=365926400, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[398]: n_dims = 1, name = a.blk.3.ffn_norm.weight, tensor_size=2048, offset=365928448, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[399]: n_dims = 1, name = a.blk.3.ffn_norm_1.bias, tensor_size=2048, offset=365930496, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[400]: n_dims = 1, name = a.blk.3.ffn_norm_1.weight, tensor_size=2048, offset=365932544, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[401]: n_dims = 1, name = a.blk.3.ln2.bias, tensor_size=2048, offset=365934592, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[402]: n_dims = 1, name = a.blk.3.ln2.weight, tensor_size=2048, offset=365936640, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[403]: n_dims = 1, name = a.blk.3.ln1.bias, tensor_size=2048, offset=365938688, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[404]: n_dims = 1, name = a.blk.3.ln1.weight, tensor_size=2048, offset=365940736, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[405]: n_dims = 1, name = a.blk.3.attn_k.bias, tensor_size=2048, offset=365942784, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[406]: n_dims = 2, name = a.blk.3.attn_k.weight, tensor_size=524288, offset=365944832, shape:[512, 512, 1, 1], type = f16 clip_model_loader: tensor[407]: n_dims = 1, name = a.blk.3.attn_out.bias, tensor_size=2048, offset=366469120, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[408]: n_dims = 2, name = a.blk.3.attn_out.weight, tensor_size=524288, offset=366471168, shape:[512, 512, 1, 1], type = f16 clip_model_loader: tensor[409]: n_dims = 2, name = a.blk.3.linear_pos.weight, tensor_size=524288, offset=366995456, shape:[512, 512, 1, 1], type = f16 clip_model_loader: tensor[410]: n_dims = 1, name = a.blk.3.attn_q.bias, tensor_size=2048, offset=367519744, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[411]: n_dims = 2, name = a.blk.3.attn_q.weight, tensor_size=524288, offset=367521792, shape:[512, 512, 1, 1], type = f16 clip_model_loader: tensor[412]: n_dims = 1, name = a.blk.3.attn_v.bias, tensor_size=2048, offset=368046080, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[413]: n_dims = 2, name = a.blk.3.attn_v.weight, tensor_size=524288, offset=368048128, shape:[512, 512, 1, 1], type = f16 clip_model_loader: tensor[414]: n_dims = 2, name = a.blk.3.pos_bias_u, tensor_size=2048, offset=368572416, shape:[64, 8, 1, 1], type = f32 clip_model_loader: tensor[415]: n_dims = 2, name = a.blk.3.pos_bias_v, tensor_size=2048, offset=368574464, shape:[64, 8, 1, 1], type = f32 clip_model_loader: tensor[416]: n_dims = 1, name = a.blk.4.conv_norm.weight, tensor_size=2048, offset=368576512, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[417]: n_dims = 1, name = a.blk.4.conv_norm.bias, tensor_size=2048, offset=368578560, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[418]: n_dims = 1, name = a.blk.4.conv_dw.bias, tensor_size=2048, offset=368580608, shape:[512, 1, 1, 1], type = f32 clip_model_loader: tensor[419]: n_dims = 2, name = a.blk.4.conv_dw.weight, tensor_size=18432, offset=368582656, shape:[9, 512, 1, 1], type = f32 clip_model_loader: tensor[420]: n_dims = 1, name = a.blk.4.conv_pw1.bias, tensor_size=4096, offset=368601088, shape:[1024, 1, 1, 1], type = f32 clip_model_loader: tensor[421]: n_dims = 2, name = a.blk.4.conv_pw1.weight, tensor_size=2097152, offset=368605184, shape:[512, 1024, 1, 1], type = f32 clip_model_loader: tensor[535]: n_dims = 1, name = a.blk.7.ffn_up.bias, tensor_size=8192, offset=414472192, shape:[2048, 1, 1, 1], type = f32 --- audio hparams --- load_hparams: model size: 437.52 MiB add_text: <|im_start|>system formatted_chat.prompt: <|im_start|>system audio_tokens->n_tokens = 24 encoding audio slice... Iset_embeddings: value = 0 llama_perf_context_print: load time = 21696.55 ms formatted_chat.prompt: <|im_start|>system add_text: <|im_start|>system <|audio_start|>set_embeddings: value = 1 llama_perf_context_print: load time = 21696.55 ms add_text: <|im_start|>system formatted_chat.prompt: <|im_start|>system audio_tokens->n_tokens = 37 encoding audio slice... Gset_embeddings: value = 0 llama_perf_context_print: load time = 21696.55 ms add_text: <|im_start|>system formatted_chat.prompt: <|im_start|>system <|audio_start|>set_embeddings: value = 1 llama_perf_context_print: load time = 21696.55 ms add_text: <|im_start|>system formatted_chat.prompt: <|im_start|>system audio_tokens->n_tokens = 31 encoding audio slice... Doset_embeddings: value = 0 llama_perf_context_print: load time = 21696.55 ms |
squash-merge of ggml-org/llama.cpp PR ggml-org#18641 onto main. adds llama-liquid-audio-server and llama-liquid-audio-cli binaries for text-to-speech and speech-to-text with LFM2.5 models.
checks daily for new llama.cpp releases. auto-rebases cherry-picks (audio ggml-org#18641, outetss ggml-org#12794, eagle-3 ggml-org#18039). creates tagged release on clean rebase, PR on conflicts. PR ggml-org#19460 (GLM-5 DSA) already merged upstream, not in cherry-pick list.
Liquid AI released LFM2.5-Audio-1.5B.
This PR is intended to provide a functional implementation in
llama.cppuntil necessary infrastructure is implemented.The plan is to split and merge it into upstream in smaller chunks, while keeping and tracking functional implementation here. It will be rebased from time to time.
GGUFs, precompiled runners, and instructions, live in https://huggingface.co/LiquidAI/LFM2.5-Audio-1.5B-GGUF.
Merge plan:
n_embd_outmodel : add LFM2-ColBert-350M #18607Demo of capabilities (watch with audio on)
demo.mp4
Thank you, @ngxson for the help!