Skip to content

Refactoring convert-pth-to-ggml.py: more concise and readable#109

Merged
ggerganov merged 6 commits into
ggml-org:masterfrom
qunash:master
Mar 19, 2023
Merged

Refactoring convert-pth-to-ggml.py: more concise and readable#109
ggerganov merged 6 commits into
ggml-org:masterfrom
qunash:master

Conversation

@qunash

@qunash qunash commented Mar 14, 2023

Copy link
Copy Markdown
Contributor

No description provided.

@qunash qunash changed the title Refactoring: more concise and readable Refactoring convert-pth-to-ggml.py: more concise and readable Mar 14, 2023
@SuajCarrot

Copy link
Copy Markdown
Contributor

Exactly what I was thinking, however I think a better approach regarding string concatenation for paths is using os.path.join instead simply to avoid typos either by the user or the programmer if the code changes in the future. Overall, LGTM.

@gjmulder gjmulder added the duplicate This issue or pull request already exists label Mar 18, 2023
@ggerganov ggerganov merged commit 467b149 into ggml-org:master Mar 19, 2023
@ggerganov

ggerganov commented Mar 19, 2023

Copy link
Copy Markdown
Member

@SuajCarrot

I get this error:

python3 convert-pth-to-ggml.py models/7B/ 1
{'dim': 4096, 'multiple_of': 256, 'n_heads': 32, 'n_layers': 32, 'norm_eps': 1e-06, 'vocab_size': -1}
n_parts = 1

Processing part 0

Processing variable: tok_embeddings.weight with shape: torch.Size([32000, 4096]) and type: torch.float16

Traceback (most recent call last):
  File "/Users/ggerganov/development/github/llama.cpp/convert-pth-to-ggml.py", line 157, in <module>
    main()
  File "/Users/ggerganov/development/github/llama.cpp/convert-pth-to-ggml.py", line 151, in main
    process_and_write_variables(fout, model, ftype)
  File "/Users/ggerganov/development/github/llama.cpp/convert-pth-to-ggml.py", line 127, in process_and_write_variables
    data.tofile(fout)
AttributeError: 'Tensor' object has no attribute 'tofile'. Did you mean: 'tile'?

Any ideas?

Edit: fixed

ggerganov added a commit that referenced this pull request Mar 19, 2023
Seunghhon pushed a commit to Seunghhon/llama.cpp that referenced this pull request Apr 26, 2026
…-org#109)

* Refactor get_n_parts function to simplify code and improve readability

* Use f-strings instead of concatenation

* Refactoring: more concise and readable

* modularize

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Seunghhon pushed a commit to Seunghhon/llama.cpp that referenced this pull request Apr 26, 2026
phuongncn pushed a commit to phuongncn/llama.cpp-gx10-dgx-sparks-deepseekv4 that referenced this pull request Apr 28, 2026
* iq1_bn: improve CUDA TG

On RTX-3080 TG-128(Bitnet-1.58b-3B) goes from 318 t/s to 340 t/s.
I see I have on the front page 301 t/s, so pretty nice improvement
since then.

* iq2_bn(CUDA): quants are not 4-byte aligned

---------

Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
ljubomirj pushed a commit to ljubomirj/llama.cpp that referenced this pull request May 6, 2026
…-org#109)

* Refactor get_n_parts function to simplify code and improve readability

* Use f-strings instead of concatenation

* Refactoring: more concise and readable

* modularize

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
ljubomirj pushed a commit to ljubomirj/llama.cpp that referenced this pull request May 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

duplicate This issue or pull request already exists

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants