Skip to content

improvement(tools): optimize convert-pth-to-ggml#232

Closed
tpoisonooo wants to merge 3 commits into
ggml-org:masterfrom
tpoisonooo:optimize-convert
Closed

improvement(tools): optimize convert-pth-to-ggml#232
tpoisonooo wants to merge 3 commits into
ggml-org:masterfrom
tpoisonooo:optimize-convert

Conversation

@tpoisonooo

Copy link
Copy Markdown

optimize convert tool with argparse

$ python3 convert-pth-to-ggml.py -h
usage: convert-pth-to-ggml.py [-h] dir_model {f32,f16} out_dir

Convert ckpt models to ggml models. For example: python3 convert-pth-to-ggml.py ../llama-models/7B/ f32 models/llama-7B

positional arguments:
  dir_model   Directory path of the checkpoint model
  {f32,f16}   Data type of the converted tensor, f32 or f16
  out_dir     Directory path for storing ggml model

options:
  -h, --help  show this help message and exit

Tested on 7B/30B models, it works well.

$ tree models/
models/
├── 7B
├── llama-30B
│   ├── ggml-model-f16.bin
│   ├── ggml-model-f16.bin.1
│   ├── ggml-model-f16.bin.2
│   └── ggml-model-f16.bin.3
└── llama-7B
    ├── ggml-model.bin -> ggml-model-f16.bin
    ├── ggml-model-f16.bin
    └── ggml-model-f32.bin

@tpoisonooo

Copy link
Copy Markdown
Author

cc @ggerganov

@tpoisonooo tpoisonooo changed the title Optimize convert improvement(tools): optimize convert-pth-to-ggml Mar 17, 2023
@gjmulder gjmulder added the enhancement New feature or request label Mar 17, 2023
@tpoisonooo

Copy link
Copy Markdown
Author

Conflict fixed and tested on 7B/30B.

master version diffs with 3 lines here :

if os.path.exists(fname_out):
    print(f"Skip conversion, it already exists: {fname_out}")
    sys.exit(0)

cc @gjmulder

Comment thread convert-pth-to-ggml.py
import numpy as np
import torch
import argparse
import os

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

os is already imported

@sw

sw commented Mar 18, 2023

Copy link
Copy Markdown
Contributor

can you please update the readme? the choice "1" for ftype will no longer work.

@sw

sw commented Mar 18, 2023

Copy link
Copy Markdown
Contributor

This is almost a duplicate of #109

@ggerganov

Copy link
Copy Markdown
Member

Decided to go with #109

@ggerganov ggerganov closed this Mar 19, 2023
phuongncn pushed a commit to phuongncn/llama.cpp-gx10-dgx-sparks-deepseekv4 that referenced this pull request Apr 28, 2026
…gml-org#232)

* Give the user the option to override where model weights are stored

* Fix ggml_nbytes() problem and cleanup

For a tensor with zero elements ggml_nbytes() was returning
uint64_t::max, and this was causing graph allocation failure.

* Add timing info to CUDA graph evaluation

* Add more timing info

---------

Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants