improvement(tools): optimize convert-pth-to-ggml by tpoisonooo · Pull Request #232 · ggml-org/llama.cpp

tpoisonooo · 2023-03-17T09:02:11Z

optimize convert tool with argparse

$ python3 convert-pth-to-ggml.py -h
usage: convert-pth-to-ggml.py [-h] dir_model {f32,f16} out_dir

Convert ckpt models to ggml models. For example: python3 convert-pth-to-ggml.py ../llama-models/7B/ f32 models/llama-7B

positional arguments:
  dir_model   Directory path of the checkpoint model
  {f32,f16}   Data type of the converted tensor, f32 or f16
  out_dir     Directory path for storing ggml model

options:
  -h, --help  show this help message and exit

Tested on 7B/30B models, it works well.

$ tree models/
models/
├── 7B
├── llama-30B
│   ├── ggml-model-f16.bin
│   ├── ggml-model-f16.bin.1
│   ├── ggml-model-f16.bin.2
│   └── ggml-model-f16.bin.3
└── llama-7B
    ├── ggml-model.bin -> ggml-model-f16.bin
    ├── ggml-model-f16.bin
    └── ggml-model-f32.bin

tpoisonooo · 2023-03-17T09:02:28Z

cc @ggerganov

tpoisonooo · 2023-03-18T10:47:05Z

Conflict fixed and tested on 7B/30B.

master version diffs with 3 lines here :

if os.path.exists(fname_out):
    print(f"Skip conversion, it already exists: {fname_out}")
    sys.exit(0)

cc @gjmulder

sw · 2023-03-18T16:27:06Z

 import numpy as np
 import torch
+import argparse
+import os


os is already imported

sw · 2023-03-18T16:27:48Z

can you please update the readme? the choice "1" for ftype will no longer work.

sw · 2023-03-18T16:30:58Z

This is almost a duplicate of #109

ggerganov · 2023-03-19T17:18:03Z

Decided to go with #109

…gml-org#232) * Give the user the option to override where model weights are stored * Fix ggml_nbytes() problem and cleanup For a tensor with zero elements ggml_nbytes() was returning uint64_t::max, and this was causing graph allocation failure. * Add timing info to CUDA graph evaluation * Add more timing info --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>

tpoisonooo added 2 commits March 17, 2023 16:53

improvement(tools): optimize with argparse

3c7cb41

improvement(tools): add example

fb324e0

tpoisonooo changed the title ~~Optimize convert~~ improvement(tools): optimize convert-pth-to-ggml Mar 17, 2023

gjmulder added the enhancement New feature or request label Mar 17, 2023

Merge branch 'master' into optimize-convert

a44ccef

sw reviewed Mar 18, 2023

View reviewed changes

Comment thread convert-pth-to-ggml.py

import numpy as np

import torch

import argparse

import os

sw Mar 18, 2023

Copy link
Copy Markdown

Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

os is already imported

ggerganov closed this Mar 19, 2023

Bearsaerker mentioned this pull request Mar 12, 2025

Eval bug: Gemma 3 extremly slow prompt processing when using quantized kv cache. #12352

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

improvement(tools): optimize convert-pth-to-ggml#232

improvement(tools): optimize convert-pth-to-ggml#232
tpoisonooo wants to merge 3 commits into
ggml-org:masterfrom
tpoisonooo:optimize-convert

tpoisonooo commented Mar 17, 2023

Uh oh!

tpoisonooo commented Mar 17, 2023

Uh oh!

tpoisonooo commented Mar 18, 2023

Uh oh!

sw Mar 18, 2023

Uh oh!

sw commented Mar 18, 2023

Uh oh!

sw commented Mar 18, 2023

Uh oh!

ggerganov commented Mar 19, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

tpoisonooo commented Mar 17, 2023

Uh oh!

tpoisonooo commented Mar 17, 2023

Uh oh!

tpoisonooo commented Mar 18, 2023

Uh oh!

sw Mar 18, 2023

Choose a reason for hiding this comment

Uh oh!

sw commented Mar 18, 2023

Uh oh!

sw commented Mar 18, 2023

Uh oh!

ggerganov commented Mar 19, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants