Skip to content

Add integration with gemlite weight only quant#2528

Merged
zhyncs merged 7 commits intosgl-project:mainfrom
jerryzh168:add-gemlite
Dec 20, 2024
Merged

Add integration with gemlite weight only quant#2528
zhyncs merged 7 commits intosgl-project:mainfrom
jerryzh168:add-gemlite

Conversation

@jerryzh168
Copy link
Copy Markdown
Contributor

Summary:
gemlite Only available with nightly torchao right now (or install from source)

Test Plan:

python3 -m sglang.bench_one_batch --model meta-llama/Llama-3.1-8B-Instruct --batch-size 1 --input 1024 --output 512 --json-model-override-args '{"architectures": ["TorchNativeLlamaForCausalLM"]}' --enable-torch-compile —torchao-config gemlite-4-64 --tp-size 1

Reviewers:

Subscribers:

Tasks:

Tags:

Motivation

Modifications

Checklist

  • Format your code according to the Contributor Guide.
  • Add unit tests as outlined in the Contributor Guide.
  • Update documentation as needed, including docstrings or example tutorials.

Summary:
gemlite Only available with nightly torchao right now (or install from source)

Test Plan:
```
python3 -m sglang.bench_one_batch --model meta-llama/Llama-3.1-8B-Instruct --batch-size 1 --input 1024 --output 512 --json-model-override-args '{"architectures": ["TorchNativeLlamaForCausalLM"]}' --enable-torch-compile —torchao-config gemlite-4-64 --tp-size 1
```

Reviewers:

Subscribers:

Tasks:

Tags:
@zhyncs
Copy link
Copy Markdown
Collaborator

zhyncs commented Dec 19, 2024

Hi @jerryzh168 What is the release cycle of torchao? I can accept using the torchao nightly version, maybe you can try enabling it in the https://github.com/sgl-project/sglang/blob/main/python/pyproject.toml. What do you think? cc @merrymercy @Ying1123 @ispobock

@jerryzh168
Copy link
Copy Markdown
Contributor Author

Hi @jerryzh168 What is the release cycle of torchao? I can accept using the torchao nightly version, maybe you can try enabling it in the main/python/pyproject.toml. What do you think? cc @merrymercy @Ying1123 @ispobock

we have ~ monthly releases, yeah depend on nightly version would be better for now, and we can update to a stable version a bit later I think

@jerryzh168
Copy link
Copy Markdown
Contributor Author

I tried pip install torchao>=0.8.0.dev20241219 but it doesn't work, we probably need to use

pip install --pre torchao --index-url https://download.pytorch.org/whl/nightly/cu124

to install the nightly version, do we just want to add a version check here?

@jerryzh168
Copy link
Copy Markdown
Contributor Author

jerryzh168 commented Dec 19, 2024

@zhyncs I think we can land, it's fine to have this as an experimental feature for now I think, I added a print to ask people to use torchao nightly

@zhyncs zhyncs merged commit feb2b76 into sgl-project:main Dec 20, 2024
zhyncs added a commit that referenced this pull request Dec 21, 2024
zhyncs added a commit that referenced this pull request Dec 21, 2024
chosen-ox pushed a commit to chosen-ox/sglang that referenced this pull request Dec 22, 2024
chosen-ox pushed a commit to chosen-ox/sglang that referenced this pull request Dec 22, 2024
@merrymercy merrymercy mentioned this pull request Dec 26, 2024
3 tasks
timethink pushed a commit to timethink/sglang that referenced this pull request Mar 9, 2025
timethink pushed a commit to timethink/sglang that referenced this pull request Mar 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants