A question about the parameter “–group-size” in qserve_benchmark.py #48

oasis-Linmi · 2024-12-16T10:20:33Z

Hi, thank you for your excellent work!
I encountered an issue while running qserve_benchmark.py:
I downloaded several models with the W4A8 per-channel quantization type provided in the QServe Model Zoo. When I tried to set --group-size to -1, I consistently ran into a RuntimeError: probability tensor contains either 'inf', 'nan' or element < 0. Interestingly, changing the parameter to 128 allowed the script to run successfully.
My understanding is that for the W4A8 per-channel quantization type, setting this parameter to -1 is the correct choice, while for the W4A8-g128 type, the correct setting should be 128.
Could you please help explain what might be causing this issue?

The text was updated successfully, but these errors were encountered:

GCQi · 2025-01-17T09:01:50Z

hello, i met the same error. Could you tell me how fix the error?

ys-2020 · 2025-01-17T20:04:52Z

Yes. Setting group size to -1 should be correct for per-channel quantized model. Could you please elaborate which checkpoint you are currently using?

oasis-Linmi · 2025-01-18T17:25:23Z

I haven't fixed this error yet. I downloaded the checkpoint from https://huggingface.co/mit-han-lab/Llama-3-8B-QServe, following the Usage and Examples section of README.md of QServe. I also noticed that similar errors occur in each of the provided per-channel models.

oasis-Linmi closed this as completed Dec 18, 2024

oasis-Linmi reopened this Dec 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A question about the parameter “–group-size” in qserve_benchmark.py #48

A question about the parameter “–group-size” in qserve_benchmark.py #48

oasis-Linmi commented Dec 16, 2024

GCQi commented Jan 17, 2025

ys-2020 commented Jan 17, 2025

oasis-Linmi commented Jan 18, 2025

A question about the parameter “–group-size” in qserve_benchmark.py #48

A question about the parameter “–group-size” in qserve_benchmark.py #48

Comments

oasis-Linmi commented Dec 16, 2024

GCQi commented Jan 17, 2025

ys-2020 commented Jan 17, 2025

oasis-Linmi commented Jan 18, 2025