GPTQ fixes #115

HDCharles · 2024-04-03T00:46:48Z

Stack from ghstack (oldest at bottom):

-> GPTQ fixes #115

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: d29b6d73c90dec5171e12938afee25e5f42e042d Pull Request resolved: #115

cpuhrsch · 2024-04-03T00:51:20Z

@HDCharles - Changed the base to main in hope that CI will kick off

cpuhrsch · 2024-04-03T00:54:04Z

test/quantization/test_quant_api.py

    def test_gptq_quantizer_gpt_fast(self):
        from torchao.quantization.GPTQ import Int8DynActInt4WeightGPTQQuantizer, InputRecorder
        # should be similar to TorchCompileDynamicQuantizer
        precision = torch.bfloat16
        device = "cuda"
-        checkpoint_path = Path("../gpt-fast/checkpoints/meta-llama/Llama-2-7b-chat-hf/model.pth")
+        checkpoint_path = Path("/home/cdhernandez/local/gpt-fast/checkpoints/meta-llama/Llama-2-7b-chat-hf/model.pth")


We definitely can't store this on CI. Maybe open_llama_7b works better. gpt-fast supports that as well.

Summary: adding int4 gptq and eval support. Also fixed a few bugs relating to quantizing the activation both during gptq calculation and when calculating the output. Test Plan: python test/quantization/test_quant_api.py Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: d29b6d73c90dec5171e12938afee25e5f42e042d Pull Request resolved: #115

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

Summary: adding int4 gptq and eval support. Also fixed a few bugs relating to quantizing the activation both during gptq calculation and when calculating the output. Test Plan: python test/quantization/test_quant_api.py Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 9d293f86255d16fb813c1d20a4c2e0dc5360e1cc Pull Request resolved: #115

* add int4 gptq and eval Summary: adding int4 gptq and eval support. Also fixed a few bugs relating to quantizing the activation both during gptq calculation and when calculating the output. Test Plan: python test/quantization/test_quant_api.py Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: d29b6d73c90dec5171e12938afee25e5f42e042d Pull Request resolved: #115 * add int4 gptq and eval Summary: adding int4 gptq and eval support. Also fixed a few bugs relating to quantizing the activation both during gptq calculation and when calculating the output. Test Plan: python test/quantization/test_quant_api.py Reviewers: Subscribers: Tasks: Tags: * remove debug from GPTQ Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

* add int4 gptq and eval Summary: adding int4 gptq and eval support. Also fixed a few bugs relating to quantizing the activation both during gptq calculation and when calculating the output. Test Plan: python test/quantization/test_quant_api.py Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: d29b6d73c90dec5171e12938afee25e5f42e042d Pull Request resolved: pytorch#115 * add int4 gptq and eval Summary: adding int4 gptq and eval support. Also fixed a few bugs relating to quantizing the activation both during gptq calculation and when calculating the output. Test Plan: python test/quantization/test_quant_api.py Reviewers: Subscribers: Tasks: Tags: * remove debug from GPTQ Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

Update readme Update README.md (pytorch#113) update README.md Update README.md (pytorch#114) Update README.md (pytorch#115) Update Readme.md

GPTQ fixes

0785bda

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

HDCharles added a commit that referenced this pull request Apr 3, 2024

GPTQ fixes

769c227

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: d29b6d73c90dec5171e12938afee25e5f42e042d Pull Request resolved: #115

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 3, 2024

cpuhrsch changed the base branch from gh/HDCharles/6/base to main April 3, 2024 00:51

cpuhrsch reviewed Apr 3, 2024

View reviewed changes

HDCharles mentioned this pull request Apr 3, 2024

add int4 gptq and eval #116

Merged

Update on "GPTQ fixes"

893a866

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

HDCharles closed this in #116 Apr 3, 2024

yanbing-j pushed a commit to yanbing-j/ao that referenced this pull request Dec 9, 2024

Update README.md (pytorch#112)

31c8cbd

Update readme Update README.md (pytorch#113) update README.md Update README.md (pytorch#114) Update README.md (pytorch#115) Update Readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPTQ fixes #115

GPTQ fixes #115

HDCharles commented Apr 3, 2024 •

edited

Loading

cpuhrsch commented Apr 3, 2024 •

edited

Loading

cpuhrsch Apr 3, 2024

GPTQ fixes #115

GPTQ fixes #115

Conversation

HDCharles commented Apr 3, 2024 • edited Loading

cpuhrsch commented Apr 3, 2024 • edited Loading

cpuhrsch Apr 3, 2024

Choose a reason for hiding this comment

HDCharles commented Apr 3, 2024 •

edited

Loading

cpuhrsch commented Apr 3, 2024 •

edited

Loading