Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The Path to v1.0.0 #348

Open
8 tasks
PanQiWei opened this issue Sep 25, 2023 · 2 comments
Open
8 tasks

The Path to v1.0.0 #348

PanQiWei opened this issue Sep 25, 2023 · 2 comments
Labels
enhancement New feature or request

Comments

@PanQiWei
Copy link
Collaborator

PanQiWei commented Sep 25, 2023

Hi everyone, long time no see! Start from this week, I will use about 4 weeks to gradually push AutoGPTQ to v1.0.0, in the mean time, there will be 2~3 minor version released as optimization or feature preview so that you can experience those updates as soon as they are finished and I can hear more community voices and get more feedbacks.

My vision is at the time of v1.0.0 is released, AutoGPTQ can serve as an automatic, extendable and flexible quantization backend for all language models that are written by Pytorch.

I open this issue to list all the things will be done (optimizations, new features, bug fixes, etc) and record the development progress.(so contents below will be updated frequently)

Feel free to comment in the thread to give your opinions and suggestions!

Optimizations

  • refactor the code framework for the future extensions while maintain the important interfaces.
    • separate quantization logic as a stand alone module and serve as mixin.
    • design automatic structure recognize strategy to better support different models (hope can even support multi-modal and diffusion models).
  • speed up model packing after quantization.
  • support kernel fusion to more models to futher speed up inference.

New Features

  • model sharping: split model checkpoint into multiple files and load from multiple files. Save and Load sharded gptq checkpoint #364
  • tensor parallelism for all kinds of QuantLinear that are supported by AutoGPTQ.
  • CLI: run common commands such as quantization and benchmark directly.

Bug Fixes

@PanQiWei PanQiWei added the enhancement New feature or request label Sep 25, 2023
@PanQiWei PanQiWei pinned this issue Sep 25, 2023
@Auth0rM0rgan
Copy link

Hi @PanQiWei, Any updates regarding version 1.0.0?

@Qubitium
Copy link
Collaborator

Qubitium commented Apr 27, 2024

@PanQiWei Can you rejoin @fxmarty and be more active in code reviews? Feels like the project needs at least 2 active maintainers to keep it up to speed and not overload any single person.

@Qubitium Qubitium unpinned this issue Nov 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants