version 2.1.0
What's New
-
New Features
- PyTorch and ONNX
- [BREAKING CHANGE]: AIMET QuantSim by default uses per-channel quantization for weights instead of per-tensor
- AIMET QuantSim exports encoding json schema version 1.0.0 by default
- PyTorch
- AIMET now quantizes scalar inputs of type
torch.nn.Parameter
- these were not quantized in prior releases - Published recipe for performing LoRA QAT - using LoRA adapters to recover quantized accuracy of the base model. Includes recipes for weight-only (WQ) and weight-and-activation (QWA) QAT
- AIMET now quantizes scalar inputs of type
- PyTorch and ONNX
-
Bug Fixes
- PyTorch
- Fixed a bug that prevented Adaround from caching data samples with PyTorch versions 2.6 and later
- PyTorch
Documentation
- Release main page: https://github.com/quic/aimet/releases/tag/2.1.0
- Documentation: https://quic.github.io/aimet-pages/releases/2.1.0/index.html
Packages
- aimet_torch-2.1.0+cu121-cp310-none-any.whl
- PyTorch 2.1 GPU package with Python 3.10 and CUDA 12.x
- aimet_torch-2.1.0+cpu-cp310-none-any.whl
- PyTorch 2.1 CPU package with Python 3.10 - If installing on a machine without CUDA
- aimet_onnx-2.1.0+cu118-cp310-cp310-manylinux_2_34_x86_64.whl
- ONNX 1.16 GPU package with Python 3.10 - Recommended for use with ONNX models
- aimet_onnx-2.1.0+cpu-cp310-cp310-manylinux_2_34_x86_64.whl
- ONNX 1.16 CPU package with Python 3.10 - If installing on a machine without CUDA
- aimet_tensorflow-2.1.0+cu118-cp310-cp310-manylinux_2_34_x86_64.whl
- TensorFlow 2.10 GPU package with Python 3.10 - Recommended for use with TensorFlow models
- aimet_tensorflow-2.1.0+cpu-cp310-cp310-manylinux_2_34_x86_64.whl
- TensorFlow 2.10 CPU package with Python 3.10 - If installing on a machine without CUDA