Skip to content

version 2.1.0

Compare
Choose a tag to compare
@aimetci aimetci released this 11 Mar 20:50
· 84 commits to develop since this release

What's New

  • New Features

    • PyTorch and ONNX
      • [BREAKING CHANGE]: AIMET QuantSim by default uses per-channel quantization for weights instead of per-tensor
      • AIMET QuantSim exports encoding json schema version 1.0.0 by default
    • PyTorch
      • AIMET now quantizes scalar inputs of type torch.nn.Parameter - these were not quantized in prior releases
      • Published recipe for performing LoRA QAT - using LoRA adapters to recover quantized accuracy of the base model. Includes recipes for weight-only (WQ) and weight-and-activation (QWA) QAT
  • Bug Fixes

    • PyTorch
      • Fixed a bug that prevented Adaround from caching data samples with PyTorch versions 2.6 and later

Documentation

Packages

  • aimet_torch-2.1.0+cu121-cp310-none-any.whl
    • PyTorch 2.1 GPU package with Python 3.10 and CUDA 12.x
  • aimet_torch-2.1.0+cpu-cp310-none-any.whl
    • PyTorch 2.1 CPU package with Python 3.10 - If installing on a machine without CUDA
  • aimet_onnx-2.1.0+cu118-cp310-cp310-manylinux_2_34_x86_64.whl
    • ONNX 1.16 GPU package with Python 3.10 - Recommended for use with ONNX models
  • aimet_onnx-2.1.0+cpu-cp310-cp310-manylinux_2_34_x86_64.whl
    • ONNX 1.16 CPU package with Python 3.10 - If installing on a machine without CUDA
  • aimet_tensorflow-2.1.0+cu118-cp310-cp310-manylinux_2_34_x86_64.whl
    • TensorFlow 2.10 GPU package with Python 3.10 - Recommended for use with TensorFlow models
  • aimet_tensorflow-2.1.0+cpu-cp310-cp310-manylinux_2_34_x86_64.whl
    • TensorFlow 2.10 CPU package with Python 3.10 - If installing on a machine without CUDA