Skip to content

[Issue]: Some AITER Operators are not compatible with torch.compile #244

@tjtanaa

Description

@tjtanaa

Problem Description

Torch compile is not able to convert some of the built-in python functions into graph. e.g. highlighted in green

ERROR 03-24 02:09:11 [core.py:343]     ck_config = get_CKGEMM_config(m, n, k)
ERROR 03-24 02:09:11 [core.py:343]   File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/polyfills/__init__.py", line 135, in getattr_and_trace
ERROR 03-24 02:09:11 [core.py:343]     return fn(*args[2:], **kwargs)
ERROR 03-24 02:09:11 [core.py:343]   File "/app/Quark/aiter/aiter/ops/gemm_op_a8w8.py", line 84, in get_CKGEMM_config
+ ERROR 03-24 02:09:11 [core.py:343]     if not hasattr(get_CKGEMM_config, "ckgemm_dict"):
ERROR 03-24 02:09:11 [core.py:343] 
ERROR 03-24 02:09:11 [core.py:343] Set TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information

The full logs can be found in the attachment.

llama_fp8-int8aiterck-v1.txt

Operating System

Ubuntu 22.04.4 LTS (Jammy Jellyfish)

CPU

AMD EPYC 9654 96-Core Processor

GPU

AMD Instinct MI300X

ROCm Version

ROCm 6.3.1

ROCm Component

No response

Steps to Reproduce

Setup vLLM

git clone https://github.com/EmbeddedLLM/vllm.git --branch aiter-int8-linear  aiter-int8-linear
cd  aiter-int8-linear
# if you have installed vllm before
python3 -m pip uninstall -y vllm
export PYTORCH_ROCM_ARCH=gfx942
python3 setup.py develop

Run with V1 engine
VLLM_USE_V1=1 VLLM_ROCM_USE_AITER=1 vllm serve neuralmagic/Meta-Llama-3.1-8B-Instruct-quantized.w8a8

(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support

No response

Additional Information

No response

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions