Skip to content

[Bug] Cannot build --quantization q3f16_1 model #1005

@isaac621

Description

@isaac621

🐛 Bug

Meet an error while building q3f16_1 vicuna-7b-v1.5, the same error also occur when I try to building q3f16_1 for other models

To Reproduce

python3 -m mlc_llm.build --hf-path lmsys/vicuna-7b-v1.5 --target iphone --max-seq-len 4096 --quantization q3f16_1

Error Message

Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/Users/isaaclui/Desktop/oursky-project/mlc-llm/mlc_llm/build.py", line 13, in <module>
    main()
  File "/Users/isaaclui/Desktop/oursky-project/mlc-llm/mlc_llm/build.py", line 10, in main
    core.build_model_from_args(parsed_args)
  File "/Users/isaaclui/Desktop/oursky-project/mlc-llm/mlc_llm/core.py", line 653, in build_model_from_args
    build(mod, args)
  File "/Users/isaaclui/Desktop/oursky-project/mlc-llm/mlc_llm/core.py", line 514, in build
    mod_deploy = dl.ApplyDefaultSchedule(  # pylint: disable=not-callable
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/isaaclui/.pyenv/versions/3.11.5/lib/python3.11/site-packages/tvm/ir/transform.py", line 238, in __call__
    return _ffi_transform_api.RunPass(self, mod)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "tvm/_ffi/_cython/./packed_func.pxi", line 332, in tvm._ffi._cy3.core.PackedFuncBase.__call__
  File "tvm/_ffi/_cython/./packed_func.pxi", line 263, in tvm._ffi._cy3.core.FuncCall
  File "tvm/_ffi/_cython/./packed_func.pxi", line 252, in tvm._ffi._cy3.core.FuncCall3
  File "tvm/_ffi/_cython/./base.pxi", line 182, in tvm._ffi._cy3.core.CHECK_CALL
  File "/Users/isaaclui/.pyenv/versions/3.11.5/lib/python3.11/site-packages/tvm/_ffi/base.py", line 476, in raise_last_ffi_error
    raise py_err
  File "tvm/_ffi/_cython/./packed_func.pxi", line 56, in tvm._ffi._cy3.core.tvm_callback
  File "/Users/isaaclui/.pyenv/versions/3.11.5/lib/python3.11/site-packages/tvm/ir/transform.py", line 307, in _pass_func
    return inst.transform_module(mod, ctx)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/isaaclui/.pyenv/versions/3.11.5/lib/python3.11/site-packages/tvm/dlight/base/transform.py", line 64, in transform_module
    sch = _apply_rules(func, target, self.rules, tunable=False)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/isaaclui/.pyenv/versions/3.11.5/lib/python3.11/site-packages/tvm/dlight/base/transform.py", line 80, in _apply_rules
    space = rule.apply(func, target, tunable)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/isaaclui/.pyenv/versions/3.11.5/lib/python3.11/site-packages/tvm/dlight/gpu/gemv.py", line 191, in apply
    is_inner_reduction = normalize(sch, block_info)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/isaaclui/.pyenv/versions/3.11.5/lib/python3.11/site-packages/tvm/dlight/gpu/gemv.py", line 122, in normalize
    is_inner_reduction = iter_to_info[inner_axis].kind == "R"
                         ~~~~~~~~~~~~^^^^^^^^^^^^
KeyError: IterSplit(IterMark(v1, extent=T.int64(4096)), lower_factor=T.int64(1), extent=T.int64(4096), scale=T.int64(1))

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugConfirmed bugs

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions