Skip to content

[Bug] TVMError: Check failed: shape[0] == fill_count (7 vs. 6) : Requested shape do not match the filled count #333

@felixslu

Description

@felixslu

🐛 Bug

To Reproduce

Steps to reproduce the behavior:

  1. python build.py --hf-path=databricks/dolly-v2-3b (It is OK!)
  2. python evaluate.py --artifact-path dist --model dolly-v2-3b --quantization q3f16_0

Expected behavior

when Running inference ,I got this error!

Tokenizing...
Running inference...
Traceback (most recent call last):
File "/mlc-llm/evaluate.py", line 178, in
deploy_to_pipeline(ARGS)
File "
/mlc-llm/evaluate.py", line 136, in deploy_to_pipeline
first_k_cache = fcache_view(kv_caches[0], ShapeTuple([7, 32, 128]))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/mlc-llm/lib/python3.11/site-packages/tvm/_ffi/_ctypes/packed_func.py", line 238, in call
raise get_last_ffi_error()
tvm._ffi.base.TVMError: Traceback (most recent call last):
2: TVMFuncCall
1: tvm::runtime::PackedFuncObj::Extractor<tvm::runtime::PackedFuncSubObj<tvm::runtime::relax_vm::{lambda(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)#1}> >::Call(tvm::runtime::PackedFuncObj const*, tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)
0: tvm::runtime::relax_vm::AttentionKVCacheObj::View(tvm::runtime::ShapeTuple const&)
File "/workspace/tvm/src/runtime/relax_vm/lm_support.cc", line 78
TVMError: Check failed: shape[0] == fill_count (7 vs. 6) : Requested shape do not match the filled count

Environment

  • Platform (e.g. Intel):
  • Operating system (e.g. Ubuntu):
  • Device ( PC+RTX 3090, ...)
  • How you installed MLC-LLM (source):
  • How you installed TVM-Unity (pip):
  • Python version (e.g. 3.11):
  • GPU driver version (if applicable):
  • CUDA/cuDNN version (if applicable):
  • TVM Unity Hash Tag (python -c "import tvm; print('\n'.join(f'{k}: {v}' for k, v in tvm.support.libinfo().items()))", applicable if you compile models):
  • Any other relevant information:

Additional context

TTBWACe5vd

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions