[Bug] TVMError: Check failed: shape[0] == fill_count (7 vs. 6) : Requested shape do not match the filled count

## 🐛 Bug


## To Reproduce

Steps to reproduce the behavior:

1. python build.py --hf-path=databricks/dolly-v2-3b    **(It is OK!)**
2. python evaluate.py --artifact-path dist --model dolly-v2-3b --quantization q3f16_0


## Expected behavior
when Running inference ,I got this error!

Tokenizing...
Running inference...
Traceback (most recent call last):
  File "~/mlc-llm/evaluate.py", line 178, in <module>
    deploy_to_pipeline(ARGS)
  File "~/mlc-llm/evaluate.py", line 136, in deploy_to_pipeline
    first_k_cache = fcache_view(kv_caches[0], ShapeTuple([7, 32, 128]))
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/mlc-llm/lib/python3.11/site-packages/tvm/_ffi/_ctypes/packed_func.py", line 238, in __call__
    raise get_last_ffi_error()
tvm._ffi.base.TVMError: Traceback (most recent call last):
  2: TVMFuncCall
  1: tvm::runtime::PackedFuncObj::Extractor<tvm::runtime::PackedFuncSubObj<tvm::runtime::relax_vm::{lambda(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)#1}> >::Call(tvm::runtime::PackedFuncObj const*, tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)
  0: tvm::runtime::relax_vm::AttentionKVCacheObj::View(tvm::runtime::ShapeTuple const&)
  File "/workspace/tvm/src/runtime/relax_vm/lm_support.cc", line 78
TVMError: Check failed: shape[0] == fill_count (7 vs. 6) : Requested shape do not match the filled count

## Environment

 - Platform (e.g. Intel):
 - Operating system (e.g. Ubuntu):
 - Device ( PC+RTX 3090, ...)
 - How you installed MLC-LLM (source):
 - How you installed TVM-Unity (`pip`):
 - Python version (e.g. 3.11):
 - GPU driver version (if applicable):
 - CUDA/cuDNN version (if applicable):
 - TVM Unity Hash Tag (`python -c "import tvm; print('\n'.join(f'{k}: {v}' for k, v in tvm.support.libinfo().items()))"`, applicable if you compile models):
 - Any other relevant information:

## Additional context

![TTBWACe5vd](https://github.com/mlc-ai/mlc-llm/assets/20659028/7b5aece7-b268-461a-bed7-5b64fdfbb6ab)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug] TVMError: Check failed: shape[0] == fill_count (7 vs. 6) : Requested shape do not match the filled count #333

🐛 Bug

To Reproduce

Expected behavior

Environment

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug] TVMError: Check failed: shape[0] == fill_count (7 vs. 6) : Requested shape do not match the filled count #333

Description

🐛 Bug

To Reproduce

Expected behavior

Environment

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions