- 
        Couldn't load subscription status. 
- Fork 1.8k
Closed
Labels
documentationImprovements or additions to documentationImprovements or additions to documentation
Description
Dear
I try to build model for android according to the guide.
when run ""
python3 build.py --model vicuna-v1-7b --quantization q4f16_0 --target android --max-seq-len 768
I get below error:
(mlc) wj:~/work/mlc-llm$ python3 build.py --model vicuna-v1-7b --quantization q4f16_0 --target android --max-seq-len 768
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [27:22<00:00, 821.24s/it]
Total param size: 3.5307693481445312 GB
Start storing to cache dist/vicuna-v1-7b-q4f16_0/params
[0519/0519] saving param_518
All finished, 132 total shards committed, record saved to dist/vicuna-v1-7b-q4f16_0/params/ndarray-cache.json
Save a cached module to dist/vicuna-v1-7b-q4f16_0/mod_cache_before_build_android.pkl.
19 static functions: [I.GlobalVar("take_decode"), I.GlobalVar("reshape"), I.GlobalVar("rms_norm1"), I.GlobalVar("fused_decode1_fused_matmul3_multiply"), I.GlobalVar("squeeze"), I.GlobalVar("fused_decode3_fused_matmul5_cast2"), I.GlobalVar("fused_decode_matmul1"), I.GlobalVar("softmax1"), I.GlobalVar("fused_reshape2_squeeze"), I.GlobalVar("fused_decode1_fused_matmul3_silu"), I.GlobalVar("fused_decode_fused_matmul1_add"), I.GlobalVar("fused_transpose3_reshape4"), I.GlobalVar("transpose1"), I.GlobalVar("rotary_embedding1"), I.GlobalVar("divide1"), I.GlobalVar("slice1"), I.GlobalVar("reshape1"), I.GlobalVar("reshape2"), I.GlobalVar("fused_decode2_fused_matmul4_add")]
26 dynamic functions: [I.GlobalVar("fused_decode4_NT_matmul1"), I.GlobalVar("matmul2"), I.GlobalVar("transpose2"), I.GlobalVar("take_decode1"), I.GlobalVar("reshape7"), I.GlobalVar("full"), I.GlobalVar("fused_softmax2_cast4"), I.GlobalVar("transpose7"), I.GlobalVar("reshape6"), I.GlobalVar("slice"), I.GlobalVar("fused_NT_matmul2_divide2_maximum1_minimum1_cast3"), I.GlobalVar("extend_te"), I.GlobalVar("rms_norm"), I.GlobalVar("rotary_embedding"), I.GlobalVar("squeeze1"), I.GlobalVar("reshape8"), I.GlobalVar("fused_decode6_fused_NT_matmul4_add1"), I.GlobalVar("fused_decode4_fused_NT_matmul1_add1"), I.GlobalVar("matmul10"), I.GlobalVar("fused_decode5_fused_NT_matmul3_silu1"), I.GlobalVar("fused_min_max_triu_te_broadcast_to"), I.GlobalVar("reshape3"), I.GlobalVar("fused_decode5_fused_NT_matmul3_multiply1"), I.GlobalVar("reshape5"), I.GlobalVar("fused_NT_matmul_divide_maximum_minimum_cast"), I.GlobalVar("fused_softmax_cast1")]
Dump static shape TIR to dist/vicuna-v1-7b-q4f16_0/debug/mod_tir_static.py
Dump dynamic shape TIR to dist/vicuna-v1-7b-q4f16_0/debug/mod_tir_dynamic.py
- Dispatch to pre-scheduled op: matmul2
- Dispatch to pre-scheduled op: fused_softmax2_cast4
- Dispatch to pre-scheduled op: fused_NT_matmul2_divide2_maximum1_minimum1_cast3
- Dispatch to pre-scheduled op: rms_norm
- Dispatch to pre-scheduled op: matmul10
- Dispatch to pre-scheduled op: fused_min_max_triu_te_broadcast_to
- Dispatch to pre-scheduled op: fused_softmax_cast1
- Dispatch to pre-scheduled op: fused_NT_matmul_divide_maximum_minimum_cast
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /mnt/e/ubuntu_code/mlc-llm/build.py:221 in <module>                                      │
│                                                                                                  │
│   218 │   │   )                                                                                  │
│   219 │   │   mod = pickle.load(open(cache_path, "rb"))                                          │
│   220 │   dump_split_tir(mod)                                                                    │
│ ❱ 221 │   build(mod, ARGS)                                                                       │
│   222                                                                                            │
│                                                                                                  │
│ /mnt/e/ubuntu_code/mlc-llm/build.py:158 in build                                         │
│                                                                                                  │
│   155 │                                                                                          │
│   156 │   debug_dump_script(mod_deploy, "mod_build_stage.py", args)                              │
│   157 │                                                                                          │
│ ❱ 158 │   ex = relax.build(mod_deploy, args.target, system_lib=args.system_lib)                  │
│   159 │                                                                                          │
│   160 │   output_filename = (                                                                    │
│   161 │   │   f"{args.model}-{args.quantization.name}-{target_kind}.{args.lib_format}"           │
│                                                                                                  │
│ /home/wj/work/relax/python/tvm/relax/vm_build.py:337 in build                         │
│                                                                                                  │
│   334 │   builder = relax.ExecBuilder()                                                          │
│   335 │   leftover_mod = _vmcodegen(builder, new_mod, exec_mode=exec_mode)                       │
│   336 │   tir_mod = _filter_tir(leftover_mod)                                                    │
│ ❱ 337 │   return _vmlink(builder, target, tir_mod, ext_libs, params, system_lib=system_lib)      │
│   338                                                                                            │
│   339                                                                                            │
│   340 def _filter_tir(mod: tvm.IRModule) -> tvm.IRModule:                                        │
│                                                                                                  │
│ /home/wj/work/relax/python/tvm/relax/vm_build.py:242 in _vmlink                       │
│                                                                                                  │
│   239 │   │   ext_libs = []                                                                      │
│   240 │   lib = None                                                                             │
│   241 │   if tir_mod is not None:                                                                │
│ ❱ 242 │   │   lib = tvm.build(                                                                   │
│   243 │   │   │   tir_mod, target=target, runtime=_autodetect_system_lib_req(target, system_li   │
│   244 │   │   )                                                                                  │
│   245 │   return Executable(_ffi_api.VMLink(builder, target, lib, ext_libs, params))  # type:    │
│                                                                                                  │
│ /home/wj/work/relax/python/tvm/driver/build_module.py:281 in build                    │
│                                                                                                  │
│   278 │                                                                                          │
│   279 │   annotated_mods, target_host = Target.canon_target_map_and_host(annotated_mods, targe   │
│   280 │                                                                                          │
│ ❱ 281 │   rt_mod_host = _driver_ffi.tir_to_runtime(annotated_mods, target_host)                  │
│   282 │                                                                                          │
│   283 │   annotated_mods, target_host = Target.canon_target_map_and_host(annotated_mods, targe   │
│   284                                                                                            │
│                                                                                                  │
│ /home/wj/work/relax/python/tvm/_ffi/_ctypes/packed_func.py:238 in __call__            │
│                                                                                                  │
│   235 │   │   │   )                                                                              │
│   236 │   │   │   != 0                                                                           │
│   237 │   │   ):                                                                                 │
│ ❱ 238 │   │   │   raise get_last_ffi_error()                                                     │
│   239 │   │   _ = temp_args                                                                      │
│   240 │   │   _ = args                                                                           │
│   241 │   │   return RETURN_SWITCH[ret_tcode.value](ret_val)                                     │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
TVMError: Traceback (most recent call last):
  11: TVMFuncCall
  10: tvm::runtime::PackedFuncObj::Extractor<tvm::runtime::PackedFuncSubObj<tvm::runtime::TypedPackedFunc<tvm::runtime::Module (tvm::runtime::Map<tvm::Target, tvm::IRModule, void, void> const&,
tvm::Target)>::AssignTypedLambda<tvm::{lambda(tvm::runtime::Map<tvm::Target, tvm::IRModule, void, void> const&, tvm::Target)#6}>(tvm::{lambda(tvm::runtime::Map<tvm::Target, tvm::IRModule, void, void>
const&, tvm::Target)#6}, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)::{lambda(tvm::runtime::TVMArgs const&, tvm::runtime::TVMRetValue*)#1}>
>::Call(tvm::runtime::PackedFuncObj const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, tvm::runtime::TVMRetValue)
  9: tvm::TIRToRuntime(tvm::runtime::Map<tvm::Target, tvm::IRModule, void, void> const&, tvm::Target const&)
  8: tvm::codegen::Build(tvm::IRModule, tvm::Target)
  7: tvm::runtime::PackedFuncObj::Extractor<tvm::runtime::PackedFuncSubObj<tvm::runtime::TypedPackedFunc<tvm::runtime::Module (tvm::IRModule, tvm::Target)>::AssignTypedLambda<tvm::runtime::Module
(*)(tvm::IRModule, tvm::Target)>(tvm::runtime::Module (*)(tvm::IRModule, tvm::Target), std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)::{lambda(tvm::runtime::TVMArgs const&,
tvm::runtime::TVMRetValue*)#1}> >::Call(tvm::runtime::PackedFuncObj const*, tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)
  6: tvm::codegen::BuildOpenCL(tvm::IRModule, tvm::Target)
  5: tvm::runtime::OpenCLModuleCreate(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >,
std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, tvm::runtime::FunctionInfo, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> > const, tvm::runtime::FunctionInfo> > >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)
  4: tvm::runtime::OpenCLModuleNode::Init()
  3: tvm::runtime::cl::OpenCLWorkspace::Init()
  2: tvm::runtime::cl::OpenCLWorkspace::Init(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
  1: tvm::runtime::cl::GetPlatformIDs()
  0: clGetPlatformIDs
  File "/mnt/e/ubuntu_code/relax/src/runtime/opencl/opencl_wrapper/opencl_wrapper.cc", line 106
TVMError:
---------------------------------------------------------------
An error occurred during the execution of TVM.
For more information, please see: https://tvm.apache.org/docs/errors.html
---------------------------------------------------------------
  Check failed: (m_libHandler != nullptr) is false: Error! Cannot open libOpenCL!
free(): invalid pointer
Aborted (core dumped)
I use wsl ubuntu environnment, and ubuntu does not have OpenCL.
But this model is build for Android running on a Android arm phone, do not need ubuntu host OpenCL runtime.
Could you please help check this ?
Thank you very much
Metadata
Metadata
Assignees
Labels
documentationImprovements or additions to documentationImprovements or additions to documentation