Skip to content

failed to build model to library for Android #133

@huanyingjun

Description

@huanyingjun

Dear
I try to build model for android according to the guide.
when run ""
python3 build.py --model vicuna-v1-7b --quantization q4f16_0 --target android --max-seq-len 768
I get below error:

(mlc) wj:~/work/mlc-llm$ python3 build.py --model vicuna-v1-7b --quantization q4f16_0 --target android --max-seq-len 768
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [27:22<00:00, 821.24s/it]
Total param size: 3.5307693481445312 GB
Start storing to cache dist/vicuna-v1-7b-q4f16_0/params
[0519/0519] saving param_518
All finished, 132 total shards committed, record saved to dist/vicuna-v1-7b-q4f16_0/params/ndarray-cache.json
Save a cached module to dist/vicuna-v1-7b-q4f16_0/mod_cache_before_build_android.pkl.
19 static functions: [I.GlobalVar("take_decode"), I.GlobalVar("reshape"), I.GlobalVar("rms_norm1"), I.GlobalVar("fused_decode1_fused_matmul3_multiply"), I.GlobalVar("squeeze"), I.GlobalVar("fused_decode3_fused_matmul5_cast2"), I.GlobalVar("fused_decode_matmul1"), I.GlobalVar("softmax1"), I.GlobalVar("fused_reshape2_squeeze"), I.GlobalVar("fused_decode1_fused_matmul3_silu"), I.GlobalVar("fused_decode_fused_matmul1_add"), I.GlobalVar("fused_transpose3_reshape4"), I.GlobalVar("transpose1"), I.GlobalVar("rotary_embedding1"), I.GlobalVar("divide1"), I.GlobalVar("slice1"), I.GlobalVar("reshape1"), I.GlobalVar("reshape2"), I.GlobalVar("fused_decode2_fused_matmul4_add")]
26 dynamic functions: [I.GlobalVar("fused_decode4_NT_matmul1"), I.GlobalVar("matmul2"), I.GlobalVar("transpose2"), I.GlobalVar("take_decode1"), I.GlobalVar("reshape7"), I.GlobalVar("full"), I.GlobalVar("fused_softmax2_cast4"), I.GlobalVar("transpose7"), I.GlobalVar("reshape6"), I.GlobalVar("slice"), I.GlobalVar("fused_NT_matmul2_divide2_maximum1_minimum1_cast3"), I.GlobalVar("extend_te"), I.GlobalVar("rms_norm"), I.GlobalVar("rotary_embedding"), I.GlobalVar("squeeze1"), I.GlobalVar("reshape8"), I.GlobalVar("fused_decode6_fused_NT_matmul4_add1"), I.GlobalVar("fused_decode4_fused_NT_matmul1_add1"), I.GlobalVar("matmul10"), I.GlobalVar("fused_decode5_fused_NT_matmul3_silu1"), I.GlobalVar("fused_min_max_triu_te_broadcast_to"), I.GlobalVar("reshape3"), I.GlobalVar("fused_decode5_fused_NT_matmul3_multiply1"), I.GlobalVar("reshape5"), I.GlobalVar("fused_NT_matmul_divide_maximum_minimum_cast"), I.GlobalVar("fused_softmax_cast1")]
Dump static shape TIR to dist/vicuna-v1-7b-q4f16_0/debug/mod_tir_static.py
Dump dynamic shape TIR to dist/vicuna-v1-7b-q4f16_0/debug/mod_tir_dynamic.py
- Dispatch to pre-scheduled op: matmul2
- Dispatch to pre-scheduled op: fused_softmax2_cast4
- Dispatch to pre-scheduled op: fused_NT_matmul2_divide2_maximum1_minimum1_cast3
- Dispatch to pre-scheduled op: rms_norm
- Dispatch to pre-scheduled op: matmul10
- Dispatch to pre-scheduled op: fused_min_max_triu_te_broadcast_to
- Dispatch to pre-scheduled op: fused_softmax_cast1
- Dispatch to pre-scheduled op: fused_NT_matmul_divide_maximum_minimum_cast
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /mnt/e/ubuntu_code/mlc-llm/build.py:221 in <module>                                      │
│                                                                                                  │
│   218 │   │   )                                                                                  │
│   219 │   │   mod = pickle.load(open(cache_path, "rb"))                                          │
│   220 │   dump_split_tir(mod)                                                                    │
│ ❱ 221 │   build(mod, ARGS)                                                                       │
│   222                                                                                            │
│                                                                                                  │
│ /mnt/e/ubuntu_code/mlc-llm/build.py:158 in build                                         │
│                                                                                                  │
│   155 │                                                                                          │
│   156 │   debug_dump_script(mod_deploy, "mod_build_stage.py", args)                              │
│   157 │                                                                                          │
│ ❱ 158 │   ex = relax.build(mod_deploy, args.target, system_lib=args.system_lib)                  │
│   159 │                                                                                          │
│   160 │   output_filename = (                                                                    │
│   161 │   │   f"{args.model}-{args.quantization.name}-{target_kind}.{args.lib_format}"           │
│                                                                                                  │
│ /home/wj/work/relax/python/tvm/relax/vm_build.py:337 in build                         │
│                                                                                                  │
│   334 │   builder = relax.ExecBuilder()                                                          │
│   335 │   leftover_mod = _vmcodegen(builder, new_mod, exec_mode=exec_mode)                       │
│   336 │   tir_mod = _filter_tir(leftover_mod)                                                    │
│ ❱ 337 │   return _vmlink(builder, target, tir_mod, ext_libs, params, system_lib=system_lib)      │
│   338                                                                                            │
│   339                                                                                            │
│   340 def _filter_tir(mod: tvm.IRModule) -> tvm.IRModule:                                        │
│                                                                                                  │
│ /home/wj/work/relax/python/tvm/relax/vm_build.py:242 in _vmlink                       │
│                                                                                                  │
│   239 │   │   ext_libs = []                                                                      │
│   240 │   lib = None                                                                             │
│   241 │   if tir_mod is not None:                                                                │
│ ❱ 242 │   │   lib = tvm.build(                                                                   │
│   243 │   │   │   tir_mod, target=target, runtime=_autodetect_system_lib_req(target, system_li   │
│   244 │   │   )                                                                                  │
│   245 │   return Executable(_ffi_api.VMLink(builder, target, lib, ext_libs, params))  # type:    │
│                                                                                                  │
│ /home/wj/work/relax/python/tvm/driver/build_module.py:281 in build                    │
│                                                                                                  │
│   278 │                                                                                          │
│   279 │   annotated_mods, target_host = Target.canon_target_map_and_host(annotated_mods, targe   │
│   280 │                                                                                          │
│ ❱ 281 │   rt_mod_host = _driver_ffi.tir_to_runtime(annotated_mods, target_host)                  │
│   282 │                                                                                          │
│   283 │   annotated_mods, target_host = Target.canon_target_map_and_host(annotated_mods, targe   │
│   284                                                                                            │
│                                                                                                  │
│ /home/wj/work/relax/python/tvm/_ffi/_ctypes/packed_func.py:238 in __call__            │
│                                                                                                  │
│   235 │   │   │   )                                                                              │
│   236 │   │   │   != 0                                                                           │
│   237 │   │   ):                                                                                 │
│ ❱ 238 │   │   │   raise get_last_ffi_error()                                                     │
│   239 │   │   _ = temp_args                                                                      │
│   240 │   │   _ = args                                                                           │
│   241 │   │   return RETURN_SWITCH[ret_tcode.value](ret_val)                                     │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
TVMError: Traceback (most recent call last):
  11: TVMFuncCall
  10: tvm::runtime::PackedFuncObj::Extractor<tvm::runtime::PackedFuncSubObj<tvm::runtime::TypedPackedFunc<tvm::runtime::Module (tvm::runtime::Map<tvm::Target, tvm::IRModule, void, void> const&,
tvm::Target)>::AssignTypedLambda<tvm::{lambda(tvm::runtime::Map<tvm::Target, tvm::IRModule, void, void> const&, tvm::Target)#6}>(tvm::{lambda(tvm::runtime::Map<tvm::Target, tvm::IRModule, void, void>
const&, tvm::Target)#6}, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)::{lambda(tvm::runtime::TVMArgs const&, tvm::runtime::TVMRetValue*)#1}>
>::Call(tvm::runtime::PackedFuncObj const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, tvm::runtime::TVMRetValue)
  9: tvm::TIRToRuntime(tvm::runtime::Map<tvm::Target, tvm::IRModule, void, void> const&, tvm::Target const&)
  8: tvm::codegen::Build(tvm::IRModule, tvm::Target)
  7: tvm::runtime::PackedFuncObj::Extractor<tvm::runtime::PackedFuncSubObj<tvm::runtime::TypedPackedFunc<tvm::runtime::Module (tvm::IRModule, tvm::Target)>::AssignTypedLambda<tvm::runtime::Module
(*)(tvm::IRModule, tvm::Target)>(tvm::runtime::Module (*)(tvm::IRModule, tvm::Target), std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)::{lambda(tvm::runtime::TVMArgs const&,
tvm::runtime::TVMRetValue*)#1}> >::Call(tvm::runtime::PackedFuncObj const*, tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)
  6: tvm::codegen::BuildOpenCL(tvm::IRModule, tvm::Target)
  5: tvm::runtime::OpenCLModuleCreate(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >,
std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, tvm::runtime::FunctionInfo, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> > const, tvm::runtime::FunctionInfo> > >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)
  4: tvm::runtime::OpenCLModuleNode::Init()
  3: tvm::runtime::cl::OpenCLWorkspace::Init()
  2: tvm::runtime::cl::OpenCLWorkspace::Init(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
  1: tvm::runtime::cl::GetPlatformIDs()
  0: clGetPlatformIDs
  File "/mnt/e/ubuntu_code/relax/src/runtime/opencl/opencl_wrapper/opencl_wrapper.cc", line 106
TVMError:
---------------------------------------------------------------
An error occurred during the execution of TVM.
For more information, please see: https://tvm.apache.org/docs/errors.html
---------------------------------------------------------------
  Check failed: (m_libHandler != nullptr) is false: Error! Cannot open libOpenCL!
free(): invalid pointer
Aborted (core dumped)

I use wsl ubuntu environnment, and ubuntu does not have OpenCL.
But this model is build for Android running on a Android arm phone, do not need ubuntu host OpenCL runtime.

Could you please help check this ?
Thank you very much

Metadata

Metadata

Assignees

No one assigned

    Labels

    documentationImprovements or additions to documentation

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions