Skip to content

[Question] How to fix error -> RuntimeError: #if defined(__CUDA_ARCH__) && (__CUDA_ARCH__ >= 530) #279

@chensinit

Description

@chensinit

❓ General Questions

Hello.
I try to build. but i got error.
Please help me how to fix it : (

command : python build.py --hf-path=databricks/dolly-v2-3b
error : RuntimeError: #if defined(CUDA_ARCH) && (CUDA_ARCH >= 530)

\mlc-llm$ python build.py --hf-path=databricks/dolly-v2-3b
Weights exist at dist/models/dolly-v2-3b, skipping download.
Using path "dist/models/dolly-v2-3b" for model "dolly-v2-3b"
Database paths: ['log_db/vicuna-v1-7b', 'log_db/rwkv-raven-3b', 'log_db/rwkv-raven-1b5', 'log_db/redpajama-3b-q4f16', 'log_db/dolly-v2-3b', 'log_db/rwkv-raven-7b', 'log_db/redpajama-3b-q4f32']
Target configured: cuda -keys=cuda,gpu -arch=sm_61 -max_num_threads=1024 -max_shared_memory_per_block=49152 -max_threads_per_block=1024 -registers_per_block=65536 -thread_warp_size=32
Loading HuggingFace model into memory: dist/models/dolly-v2-3bNote: This may take a while depending on the model size and your RAM availability, and could be particularly slow if it uses swap memory.
Loading done.
Dumping independent weights dumped to: dist/dolly-v2-3b-q3f16_0/raw_params
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 388/388 [00:06<00:00, 61.00it/s]
Loading model weights from: dist/dolly-v2-3b-q3f16_0/raw_params
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 388/388 [00:02<00:00, 154.51it/s]
Loading done
Automatically using target for weight quantization: cuda -keys=cuda,gpu -arch=sm_61 -max_num_threads=1024 -max_shared_memory_per_block=49152 -max_threads_per_block=1024 -registers_per_block=65536 -thread_warp_size=32
Traceback (most recent call last):
File "/home/gihun/IdeaProjects/230530_LLM/mlc-llm/build.py", line 409, in
main()
File "/home/gihun/IdeaProjects/230530_LLM/mlc-llm/build.py", line 387, in main
mod = mod_transform_before_build(mod, params, ARGS)
File "/home/gihun/IdeaProjects/230530_LLM/mlc-llm/build.py", line 270, in mod_transform_before_build
new_params = utils.transform_params(mod_transform, model_params)
File "/home/gihun/IdeaProjects/230530_LLM/mlc-llm/mlc_llm/utils.py", line 155, in transform_params
ex = relax.build(mod_transform, target=target)
File "/home/gihun/IdeaProjects/230530_LLM/relax/python/tvm/relax/vm_build.py", line 338, in build
return _vmlink(builder, target, tir_mod, ext_libs, params, system_lib=system_lib)
File "/home/gihun/IdeaProjects/230530_LLM/relax/python/tvm/relax/vm_build.py", line 242, in _vmlink
lib = tvm.build(
File "/home/gihun/IdeaProjects/230530_LLM/relax/python/tvm/driver/build_module.py", line 281, in build
rt_mod_host = _driver_ffi.tir_to_runtime(annotated_mods, target_host)
File "/home/gihun/IdeaProjects/230530_LLM/relax/python/tvm/_ffi/_ctypes/packed_func.py", line 238, in call
raise get_last_ffi_error()
tvm._ffi.base.TVMError: Traceback (most recent call last):
7: TVMFuncCall
6: tvm::runtime::PackedFuncObj::Extractor<tvm::runtime::PackedFuncSubObj<tvm::runtime::TypedPackedFunc<tvm::runtime::Module (tvm::runtime::Map<tvm::Target, tvm::IRModule, void, void> const&, tvm::Target)>::AssignTypedLambda<tvm::__mk_TVM22::{lambda(tvm::runtime::Map<tvm::Target, tvm::IRModule, void, void> const&, tvm::Target)#1}>(tvm::__mk_TVM22::{lambda(tvm::runtime::Map<tvm::Target, tvm::IRModule, void, void> const&, tvm::Target)#1}, std::__cxx11::basic_string<char, std::char_traits, std::allocator >)::{lambda(tvm::runtime::TVMArgs const&, tvm::runtime::TVMRetValue*)#1}> >::Call(tvm::runtime::PackedFuncObj const*, std::__cxx11::basic_string<char, std::char_traits, std::allocator >, tvm::runtime::TVMRetValue)
5: tvm::TIRToRuntime(tvm::runtime::Map<tvm::Target, tvm::IRModule, void, void> const&, tvm::Target const&)
4: tvm::codegen::Build(tvm::IRModule, tvm::Target)
3: _ZN3tvm7runtime13PackedFuncObj
2: tvm::runtime::TypedPackedFunc<tvm::runtime::Module (tvm::IRModule, tvm::Target)>::AssignTypedLambda<tvm::runtime::Module ()(tvm::IRModule, tvm::Target)>(tvm::runtime::Module ()(tvm::IRModule, tvm::Target), std::__cxx11::basic_string<char, std::char_traits, std::allocator >)::{lambda(tvm::runtime::TVMArgs const&, tvm::runtime::TVMRetValue*)#1}::operator()(tvm::runtime::TVMArgs const&, tvm::runtime::TVMRetValue*) const
1: tvm::codegen::BuildCUDA(tvm::IRModule, tvm::Target)
0: tvm::runtime::PackedFuncObj::Extractor<tvm::runtime::PackedFuncSubObj<TVMFuncCreateFromCFunc::{lambda(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)#2}> >::Call(tvm::runtime::PackedFuncObj const*, tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) [clone .cold]
File "/home/gihun/IdeaProjects/230530_LLM/relax/python/tvm/_ffi/_ctypes/packed_func.py", line 82, in cfun
rv = local_pyfunc(*pyargs)
File "/home/gihun/IdeaProjects/230530_LLM/relax/python/tvm/contrib/nvcc.py", line 189, in tvm_callback_cuda_compile
ptx = compile_cuda(code, target_format="fatbin")
File "/home/gihun/IdeaProjects/230530_LLM/relax/python/tvm/contrib/nvcc.py", line 113, in compile_cuda
raise RuntimeError(msg)
RuntimeError: #if defined(CUDA_ARCH) && (CUDA_ARCH >= 530)

...

Thank you!

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionQuestion about the usage

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions