-
Couldn't load subscription status.
- Fork 1.8k
Description
"Getting Started with MLC-LLM using the Llama 2 Model" jupyter notebook is not working in colab?!
I ran the notebook and got the following error message
###########################
TVMError Traceback (most recent call last)
in <cell line: 1>()
----> 1 output = cm.generate(
2 prompt="When was Python released?",
3 progress_callback=StreamToStdout(callback_interval=2),
4 )
5 frames
tvm/_ffi/_cython/./packed_func.pxi in tvm._ffi._cy3.core.PackedFuncBase.call()
tvm/_ffi/_cython/./packed_func.pxi in tvm._ffi._cy3.core.FuncCall()
tvm/_ffi/_cython/./base.pxi in tvm._ffi._cy3.core.CHECK_CALL()
/workspace/mlc-llm/cpp/llm_chat.cc in mlc::llm::LLMChat::ForwardTokens(std::vector<int, std::allocator >, long)()
TVMError: Traceback (most recent call last):
9: mlc::llm::LLMChatModule::GetFunction(tvm::runtime::String const&, tvm::runtime::ObjectPtrtvm::runtime::Object const&)::{lambda(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)#5}::operator()(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
at /workspace/mlc-llm/cpp/llm_chat.cc:1576
8: mlc::llm::LLMChat::PrefillStep(std::__cxx11::basic_string<char, std::char_traits, std::allocator >, bool, bool, mlc::llm::PlaceInPrompt, tvm::runtime::String)
at /workspace/mlc-llm/cpp/llm_chat.cc:885
7: mlc::llm::LLMChat::ForwardTokens(std::vector<int, std::allocator >, long)
at /workspace/mlc-llm/cpp/llm_chat.cc:1272
6: tvm::runtime::relax_vm::VirtualMachineImpl::InvokeClosurePacked(tvm::runtime::ObjectRef const&, tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)
5: tvm::runtime::PackedFuncObj::Extractor<tvm::runtime::PackedFuncSubObj<tvm::runtime::relax_vm::VirtualMachineImpl::GetClosureInternal(tvm::runtime::String const&, bool)::{lambda(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)#1}> >::Call(tvm::runtime::PackedFuncObj const*, tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)
4: tvm::runtime::relax_vm::VirtualMachineImpl::InvokeBytecode(long, std::vector<tvm::runtime::TVMRetValue, std::allocatortvm::runtime::TVMRetValue > const&)
3: tvm::runtime::relax_vm::VirtualMachineImpl::RunLoop()
2: tvm::runtime::relax_vm::VirtualMachineImpl::RunInstrCall(tvm::runtime::relax_vm::VMFrame*, tvm::runtime::relax_vm::Instruction)
1: tvm::runtime::PackedFuncObj::Extractor<tvm::runtime::PackedFuncSubObj<tvm::runtime::WrapPackedFunc(int ()(TVMValue, int*, int, TVMValue*, int*, void*), tvm::runtime::ObjectPtrtvm::runtime::Object const&)::{lambda(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)#1}> >::Call(tvm::runtime::PackedFuncObj const*, tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)
0: _ZN3tvm7runtime6deta
3: _ZN3tvm7runtime13PackedFuncObj9ExtractorINS0_16PackedFuncSubObjIZNS0_6detail17PackFuncVoidAddr_ILi8ENS0_15CUDAWrappedFuncEEENS0_10PackedFuncET0_RKSt6vectorINS4_1
2: tvm::runtime::CUDAWrappedFunc::operator()(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*, void**) const [clone .isra.0]
1: tvm::runtime::CUDAModuleNode::GetFunc(int, std::_cxx11::basic_string<char, std::char_traits, std::allocator > const&)
0: ZN3tvm7runtime6deta
File "/workspace/tvm/src/runtime/cuda/cuda_module.cc", line 110
File "/workspace/tvm/src/runtime/library_module.cc", line 78
CUDAError: cuModuleLoadData(&(module[device_id]), data.c_str()) failed with error: CUDA_ERROR_NO_BINARY_FOR_GPU
