You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I try use lightllm with baichuan13B model, but get below error. I cannot find any TrainingArguments in the code, so is there anything else need to be configured?... The same checkpoint could be loaded by vllm and works well...
load model error: Can't get attribute 'TrainingArguments' on <module 'lightllm.server.api_server' from '/usr/local/lib/python3.10/dist-packages/lightllm-1.0.0-py3.10.egg/lightllm/server/api_server.py'> Can't get attribute 'TrainingArguments' on <module 'lightllm.server.api_server' from '/usr/local/lib/python3.10/dist-packages/lightllm-1.0.0-py3.10.egg/lightllm/server/api_server.py'> <class 'AttributeError'>
################
load model error: Can't get attribute 'TrainingArguments' on <module 'lightllm.server.api_server' from '/usr/local/lib/python3.10/dist-packages/lightllm-1.0.0-py3.10.egg/lightllm/server/api_server.py'> Can't get attribute 'TrainingArguments' on <module 'lightllm.server.api_server' from '/usr/local/lib/python3.10/dist-packages/lightllm-1.0.0-py3.10.egg/lightllm/server/api_server.py'> <class 'AttributeError'>
router init state: Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/lightllm-1.0.0-py3.10.egg/lightllm/server/router/manager.py", line 257, in start_router_process
asyncio.run(router.wait_to_model_ready())
File "/usr/lib/python3.10/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "uvloop/loop.pyx", line 1517, in uvloop.loop.Loop.run_until_complete
File "/usr/local/lib/python3.10/dist-packages/lightllm-1.0.0-py3.10.egg/lightllm/server/router/manager.py", line 62, in wait_to_model_ready
await asyncio.gather(*init_model_ret)
File "/usr/local/lib/python3.10/dist-packages/lightllm-1.0.0-py3.10.egg/lightllm/server/router/model_infer/model_rpc.py", line 211, in init_model
await ans
File "/usr/local/lib/python3.10/dist-packages/lightllm-1.0.0-py3.10.egg/lightllm/server/router/model_infer/model_rpc.py", line 189, in _func
return ans.value
File "/usr/local/lib/python3.10/dist-packages/rpyc-5.3.1-py3.10.egg/rpyc/core/async_.py", line 108, in value
raise self._obj
_get_exception_class.<locals>.Derived: Can't get attribute 'TrainingArguments' on <module 'lightllm.server.api_server' from '/usr/local/lib/python3.10/dist-packages/lightllm-1.0.0-py3.10.egg/lightllm/server/api_server.py'>
========= Remote Traceback (1) =========
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/rpyc-5.3.1-py3.10.egg/rpyc/core/protocol.py", line 359, in _dispatch_request
res = self._HANDLERS[handler](self, *args)
File "/usr/local/lib/python3.10/dist-packages/rpyc-5.3.1-py3.10.egg/rpyc/core/protocol.py", line 837, in _handle_call
return obj(*args, **dict(kwargs))
File "/usr/local/lib/python3.10/dist-packages/lightllm-1.0.0-py3.10.egg/lightllm/server/router/model_infer/model_rpc.py", line 77, in exposed_init_model
raise e
File "/usr/local/lib/python3.10/dist-packages/lightllm-1.0.0-py3.10.egg/lightllm/server/router/model_infer/model_rpc.py", line 63, in exposed_init_model
self.model = Baichuan13bTpPartModel(rank_id, world_size, weight_dir, max_total_token_num, load_way, mode)
File "/usr/local/lib/python3.10/dist-packages/lightllm-1.0.0-py3.10.egg/lightllm/models/baichuan13b/model.py", line 21, in __init__
super().__init__(tp_rank, world_size, weight_dir, max_total_token_num, load_way, mode)
File "/usr/local/lib/python3.10/dist-packages/lightllm-1.0.0-py3.10.egg/lightllm/models/llama/model.py", line 30, in __init__
super().__init__(tp_rank, world_size, weight_dir, max_total_token_num, load_way, mode)
File "/usr/local/lib/python3.10/dist-packages/lightllm-1.0.0-py3.10.egg/lightllm/common/basemodel/basemodel.py", line 37, in __init__
self._init_weights()
File "/usr/local/lib/python3.10/dist-packages/lightllm-1.0.0-py3.10.egg/lightllm/common/basemodel/basemodel.py", line 70, in _init_weights
load_hf_weights(
File "/usr/local/lib/python3.10/dist-packages/lightllm-1.0.0-py3.10.egg/lightllm/common/basemodel/layer_weights/hf_load_utils.py", line 25, in load_hf_weights
weights = torch.load(os.path.join(weight_dir, file_), 'cpu')
File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 789, in load
return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 1131, in _load
result = unpickler.load()
File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 1124, in find_class
return super().find_class(mod_name, name)
AttributeError: Can't get attribute 'TrainingArguments' on <module 'lightllm.server.api_server' from '/usr/local/lib/python3.10/dist-packages/lightllm-1.0.0-py3.10.egg/lightllm/server/api_server.py'>
The text was updated successfully, but these errors were encountered:
Hi,
I try use lightllm with baichuan13B model, but get below error. I cannot find any TrainingArguments in the code, so is there anything else need to be configured?... The same checkpoint could be loaded by vllm and works well...
The launch command is as:
Error log:
The text was updated successfully, but these errors were encountered: