[Feature] internlm2_5-20b-chat支持吗？我部署后能跑起来，但输出一团糟 #2366

zhangjiekui · 2024-08-24T09:25:40Z

Motivation

internlm2_5-20b-chat支持

Related resources

No response

Additional context

No response

lvhan028 · 2024-08-24T12:05:17Z

支持。
烦请提供下 lmdeploy check_env的结果
以及复现demo

zhangjiekui · 2024-08-26T00:15:48Z

##lmdeploy check_env
`
sys.platform: linux
Python: 3.10.12 (main, Jul 29 2024, 16:56:48) [GCC 11.4.0]
CUDA available: True
MUSA available: False
numpy_random_seed: 2147483648
GPU 0,1,2,3: NVIDIA GeForce RTX 4090
CUDA_HOME: /usr
NVCC: Cuda compilation tools, release 11.5, V11.5.119
GCC: x86_64-linux-gnu-gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
PyTorch: 2.4.0+cu121
PyTorch compiling details: PyTorch built with:

GCC 9.3
C++ Version: 201703
Intel(R) oneAPI Math Kernel Library Version 2022.2-Product Build 20220804 for Intel(R) 64 architecture applications
Intel(R) MKL-DNN v3.4.2 (Git Hash 1137e04ec0b5251ca2b4400a4fd3c667ce843d67)
OpenMP 201511 (a.k.a. OpenMP 4.5)
LAPACK is enabled (usually provided by MKL)
NNPACK is enabled
CPU capability usage: AVX512
CUDA Runtime 12.1
NVCC architecture flags: -gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_90,code=sm_90
CuDNN 90.1 (built against CUDA 12.4)
Magma 2.6.1
Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=12.1, CUDNN_VERSION=9.1.0, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=0 -fabi-version=11 -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wsuggest-override -Wno-psabi -Wno-error=pedantic -Wno-error=old-style-cast -Wno-missing-braces -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=2.4.0, USE_CUDA=ON, USE_CUDNN=ON, USE_CUSPARSELT=1, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_GLOO=ON, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=1, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, USE_ROCM_KERNEL_ASSERT=OFF,

TorchVision: 0.19.0+cu121
LMDeploy: 0.5.3+
transformers: 4.43.3
gradio: 4.26.0
fastapi: 0.110.3
pydantic: 2.6.4
triton: 3.0.0
NVIDIA Topology:
GPU0 GPU1 GPU2 GPU3 CPU Affinity NUMA Affinity GPU NUMA ID
GPU0 X NODE SYS SYS 0-11,24-35 0 N/A
GPU1 NODE X SYS SYS 0-11,24-35 0 N/A
GPU2 SYS SYS X NODE 12-23,36-47 1 N/A
GPU3 SYS SYS NODE X 12-23,36-47 1 N/A

Legend:

X = Self
SYS = Connection traversing PCIe as well as the SMP interconnect between NUMA nodes (e.g., QPI/UPI)
NODE = Connection traversing PCIe as well as the interconnect between PCIe Host Bridges within a NUMA node
PHB = Connection traversing PCIe as well as a PCIe Host Bridge (typically the CPU)
PXB = Connection traversing multiple PCIe bridges (without traversing the PCIe Host Bridge)
PIX = Connection traversing at most a single PCIe bridge
NV# = Connection traversing a bonded set of # NVLinks
`

zhangjiekui · 2024-08-26T00:55:12Z

因为我是集成在langchain ChatOpenAI中使用的：
`

ChatOpenAI(
model_name=self.llm_model_name,
openai_api_base=self.llm_model_base_url,
openai_api_key= self.api_key,
streaming=streaming,
verbose=True,
temperature=temperature,
max_tokens=max_tokens
)
`

zhangjiekui · 2024-08-26T00:56:59Z

因此我只能提供lmdeploy的INFO日志：
`

INFO: 10.1.150.105:36416 - "POST /v1/chat/completions HTTP/1.1" 200 OK
2024-08-26 00:44:46,880 - lmdeploy - INFO - prompt='<|im_start|>system\nSolve/Answer user's questions as best as you can in the language the user asked, preferred/default to Simplified Chinese. You have access to the following tools:get_flight_number,WEB_SEARCH,CODE_GENERATION,CODE_EXECUTION,VECTORSTORE.You are allowed to make multiple calls (either together or in sequence).Choice Guidelines: 'CODE_GENERATION' is preferred rather than 'python_repl_tool' for drawing images, plotting graphs/charts, data file processing/analyzing/saving.<|im_end|>\n<|im_start|>system name=<|plugin|>\n[{"description": "根据输入的出发地(origin),目的地(destination),日期(date),返回航班号。重要:当前日期是2024-08-26，日期(date)只接受python datetime.date类型输入。如果日期是不确定的描述，如明天、后天等，请先使用工具'python_repl_tool'进行计算，并且返回python datetime.date类型结果，具体格式示例如下：2024-04-03。\nArgs:\n origin:(str) 出发地/城市，例如合肥\n destination:(str) 目的地/城市，例如北京\n date:(date) 飞行日期，例如2024-08-03.如果日期是不确定的描述，如明天、后天等，请先调用工具'python_repl_tool'或者'CODE_GENERATION'进行计算，并且返回python datetime.date类型结果", "name": "get_flight_number", "parameters": {"type": "object", "properties": {"origin": {"type": "string"}, "destination": {"type": "string"}, "date": {"type": "string", "format": "date"}}, "required": ["origin", "destination", "date"]}}, {"description": "The internet. Use web_search for questions that are related to anything else than agent memory , prompt engineering, and adversarial attacks, 特别是2024年及以后相关的问题及发生的事情.", "name": "WEB_SEARCH", "parameters": {"type": "object", "properties": {"query": {"description": "The query to use when searching the internet.", "type": "string"}}, "required": ["query"]}}, {"description": "Code Assistant/Code generation . Use programming code (preferred Python language) to slove a user question, if the question related to [code generating, file processing/analyzing, data exploring/analyzing/visualizing].", "name": "CODE_GENERATION", "parameters": {"type": "object", "properties": {"query": {"description": "The query to use when need to generate python code.", "type": "string"}}, "required": ["query"]}}, {"description": "Code execution/Code intepreter. When user providing a code snippet and asking for running/executing, w or w/o returning the result.", "name": "CODE_EXECUTION", "parameters": {"type": "object", "properties": {"query": {"description": "The query to use when asking for running/executing a code snippet.", "type": "string"}}, "required": ["query"]}}, {"description": "A vectorstore/knowledge base containing documents related to 晶奇公司、安徽晶奇网络科技股份有限公司、医疗卫生、医疗健康、医疗服务、医疗政策、medical treatment & health. 凡是与晶奇公司、安徽晶奇网络科技股份有限公司及医疗健康相关的问题，都必须使用vectorstore工具。MUST use the vectorstore for questions on these topics.", "name": "VECTORSTORE", "parameters": {"type": "object", "properties": {"query": {"description": "The query to use when searching the vectorstore.", "type": "string"}}, "required": ["query"]}}]<|im_end|>\n<|im_start|>user\n请查询明天从合肥到上海的航班,以及南昌到北京的航班<|im_end|>\n<|im_start|>assistant\n', gen_config=EngineGenerationConfig(n=1, max_new_tokens=30000, top_p=1.0, top_k=40, temperature=0.01, repetition_penalty=1.0, ignore_eos=False, random_seed=17805185034465029694, stop_words=[92542, 92540], bad_words=None, min_new_tokens=None, skip_special_tokens=False, logprobs=None), prompt_token_id=[1, 92543, 9081, 364, 287, 4105, 301, 16272, 1341, 725, 4917, 569, 1999, 569, 629, 777, 435, 410, 4287, 410, 1341, 4751, 328, 15019, 29305, 442, 59870, 2019, 8588, 281, 1592, 746, 2775, 442, 410, 2863, 7521, 25331, 893, 4303, 5657, 328, 18985, 33780, 11442, 3043, 18839, 3656, 11442, 3043, 38076, 28951, 328, 43285, 43530, 38270, 657, 5578, 442, 1426, 5408, 6888, 451, 49152, 3942, 607, 435, 8636, 699, 25148, 46821, 334, 495, 14995, 18839, 3656, 259, 505, 15019, 4913, 1233, 495, 12819, 1433, 631, 23094, 259, 500, 13470, 5494, 328, 43487, 38804, 21513, 7188, 328, 955, 1177, 8826, 301, 43300, 20555, 2849, 2468, 281, 92542, 364, 92543, 9081, 963, 310, 92538, 364, 336, 5071, 4847, 921, 461, 68420, 68412, 60354, 71321, 60415, 44880, 833, 76581, 47433, 833, 69271, 12121, 833, 70121, 77201, 60675, 60355, 68372, 334, 69672, 69271, 60357, 638, 1311, 285, 2418, 285, 1743, 60353, 69271, 12121, 313, 60539, 68761, 12819, 9008, 10042, 68517, 68412, 60355, 68263, 69271, 69153, 90743, 69401, 60353, 60407, 70944, 60359, 82662, 60455, 60353, 60836, 60609, 68271, 68270, 259, 12819, 1433, 631, 23094, 259, 68274, 68679, 60353, 68614, 70121, 12819, 9008, 10042, 68517, 68501, 60353, 68592, 68878, 76885, 68411, 60387, 638, 1311, 285, 2469, 285, 2934, 60355, 1849, 4275, 334, 1849, 388, 8745, 3432, 626, 313, 262, 71321, 60415, 301, 68494, 60353, 69192, 73831, 1849, 388, 18139, 3432, 626, 313, 262, 76581, 301, 68494, 60353, 69192, 68516, 1849, 388, 1170, 3432, 1170, 313, 262, 71709, 69271, 60353, 262, 69192, 638, 1311, 285, 2418, 285, 2934, 281, 68263, 69271, 69153, 90743, 69401, 60353, 60407, 70944, 60359, 82662, 60455, 60353, 60836, 60609, 76395, 68270, 259, 12819, 1433, 631, 23094, 259, 68319, 259, 14995, 18839, 3656, 259, 68274, 68679, 60353, 68614, 70121, 12819, 9008, 10042, 68517, 68501, 628, 461, 738, 921, 461, 586, 893, 4303, 5657, 628, 461, 13927, 921, 5371, 1459, 921, 461, 1850, 628, 461, 13333, 921, 5371, 8745, 921, 5371, 1459, 921, 461, 1054, 14484, 461, 18139, 921, 5371, 1459, 921, 461, 1054, 14484, 461, 1170, 921, 5371, 1459, 921, 461, 1054, 628, 461, 2394, 921, 461, 1170, 9338, 2291, 461, 6436, 921, 4544, 8745, 628, 461, 18139, 628, 461, 1170, 1487, 38002, 5371, 4847, 921, 461, 918, 7742, 281, 5602, 3644, 10863, 500, 4917, 560, 657, 5594, 442, 4271, 902, 1233, 8447, 5097, 1298, 10069, 14800, 328, 454, 28750, 42369, 8914, 328, 262, 69764, 638, 1311, 60372, 60543, 68458, 68524, 68804, 60543, 68526, 68899, 10603, 461, 738, 921, 461, 18985, 33780, 628, 461, 13927, 921, 5371, 1459, 921, 461, 1850, 628, 461, 13333, 921, 5371, 1779, 921, 5371, 4847, 921, 461, 918, 3402, 442, 1130, 1119, 15164, 410, 7742, 10603, 461, 1459, 921, 461, 1054, 9338, 2291, 461, 6436, 921, 4544, 1779, 1487, 38002, 5371, 4847, 921, 461, 2229, 21624, 301, 2229, 9600, 790, 5602, 15600, 2189, 451, 1877, 5710, 13175, 4287, 313, 442, 17742, 717, 395, 1341, 3568, 328, 552, 410, 3568, 5594, 442, 640, 2000, 23492, 328, 1177, 8826, 301, 43300, 20555, 328, 955, 24341, 301, 43300, 20555, 301, 29526, 5008, 1074, 628, 461, 738, 921, 461, 14995, 18839, 3656, 628, 461, 13927, 921, 5371, 1459, 921, 461, 1850, 628, 461, 13333, 921, 5371, 1779, 921, 5371, 4847, 921, 461, 918, 3402, 442, 1130, 1119, 1329, 442, 7076, 10273, 2189, 10603, 461, 1459, 921, 461, 1054, 9338, 2291, 461, 6436, 921, 4544, 1779, 1487, 38002, 5371, 4847, 921, 461, 2229, 11471, 301, 2229, 658, 880, 396, 596, 281, 3363, 1341, 8373, 395, 2189, 42699, 454, 10300, 500, 4463, 301, 11897, 10748, 328, 420, 607, 420, 20449, 13593, 410, 1245, 10603, 461, 738, 921, 461, 14995, 38076, 28951, 628, 461, 13927, 921, 5371, 1459, 921, 461, 1850, 628, 461, 13333, 921, 5371, 1779, 921, 5371, 4847, 921, 461, 918, 3402, 442, 1130, 1119, 10300, 500, 4463, 301, 11897, 10748, 395, 2189, 42699, 10603, 461, 1459, 921, 461, 1054, 9338, 2291, 461, 6436, 921, 4544, 1779, 1487, 38002, 5371, 4847, 921, 461, 290, 4782, 4474, 14249, 50356, 2483, 8617, 9423, 5594, 442, 262, 61954, 61112, 68280, 60359, 70239, 61954, 61112, 68441, 68636, 73067, 60359, 69432, 68890, 60359, 69432, 68515, 60359, 69432, 68320, 60359, 69432, 68843, 60359, 2213, 1076, 6535, 741, 2984, 281, 262, 83860, 60510, 61954, 61112, 68280, 60359, 70239, 61954, 61112, 68441, 68636, 73067, 60543, 69432, 68515, 68524, 68804, 60353, 92146, 68271, 3380, 4474, 68270, 60355, 307, 8685, 1130, 410, 4782, 4474, 500, 4917, 519, 1639, 13485, 10603, 461, 738, 921, 461, 43285, 43530, 628, 461, 13927, 921, 5371, 1459, 921, 461, 1850, 628, 461, 13333, 921, 5371, 1779, 921, 5371, 4847, 921, 461, 918, 3402, 442, 1130, 1119, 15164, 410, 4782, 4474, 10603, 461, 1459, 921, 461, 1054, 9338, 2291, 461, 6436, 921, 4544, 1779, 1487, 3578, 332, 92542, 364, 92543, 1008, 364, 60836, 69358, 70944, 60577, 73831, 60378, 68589, 60354, 77201, 328, 68375, 75871, 60378, 88804, 77201, 92542, 364, 92543, 525, 11353, 364], adapter_name=None.
2024-08-26 00:44:46,880 - lmdeploy - INFO - session_id=14, history_tokens=0, input_tokens=803, max_new_tokens=30000, seq_start=True, seq_end=True, step=0, prep=True
2024-08-26 00:44:46,880 - lmdeploy - INFO - Register stream callback for 14
[TM][INFO] [forward] Enqueue requests
[TM][INFO] [forward] Wait for requests to complete ...
[TM][INFO] [ProcessInferRequests] Request for 14 received.
[TM][INFO] [Forward] [0, 1), dc=0, pf=1, sum_q=803, sum_k=803, max_q=803, max_k=803
[TM][INFO] ------------------------- step = 810 -------------------------
[TM][INFO] ------------------------- step = 820 -------------------------
[TM][INFO] ------------------------- step = 830 -------------------------
[TM][INFO] ------------------------- step = 840 -------------------------
[TM][INFO] ------------------------- step = 850 -------------------------
[TM][INFO] ------------------------- step = 860 -------------------------
[TM][INFO] ------------------------- step = 870 -------------------------
[TM][INFO] ------------------------- step = 880 -------------------------
[TM][INFO] ------------------------- step = 890 -------------------------
[TM][INFO] ------------------------- step = 900 -------------------------
[TM][INFO] ------------------------- step = 910 -------------------------
[TM][INFO] ------------------------- step = 920 -------------------------
[TM][INFO] [Interrupt] slot = 0, id = 14
[TM][INFO] [forward] Request completed for 14
2024-08-26 00:44:49,126 - lmdeploy - INFO - UN-register stream callback for 14
`

zhangjiekui · 2024-08-26T01:00:09Z

但INFO日志中没办法看到模型产生的输出结果（不清楚是模型本身没输出，还只是INFO日志没有打印出输出）

zhangjiekui · 2024-08-26T01:01:52Z

然后我通过langsmith来跟踪：

ValueError('No generations found in stream.')Traceback (most recent call last):

File "/root/.virtualenvs/test/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 603, in generate
self._generate_with_cache(

File "/root/.virtualenvs/test/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 825, in _generate_with_cache
result = self._generate(

File "/root/.virtualenvs/test/lib/python3.10/site-packages/langchain_openai/chat_models/base.py", line 534, in _generate
return generate_from_stream(stream_iter)

File "/root/.virtualenvs/test/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 107, in generate_from_stream
raise ValueError("No generations found in stream.")

ValueError: No generations found in stream.

zhangjiekui · 2024-08-26T01:16:07Z

发现可能是stream的问题，然后设置streaming=False，可以跑出结果来，但又发现错误：

lmdeploy/model.py", line 515, in messages2prompt
begin = box_map[role]
KeyError: 'tool'

推测应该是我的应用里有tool_call , 并返回了langchain的ToolMessage，里面的role = 'tool'

zhangjiekui · 2024-08-26T01:18:43Z

所以,我们与openai的一致性兼容到底是哪些一致，还有哪些不一致呢？

zhangjiekui · 2024-08-26T01:34:59Z

通过修改：lmdeploy/model.py，增加tool = "tool"

`

    box_map = dict(user=self.user,
                   assistant=self.assistant,
                   system=self.system,
                   environment=self.environment,
                   tool = "tool")
    eox_map = dict(user=self.eoh,
                   assistant=self.eoa + self.separator,
                   system=self.eosys,
                   environment=self.eoenv,
                   tool = "tool")

`

终于可以跑起来了

zhangjiekui · 2024-08-26T01:42:26Z

再多运行几个测试，又发现很多问题，没办法继续了......

AllentDan · 2024-08-26T04:02:39Z

目前的 tool 功能只支持了 stream=False.

William4711 · 2024-08-30T02:02:52Z

我也發現了一樣的錯誤KeyError: 'tool'
測試過程

@tool
def GetWeather(location: str):
    """Get the current weather in a given location"""
    return "30 degrees Celsius"

query = "What's the weather like in San Francisco today?"
messages = [HumanMessage(query)]
tools = [GetWeather]
model_with_tools = llm.bind_tools(tools)
ans1 = model_with_tools.invoke(messages)

for tool_call in ans1.tool_calls:
    selected_tool = {"GetWeather":GetWeather}[tool_call["name"]]
    tool_output = selected_tool.invoke(tool_call["args"])
    messages.append(ToolMessage(tool_output, tool_call_id=tool_call["id"]))
    
ans1 = model_with_tools.invoke(messages)

在最後一行出錯
看似無法用ToolMessage

如果無法ToolMessage
要怎麼結合回傳的溫度
給llm做回覆呢?

2024-08-30 09:50:21 INFO: 172.18.0.1:36578 - "POST /v1/chat/completions HTTP/1.1" 200 OK
2024-08-30 09:50:32 INFO: 172.18.0.1:47358 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
2024-08-30 09:50:32 ERROR: Exception in ASGI application
2024-08-30 09:50:32 Traceback (most recent call last):
2024-08-30 09:50:32 File "/opt/py3/lib/python3.10/site-packages/uvicorn/protocols/http/h11_impl.py", line 406, in run_asgi
2024-08-30 09:50:32 result = await app( # type: ignore[func-returns-value]
2024-08-30 09:50:32 File "/opt/py3/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 70, in call
2024-08-30 09:50:32 return await self.app(scope, receive, send)
2024-08-30 09:50:32 File "/opt/py3/lib/python3.10/site-packages/fastapi/applications.py", line 1054, in call
2024-08-30 09:50:32 await super().call(scope, receive, send)
2024-08-30 09:50:32 File "/opt/py3/lib/python3.10/site-packages/starlette/applications.py", line 123, in call
2024-08-30 09:50:32 await self.middleware_stack(scope, receive, send)
2024-08-30 09:50:32 File "/opt/py3/lib/python3.10/site-packages/starlette/middleware/errors.py", line 186, in call
2024-08-30 09:50:32 raise exc
2024-08-30 09:50:32 File "/opt/py3/lib/python3.10/site-packages/starlette/middleware/errors.py", line 164, in call
2024-08-30 09:50:32 await self.app(scope, receive, _send)
2024-08-30 09:50:32 File "/opt/py3/lib/python3.10/site-packages/starlette/middleware/cors.py", line 85, in call
2024-08-30 09:50:32 await self.app(scope, receive, send)
2024-08-30 09:50:32 File "/opt/py3/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 65, in call
2024-08-30 09:50:32 await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
2024-08-30 09:50:32 File "/opt/py3/lib/python3.10/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
2024-08-30 09:50:32 raise exc
2024-08-30 09:50:32 File "/opt/py3/lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
2024-08-30 09:50:32 await app(scope, receive, sender)
2024-08-30 09:50:32 File "/opt/py3/lib/python3.10/site-packages/starlette/routing.py", line 754, in call
2024-08-30 09:50:32 await self.middleware_stack(scope, receive, send)
2024-08-30 09:50:32 File "/opt/py3/lib/python3.10/site-packages/starlette/routing.py", line 774, in app
2024-08-30 09:50:32 await route.handle(scope, receive, send)
2024-08-30 09:50:32 File "/opt/py3/lib/python3.10/site-packages/starlette/routing.py", line 295, in handle
2024-08-30 09:50:32 await self.app(scope, receive, send)
2024-08-30 09:50:32 File "/opt/py3/lib/python3.10/site-packages/starlette/routing.py", line 77, in app
2024-08-30 09:50:32 await wrap_app_handling_exceptions(app, request)(scope, receive, send)
2024-08-30 09:50:32 File "/opt/py3/lib/python3.10/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
2024-08-30 09:50:32 raise exc
2024-08-30 09:50:32 File "/opt/py3/lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
2024-08-30 09:50:32 await app(scope, receive, sender)
2024-08-30 09:50:32 File "/opt/py3/lib/python3.10/site-packages/starlette/routing.py", line 74, in app
2024-08-30 09:50:32 response = await f(request)
2024-08-30 09:50:32 File "/opt/py3/lib/python3.10/site-packages/fastapi/routing.py", line 297, in app
2024-08-30 09:50:32 raw_response = await run_endpoint_function(
2024-08-30 09:50:32 File "/opt/py3/lib/python3.10/site-packages/fastapi/routing.py", line 210, in run_endpoint_function
2024-08-30 09:50:32 return await dependant.call(**values)
2024-08-30 09:50:32 File "/opt/lmdeploy/lmdeploy/serve/openai/api_server.py", line 458, in chat_completions_v1
2024-08-30 09:50:32 async for res in result_generator:
2024-08-30 09:50:32 File "/opt/lmdeploy/lmdeploy/serve/async_engine.py", line 525, in generate
2024-08-30 09:50:32 prompt_input = await self._get_prompt_input(prompt,
2024-08-30 09:50:32 File "/opt/lmdeploy/lmdeploy/serve/async_engine.py", line 471, in _get_prompt_input
2024-08-30 09:50:32 prompt = chat_template.messages2prompt(prompt,
2024-08-30 09:50:32 File "/opt/lmdeploy/lmdeploy/model.py", line 866, in messages2prompt
2024-08-30 09:50:32 ret += f'{box_map[role]}{content}{eox_map[role]}'
2024-08-30 09:50:32 KeyError: 'tool'
2024-08-30 09:50:33 INFO: 172.18.0.1:47360 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
2024-08-30 09:50:33 ERROR: Exception in ASGI application
2024-08-30 09:50:33 Traceback (most recent call last):
2024-08-30 09:50:33 File "/opt/py3/lib/python3.10/site-packages/uvicorn/protocols/http/h11_impl.py", line 406, in run_asgi
2024-08-30 09:50:33 result = await app( # type: ignore[func-returns-value]
2024-08-30 09:50:33 File "/opt/py3/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 70, in call
2024-08-30 09:50:33 return await self.app(scope, receive, send)
2024-08-30 09:50:33 File "/opt/py3/lib/python3.10/site-packages/fastapi/applications.py", line 1054, in call
2024-08-30 09:50:33 await super().call(scope, receive, send)
2024-08-30 09:50:33 File "/opt/py3/lib/python3.10/site-packages/starlette/applications.py", line 123, in call
2024-08-30 09:50:33 await self.middleware_stack(scope, receive, send)
2024-08-30 09:50:33 File "/opt/py3/lib/python3.10/site-packages/starlette/middleware/errors.py", line 186, in call
2024-08-30 09:50:33 raise exc
2024-08-30 09:50:33 File "/opt/py3/lib/python3.10/site-packages/starlette/middleware/errors.py", line 164, in call
2024-08-30 09:50:33 await self.app(scope, receive, _send)
2024-08-30 09:50:33 File "/opt/py3/lib/python3.10/site-packages/starlette/middleware/cors.py", line 85, in call
2024-08-30 09:50:33 await self.app(scope, receive, send)
2024-08-30 09:50:33 File "/opt/py3/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 65, in call
2024-08-30 09:50:33 await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
2024-08-30 09:50:33 File "/opt/py3/lib/python3.10/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
2024-08-30 09:50:33 raise exc
2024-08-30 09:50:33 File "/opt/py3/lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
2024-08-30 09:50:33 await app(scope, receive, sender)
2024-08-30 09:50:33 File "/opt/py3/lib/python3.10/site-packages/starlette/routing.py", line 754, in call
2024-08-30 09:50:33 await self.middleware_stack(scope, receive, send)
2024-08-30 09:50:33 File "/opt/py3/lib/python3.10/site-packages/starlette/routing.py", line 774, in app
2024-08-30 09:50:33 await route.handle(scope, receive, send)
2024-08-30 09:50:33 File "/opt/py3/lib/python3.10/site-packages/starlette/routing.py", line 295, in handle
2024-08-30 09:50:33 await self.app(scope, receive, send)
2024-08-30 09:50:33 File "/opt/py3/lib/python3.10/site-packages/starlette/routing.py", line 77, in app
2024-08-30 09:50:33 await wrap_app_handling_exceptions(app, request)(scope, receive, send)
2024-08-30 09:50:33 File "/opt/py3/lib/python3.10/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
2024-08-30 09:50:33 raise exc
2024-08-30 09:50:33 File "/opt/py3/lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
2024-08-30 09:50:33 await app(scope, receive, sender)
2024-08-30 09:50:33 File "/opt/py3/lib/python3.10/site-packages/starlette/routing.py", line 74, in app
2024-08-30 09:50:33 response = await f(request)
2024-08-30 09:50:33 File "/opt/py3/lib/python3.10/site-packages/fastapi/routing.py", line 297, in app
2024-08-30 09:50:33 raw_response = await run_endpoint_function(
2024-08-30 09:50:33 File "/opt/py3/lib/python3.10/site-packages/fastapi/routing.py", line 210, in run_endpoint_function
2024-08-30 09:50:33 return await dependant.call(**values)
2024-08-30 09:50:33 File "/opt/lmdeploy/lmdeploy/serve/openai/api_server.py", line 458, in chat_completions_v1
2024-08-30 09:50:33 async for res in result_generator:
2024-08-30 09:50:33 File "/opt/lmdeploy/lmdeploy/serve/async_engine.py", line 525, in generate
2024-08-30 09:50:33 prompt_input = await self._get_prompt_input(prompt,
2024-08-30 09:50:33 File "/opt/lmdeploy/lmdeploy/serve/async_engine.py", line 471, in _get_prompt_input
2024-08-30 09:50:33 prompt = chat_template.messages2prompt(prompt,
2024-08-30 09:50:33 File "/opt/lmdeploy/lmdeploy/model.py", line 866, in messages2prompt
2024-08-30 09:50:33 ret += f'{box_map[role]}{content}{eox_map[role]}'
2024-08-30 09:50:33 KeyError: 'tool'
2024-08-30 09:50:35 INFO: 172.18.0.1:47366 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
2024-08-30 09:50:35 ERROR: Exception in ASGI application
2024-08-30 09:50:35 Traceback (most recent call last):
2024-08-30 09:50:35 File "/opt/py3/lib/python3.10/site-packages/uvicorn/protocols/http/h11_impl.py", line 406, in run_asgi
2024-08-30 09:50:35 result = await app( # type: ignore[func-returns-value]
2024-08-30 09:50:35 File "/opt/py3/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 70, in call
2024-08-30 09:50:35 return await self.app(scope, receive, send)
2024-08-30 09:50:35 File "/opt/py3/lib/python3.10/site-packages/fastapi/applications.py", line 1054, in call
2024-08-30 09:50:35 await super().call(scope, receive, send)
2024-08-30 09:50:35 File "/opt/py3/lib/python3.10/site-packages/starlette/applications.py", line 123, in call
2024-08-30 09:50:35 await self.middleware_stack(scope, receive, send)
2024-08-30 09:50:35 File "/opt/py3/lib/python3.10/site-packages/starlette/middleware/errors.py", line 186, in call
2024-08-30 09:50:35 raise exc
2024-08-30 09:50:35 File "/opt/py3/lib/python3.10/site-packages/starlette/middleware/errors.py", line 164, in call
2024-08-30 09:50:35 await self.app(scope, receive, _send)
2024-08-30 09:50:35 File "/opt/py3/lib/python3.10/site-packages/starlette/middleware/cors.py", line 85, in call
2024-08-30 09:50:35 await self.app(scope, receive, send)
2024-08-30 09:50:35 File "/opt/py3/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 65, in call
2024-08-30 09:50:35 await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
2024-08-30 09:50:35 File "/opt/py3/lib/python3.10/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
2024-08-30 09:50:35 raise exc
2024-08-30 09:50:35 File "/opt/py3/lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
2024-08-30 09:50:35 await app(scope, receive, sender)
2024-08-30 09:50:35 File "/opt/py3/lib/python3.10/site-packages/starlette/routing.py", line 754, in call
2024-08-30 09:50:35 await self.middleware_stack(scope, receive, send)
2024-08-30 09:50:35 File "/opt/py3/lib/python3.10/site-packages/starlette/routing.py", line 774, in app
2024-08-30 09:50:35 await route.handle(scope, receive, send)
2024-08-30 09:50:35 File "/opt/py3/lib/python3.10/site-packages/starlette/routing.py", line 295, in handle
2024-08-30 09:50:35 await self.app(scope, receive, send)
2024-08-30 09:50:35 File "/opt/py3/lib/python3.10/site-packages/starlette/routing.py", line 77, in app
2024-08-30 09:50:35 await wrap_app_handling_exceptions(app, request)(scope, receive, send)
2024-08-30 09:50:35 File "/opt/py3/lib/python3.10/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
2024-08-30 09:50:35 raise exc
2024-08-30 09:50:35 File "/opt/py3/lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
2024-08-30 09:50:35 await app(scope, receive, sender)
2024-08-30 09:50:35 File "/opt/py3/lib/python3.10/site-packages/starlette/routing.py", line 74, in app
2024-08-30 09:50:35 response = await f(request)
2024-08-30 09:50:35 File "/opt/py3/lib/python3.10/site-packages/fastapi/routing.py", line 297, in app
2024-08-30 09:50:35 raw_response = await run_endpoint_function(
2024-08-30 09:50:35 File "/opt/py3/lib/python3.10/site-packages/fastapi/routing.py", line 210, in run_endpoint_function
2024-08-30 09:50:35 return await dependant.call(**values)
2024-08-30 09:50:35 File "/opt/lmdeploy/lmdeploy/serve/openai/api_server.py", line 458, in chat_completions_v1
2024-08-30 09:50:35 async for res in result_generator:
2024-08-30 09:50:35 File "/opt/lmdeploy/lmdeploy/serve/async_engine.py", line 525, in generate
2024-08-30 09:50:35 prompt_input = await self._get_prompt_input(prompt,
2024-08-30 09:50:35 File "/opt/lmdeploy/lmdeploy/serve/async_engine.py", line 471, in _get_prompt_input
2024-08-30 09:50:35 prompt = chat_template.messages2prompt(prompt,
2024-08-30 09:50:35 File "/opt/lmdeploy/lmdeploy/model.py", line 866, in messages2prompt
2024-08-30 09:50:35 ret += f'{box_map[role]}{content}{eox_map[role]}'
2024-08-30 09:50:35 KeyError: 'tool'

AllentDan · 2024-08-30T02:10:48Z

@William4711 这个错误看起来是传入了一个 tool 的角色到 llama3.1 模型，llama3.1 没有这个角色。支持的有 user，assistant, ipython, system

William4711 · 2024-08-30T02:39:42Z

@William4711 这个错误看起来是传入了一个 tool 的角色到 llama3.1 模型，llama3.1 没有这个角色。支持的有 user，assistant, ipython, system

ToolMessage來自於Langchain
Langchain有關程序呼叫相關的message有ToolMessage與FunctionMessage
用了
ToolMessage會報 KeyError: 'tool'
FunctionMessage會報 KeyError: 'function'

所以llama3.1應該是角色ipython來搭配程序呼叫回覆使用
我再來測試看看
thanks

lvhan028 · 2024-10-09T03:23:48Z

@William4711 could you help check the PR #2558?

github-actions · 2024-10-17T02:39:39Z

This issue is marked as stale because it has been marked as invalid or awaiting response for 7 days without any further response. It will be closed in 5 days if the stale label is not removed or if there is no further response.

github-actions · 2024-10-22T02:40:39Z

This issue is closed because it has been stale for 5 days. Please open a new issue if you have similar issues or you have any new updates now.

lvhan028 assigned AllentDan Aug 26, 2024

AllentDan mentioned this issue Oct 8, 2024

[Bug] Providing tool response back to llm for output generation is broken for llama3.1 8B #2542

Open

3 tasks

lvhan028 added the awaiting response label Oct 9, 2024

github-actions bot added the Stale label Oct 17, 2024

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Oct 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] internlm2_5-20b-chat支持吗？我部署后能跑起来，但输出一团糟 #2366

[Feature] internlm2_5-20b-chat支持吗？我部署后能跑起来，但输出一团糟 #2366

zhangjiekui commented Aug 24, 2024

lvhan028 commented Aug 24, 2024

zhangjiekui commented Aug 26, 2024

zhangjiekui commented Aug 26, 2024

zhangjiekui commented Aug 26, 2024

zhangjiekui commented Aug 26, 2024

zhangjiekui commented Aug 26, 2024

zhangjiekui commented Aug 26, 2024

zhangjiekui commented Aug 26, 2024

zhangjiekui commented Aug 26, 2024 •

edited

Loading

zhangjiekui commented Aug 26, 2024

AllentDan commented Aug 26, 2024

William4711 commented Aug 30, 2024 •

edited

Loading

AllentDan commented Aug 30, 2024

William4711 commented Aug 30, 2024 •

edited

Loading

lvhan028 commented Oct 9, 2024

github-actions bot commented Oct 17, 2024

github-actions bot commented Oct 22, 2024

[Feature] internlm2_5-20b-chat支持吗？我部署后能跑起来，但输出一团糟 #2366

[Feature] internlm2_5-20b-chat支持吗？我部署后能跑起来，但输出一团糟 #2366

Comments

zhangjiekui commented Aug 24, 2024

Motivation

Related resources

Additional context

lvhan028 commented Aug 24, 2024

zhangjiekui commented Aug 26, 2024

zhangjiekui commented Aug 26, 2024

zhangjiekui commented Aug 26, 2024

zhangjiekui commented Aug 26, 2024

zhangjiekui commented Aug 26, 2024

zhangjiekui commented Aug 26, 2024

zhangjiekui commented Aug 26, 2024

zhangjiekui commented Aug 26, 2024 • edited Loading

zhangjiekui commented Aug 26, 2024

AllentDan commented Aug 26, 2024

William4711 commented Aug 30, 2024 • edited Loading

AllentDan commented Aug 30, 2024

William4711 commented Aug 30, 2024 • edited Loading

lvhan028 commented Oct 9, 2024

github-actions bot commented Oct 17, 2024

github-actions bot commented Oct 22, 2024

zhangjiekui commented Aug 26, 2024 •

edited

Loading

William4711 commented Aug 30, 2024 •

edited

Loading

William4711 commented Aug 30, 2024 •

edited

Loading