You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Tue_Feb_27_16:19:38_PST_2024
Cuda compilation tools, release 12.4, V12.4.99
Build cuda_12.4.r12.4/compiler.33961263_0
transformers 4.48.0
torch 2.5.1
flashinfer-python 0.2.1.post2
vllm 0.6.6.post1
Python 3.11.11
Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?
docker / docker
pip install / 通过 pip install 安装
installation from source / 从源码安装
Version info / 版本信息
xinference, version 1.3.1
The command used to start Xinference / 用以启动 xinference 的命令
xinference-local --host 0.0.0.0 --port 9997
/root/anaconda3/envs/xinference/lib/python3.11/site-packages/xinference/model/llm/init.py:128: UserWarning: /root/.xinference/model/llm/glm4-9b.json has error, Invalid model URI /root/.cache/modelscope/hub/ZhipuAI/glm-4-9b-chat.
warnings.warn(f"{user_defined_llm_dir}/{f} has error, {e}")
2025-03-09 15:00:21,665 xinference.core.supervisor 87106 INFO Xinference supervisor 0.0.0.0:52065 started
2025-03-09 15:00:21,990 xinference.core.worker 87106 INFO Starting metrics export server at 0.0.0.0:None
2025-03-09 15:00:21,994 xinference.core.worker 87106 INFO Checking metrics export server...
2025-03-09 15:00:24,458 xinference.core.worker 87106 INFO Metrics server is started at: http://0.0.0.0:36679
2025-03-09 15:00:24,458 xinference.core.worker 87106 INFO Purge cache directory: /root/.xinference/cache
2025-03-09 15:00:24,461 xinference.core.utils 87106 INFO Remove empty directory: /root/.xinference/cache/glm4-9b-pytorch-9b
2025-03-09 15:00:24,462 xinference.core.worker 87106 INFO Connected to supervisor as a fresh worker
2025-03-09 15:00:24,473 xinference.core.worker 87106 INFO Xinference worker 0.0.0.0:52065 started
2025-03-09 15:00:27,843 xinference.api.restful_api 86158 INFO Starting Xinference at endpoint: http://0.0.0.0:9997
2025-03-09 15:00:28,285 uvicorn.error 86158 INFO Uvicorn running on http://0.0.0.0:9997 (Press CTRL+C to quit)
Reproduction / 复现过程
1、自定义的模型无法删除
2、挂载掉模型没有反应
这是.xinference目录下llm目录下qwen-2.5-7b.json信息
{"version": 1, "context_length": 32768, "model_name": "qwen-2.5-7b", "model_lang": ["en", "zh"], "model_ability": ["generate", "chat"], "model_description": "This is a custom model description.", "model_family": "qwen2.5-instruct", "model_specs": [{"model_format": "pytorch", "model_size_in_billions": 7, "quantizations": ["none"], "model_id": null, "model_hub": "huggingface", "model_uri": "/home/hnjj/diskdata/models/Qwen/Qwen2___5-7B-Instruct", "model_revision": null}], "chat_template": "{%- if tools %}\n {{- '<|im_start|>system\n' }}\n {%- if messages[0]['role'] == 'system' %}\n {{- messages[0]['content'] }}\n {%- else %}\n {{- 'You are Qwen, created by Alibaba Cloud. You are a helpful assistant.' }}\n {%- endif %}\n {{- "\n\n# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within XML tags:\n" }}\n {%- for tool in tools %}\n {{- "\n" }}\n {{- tool | tojson }}\n {%- endfor %}\n {{- "\n\n\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": , \"arguments\": }\n</tool_call><|im_end|>\n" }}\n{%- else %}\n {%- if messages[0]['role'] == 'system' %}\n {{- '<|im_start|>system\n' + messages[0]['content'] + '<|im_end|>\n' }}\n {%- else %}\n {{- '<|im_start|>system\nYou are Qwen, created by Alibaba Cloud. You are a helpful assistant.<|im_end|>\n' }}\n {%- endif %}\n{%- endif %}\n{%- for message in messages %}\n {%- if (message.role == "user") or (message.role == "system" and not loop.first) or (message.role == "assistant" and not message.tool_calls) %}\n {{- '<|im_start|>' + message.role + '\n' + message.content + '<|im_end|>' + '\n' }}\n {%- elif message.role == "assistant" %}\n {{- '<|im_start|>' + message.role }}\n {%- if message.content %}\n {{- '\n' + message.content }}\n {%- endif %}\n {%- for tool_call in message.tool_calls %}\n {%- if tool_call.function is defined %}\n {%- set tool_call = tool_call.function %}\n {%- endif %}\n {{- '\n<tool_call>\n{"name": "' }}\n {{- tool_call.name }}\n {{- '", "arguments": ' }}\n {{- tool_call.arguments | tojson }}\n {{- '}\n</tool_call>' }}\n {%- endfor %}\n {{- '<|im_end|>\n' }}\n {%- elif message.role == "tool" %}\n {%- if (loop.index0 == 0) or (messages[loop.index0 - 1].role != "tool") %}\n {{- '<|im_start|>user' }}\n {%- endif %}\n {{- '\n<tool_response>\n' }}\n {{- message.content }}\n {{- '\n</tool_response>' }}\n {%- if loop.last or (messages[loop.index0 + 1].role != "tool") %}\n {{- '<|im_end|>\n' }}\n {%- endif %}\n {%- endif %}\n{%- endfor %}\n{%- if add_generation_prompt %}\n {{- '<|im_start|>assistant\n' }}\n{%- endif %}\n", "stop_token_ids": [151643, 151644, 151645], "stop": ["<|endoftext|>", "<|im_start|>", "<|im_end|>"]}
Expected behavior / 期待表现
需要点击页面可以执行成功
The text was updated successfully, but these errors were encountered:
System Info / 系統信息
nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Tue_Feb_27_16:19:38_PST_2024
Cuda compilation tools, release 12.4, V12.4.99
Build cuda_12.4.r12.4/compiler.33961263_0
transformers 4.48.0
torch 2.5.1
flashinfer-python 0.2.1.post2
vllm 0.6.6.post1
Python 3.11.11
Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?
Version info / 版本信息
xinference, version 1.3.1
The command used to start Xinference / 用以启动 xinference 的命令
xinference-local --host 0.0.0.0 --port 9997
/root/anaconda3/envs/xinference/lib/python3.11/site-packages/xinference/model/llm/init.py:128: UserWarning: /root/.xinference/model/llm/glm4-9b.json has error, Invalid model URI /root/.cache/modelscope/hub/ZhipuAI/glm-4-9b-chat.
warnings.warn(f"{user_defined_llm_dir}/{f} has error, {e}")
2025-03-09 15:00:21,665 xinference.core.supervisor 87106 INFO Xinference supervisor 0.0.0.0:52065 started
2025-03-09 15:00:21,990 xinference.core.worker 87106 INFO Starting metrics export server at 0.0.0.0:None
2025-03-09 15:00:21,994 xinference.core.worker 87106 INFO Checking metrics export server...
2025-03-09 15:00:24,458 xinference.core.worker 87106 INFO Metrics server is started at: http://0.0.0.0:36679
2025-03-09 15:00:24,458 xinference.core.worker 87106 INFO Purge cache directory: /root/.xinference/cache
2025-03-09 15:00:24,461 xinference.core.utils 87106 INFO Remove empty directory: /root/.xinference/cache/glm4-9b-pytorch-9b
2025-03-09 15:00:24,462 xinference.core.worker 87106 INFO Connected to supervisor as a fresh worker
2025-03-09 15:00:24,473 xinference.core.worker 87106 INFO Xinference worker 0.0.0.0:52065 started
2025-03-09 15:00:27,843 xinference.api.restful_api 86158 INFO Starting Xinference at endpoint: http://0.0.0.0:9997
2025-03-09 15:00:28,285 uvicorn.error 86158 INFO Uvicorn running on http://0.0.0.0:9997 (Press CTRL+C to quit)
Reproduction / 复现过程
1、自定义的模型无法删除
2、挂载掉模型没有反应
这是.xinference目录下llm目录下qwen-2.5-7b.json信息
{"version": 1, "context_length": 32768, "model_name": "qwen-2.5-7b", "model_lang": ["en", "zh"], "model_ability": ["generate", "chat"], "model_description": "This is a custom model description.", "model_family": "qwen2.5-instruct", "model_specs": [{"model_format": "pytorch", "model_size_in_billions": 7, "quantizations": ["none"], "model_id": null, "model_hub": "huggingface", "model_uri": "/home/hnjj/diskdata/models/Qwen/Qwen2___5-7B-Instruct", "model_revision": null}], "chat_template": "{%- if tools %}\n {{- '<|im_start|>system\n' }}\n {%- if messages[0]['role'] == 'system' %}\n {{- messages[0]['content'] }}\n {%- else %}\n {{- 'You are Qwen, created by Alibaba Cloud. You are a helpful assistant.' }}\n {%- endif %}\n {{- "\n\n# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within XML tags:\n" }}\n {%- for tool in tools %}\n {{- "\n" }}\n {{- tool | tojson }}\n {%- endfor %}\n {{- "\n\n\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": , \"arguments\": }\n</tool_call><|im_end|>\n" }}\n{%- else %}\n {%- if messages[0]['role'] == 'system' %}\n {{- '<|im_start|>system\n' + messages[0]['content'] + '<|im_end|>\n' }}\n {%- else %}\n {{- '<|im_start|>system\nYou are Qwen, created by Alibaba Cloud. You are a helpful assistant.<|im_end|>\n' }}\n {%- endif %}\n{%- endif %}\n{%- for message in messages %}\n {%- if (message.role == "user") or (message.role == "system" and not loop.first) or (message.role == "assistant" and not message.tool_calls) %}\n {{- '<|im_start|>' + message.role + '\n' + message.content + '<|im_end|>' + '\n' }}\n {%- elif message.role == "assistant" %}\n {{- '<|im_start|>' + message.role }}\n {%- if message.content %}\n {{- '\n' + message.content }}\n {%- endif %}\n {%- for tool_call in message.tool_calls %}\n {%- if tool_call.function is defined %}\n {%- set tool_call = tool_call.function %}\n {%- endif %}\n {{- '\n<tool_call>\n{"name": "' }}\n {{- tool_call.name }}\n {{- '", "arguments": ' }}\n {{- tool_call.arguments | tojson }}\n {{- '}\n</tool_call>' }}\n {%- endfor %}\n {{- '<|im_end|>\n' }}\n {%- elif message.role == "tool" %}\n {%- if (loop.index0 == 0) or (messages[loop.index0 - 1].role != "tool") %}\n {{- '<|im_start|>user' }}\n {%- endif %}\n {{- '\n<tool_response>\n' }}\n {{- message.content }}\n {{- '\n</tool_response>' }}\n {%- if loop.last or (messages[loop.index0 + 1].role != "tool") %}\n {{- '<|im_end|>\n' }}\n {%- endif %}\n {%- endif %}\n{%- endfor %}\n{%- if add_generation_prompt %}\n {{- '<|im_start|>assistant\n' }}\n{%- endif %}\n", "stop_token_ids": [151643, 151644, 151645], "stop": ["<|endoftext|>", "<|im_start|>", "<|im_end|>"]}
Expected behavior / 期待表现
需要点击页面可以执行成功
The text was updated successfully, but these errors were encountered: