-
Notifications
You must be signed in to change notification settings - Fork 740
Description
System Info / 系統信息
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 575.51.03 Driver Version: 575.51.03 CUDA Version: 12.9 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA RTX A6000 Off | 00000000:01:00.0 Off | Off |
| 48% 69C P2 153W / 300W | 43193MiB / 49140MiB | 14% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 226747 C Model: Qwen3-Embedding-8B-0 14890MiB |
| 0 N/A N/A 468632 C ...party/bin/llama-box/llama-box 898MiB |
| 0 N/A N/A 469101 C /usr/bin/python3 1434MiB |
| 0 N/A N/A 2499695 C ...lama-box/llama-box-rpc-server 260MiB |
| 0 N/A N/A 2807504 C Model: Qwen3-Reranker-8B-0 19392MiB |
| 0 N/A N/A 3292392 C python3 1228MiB |
| 0 N/A N/A 3292394 C python3 2852MiB |
| 0 N/A N/A 3695965 C /usr/bin/python3 2190MiB |
Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?
- docker / docker
- pip install / 通过 pip install 安装
- installation from source / 从源码安装
Version info / 版本信息
xinference:v1.9.0
The command used to start Xinference / 用以启动 xinference 的命令
xinference launch --model-name Qwen3-Reranker-8B --model-type rerank --replica 1 --n-gpu auto --model-engine vllm --model-format pytorch --quantization none
Reproduction / 复现过程
使用docker部署xinference:v1.9.0 后,加载重排序模型Qwen3-Reranker-8B。经过一段时间的使用发现显存泄漏问题。

Expected behavior / 期待表现
显存占用能稳定在一定区间内