一个服务化的可多GPU并行处理的方案（基于LitServe） #667

randydl · 2024-09-27T03:09:37Z

支持传入jpg、png、pdf路径。批量处理的话大家只需要简单的多线程调用客户端的do_parse函数就可以了，服务端会自动在多个GPU上并行处理。

pip install -U litserve python-multipart filetype
pip install torch==2.3.1 torchvision==0.18.1 torchaudio==2.3.1
pip install -U magic-pdf[full] --extra-index-url https://wheels.myhloli.com
pip install paddlepaddle-gpu==3.0.0b1 -i https://www.paddlepaddle.org.cn/packages/stable/cu118

server.py

import torch
import filetype
import json, uuid
import litserve as ls
from unittest.mock import patch
from fastapi import HTTPException
from magic_pdf.tools.common import do_parse
from magic_pdf.model.doc_analyze_by_custom_model import ModelSingleton


class MinerUAPI(ls.LitAPI):
    def __init__(self, output_dir='/tmp'):
        self.output_dir = output_dir

    @staticmethod
    def clean_memory(device):
        import gc
        if torch.cuda.is_available():
            with torch.cuda.device(device):
                torch.cuda.empty_cache()
                torch.cuda.ipc_collect()
        gc.collect()

    def setup(self, device):
        with patch('magic_pdf.model.doc_analyze_by_custom_model.get_device') as mock_obj:
            mock_obj.return_value = device
            model_manager = ModelSingleton()
            model_manager.get_model(True, False)
            model_manager.get_model(False, False)
            mock_obj.assert_called()
            print(f'Model initialization complete!')

    def decode_request(self, request):
        file = request['file'].file.read()
        kwargs = json.loads(request['kwargs'])
        assert filetype.guess_mime(file) == 'application/pdf'
        return file, kwargs

    def predict(self, inputs):
        try:
            pdf_name = str(uuid.uuid4())
            do_parse(self.output_dir, pdf_name, inputs[0], [], **inputs[1])
            return pdf_name
        except Exception as e:
            raise HTTPException(status_code=500, detail=f'{e}')
        finally:
            self.clean_memory(self.device)

    def encode_response(self, response):
        return {'output_dir': response}


if __name__ == '__main__':
    server = ls.LitServer(MinerUAPI(), accelerator='gpu', devices=[0, 1], timeout=False)
    server.run(port=8000)

client.py

import json
import pymupdf
import requests
import numpy as np
from loguru import logger
from joblib import Parallel, delayed


def to_pdf(file_path):
    with pymupdf.open(file_path) as f:
        if f.is_pdf:
            pdf_bytes = f.tobytes()
        else:
            pdf_bytes = f.convert_to_pdf()
        return pdf_bytes


def do_parse(file_path, url='http://127.0.0.1:8000/predict', **kwargs):
    try:
        kwargs.setdefault('parse_method', 'auto')
        kwargs.setdefault('debug_able', False)

        response = requests.post(url,
            data={'kwargs': json.dumps(kwargs)},
            files={'file': to_pdf(file_path)}
        )

        if response.status_code == 200:
            output = response.json()
            output['file_path'] = file_path
            return output
        else:
            raise Exception(response.text)
    except Exception as e:
        logger.error(f'File: {file_path} - Info: {e}')


if __name__ == '__main__':
    files = ['/tmp/small_ocr.pdf']
    n_jobs = np.clip(len(files), 1, 4)
    results = Parallel(n_jobs, prefer='threads', verbose=10)(
        delayed(do_parse)(p) for p in files
    )
    print(results)

BlackMoki-bot · 2024-09-28T06:52:31Z

你好，我在运行代码时，服务器端一直报Exception: Parsing error: 'Layoutlmv3_Predictor' object has no attribute 'parameters'，客户端一直报requests.exceptions.HTTPError: 500 Server Error: Internal Server Error for url: http://127.0.0.1:8000/predict
但http://127.0.0.1能正常访问，请问这是什么原因呀？跪求大佬指教！

randydl · 2024-09-30T05:50:20Z

你好，我在运行代码时，服务器端一直报Exception: Parsing error: 'Layoutlmv3_Predictor' object has no attribute 'parameters'，客户端一直报requests.exceptions.HTTPError: 500 Server Error: Internal Server Error for url: http://127.0.0.1:8000/predict 但http://127.0.0.1能正常访问，请问这是什么原因呀？跪求大佬指教！

看样子是你的处理代码有问题，不是服务的问题

flow3rdown · 2024-10-12T10:15:48Z

使用这个代码后，表格识别变得巨慢，是什么原因呢？

randydl · 2024-10-12T11:17:32Z

使用这个代码后，表格识别变得巨慢，是什么原因呢？

你不使用服务化的方式，用magic-pdf cli的方式慢吗？

flow3rdown · 2024-10-14T02:39:25Z

使用这个代码后，表格识别变得巨慢，是什么原因呢？

你不使用服务化的方式，用magic-pdf cli的方式慢吗？

这样的话速度是正常的，表格识别用的TableMaster

PoisonousBromineChan · 2024-10-15T15:07:04Z

代码实际上没看懂咋用，就习惯性地先开server.py，把client.py里面的文件路径改成自己的再启动。结果发现报错和small_ocr.pdf有关，明明我要处理的文件都没有small_ocr.pdf了，不知道如何解决。
有没有简单一点的方法，比如直接改magic-pdf.json？把里面设备一栏改成多CUDA的？

randydl · 2024-10-16T06:26:41Z

应该是你的代码改错了吧，我这边正常运行，改了文件路径怎么可能还有small_ocr.pdf，这只是个example file @PoisonousBromineChan

flow3rdown · 2024-10-17T02:05:37Z

应该是你的代码改错了吧，我这边正常运行，改了文件路径怎么可能还有small_ocr.pdf，这只是个example file @PoisonousBromineChan

请问您这边跑的时候表格识别速度正常吗？

randydl · 2024-10-18T09:19:12Z

应该是你的代码改错了吧，我这边正常运行，改了文件路径怎么可能还有small_ocr.pdf，这只是个example file @PoisonousBromineChan

请问您这边跑的时候表格识别速度正常吗？

表格我还没验证过，有时间我试试看

234687552 · 2024-10-19T14:03:11Z

问题描述：

参考server.py使用LitServe调用，发现表格识别巨慢

系统&环境：

PRETTY_NAME="Ubuntu 24.04 LTS"

Python 3.10.14

magic-pdf version 0.7.1

paddlepaddle-gpu 3.0.0b1

magic-pdf.json配置

{
    "bucket_info":{
        "bucket-name-1":["ak", "sk", "endpoint"],
        "bucket-name-2":["ak", "sk", "endpoint"]
    },
    "models-dir":"/opt/models",
    "device-mode":"cuda",
    "table-config": {
        "model": "TableMaster",
        "is_table_recog_enable": true,
        "max_time": 400
    }
}

实验pdf链接：

https://github.com/opendatalab/MinerU/blob/master/demo/demo1.pdf

使用litserve

输出日志为：

2024-10-19 21:10:57.105 | INFO | magic_pdf.libs.pdf_check:detect_invalid_chars:57 - cid_count: 0, text_len: 1501, cid_chars_radio: 0.0
2024-10-19 21:10:57.861 | INFO | magic_pdf.model.pdf_extract_kit:__call__:170 - layout detection cost: 0.68
Model initialization complete!
Setup complete for worker 3.

0: 1888x1344 4 embeddings, 92.2ms
Speed: 12.7ms preprocess, 92.2ms inference, 13.2ms postprocess per image at shape (1, 3, 1888, 1344)
2024-10-19 21:10:58.633 | INFO | magic_pdf.model.pdf_extract_kit:__call__:200 - formula nums: 4, mfr time: 0.2
2024-10-19 21:10:58.640 | INFO | magic_pdf.model.pdf_extract_kit:__call__:291 - ------------------table recognition processing begins-----------------
2024-10-19 21:14:13.524 | INFO | magic_pdf.model.pdf_extract_kit:__call__:300 - ------------table recognition processing ends within 194.88404989242554s-----
2024-10-19 21:14:13.525 | INFO | magic_pdf.model.pdf_extract_kit:__call__:317 - table cost: 194.89
2024-10-19 21:14:13.525 | INFO | magic_pdf.model.doc_analyze_by_custom_model:doc_analyze:124 - doc analyze cost: 196.3451521396637
2024-10-19 21:14:13.567 | INFO | magic_pdf.pdf_parse_union_core:pdf_parse_union:221 - page_id: 0, last_page_cost_time: 0.0
2024-10-19 21:14:13.663 | INFO | magic_pdf.para.para_split_v2:__detect_list_lines:143 - 发现了列表，列表行数：[(0, 1)]， [[0]]
2024-10-19 21:14:13.663 | INFO | magic_pdf.para.para_split_v2:__detect_list_lines:156 - 列表行的第0到第1行是列表
2024-10-19 21:14:13.797 | INFO | magic_pdf.pipe.UNIPipe:pipe_mk_markdown:48 - uni_pipe mk mm_markdown finished
2024-10-19 21:14:13.805 | INFO | magic_pdf.pipe.UNIPipe:pipe_mk_uni_format:43 - uni_pipe mk content list finished
2024-10-19 21:14:13.805 | INFO | magic_pdf.tools.common:do_parse:119 - local output dir is /tmp/91dc2fda-fb5c-431f-bbce-9dcdc8ce3596/auto

使用命令行

/opt/mineru_venv/bin/magic-pdf -p origin.pdf -m auto

输出日志为：

[10/19 21:41:53 d2.checkpoint.detection_checkpoint]: [DetectionCheckpointer] Loading from /opt/models/Layout/model_final.pth ...
[10/19 21:41:53 fvcore.common.checkpoint]: [Checkpointer] Loading from /opt/models/Layout/model_final.pth ...
2024-10-19 21:41:56.518 | INFO     | magic_pdf.model.pdf_extract_kit:__init__:159 - DocAnalysis init done!
2024-10-19 21:41:56.518 | INFO     | magic_pdf.model.doc_analyze_by_custom_model:custom_model_init:98 - model init cost: 21.35542368888855
2024-10-19 21:41:57.207 | INFO     | magic_pdf.model.pdf_extract_kit:__call__:170 - layout detection cost: 0.61

0: 1888x1344 4 embeddings, 91.9ms
Speed: 9.7ms preprocess, 91.9ms inference, 1.1ms postprocess per image at shape (1, 3, 1888, 1344)
2024-10-19 21:41:57.948 | INFO     | magic_pdf.model.pdf_extract_kit:__call__:200 - formula nums: 4, mfr time: 0.19
2024-10-19 21:41:57.956 | INFO     | magic_pdf.model.pdf_extract_kit:__call__:291 - ------------------table recognition processing begins-----------------
[2024/10/19 21:41:59] ppocr DEBUG: dt_boxes num : 18, elapse : 0.045398712158203125
[2024/10/19 21:41:59] ppocr DEBUG: dt_boxes num : 18, elapse : 0.045398712158203125
[2024/10/19 21:41:59] ppocr DEBUG: rec_res num  : 18, elapse : 0.047318220138549805
[2024/10/19 21:41:59] ppocr DEBUG: rec_res num  : 18, elapse : 0.047318220138549805
2024-10-19 21:41:59.425 | INFO     | magic_pdf.model.pdf_extract_kit:__call__:300 - ------------table recognition processing ends within 1.4687747955322266s-----
2024-10-19 21:41:59.425 | INFO     | magic_pdf.model.pdf_extract_kit:__call__:317 - table cost: 1.47
2024-10-19 21:41:59.425 | INFO     | magic_pdf.model.doc_analyze_by_custom_model:doc_analyze:124 - doc analyze cost: 2.828835964202881
2024-10-19 21:41:59.467 | INFO     | magic_pdf.pdf_parse_union_core:pdf_parse_union:221 - page_id: 0, last_page_cost_time: 0.0
2024-10-19 21:42:00.020 | INFO     | magic_pdf.para.para_split_v2:__detect_list_lines:143 - 发现了列表，列表行数：[(0, 1)]， [[0]]
2024-10-19 21:42:00.020 | INFO     | magic_pdf.para.para_split_v2:__detect_list_lines:156 - 列表行的第0到第1行是列表
2024-10-19 21:42:00.154 | INFO     | magic_pdf.pipe.UNIPipe:pipe_mk_markdown:48 - uni_pipe mk mm_markdown finished
2024-10-19 21:42:00.162 | INFO     | magic_pdf.pipe.UNIPipe:pipe_mk_uni_format:43 - uni_pipe mk content list finished
2024-10-19 21:42:00.162 | INFO     | magic_pdf.tools.common:do_parse:119 - local output dir is output/origin/auto

234687552 · 2024-10-22T03:19:22Z

不知道是不是这里导致表格识别巨慢

https://github.com/opendatalab/MinerU/blob/master/magic_pdf/model/ppTableModel.py#L46

myhloli · 2024-10-22T03:27:57Z

不知道是不是这里导致表格识别巨慢

https://github.com/opendatalab/MinerU/blob/master/magic_pdf/model/ppTableModel.py#L46

确实是这个原因，里面写死了匹配的规则，我们修一下这里
目前可以临时修改成

use_gpu = True if device.startswith("cuda") else False

234687552 · 2024-10-23T12:10:28Z

问题描述：

参考server.py提供接口，15并发4gpu压测，发现gpu[0]总是爆满，其他gpu都是相对空闲。

期望结果：

gpu的压力均分

实验过程执行：

nvidia-smi --loop=1

输出日志：

                                                                                   
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+
Wed Oct 23 19:59:02 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.12              Driver Version: 550.90.12      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA L20                     On  |   00000000:D3:00.0 Off |                    0 |
| N/A   68C    P0            228W /  350W |   19876MiB /  46068MiB |    100%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA L20                     On  |   00000000:D4:00.0 Off |                    0 |
| N/A   50C    P0            146W /  350W |    9629MiB /  46068MiB |     38%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   2  NVIDIA L20                     On  |   00000000:D6:00.0 Off |                    0 |
| N/A   45C    P0            154W /  350W |    9629MiB /  46068MiB |     46%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   3  NVIDIA L20                     On  |   00000000:D7:00.0 Off |                    0 |
| N/A   45C    P0             90W /  350W |    9629MiB /  46068MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+
Wed Oct 23 19:59:04 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.12              Driver Version: 550.90.12      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA L20                     On  |   00000000:D3:00.0 Off |                    0 |
| N/A   68C    P0            246W /  350W |   20234MiB /  46068MiB |    100%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA L20                     On  |   00000000:D4:00.0 Off |                    0 |
| N/A   51C    P0            155W /  350W |    9629MiB /  46068MiB |     43%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   2  NVIDIA L20                     On  |   00000000:D6:00.0 Off |                    0 |
| N/A   43C    P0            130W /  350W |    9629MiB /  46068MiB |      5%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   3  NVIDIA L20                     On  |   00000000:D7:00.0 Off |                    0 |
| N/A   45C    P0             93W /  350W |    9629MiB /  46068MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+
Wed Oct 23 19:59:05 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.12              Driver Version: 550.90.12      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA L20                     On  |   00000000:D3:00.0 Off |                    0 |
| N/A   68C    P0            217W /  350W |   20234MiB /  46068MiB |    100%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA L20                     On  |   00000000:D4:00.0 Off |                    0 |
| N/A   50C    P0            158W /  350W |    9629MiB /  46068MiB |     34%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   2  NVIDIA L20                     On  |   00000000:D6:00.0 Off |                    0 |
| N/A   43C    P0             88W /  350W |    9629MiB /  46068MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   3  NVIDIA L20                     On  |   00000000:D7:00.0 Off |                    0 |
| N/A   45C    P0             90W /  350W |    9629MiB /  46068MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

randydl · 2024-10-24T01:42:35Z

@234687552 你这边是打开了表格识别了吗，如果打开了可以试试关闭表格识别，再测一下负载均衡，这样可以定位是不是表格识别的问题。

randydl · 2024-10-24T01:49:16Z

感谢，跑通了。额外安装库 pip install python-multipart，然后启动服务器程序就请求成功了。另外如果希望仅仅输出.md文件来节省存储空间和速度的话可以： from magic_pdf.libs.MakeContentConfig import MakeMode # 添加这行

修改do parse 函数：
        do_parse(self.output_dir,
                  pdf_name, inputs[0],
                    [],
                    **inputs[1],
                    f_draw_span_bbox=False,
                    f_draw_layout_bbox=False,
                    f_dump_md=True,
                    f_dump_middle_json=False,
                    f_dump_model_json=False,
                    f_dump_orig_pdf=False,
                    f_dump_content_list=False,
                    f_make_md_mode=MakeMode.MM_MD,
                    f_draw_model_bbox=False)

简单的方法是在调用client里面的do_parse函数时传入这些参数就可以了，不需要修改server的代码

234687552 · 2024-10-24T02:14:31Z

@234687552 你这边是打开了表格识别了吗，如果打开了可以试试关闭表格识别，再测一下负载均衡，这样可以定位是不是表格识别的问题。

情况描述：

之前是开启了表格识别："is_table_recog_enable": true,

关闭后测试：gpu[0] 不会一直持续爆满，其他gpu相对均衡运转

关闭表格识别

cat ~/magic-pdf.json

{
    "bucket_info":{
        "bucket-name-1":["ak", "sk", "endpoint"],
        "bucket-name-2":["ak", "sk", "endpoint"]
    },
    "models-dir":"/opt/models",
    "device-mode":"cuda",
    "table-config": {
        "model": "TableMaster",
        "is_table_recog_enable": false,
        "max_time": 400
    }
}

gpu使用情况

nvidia-smi --loop=1

                                                                                        
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+
Thu Oct 24 10:07:57 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.12              Driver Version: 550.90.12      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA L20                     On  |   00000000:D3:00.0 Off |                    0 |
| N/A   58C    P0            169W /  350W |   15238MiB /  46068MiB |     47%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA L20                     On  |   00000000:D4:00.0 Off |                    0 |
| N/A   59C    P0            165W /  350W |    9627MiB /  46068MiB |     43%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   2  NVIDIA L20                     On  |   00000000:D6:00.0 Off |                    0 |
| N/A   54C    P0            154W /  350W |    9627MiB /  46068MiB |     22%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   3  NVIDIA L20                     On  |   00000000:D7:00.0 Off |                    0 |
| N/A   53C    P0            109W /  350W |    9619MiB /  46068MiB |     15%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+
Thu Oct 24 10:07:58 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.12              Driver Version: 550.90.12      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA L20                     On  |   00000000:D3:00.0 Off |                    0 |
| N/A   62C    P0            193W /  350W |   15238MiB /  46068MiB |     76%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA L20                     On  |   00000000:D4:00.0 Off |                    0 |
| N/A   60C    P0            175W /  350W |    9627MiB /  46068MiB |     48%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   2  NVIDIA L20                     On  |   00000000:D6:00.0 Off |                    0 |
| N/A   52C    P0            176W /  350W |    9627MiB /  46068MiB |     56%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   3  NVIDIA L20                     On  |   00000000:D7:00.0 Off |                    0 |
| N/A   60C    P0            192W /  350W |    9629MiB /  46068MiB |     79%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+
Thu Oct 24 10:08:00 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.12              Driver Version: 550.90.12      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA L20                     On  |   00000000:D3:00.0 Off |                    0 |
| N/A   57C    P0            204W /  350W |   15238MiB /  46068MiB |     42%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA L20                     On  |   00000000:D4:00.0 Off |                    0 |
| N/A   59C    P0            189W /  350W |    9627MiB /  46068MiB |     86%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   2  NVIDIA L20                     On  |   00000000:D6:00.0 Off |                    0 |
| N/A   51C    P0            114W /  350W |    9627MiB /  46068MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   3  NVIDIA L20                     On  |   00000000:D7:00.0 Off |                    0 |
| N/A   54C    P0            114W /  350W |    9629MiB /  46068MiB |     19%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

234687552 · 2024-10-24T02:20:10Z

这边实际情况是必须开启表格识别的，现在不知道如何处理让表格识别也均衡单机使用多gpu

randydl · 2024-10-24T02:46:34Z

这边实际情况是必须开启表格识别的，现在不知道如何处理让表格识别也均衡单机使用多gpu

看来我的猜测是对的，还是因为表格识别的bug引起的，可能还是在代码的某个地方，表格模型还是以.cuda的方式load的，还是没有正确识别到cuda:1这种。导致所有的表格模型都load到了gpu 0上，因而gpu 0爆满。

randydl · 2024-10-24T06:51:39Z

对于TableMaster表格识别模型，以下是存在bug的地方：
https://github.com/opendatalab/MinerU/blob/master/magic_pdf/model/ppTableModel.py#L55
仅仅改use_gpu = True if device == "cuda" else False是不够的，需要调查use_gpu变量

对于struct_eqtable表格模型，以下是存在bug的地方：
https://github.com/opendatalab/MinerU/blob/master/magic_pdf/model/pek_sub_modules/structeqtable/StructTableModel.py#L9
这个bug应该好改，改成self.model = StructTable(self.model_path, self.max_new_tokens, self.max_time).to(device)应该就能生效

@myhloli @234687552

myhloli · 2024-10-24T07:02:24Z

对于TableMaster表格识别模型，以下是存在bug的地方： https://github.com/opendatalab/MinerU/blob/master/magic_pdf/model/ppTableModel.py#L55 仅仅改use_gpu = True if device == "cuda" else False是不够的，需要调查use_gpu变量

对于struct_eqtable表格模型，以下是存在bug的地方： https://github.com/opendatalab/MinerU/blob/master/magic_pdf/model/pek_sub_modules/structeqtable/StructTableModel.py#L9 这个bug应该好改，改成self.model = StructTable(self.model_path, self.max_new_tokens, self.max_time).to(device)应该就能生效

@myhloli @234687552

paddle框架指定gpu的方式和torch框架不一致，目前paddle都是使用第一张卡去加速的，目前我们的开发重心还在提高解析质量上，暂时分不出人力优化多卡分配的逻辑，欢迎有能力解决多卡分配问题的开发者提交pr

randydl · 2024-10-24T08:22:36Z

server.py

import os
import torch
import filetype
import json, uuid
import litserve as ls
from fastapi import HTTPException
from magic_pdf.tools.common import do_parse
from magic_pdf.model.doc_analyze_by_custom_model import ModelSingleton


class MinerUAPI(ls.LitAPI):
    def __init__(self, output_dir='/tmp'):
        self.output_dir = output_dir

    @staticmethod
    def clean_memory(device):
        import gc
        if torch.cuda.is_available():
            with torch.cuda.device(device):
                torch.cuda.empty_cache()
                torch.cuda.ipc_collect()
        gc.collect()

    def setup(self, device):
        device = torch.device(device)
        os.environ['CUDA_VISIBLE_DEVICES'] = str(device.index)
        model_manager = ModelSingleton()
        model_manager.get_model(True, False)
        model_manager.get_model(False, False)
        print(f'Model initialization complete!')

    def decode_request(self, request):
        file = request['file'].file.read()
        kwargs = json.loads(request['kwargs'])
        assert filetype.guess_mime(file) == 'application/pdf'
        return file, kwargs

    def predict(self, inputs):
        try:
            pdf_name = str(uuid.uuid4())
            do_parse(self.output_dir, pdf_name, inputs[0], [], **inputs[1])
            return pdf_name
        except Exception as e:
            raise HTTPException(status_code=500, detail=f'{e}')
        finally:
            self.clean_memory(self.device)

    def encode_response(self, response):
        return {'output_dir': response}


if __name__ == '__main__':
    server = ls.LitServer(MinerUAPI(), accelerator='gpu', devices=[0, 1], timeout=False)
    server.run(port=8000)

magic-pdf.json

{
    "bucket_info":{
        "bucket-name-1":["ak", "sk", "endpoint"],
        "bucket-name-2":["ak", "sk", "endpoint"]
    },
    "models-dir":"/opt/models",
    "device-mode":"cuda",
    "table-config": {
        "model": "TableMaster",
        "is_table_recog_enable": true,
        "max_time": 400
    }
}

试试把server.py改成我提供的新的代码，打开表格识别，再跑一次压测看看，应该是可以了 @234687552

234687552 · 2024-10-24T13:00:46Z

server.py

import os
import torch
import filetype
import json, uuid
import litserve as ls
from fastapi import HTTPException
from magic_pdf.tools.common import do_parse
from magic_pdf.model.doc_analyze_by_custom_model import ModelSingleton


class MinerUAPI(ls.LitAPI):
    def __init__(self, output_dir='/tmp'):
        self.output_dir = output_dir

    @staticmethod
    def clean_memory(device):
        import gc
        if torch.cuda.is_available():
            with torch.cuda.device(device):
                torch.cuda.empty_cache()
                torch.cuda.ipc_collect()
        gc.collect()

    def setup(self, device):
        device = torch.device(device)
        os.environ['CUDA_VISIBLE_DEVICES'] = str(device.index)
        model_manager = ModelSingleton()
        model_manager.get_model(True, False)
        model_manager.get_model(False, False)
        print(f'Model initialization complete!')

    def decode_request(self, request):
        file = request['file'].file.read()
        kwargs = json.loads(request['kwargs'])
        assert filetype.guess_mime(file) == 'application/pdf'
        return file, kwargs

    def predict(self, inputs):
        try:
            pdf_name = str(uuid.uuid4())
            do_parse(self.output_dir, pdf_name, inputs[0], [], **inputs[1])
            return pdf_name
        except Exception as e:
            raise HTTPException(status_code=500, detail=f'{e}')
        finally:
            self.clean_memory(self.device)

    def encode_response(self, response):
        return {'output_dir': response}


if __name__ == '__main__':
    server = ls.LitServer(MinerUAPI(), accelerator='gpu', devices=[0, 1], timeout=False)
    server.run(port=8000)

magic-pdf.json

{
    "bucket_info":{
        "bucket-name-1":["ak", "sk", "endpoint"],
        "bucket-name-2":["ak", "sk", "endpoint"]
    },
    "models-dir":"/opt/models",
    "device-mode":"cuda",
    "table-config": {
        "model": "TableMaster",
        "is_table_recog_enable": true,
        "max_time": 400
    }
}

试试把server.py改成我提供的新的代码，打开表格识别，再跑一次压测看看，应该是可以了 @234687552

情况描述
@randydl

gpu是均衡分配占用【详看后面的日志和截图】，但是clean_memory有异常堆栈

参考改动如下：

  def setup(self, device):
        device = torch.device(device)
        os.environ['CUDA_VISIBLE_DEVICES'] = str(device.index)
        model_manager = ModelSingleton()
        model_manager.get_model(True, False)
        model_manager.get_model(False, False)
        print(f'Model initialization complete!')

异常堆栈：

Please check the error trace for more details.
Traceback (most recent call last):
File "/opt/mineru_venv/lib/python3.10/site-packages/litserve/loops.py", line 134, in run_single_loop
y = _inject_context(
File "/opt/mineru_venv/lib/python3.10/site-packages/litserve/loops.py", line 55, in _inject_context
return func(*args, **kwargs)
File "/app/app.py", line 144, in predict
self.clean_memory(self.device)
File "/app/app.py", line 83, in clean_memory
with torch.cuda.device(device):
File "/opt/mineru_venv/lib/python3.10/site-packages/torch/cuda/__init__.py", line 365, in __enter__
self.prev_idx = torch.cuda._exchange_device(self.idx)
RuntimeError: CUDA error: invalid device ordinal
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

gpu使用情况

nvidia-smi --loop=1

                                                                                        
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+
Thu Oct 24 20:54:03 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.12              Driver Version: 550.90.12      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA L20                     On  |   00000000:D3:00.0 Off |                    0 |
| N/A   51C    P0            135W /  350W |   11611MiB /  46068MiB |     18%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA L20                     On  |   00000000:D4:00.0 Off |                    0 |
| N/A   54C    P0            124W /  350W |   11435MiB /  46068MiB |     23%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   2  NVIDIA L20                     On  |   00000000:D6:00.0 Off |                    0 |
| N/A   48C    P0            112W /  350W |   12227MiB /  46068MiB |     20%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   3  NVIDIA L20                     On  |   00000000:D7:00.0 Off |                    0 |
| N/A   51C    P0            124W /  350W |   11435MiB /  46068MiB |     26%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+
Thu Oct 24 20:54:05 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.12              Driver Version: 550.90.12      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA L20                     On  |   00000000:D3:00.0 Off |                    0 |
| N/A   51C    P0            117W /  350W |   11611MiB /  46068MiB |     23%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA L20                     On  |   00000000:D4:00.0 Off |                    0 |
| N/A   54C    P0            130W /  350W |   11435MiB /  46068MiB |     27%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   2  NVIDIA L20                     On  |   00000000:D6:00.0 Off |                    0 |
| N/A   48C    P0            118W /  350W |   12227MiB /  46068MiB |     23%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   3  NVIDIA L20                     On  |   00000000:D7:00.0 Off |                    0 |
| N/A   51C    P0            132W /  350W |   11435MiB /  46068MiB |     31%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+
Thu Oct 24 20:54:06 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.12              Driver Version: 550.90.12      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA L20                     On  |   00000000:D3:00.0 Off |                    0 |
| N/A   51C    P0            125W /  350W |   11611MiB /  46068MiB |     27%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA L20                     On  |   00000000:D4:00.0 Off |                    0 |
| N/A   54C    P0            138W /  350W |   11435MiB /  46068MiB |     30%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   2  NVIDIA L20                     On  |   00000000:D6:00.0 Off |                    0 |
| N/A   48C    P0            126W /  350W |   12227MiB /  46068MiB |     27%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   3  NVIDIA L20                     On  |   00000000:D7:00.0 Off |                    0 |
| N/A   52C    P0            143W /  350W |   11435MiB /  46068MiB |     36%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

randydl · 2024-10-24T13:14:23Z

感谢，看来有进展！试试把with torch.cuda.device(device):这句话删掉@234687552

234687552 · 2024-10-25T01:22:03Z

感谢，看来有进展！试试把with torch.cuda.device(device):这句话删掉@234687552

感谢支持，现在是可以多gpu正常运作了。

xcvil · 2024-12-02T10:50:01Z

尝试了一下，感觉并不是多张卡分布式运行服务，是每张卡部署一个服务。请问一下有什么解决方案

我也有类似的问题，当我提交了多卡的工作之后，server只在第一个卡上跑，所有的multiprocessor都在这张卡上跑

randydl · 2024-12-03T02:07:58Z

@xcvil Ensuring CUDA_VISIBLE_DEVICES is set correctly.

Thanks a lot for replying. Could you please give me some hints about how to set CUDA_VISIBLE_DEVICES?

In my scenario, CUDA_VISIBLE_DEVICES=0,1,2,3 (I requested 4 GPUs). I noticed that in the server.py, there are codes
def setup(self, device):
        if device.startswith('cuda'):
            os.environ['CUDA_VISIBLE_DEVICES'] = device.split(':')[-1]
            if torch.cuda.device_count() > 1:
                raise RuntimeError("Remove any CUDA actions before setting 'CUDA_VISIBLE_DEVICES'.")
MinerU/projects/multi_gpu/server.py

Line 20 in b9f3435

if torch.cuda.device_count() > 1:

I will raise error with these lines of codes.

The check torch.cuda.device_count() > 1 ensures that CUDA_VISIBLE_DEVICES is set effectively by preventing any CUDA operations from being performed before its configuration. This is crucial because performing CUDA operations before setting CUDA_VISIBLE_DEVICES can render the setting ineffective. By verifying that torch.cuda.device_count() > 1, we ensure that no pre-existing CUDA operations interfere with the device visibility settings, thus allowing each process to correctly select and use only the specified GPU.

zxwsd · 2024-12-03T15:01:51Z

@randydl 请问我修改了一下这个代码，这样可以支持多张gpu一起运行吗
`import os
import fitz
import torch
import base64
import litserve as ls
from uuid import uuid4
from fastapi import HTTPException
from filetype import guess_extension
from magic_pdf.tools.common import do_parse
from magic_pdf.model.doc_analyze_by_custom_model import ModelSingleton

class MinerUAPI(ls.LitAPI):
def init(self, output_dir='/tmp'):
self.output_dir = output_dir

def setup(self, device):
    if device.startswith('cuda'):
        os.environ['CUDA_VISIBLE_DEVICES'] = device.split(':')[-1]

    model_manager = ModelSingleton()
    model_manager.get_model(True, False)
    model_manager.get_model(False, False)
    print(f'Model initialization complete on {device}!')

def decode_request(self, request):
    file = request['file']
    file = self.to_pdf(file)
    opts = request.get('kwargs', {})
    opts.setdefault('debug_able', False)
    opts.setdefault('parse_method', 'auto')
    return file, opts

def predict(self, inputs):
    try:
        do_parse(self.output_dir, pdf_name := str(uuid4()), inputs[0], [], **inputs[1])
        return pdf_name
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))
    finally:
        self.clean_memory()

def encode_response(self, response):
    return {'output_dir': response}

def clean_memory(self):
    import gc
    if torch.cuda.is_available():
        torch.cuda.empty_cache()
        torch.cuda.ipc_collect()
    gc.collect()

def to_pdf(self, file_base64):
    try:
        file_bytes = base64.b64decode(file_base64)
        file_ext = guess_extension(file_bytes)
        with fitz.open(stream=file_bytes, filetype=file_ext) as f:
            if f.is_pdf: return f.tobytes()
            return f.convert_to_pdf()
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))

if name == 'main':
server = ls.LitServer(
MinerUAPI(output_dir='/tmp'),
accelerator='cuda',
devices='auto',
workers_per_device=1,
timeout=False
)
server.run(port=8000)

`

xcvil · 2024-12-03T17:42:17Z

@randydl 请问我修改了一下这个代码，这样可以支持多张gpu一起运行吗 `import os import fitz import torch import base64 import litserve as ls from uuid import uuid4 from fastapi import HTTPException from filetype import guess_extension from magic_pdf.tools.common import do_parse from magic_pdf.model.doc_analyze_by_custom_model import ModelSingleton

class MinerUAPI(ls.LitAPI): def init(self, output_dir='/tmp'): self.output_dir = output_dir
def setup(self, device):
    if device.startswith('cuda'):
        os.environ['CUDA_VISIBLE_DEVICES'] = device.split(':')[-1]

    model_manager = ModelSingleton()
    model_manager.get_model(True, False)
    model_manager.get_model(False, False)
    print(f'Model initialization complete on {device}!')

def decode_request(self, request):
    file = request['file']
    file = self.to_pdf(file)
    opts = request.get('kwargs', {})
    opts.setdefault('debug_able', False)
    opts.setdefault('parse_method', 'auto')
    return file, opts

def predict(self, inputs):
    try:
        do_parse(self.output_dir, pdf_name := str(uuid4()), inputs[0], [], **inputs[1])
        return pdf_name
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))
    finally:
        self.clean_memory()

def encode_response(self, response):
    return {'output_dir': response}

def clean_memory(self):
    import gc
    if torch.cuda.is_available():
        torch.cuda.empty_cache()
        torch.cuda.ipc_collect()
    gc.collect()

def to_pdf(self, file_base64):
    try:
        file_bytes = base64.b64decode(file_base64)
        file_ext = guess_extension(file_bytes)
        with fitz.open(stream=file_bytes, filetype=file_ext) as f:
            if f.is_pdf: return f.tobytes()
            return f.convert_to_pdf()
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))
if name == 'main': server = ls.LitServer( MinerUAPI(output_dir='/tmp'), accelerator='cuda', devices='auto', workers_per_device=1, timeout=False ) server.run(port=8000)

`

你可以试试这个
#1157 (comment)
我改了一下在LSF/Slurm上能跑，如果不能的话把报错信息post一下一起看看！

zxwsd · 2024-12-04T13:11:23Z

@xcvil 我尝试了一下还是只能一个gpu上运行，不管指定哪几个，最后只在指定的第一个gpu上跑，指定1，3，只跑1，指定0，1，只跑0

zxwsd · 2024-12-04T13:14:13Z

@234687552 你好，请问我尝试了你发的代码，为什么我跑的时候还是只在一个gpu上运行的

xcvil · 2024-12-04T15:53:20Z

@xcvil 我尝试了一下还是只能一个gpu上运行，不管指定哪几个，最后只在指定的第一个gpu上跑，指定1，3，只跑1，指定0，1，只跑0

看看你代码呗如果slurm/lsf， shell也发一下

zxwsd · 2024-12-05T06:36:38Z

@xcvil

server端：
`
import os
import fitz
import torch
import base64
import litserve as ls
from uuid import uuid4
from fastapi import HTTPException
from filetype import guess_extension
from magic_pdf.tools.common import do_parse
from magic_pdf.model.doc_analyze_by_custom_model import ModelSingleton

class MinerUAPI(ls.LitAPI):
def init(self, output_dir='/tmp'):
self.output_dir = output_dir

def setup(self, device):
    os.environ['CUDA_VISIBLE_DEVICES'] = device.split(':')[-1]
   # torch.cuda.set_device(device)
    model_manager = ModelSingleton()
    model_manager.get_model(True, False)
    model_manager.get_model(False, False)
    print(f'Model initialization complete on {device}!')

def decode_request(self, request):
    file = request['file']
    file = self.to_pdf(file)
    opts = request.get('kwargs', {})
    opts.setdefault('debug_able', False)
    opts.setdefault('parse_method', 'auto')
    return file, opts

def predict(self, inputs):
    try:
        do_parse(self.output_dir, pdf_name := str(uuid4()), inputs[0], [], **inputs[1])
        return pdf_name
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))
    finally:
        self.clean_memory()

def encode_response(self, response):
    return {'output_dir': response}

def clean_memory(self):
    import gc
    if torch.cuda.is_available():
        torch.cuda.empty_cache()
        torch.cuda.ipc_collect()
    gc.collect()

def to_pdf(self, file_base64):
    try:
        file_bytes = base64.b64decode(file_base64)
        file_ext = guess_extension(file_bytes)
        with fitz.open(stream=file_bytes, filetype=file_ext) as f:
            if f.is_pdf: return f.tobytes()
            return f.convert_to_pdf()
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))

if name == 'main':
server = ls.LitServer(
MinerUAPI(output_dir='/data/cuixk/output'),
accelerator='cuda',
devices=[0,1],
workers_per_device=1,
timeout=False
)
server.run(port=8000)
client端：
import os
import base64
import requests
import numpy as np
from loguru import logger
from joblib import Parallel, delayed

def to_b64(file_path):
try:
with open(file_path, 'rb') as f:
return base64.b64encode(f.read()).decode('utf-8')
except Exception as e:
raise Exception(f'File: {file_path} - Info: {e}')

def do_parse(file_path, url='http://127.0.0.1:8000/predict', **kwargs):
try:
response = requests.post(url, json={
'file': to_b64(file_path),
'kwargs': kwargs
})

    if response.status_code == 200:
        output = response.json()
        output['file_path'] = file_path
        return output
    else:
        raise Exception(response.text)
except Exception as e:
    logger.error(f'File: {file_path} - Info: {e}')

def process_pdf_files_concurrently(pdf_files):
n_jobs = np.clip(len(pdf_files), 1, 2)
results = Parallel(n_jobs = n_jobs, prefer='threads', verbose=10)(
delayed(do_parse)(p) for p in pdf_files
)
print(results)

def process_files_in_batches(directory, batch_size=20):
pdf_files = []
for root, dirs, files in os.walk(directory):
for file in files:
if file.lower().endswith('.pdf'):
pdf_files.append(os.path.join(root, file))
if len(pdf_files) >= batch_size:
print(f"the pdf files are: ${pdf_files}")
process_pdf_files_concurrently(pdf_files)
pdf_files = []

if name == 'main':
directory = "/data/cuixk/test1/knowleges"
batch_size = 20
process_files_in_batches(directory, batch_size=batch_size)
`
跑的时候只有显示

xcvil · 2024-12-07T10:41:26Z

@zxwsd 你试了这个嘛
#1157 (comment)

zxwsd · 2024-12-08T05:46:23Z

@xcvil 我更换了一个版本，把MinerU从10.5降到0.9.0，然后就可以正常多卡并行了

234687552 · 2024-12-18T07:36:36Z

@xcvil @zxwsd

可以研究litServer，单机多卡部署，其实一个请求过来，最多也就打满一张卡，不会占用其他的卡

Ronass · 2024-12-20T10:48:29Z

@randydl 大佬能帮忙看下吗，怎么才能支持multipart/form-data 这种类型的数据呢，想部署一个api服务，直接用的serve.py，但是发现怎么也接收不到数据，
curl -i -X POST \ -H "Content-Type:multipart/form-data" \ -F "file=@\"./10-1.pdf\";type=application/pdf;filename=\"10-1.pdf\"" \ -F "parse_method=auto" \ 'http://0.0.0.0:8000/predict'

这是请求的格式，不知道decode_request 这个方法怎么才能接收到这两个参数

randydl · 2024-12-23T07:35:01Z

@randydl 大佬能帮忙看下吗，怎么才能支持multipart/form-data 这种类型的数据呢，想部署一个api服务，直接用的serve.py，但是发现怎么也接收不到数据， curl -i -X POST \ -H "Content-Type:multipart/form-data" \ -F "file=@\"./10-1.pdf\";type=application/pdf;filename=\"10-1.pdf\"" \ -F "parse_method=auto" \ 'http://0.0.0.0:8000/predict'

这是请求的格式，不知道decode_request 这个方法怎么才能接收到这两个参数

请参考https://github.com/opendatalab/MinerU/blob/master/projects/multi_gpu/client.py

hzh747117982 · 2025-02-13T06:10:24Z

请问，当pdf文件过大时，这个有防止显存爆掉的措施吗

seedclaimer · 2025-02-20T04:14:06Z

按示例代码运行在windows双卡3090，两张卡都产生了显存占用。
但，cuda0的核心占用明显提高，cuda1核心占用为0，功耗也是待机功耗，疑似计算都是在cuda0上进行的？

magic_pdf==1.1.0

plyu3 · 2025-03-04T11:05:43Z

直接用遇到了这个问题：

TypeError: cannot pickle '_io.BufferedRandom' object
2025-03-04 19:02:38.703 | ERROR | main:do_parse:35 - File: tmp/small_ocr.pdf - Info: Internal Server Error
[Parallel(n_jobs=1)]: Done 1 tasks | elapsed: 0.2s
[Parallel(n_jobs=1)]: Done 1 tasks | elapsed: 0.2s
[None]

ywh-my · 2025-03-05T06:17:56Z

直接用遇到了这个问题：

TypeError: cannot pickle '_io.BufferedRandom' object 2025-03-04 19:02:38.703 | ERROR | main:do_parse:35 - File: tmp/small_ocr.pdf - Info: Internal Server Error [Parallel(n_jobs=1)]: Done 1 tasks | elapsed: 0.2s [Parallel(n_jobs=1)]: Done 1 tasks | elapsed: 0.2s [None]

我和litserve的开发者聊了一下。他说使用 starlette == 0.45.3 才能避免这个问题。参考：Lightning-AI/LitServe#443

tongyuhome · 2025-03-08T15:28:11Z

应该是你的代码改错了吧，我这边正常运行，改了文件路径怎么可能还有small_ocr.pdf，这只是个example file @PoisonousBromineChan

请问您这边跑的时候表格识别速度正常吗？

使用这个代码后，表格识别变得巨慢，是什么原因呢？

你不使用服务化的方式，用magic-pdf cli的方式慢吗？

这样的话速度是正常的，表格识别用的TableMaster

请问这个问题最后有定位得到解决吗？？

randydl · 2025-03-09T02:55:16Z

应该是你的代码改错了吧，我这边正常运行，改了文件路径怎么可能还有small_ocr.pdf，这只是个example file @PoisonousBromineChan

请问您这边跑的时候表格识别速度正常吗？

使用这个代码后，表格识别变得巨慢，是什么原因呢？

你不使用服务化的方式，用magic-pdf cli的方式慢吗？

这样的话速度是正常的，表格识别用的TableMaster

请问这个问题最后有定位得到解决吗？？

解决了，用这里的代码：https://github.com/opendatalab/MinerU/tree/master/projects/multi_gpu

tongyuhome · 2025-03-09T05:29:04Z

应该是你的代码改错了吧，我这边正常运行，改了文件路径怎么可能还有small_ocr.pdf，这只是个example file @PoisonousBromineChan

请问您这边跑的时候表格识别速度正常吗？

使用这个代码后，表格识别变得巨慢，是什么原因呢？

你不使用服务化的方式，用magic-pdf cli的方式慢吗？

这样的话速度是正常的，表格识别用的TableMaster

请问这个问题最后有定位得到解决吗？？

解决了，用这里的代码：https://github.com/opendatalab/MinerU/tree/master/projects/multi_gpu

可以问下主要是那部分代码解决这个问题吗？
因为用的还是之前的mineru版本所以不方便直接使用这里的代码，可以的话只替换解决问题的这部分代码。

a694724555 · 2025-03-10T03:43:56Z

感谢，跑通了。额外安装库 pip install python-multipart，然后启动服务器程序就请求成功了。另外如果希望仅仅输出.md文件来节省存储空间和速度的话可以： from magic_pdf.libs.MakeContentConfig import MakeMode # 添加这行

修改do parse 函数：
        do_parse(self.output_dir,
                  pdf_name, inputs[0],
                    [],
                    **inputs[1],
                    f_draw_span_bbox=False,
                    f_draw_layout_bbox=False,
                    f_dump_md=True,
                    f_dump_middle_json=False,
                    f_dump_model_json=False,
                    f_dump_orig_pdf=False,
                    f_dump_content_list=False,
                    f_make_md_mode=MakeMode.MM_MD,
                    f_draw_model_bbox=False)

"No module named 'magic_pdf.libs.MakeContentConfig'" 为什么我会找不到这个包

ywh-my · 2025-03-10T03:47:31Z

这是旧版本了。新版本你自己查这个变量在哪里。

…

---- 回复的原邮件 ---- | 发件人 | ***@***.***> | | 日期 | 2025年03月10日 11:44 | | 收件人 | ***@***.***> | | 抄送至 | ***@***.***>***@***.***> | | 主题 | Re: [opendatalab/MinerU] 一个服务化的可多GPU并行处理的方案（基于LitServe） (Issue #667) | 感谢，跑通了。额外安装库 pip install python-multipart，然后启动服务器程序就请求成功了。另外如果希望仅仅输出.md文件来节省存储空间和速度的话可以： from magic_pdf.libs.MakeContentConfig import MakeMode # 添加这行修改do parse 函数： do_parse(self.output_dir, pdf_name, inputs[0], [], **inputs[1], f_draw_span_bbox=False, f_draw_layout_bbox=False, f_dump_md=True, f_dump_middle_json=False, f_dump_model_json=False, f_dump_orig_pdf=False, f_dump_content_list=False, f_make_md_mode=MakeMode.MM_MD, f_draw_model_bbox=False) "No module named 'magic_pdf.libs.MakeContentConfig'" 为什么我会找不到这个包 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: ***@***.***> a694724555 left a comment (opendatalab/MinerU#667) 感谢，跑通了。额外安装库 pip install python-multipart，然后启动服务器程序就请求成功了。另外如果希望仅仅输出.md文件来节省存储空间和速度的话可以： from magic_pdf.libs.MakeContentConfig import MakeMode # 添加这行修改do parse 函数： do_parse(self.output_dir, pdf_name, inputs[0], [], **inputs[1], f_draw_span_bbox=False, f_draw_layout_bbox=False, f_dump_md=True, f_dump_middle_json=False, f_dump_model_json=False, f_dump_orig_pdf=False, f_dump_content_list=False, f_make_md_mode=MakeMode.MM_MD, f_draw_model_bbox=False) "No module named 'magic_pdf.libs.MakeContentConfig'" 为什么我会找不到这个包 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: ***@***.***>

cyKron613 · 2025-03-11T04:31:19Z

按示例代码运行在windows双卡3090，两张卡都产生了显存占用。但，cuda0的核心占用明显提高，cuda1核心占用为0，功耗也是待机功耗，疑似计算都是在cuda0上进行的？

magic_pdf==1.1.0

同样的问题如何解决呢

a694724555 · 2025-03-11T06:12:56Z

这是旧版本。新版本是你自己在哪里查到这个指标的。
…

/谢谢找到了

randydl · 2025-03-11T23:46:32Z

按示例代码运行在windows双卡3090，两张卡都产生了显存占用。但，cuda0的核心占用明显提高，cuda1核心占用为0，功耗也是待机功耗，疑似计算都是在cuda0上进行的？
magic_pdf==1.1.0

同样的问题如何解决呢

是用的projects里面的代码么

cyKron613 · 2025-03-14T08:32:04Z

按示例代码运行在windows双卡3090，两张卡都产生了显存占用。但，cuda0的核心占用明显提高，cuda1核心占用为0，功耗也是待机功耗，疑似计算都是在cuda0上进行的？
magic_pdf==1.1.0

同样的问题如何解决呢

是用的projects里面的代码么

#1895 (comment)
这是我具体遇到的问题上面有截图也不完全一致。。但是是按照你的思路去写的

randydl · 2025-03-14T08:51:07Z

按示例代码运行在windows双卡3090，两张卡都产生了显存占用。但，cuda0的核心占用明显提高，cuda1核心占用为0，功耗也是待机功耗，疑似计算都是在cuda0上进行的？
magic_pdf==1.1.0

同样的问题如何解决呢

是用的projects里面的代码么

#1895 (comment) 这是我具体遇到的问题上面有截图也不完全一致。。但是是按照你的思路去写的

看了你写的代码，感觉不太对。我给你讲一讲实现多gpu并行处理的核心思路，就是启动多个进程，每个进程分配不同的gpu（用set cuda_visiable_devices来设定，且只能分配一个gpu），每个进程都会执行mineru的处理，进程之间是相互隔离的。

caixiongjiang · 2025-03-31T13:56:51Z

#!/usr/bin/env python
# -*- coding: UTF-8 -*-
"""=================================================
@PROJECT_NAME: database-services
@File    : multi_gpu_app.py.py
@Author  : JarsonCai
@Date    : 2025/3/29 23:10
@Function: 
    MinerU 多GPU并行推理服务
@Modify History:
         
@Copyright：Copyright(c) 2025-2027. All Rights Reserved
=================================================="""

import os
import json
import gc
import tempfile
import base64
import shutil
import fitz
import filetype
from typing import Tuple

import torch

import litserve as ls
from pathlib import Path
from fastapi import HTTPException
from glob import glob
from base64 import b64encode
from io import StringIO
from loguru import logger

import magic_pdf.model as model_config
from magic_pdf.config.enums import SupportedPdfParseMethod
from magic_pdf.data.data_reader_writer import DataWriter, FileBasedDataWriter
from magic_pdf.data.dataset import PymuDocDataset
from magic_pdf.model.doc_analyze_by_custom_model import doc_analyze
from magic_pdf.tools.cli import convert_file_to_pdf

model_config.__use_inside_model__ = True


class MemoryDataWriter(DataWriter):
    def __init__(self):
        self.buffer = StringIO()

    def write(self, path: str, data: bytes) -> None:
        if isinstance(data, str):
            self.buffer.write(data)
        else:
            self.buffer.write(data.decode("utf-8"))

    def write_string(self, path: str, data: str) -> None:
        self.buffer.write(data)

    def get_value(self) -> str:
        return self.buffer.getvalue()

    def close(self):
        self.buffer.close()


class MinerUAPI(ls.LitAPI):
    def __init__(self, output_dir='/tmp'):
        self.output_dir = Path(output_dir)

    def setup(self, device):
        """初始化模型，支持多GPU"""
        if device.startswith('cuda'):
            gpu_id = device.split(':')[-1]
            os.environ['CUDA_VISIBLE_DEVICES'] = gpu_id

        # 初始化模型相关组件
        from magic_pdf.model.doc_analyze_by_custom_model import ModelSingleton
        model_manager = ModelSingleton()
        model_manager.get_model(True, False)
        model_manager.get_model(False, False)
        logger.info(f'Model initialization complete on {device}!')

    def decode_request(self, request):
        """解析请求参数，保持与原接口一致"""
        # 正确解析FormData对象
        file = request.get("file", "")
        params = request.get('kwargs', {})
        params_str = json.dumps(params, ensure_ascii=False, indent=4)

        logger.info(f"params:\n {params_str}")

        return file, {
            "file_name": params.get("file_name", ""),
            "parse_method": params.get("parse_method", "ocr"),
            "start_page_id": int(params.get("start_page_id", 0)),
            "end_page_id": int(params.get("end_page_id")) if params.get("end_page_id") else None,
            "is_json_md_dump": params["is_json_md_dump"].lower() == "false",
            "output_dir": params.get("output_dir", "output"),
            "return_layout": params["return_layout"].lower() == "true",
            "return_info": params["return_info"].lower() == "true",
            "return_content_list": params["return_content_list"].lower() == "true",
            "return_images": params["return_images"].lower() == "true"
        }


    def predict(self, inputs):
        """核心处理逻辑，保持与原逻辑一致"""
        try:
            file, params = inputs[0], inputs[1]
            # 处理文件
            file_name = params['file_name']
            pdf_bytes, pdf_name = self.cvt2pdf(file, file_name)
            is_json_md_dump = params["is_json_md_dump"]
            return_layout = params["return_layout"]
            return_info = params["return_info"]
            return_content_list = params["return_content_list"]
            return_images = params["return_images"]

            # 处理PDF
            output_path = f"{params['output_dir']}/{pdf_name}"
            output_image_path = f"{output_path}/images"
            image_writer = FileBasedDataWriter(output_image_path)

            # 同步处理
            infer_result, pipe_result = self.process_pdf(
                pdf_bytes,
                params["parse_method"],
                image_writer,
                params["start_page_id"],
                params["end_page_id"]
            )

            # 生成响应数据
            return self._build_response(
                infer_result, pipe_result, output_path, output_image_path,
                pdf_name, is_json_md_dump, return_layout, return_info,
                return_content_list, return_images
            )

        except Exception as e:
            raise HTTPException(status_code=500, detail=str(e))
        finally:
            self.clean_memory()

    def encode_response(self, output):
        """保持与原接口一致的响应格式"""
        return output

    def process_pdf(self, pdf_bytes, parse_method, image_writer, start_page_id, end_page_id):
        """保持与原函数一致的处理逻辑"""
        ds = PymuDocDataset(pdf_bytes)
        if parse_method == "ocr":
            infer_result = ds.apply(doc_analyze, ocr=True, start_page_id=start_page_id, end_page_id=end_page_id)
            pipe_result = infer_result.pipe_ocr_mode(image_writer)
        elif parse_method == "txt":
            infer_result = ds.apply(doc_analyze, ocr=False, start_page_id=start_page_id, end_page_id=end_page_id)
            pipe_result = infer_result.pipe_txt_mode(image_writer)
        else:
            if ds.classify() == SupportedPdfParseMethod.OCR:
                infer_result = ds.apply(doc_analyze, ocr=True, start_page_id=start_page_id, end_page_id=end_page_id)
                pipe_result = infer_result.pipe_ocr_mode(image_writer)
            else:
                infer_result = ds.apply(doc_analyze, ocr=False, start_page_id=start_page_id, end_page_id=end_page_id)
                pipe_result = infer_result.pipe_txt_mode(image_writer)
        return infer_result, pipe_result

    def cvt2pdf(self, file_base64, filename: str) -> Tuple[bytes, str]:

        try:
            if not filename:
                raise Exception('No file name provided')
            pdf_name = filename.split('.')[0] + '.pdf'
            temp_dir = Path(tempfile.mkdtemp())
            temp_file = temp_dir.joinpath('tmpfile')
            file_bytes = base64.b64decode(file_base64)
            file_ext = filetype.guess_extension(file_bytes)

            if file_ext in ['pdf', 'jpg', 'png', 'doc', 'docx', 'ppt', 'pptx']:
                if file_ext == 'pdf':
                    return file_bytes, pdf_name
                elif file_ext in ['jpg', 'png']:
                    with fitz.open(stream=file_bytes, filetype=file_ext) as f:
                        return f.convert_to_pdf(), pdf_name
                else:
                    temp_file.write_bytes(file_bytes)
                    convert_file_to_pdf(temp_file, temp_dir)
                    return temp_file.with_suffix('.pdf').read_bytes(), pdf_name
            else:
                raise Exception('Unsupported file format')
        except Exception as e:
            raise HTTPException(status_code=500, detail=str(e))
        finally:
            shutil.rmtree(temp_dir, ignore_errors=True)



    def _build_response(self, infer_result, pipe_result, output_path, output_image_path,
                        pdf_name, is_json_md_dump, return_layout, return_info,
                        return_content_list, return_images):
        """构建响应数据结构"""
        # 内存写入器初始化
        content_list_writer = MemoryDataWriter()
        md_content_writer = MemoryDataWriter()
        middle_json_writer = MemoryDataWriter()

        # 数据写入
        pipe_result.dump_content_list(content_list_writer, "", "images")
        pipe_result.dump_md(md_content_writer, "", "images")
        pipe_result.dump_middle_json(middle_json_writer, "")

        # 构建响应
        data = {
            "md_content": md_content_writer.get_value(),
            "layout": json.loads(content_list_writer.get_value()) if return_layout else None,
            "info": json.loads(middle_json_writer.get_value()) if return_info else None,
            "content_list": json.loads(content_list_writer.get_value()) if return_content_list else None,
            "images": self._get_images(output_image_path) if return_images else None
        }

        # 持久化存储
        if is_json_md_dump:
            self._save_results(output_path, pdf_name, content_list_writer,
                               md_content_writer, middle_json_writer, infer_result)

        # 清理资源
        content_list_writer.close()
        md_content_writer.close()
        middle_json_writer.close()

        return data

    def _get_images(self, image_path):
        """编码图片数据"""
        return {
            os.path.basename(p): f"data:image/jpeg;base64,{b64encode(open(p, 'rb').read()).decode()}"
            for p in glob(f"{image_path}/*.jpg")
        }

    def _save_results(self, output_path, pdf_name, *writers):
        """保存结果文件"""
        writer = FileBasedDataWriter(output_path)
        writer.write_string(f"{pdf_name}_content_list.json", writers[0].get_value())
        writer.write_string(f"{pdf_name}.md", writers[1].get_value())
        writer.write_string(f"{pdf_name}_middle.json", writers[2].get_value())
        writer.write_string(f"{pdf_name}_model.json", json.dumps(writers[3].get_infer_res(), indent=4))

    def clean_memory(self):
        """内存清理"""
        if torch.cuda.is_available():
            torch.cuda.empty_cache()
            torch.cuda.ipc_collect()
        gc.collect()


if __name__ == '__main__':
    # 根据容器的环境参数来设置每个GPU的worker数量
    # 获取环境变量
    workers_per_device = int(os.environ.get('WORKERS_PER_DEVICE', '1'))
    server_timeout = int(os.environ.get('SERVER_TIMEOUT', False)) # 从请求进入开始计时，超时则返回504
    server = ls.LitServer(
        MinerUAPI(),
        accelerator='cuda',
        devices='auto',
        workers_per_device=workers_per_device,  # 每个GPU一个worker
        timeout=server_timeout
    )
    server.run(port=8000)

大佬你好，这是我按照你的思路封装的API，为什么和他遇到的问题是一样的，我设置了10个worker，全部都启动到了cuda:0上

ywh-my · 2025-04-06T14:47:15Z

这个代码的这一部分：with patch('magic_pdf.model.doc_analyze_by_custom_model.get_device') as mock_obj:
mock_obj.return_value = device
model_manager = ModelSingleton()
model_manager.get_model(True, False)
model_manager.get_model(False, False)
mock_obj.assert_called()
print(f'Model initialization complete!')
应该是不需要的，因为模型初始化的操作，在 do parse函数内部会有。这段代码属于是又额外实例化了模型。

randydl added the enhancement New feature or request label Sep 27, 2024

myhloli pinned this issue Sep 27, 2024

This was referenced Oct 1, 2024

Does MinerU support self deployment on remote GPU servers? #678

Closed

关于多显卡 About Multi-GPUs #683

Closed

randydl changed the title ~~给大家提供一个多GPU并行处理的API调用方案，基于 LitServe (FastAPI)~~ 一个服务化的可多GPU并行处理的实现方案（基于LitServe） Oct 8, 2024

randydl changed the title ~~一个服务化的可多GPU并行处理的实现方案（基于LitServe）~~ 一个服务化的可多GPU并行处理的方案（基于LitServe） Oct 8, 2024

huyiwen mentioned this issue Dec 7, 2024

单卡多进程执行 #1223

Closed

一个服务化的可多GPU并行处理的方案（基于LitServe） #667

一个服务化的可多GPU并行处理的方案（基于LitServe） #667

Comments

randydl commented Sep 27, 2024 • edited Loading

BlackMoki-bot commented Sep 28, 2024 • edited Loading

randydl commented Sep 30, 2024

flow3rdown commented Oct 12, 2024

randydl commented Oct 12, 2024

flow3rdown commented Oct 14, 2024

PoisonousBromineChan commented Oct 15, 2024

randydl commented Oct 16, 2024 • edited Loading

flow3rdown commented Oct 17, 2024

randydl commented Oct 18, 2024

234687552 commented Oct 19, 2024

234687552 commented Oct 22, 2024 • edited Loading

myhloli commented Oct 22, 2024

234687552 commented Oct 23, 2024

randydl commented Oct 24, 2024

randydl commented Oct 24, 2024

修改do parse 函数：

234687552 commented Oct 24, 2024 • edited Loading

234687552 commented Oct 24, 2024

randydl commented Oct 24, 2024

randydl commented Oct 24, 2024

myhloli commented Oct 24, 2024

randydl commented Oct 24, 2024

234687552 commented Oct 24, 2024

randydl commented Oct 24, 2024

234687552 commented Oct 25, 2024

xcvil commented Dec 2, 2024

randydl commented Dec 3, 2024

zxwsd commented Dec 3, 2024

xcvil commented Dec 3, 2024 • edited Loading

zxwsd commented Dec 4, 2024

zxwsd commented Dec 4, 2024

xcvil commented Dec 4, 2024

zxwsd commented Dec 5, 2024

xcvil commented Dec 7, 2024

zxwsd commented Dec 8, 2024

234687552 commented Dec 18, 2024

Ronass commented Dec 20, 2024 • edited Loading

randydl commented Dec 23, 2024

hzh747117982 commented Feb 13, 2025

seedclaimer commented Feb 20, 2025 • edited Loading

plyu3 commented Mar 4, 2025

ywh-my commented Mar 5, 2025

tongyuhome commented Mar 8, 2025

randydl commented Mar 9, 2025

tongyuhome commented Mar 9, 2025

a694724555 commented Mar 10, 2025

修改do parse 函数：

ywh-my commented Mar 10, 2025 via email

cyKron613 commented Mar 11, 2025

a694724555 commented Mar 11, 2025

randydl commented Mar 11, 2025

cyKron613 commented Mar 14, 2025

randydl commented Mar 14, 2025

caixiongjiang commented Mar 31, 2025

ywh-my commented Apr 6, 2025

randydl commented Sep 27, 2024 •

edited

Loading

BlackMoki-bot commented Sep 28, 2024 •

edited

Loading

randydl commented Oct 16, 2024 •

edited

Loading

234687552 commented Oct 22, 2024 •

edited

Loading

234687552 commented Oct 24, 2024 •

edited

Loading

xcvil commented Dec 3, 2024 •

edited

Loading

Ronass commented Dec 20, 2024 •

edited

Loading

seedclaimer commented Feb 20, 2025 •

edited

Loading