-
Notifications
You must be signed in to change notification settings - Fork 8.6k
Open
Description
🔎 Search before asking
- I have searched the PaddleOCR Docs and found no similar bug report.
- I have searched the PaddleOCR Issues and found no similar bug report.
- I have searched the PaddleOCR Discussions and found no similar bug report.
🐛 Bug (问题描述)
I installed to Docker container base on ultralytics/ultralytics:latest-jetson-jetpack6 image on Jetson Orin NX 16Gb CPU paddlepaddle and paddleocr by next commands
pip install paddlepaddle==3.1.1 -i https://www.paddlepaddle.org.cn/packages/stable/cpu/
pip install paddleocr
Error all logs
/usr/local/lib/python3.10/dist-packages/paddle/utils/cpp_extension/extension_utils.py:717: UserWarning: No ccache found. Please be aware that recompiling all source files may be required. You can download and install ccache from: https://github.com/ccache/ccache/blob/master/doc/INSTALL.md
warnings.warn(warning_message)
Creating model: ('PP-OCRv5_server_det', None)
Model files already exist. Using cached files. To redownload, please delete the directory manually: `/root/.paddlex/official_models/PP-OCRv5_server_det`.
Creating model: ('en_PP-OCRv5_mobile_rec', None)
Model files already exist. Using cached files. To redownload, please delete the directory manually: `/root/.paddlex/official_models/en_PP-OCRv5_mobile_rec`.
--------------------------------------
C++ Traceback (most recent call last):
--------------------------------------
No stack trace in paddle, may be caused by external reasons.
----------------------
Error Message Summary:
----------------------
FatalError: `Segmentation fault` is detected by the operating system.
[TimeInfo: *** Aborted at 1757103493 (unix time) try "date -d @1757103493" if you are using GNU date ***]
[SignalInfo: *** SIGSEGV (@0x0) received by PID 1701 (TID 0xfffebcf6f120) from PID 0 ***]
Segmentation fault (core dumped)
I wanted to test the performance on Jetson.
And I already saw same issues here.
Result of check
>>> paddle.utils.run_check()
Running verify PaddlePaddle program ...
I0905 19:36:05.268429 1129 pir_interpreter.cc:1524] New Executor is Running ...
I0905 19:36:05.269327 1129 pir_interpreter.cc:1547] pir interpreter is running by multi-thread mode ...
PaddlePaddle works well on 1 CPU.
PaddlePaddle is installed successfully! Let's start deep learning with PaddlePaddle now.
>>> paddle.device.cuda.device_count()
0
🏃♂️ Environment (运行环境)
OS: Jetpack 6.4.3
Environment: docker image ultralytics/ultralytics:latest-jetson-jetpack6
Python: 3.10.12
PaddleOCR: 3.2.0
Install: pip
RAM: 16 Gb
CUDA: 12.6.68
🌰 Minimal Reproducible Example (最小可复现问题的Demo)
Code example of performance test
The problem occurs in the predict line
import logging
import os
import time
from typing import Tuple, Callable, Any
import cv2
import numpy as np
import paddle
from paddleocr import PaddleOCR
# --- НАСТРОЙКА ЛОГИРОВАНИЯ ПОСЛЕ ИМПОРТОВ ---
LOG_FORMAT = "%(asctime)s - [%(levelname)s] - *%(name)s* - (%(filename)s).%(funcName)s(%(lineno)d) - %(message)s"
# 1. Удаляем все существующие обработчики у корневого логгера
for handler in logging.root.handlers[:]:
logging.root.removeHandler(handler)
# 2. Создаём новый обработчик для вывода в stdout
handler = logging.StreamHandler()
handler.setFormatter(logging.Formatter(LOG_FORMAT))
# 3. Настраиваем корневой логгер
logging.root.setLevel(logging.DEBUG)
logging.root.addHandler(handler)
# 4. Отключаем ненужные логгеры
removed_loggers = ("pika", "botocore", "matplotlib", "paddle", "ppocr", "ppdet")
for rm_logger in removed_loggers:
logger_to_remove = logging.getLogger(rm_logger)
logger_to_remove.setLevel(logging.WARNING)
# Удаляем их обработчики, если есть
logger_to_remove.handlers.clear()
# Отключаем передачу вверх
logger_to_remove.propagate = False
logger = logging.getLogger(__name__)
logger.info("Logger init")
def performance_log(
text: str,
log_func: Callable[[str], Any] = print,
number_of_avg: int = 10,
precision: int = 4,
):
assert number_of_avg > 0
def func_wrapper(func: Callable):
counter = 0
time_accumulator = 0
def call(*args, **kwargs):
start = time.time()
result = func(*args, **kwargs)
finish = time.time()
nonlocal counter, time_accumulator
counter += 1
time_accumulator += finish - start
if counter >= number_of_avg:
avg_time = time_accumulator / counter
log_func(f"{text}: Performance of {func.__name__} = {avg_time:.{precision}f} (n={number_of_avg})")
counter = 0
time_accumulator = 0
return result
return call
return func_wrapper
class PaddleOcr:
def __init__(
self,
):
device_name = "gpu" if paddle.device.is_compiled_with_cuda() else "cpu"
# To speed up the build process, the models and processor can be saved locally.
self.__model = PaddleOCR(
use_doc_orientation_classify=False, # Disables document orientation classification model via this parameter
use_doc_unwarping=False, # Disables text image rectification model via this parameter
use_textline_orientation=False, # Disables text line orientation classification model via this parameter
lang="en",
device=device_name
)
logger.info(
"OCR: \"%s\"",
device_name,
)
def __recognize_text(self, image) -> Tuple:
result = self.__model.predict(input=image)
rec_texts = result[0]['rec_texts']
rec_boxes = result[0]['rec_boxes']
return rec_texts, rec_boxes
@performance_log('Ocr', logger.info, )
def recognize(self, image: np.ndarray) -> str:
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
rec_texts, rec_boxes = self.__recognize_text(image)
return rec_texts, rec_boxes
def main():
IMAGE_PATH = r"/app/ocr_images"
repeat_number = 10
files = [os.path.join(IMAGE_PATH, f) for f in os.listdir(IMAGE_PATH) if os.path.isfile(os.path.join(IMAGE_PATH, f))]
if not files:
logger.error(f"Files not found in directory \"{IMAGE_PATH}\"")
ocr = PaddleOcr()
for _ in range(repeat_number):
for f in files:
ocr.recognize(cv2.imread(f))
if __name__ == '__main__':
main()
Metadata
Metadata
Assignees
Labels
No labels