Skip to content

FatalError: Segmentation fault is detected by the operating system #16402

@DGDarkKing

Description

@DGDarkKing

🔎 Search before asking

  • I have searched the PaddleOCR Docs and found no similar bug report.
  • I have searched the PaddleOCR Issues and found no similar bug report.
  • I have searched the PaddleOCR Discussions and found no similar bug report.

🐛 Bug (问题描述)

I installed to Docker container base on ultralytics/ultralytics:latest-jetson-jetpack6 image on Jetson Orin NX 16Gb CPU paddlepaddle and paddleocr by next commands

pip install paddlepaddle==3.1.1 -i https://www.paddlepaddle.org.cn/packages/stable/cpu/
pip install paddleocr

Error all logs

/usr/local/lib/python3.10/dist-packages/paddle/utils/cpp_extension/extension_utils.py:717: UserWarning: No ccache found. Please be aware that recompiling all source files may be required. You can download and install ccache from: https://github.com/ccache/ccache/blob/master/doc/INSTALL.md
  warnings.warn(warning_message)

Creating model: ('PP-OCRv5_server_det', None)
Model files already exist. Using cached files. To redownload, please delete the directory manually: `/root/.paddlex/official_models/PP-OCRv5_server_det`.
Creating model: ('en_PP-OCRv5_mobile_rec', None)
Model files already exist. Using cached files. To redownload, please delete the directory manually: `/root/.paddlex/official_models/en_PP-OCRv5_mobile_rec`.


--------------------------------------
C++ Traceback (most recent call last):
--------------------------------------
No stack trace in paddle, may be caused by external reasons.

----------------------
Error Message Summary:
----------------------
FatalError: `Segmentation fault` is detected by the operating system.
  [TimeInfo: *** Aborted at 1757103493 (unix time) try "date -d @1757103493" if you are using GNU date ***]
  [SignalInfo: *** SIGSEGV (@0x0) received by PID 1701 (TID 0xfffebcf6f120) from PID 0 ***]

Segmentation fault (core dumped)

I wanted to test the performance on Jetson.
And I already saw same issues here.
Result of check

>>> paddle.utils.run_check()
Running verify PaddlePaddle program ...
I0905 19:36:05.268429  1129 pir_interpreter.cc:1524] New Executor is Running ...
I0905 19:36:05.269327  1129 pir_interpreter.cc:1547] pir interpreter is running by multi-thread mode ...
PaddlePaddle works well on 1 CPU.
PaddlePaddle is installed successfully! Let's start deep learning with PaddlePaddle now.
>>> paddle.device.cuda.device_count()
0

🏃‍♂️ Environment (运行环境)

OS: Jetpack 6.4.3
Environment: docker image ultralytics/ultralytics:latest-jetson-jetpack6
Python: 3.10.12
PaddleOCR: 3.2.0
Install: pip
RAM: 16 Gb
CUDA: 12.6.68

🌰 Minimal Reproducible Example (最小可复现问题的Demo)

Code example of performance test
The problem occurs in the predict line

import logging
import os
import time
from typing import Tuple, Callable, Any

import cv2
import numpy as np
import paddle
from paddleocr import PaddleOCR

# --- НАСТРОЙКА ЛОГИРОВАНИЯ ПОСЛЕ ИМПОРТОВ ---
LOG_FORMAT = "%(asctime)s - [%(levelname)s] - *%(name)s* - (%(filename)s).%(funcName)s(%(lineno)d) - %(message)s"

# 1. Удаляем все существующие обработчики у корневого логгера
for handler in logging.root.handlers[:]:
    logging.root.removeHandler(handler)

# 2. Создаём новый обработчик для вывода в stdout
handler = logging.StreamHandler()
handler.setFormatter(logging.Formatter(LOG_FORMAT))

# 3. Настраиваем корневой логгер
logging.root.setLevel(logging.DEBUG)
logging.root.addHandler(handler)

# 4. Отключаем ненужные логгеры
removed_loggers = ("pika", "botocore", "matplotlib", "paddle", "ppocr", "ppdet")
for rm_logger in removed_loggers:
    logger_to_remove = logging.getLogger(rm_logger)
    logger_to_remove.setLevel(logging.WARNING)
    # Удаляем их обработчики, если есть
    logger_to_remove.handlers.clear()
    # Отключаем передачу вверх
    logger_to_remove.propagate = False

logger = logging.getLogger(__name__)
logger.info("Logger init")


def performance_log(
        text: str,
        log_func: Callable[[str], Any] = print,
        number_of_avg: int = 10,
        precision: int = 4,
):
    assert number_of_avg > 0

    def func_wrapper(func: Callable):
        counter = 0
        time_accumulator = 0

        def call(*args, **kwargs):
            start = time.time()
            result = func(*args, **kwargs)
            finish = time.time()

            nonlocal counter, time_accumulator
            counter += 1
            time_accumulator += finish - start

            if counter >= number_of_avg:
                avg_time = time_accumulator / counter
                log_func(f"{text}: Performance of {func.__name__} = {avg_time:.{precision}f} (n={number_of_avg})")
                counter = 0
                time_accumulator = 0

            return result

        return call

    return func_wrapper


class PaddleOcr:
    def __init__(
            self,
    ):
        device_name = "gpu" if paddle.device.is_compiled_with_cuda() else "cpu"
        # To speed up the build process, the models and processor can be saved locally.

        self.__model = PaddleOCR(
            use_doc_orientation_classify=False,  # Disables document orientation classification model via this parameter
            use_doc_unwarping=False,  # Disables text image rectification model via this parameter
            use_textline_orientation=False,  # Disables text line orientation classification model via this parameter
            lang="en",
            device=device_name
        )

        logger.info(
            "OCR: \"%s\"",
            device_name,
        )

    def __recognize_text(self, image) -> Tuple:
        result = self.__model.predict(input=image)
        rec_texts = result[0]['rec_texts']
        rec_boxes = result[0]['rec_boxes']
        return rec_texts, rec_boxes

    @performance_log('Ocr', logger.info, )
    def recognize(self, image: np.ndarray) -> str:
        image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
        rec_texts, rec_boxes = self.__recognize_text(image)
        return rec_texts, rec_boxes


def main():
    IMAGE_PATH = r"/app/ocr_images"
    repeat_number = 10

    files = [os.path.join(IMAGE_PATH, f) for f in os.listdir(IMAGE_PATH) if os.path.isfile(os.path.join(IMAGE_PATH, f))]
    if not files:
        logger.error(f"Files not found in directory \"{IMAGE_PATH}\"")

    ocr = PaddleOcr()
    for _ in range(repeat_number):
        for f in files:
            ocr.recognize(cv2.imread(f))


if __name__ == '__main__':
    main()

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions