-
-
Notifications
You must be signed in to change notification settings - Fork 418
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rapidocr_api队列实现代码 #338
Comments
很好的问题和观察,待我有空测测看。 |
如使用 服务端 from flask import Flask, request, jsonify
import base64
from queue import Queue
from rapidocr_onnxruntime import RapidOCR
import logging
import os
# 配置日志格式
logging.basicConfig(
level=logging.INFO,
format='%(asctime)s - %(levelname)s - %(message)s'
)
app = Flask(__name__)
# 初始化OCR引擎池
def init_engine_pool(pool_size):
engine_queue = Queue(maxsize=pool_size)
for _ in range(pool_size):
engine = RapidOCR()
engine_queue.put(engine)
logging.info(f"Initialized OCR engine pool with size: {pool_size}")
return engine_queue
# 从环境变量获取配置
POOL_SIZE = int(os.environ.get('OCR_ENGINE_POOL_SIZE', 1)) # 默认1个实例
engine_pool = init_engine_pool(POOL_SIZE)
@app.route('/ocr', methods=['POST'])
def ocr_service():
# 参数校验
if 'image' not in request.json:
return jsonify({"error": "Missing image parameter"}), 400
# Base64解码
try:
img_b64 = request.json['image'].split(',')[-1]
img_bytes = base64.b64decode(img_b64)
except Exception as e:
logging.error(f"Base64解码失败: {str(e)}")
return jsonify({"error": f"无效的图片数据: {str(e)}"}), 400
# 获取OCR引擎
engine = engine_pool.get()
try:
# 直接处理二进制数据
result, elapse = engine(img_bytes)
except Exception as e:
logging.error(f"OCR处理失败: {str(e)}")
return jsonify({"error": f"OCR处理失败: {str(e)}"}), 500
finally:
engine_pool.put(engine)
# 结果格式化(根据实际数据结构调整)
formatted = []
for item in result:
formatted.append({
"coordinates": item[0], # 保持原始坐标结构
"text": item[1],
"confidence": float(item[2])
})
return jsonify({
"result": formatted,
"processing_time": elapse,
"engine_count": POOL_SIZE
})
if __name__ == '__main__':
app.run(host='0.0.0.0', port=9003, threaded=True) 服务端运行(启动4个实例) OCR_ENGINE_POOL_SIZE=4 gunicorn -w 4 -b 0.0.0.0:9003 ocr_server:app |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
问题描述 / Problem Description
之前有说当前rapidocr_api的worker参数并没有实际作用。
运行环境 / Runtime Environment
实现代码 / Code
服务端
服务端运行(启动5个实例)
客户端
问题说明
api
返回结果和runtime
返回结果统一,现在api也可以返回处理时间、置信度。可以查看服务器的三个processing_time
用来优化运行了。utils/infer_engine.py
里的openvino.runtime
。RapidOCR()
要比RapidOCR(intra_op_num_threads=1)
慢一倍,完美的互相拖后腿。(inter_op_num_threads
不知道啥用,我测试对CPU的影响在误差范围内,难道是双CPU才有影响)rapidocr_api
(openvino版),开5个实例,都只用了一核或者说一半的核心。测试CPU为2核心4线程,在htop
中查看,0123核心只有23使用率100%,01都只用了10-15%。The text was updated successfully, but these errors were encountered: