Skip to content

Commit

Permalink
Develop (PaddlePaddle#1379)
Browse files Browse the repository at this point in the history
* cp dev -> 2.3
  • Loading branch information
littletomatodonkey authored Nov 2, 2021
1 parent 796aa90 commit dfb26b6
Show file tree
Hide file tree
Showing 40 changed files with 812 additions and 302 deletions.
32 changes: 11 additions & 21 deletions README_ch.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,32 +7,26 @@
飞桨图像识别套件PaddleClas是飞桨为工业界和学术界所准备的一个图像识别任务的工具集,助力使用者训练出更好的视觉模型和应用落地。

**近期更新**
- 2021.10.31 发布[PP-ShiTu技术报告](./docs/PP_ShiTu.pdf),优化文档,新增饮料识别demo
- 2021.10.23 发布PP-ShiTu图像识别系统,cpu上200ms即可完成在10w+库的图像识别。

- 2021.11.1 发布[PP-ShiTu技术报告](https://arxiv.org/pdf/2111.00775.pdf),新增饮料识别demo
- 2021.10.23 发布轻量级图像识别系统PP-ShiTu,CPU上0.2s即可完成在10w+库的图像识别。
[点击这里](./docs/zh_CN/quick_start/quick_start_recognition.md)立即体验
- 2021.09.17 增加PaddleClas自研PP-LCNet系列模型, 这些模型在Intel CPU上有较强的竞争力。PP-LCNet的介绍可以参考[论文](https://arxiv.org/pdf/2109.15099.pdf), 或者[PP-LCNet模型介绍](docs/zh_CN/models/PP-LCNet.md),相关指标和预训练权重可以从 [这里](docs/zh_CN/ImageNet_models_cn.md)下载。
- 2021.09.17 发布PP-LCNet系列超轻量骨干网络模型, 在Intel CPU上,单张图像预测速度约5ms,ImageNet-1K数据集上Top1识别准确率达到80.82%,超越ResNet152的模型效果。PP-LCNet的介绍可以参考[论文](https://arxiv.org/pdf/2109.15099.pdf), 或者[PP-LCNet模型介绍](docs/zh_CN/models/PP-LCNet.md),相关指标和预训练权重可以从 [这里](docs/zh_CN/algorithm_introduction/ImageNet_models.md)下载。
- [more](./docs/zh_CN/others/update_history.md)

## 特性

- PP-ShiTu轻量图像识别系统:集成了目标检测、特征学习、图像检索等模块,广泛适用于各类图像识别任务。
cpu上200ms即可完成在10w+库的图像识别。
详细介绍见[PP-ShiTu: A Practical Lightweight Image Recognition System](./docs/PP_ShiTu.pdf)
- PP-ShiTu轻量图像识别系统:集成了目标检测、特征学习、图像检索等模块,广泛适用于各类图像识别任务。cpu上0.2s即可完成在10w+库的图像识别。

- PP-LCNet轻量级CPU骨干网络:专门为CPU设备打造轻量级骨干网络,速度、精度均超越竞品。
详细介绍见[PP-LCNet: A Lightweight CPU Convolutional Neural Network](https://arxiv.org/pdf/2109.15099.pdf),
或者[PP-LCNet模型介绍](docs/zh_CN/models/PP-LCNet.md)
- PP-LCNet轻量级CPU骨干网络:专门为CPU设备打造轻量级骨干网络,速度、精度均远超竞品。

- 丰富的预训练模型库:提供了35个系列共164个ImageNet预训练模型,其中6个精选系列模型支持结构快速修改
- 丰富的预训练模型库:提供了36个系列共175个ImageNet预训练模型,其中7个精选系列模型支持结构快速修改

- 全面易用的特征学习组件:集成arcmargin, triplet loss等12度量学习方法,通过配置文件即可随意组合切换。

- SSLD知识蒸馏:14个分类预训练模型,精度普遍提升3%以上;其中ResNet50_vd模型在ImageNet-1k数据集上的Top-1精度达到了84.0%,
Res2Net200_vd预训练模型Top-1精度高达85.1%。

- 数据增广:支持AutoAugment、Cutout、Cutmix等8种数据增广算法详细介绍、代码复现和在统一实验环境下的效果评估。


<div align="center">
<img src="./docs/images/recognition.gif" width = "400" />
</div>
Expand All @@ -47,6 +41,7 @@ Res2Net200_vd预训练模型Top-1精度高达85.1%。
</div>

## 快速体验

PP-ShiTu图像识别快速体验:[点击这里](./docs/zh_CN/quick_start/quick_start_recognition.md)

## 文档教程
Expand All @@ -59,9 +54,7 @@ PP-ShiTu图像识别快速体验:[点击这里](./docs/zh_CN/quick_start/quick
- [尝鲜版](./docs/zh_CN/quick_start/quick_start_classification_new_user.md)
- [进阶版](./docs/zh_CN/quick_start/quick_start_classification_professional.md)
- [PP-ShiTu图像识别系统介绍](#图像识别系统介绍)
- [主体检测](./docs/zh_CN/algorithm_introduction/mainbody_detection.md)
- [特征学习](./docs/zh_CN/algorithm_introduction/metric_learning.md)
- [向量检索](./deploy/vector_search/README.md)
- [骨干网络和预训练模型库](./docs/zh_CN/algorithm_introduction/ImageNet_models.md)
- 数据准备
- [图像分类数据集介绍](./docs/zh_CN/data_preparation/classification_dataset.md)
- [图像识别数据集介绍](./docs/zh_CN/data_preparation/recognition_dataset.md)
Expand All @@ -83,7 +76,6 @@ PP-ShiTu图像识别快速体验:[点击这里](./docs/zh_CN/quick_start/quick
- 算法介绍
- [图像分类任务介绍](./docs/zh_CN/algorithm_introduction/image_classification.md)
- [度量学习介绍](./docs/zh_CN/algorithm_introduction/metric_learning.md)
- [骨干网络和预训练模型库](./docs/zh_CN/algorithm_introduction/ImageNet_models.md)
- 高阶使用
- [数据增广](./docs/zh_CN/advanced_tutorials/DataAugmentation.md)
- [模型量化](./docs/zh_CN/advanced_tutorials/model_prune_quantization.md)
Expand All @@ -92,7 +84,7 @@ PP-ShiTu图像识别快速体验:[点击这里](./docs/zh_CN/quick_start/quick
- [社区贡献指南](./docs/zh_CN/advanced_tutorials/how_to_contribute.md)
- FAQ
- [图像识别精选问题](docs/zh_CN/faq_series/faq_2021_s2.md)
- [图像分类精选问题](docs/zh_CN/faq_series/faq.md)
- [图像分类精选问题](docs/zh_CN/faq_series/faq_selected_30.md)
- [图像分类FAQ第一季](docs/zh_CN/faq_series/faq_2020_s1.md)
- [图像分类FAQ第二季](docs/zh_CN/faq_series/faq_2021_s1.md)
- [许可证书](#许可证书)
Expand All @@ -105,9 +97,8 @@ PP-ShiTu图像识别快速体验:[点击这里](./docs/zh_CN/quick_start/quick
<img src="./docs/images/structure.jpg" width = "800" />
</div>

PP-ShiTu图像识别系统分为三步:(1)通过一个目标检测模型,检测图像物体候选区域(2)对每个候选区域进行特征提取(3)与检索库中图像进行特征匹配,提取识别结果
PP-ShiTu是一个实用的轻量级通用图像识别系统,主要由主体检测、特征学习和向量检索三个模块组成。该系统从骨干网络选择和调整、损失函数的选择、数据增强、学习率变换策略、正则化参数选择、预训练模型使用以及模型裁剪量化8个方面,采用多种策略,对各个模块的模型进行优化,最终得到在CPU上仅0.2s即可完成10w+库的图像识别的系统。更多细节请参考[PP-ShiTu技术方案](https://arxiv.org/pdf/2111.00775.pdf)

对于新的未知类别,无需重新训练模型,只需要在检索库补入该类别图像,重新建立检索库,就可以识别该类别。

<a name="识别效果展示"></a>
## PP-ShiTu图像识别系统效果展示
Expand Down Expand Up @@ -152,4 +143,3 @@ PP-ShiTu图像识别系统分为三步:(1)通过一个目标检测模型
- 非常感谢[nblib](https://github.com/nblib)修正了PaddleClas中RandErasing的数据增广配置文件。
- 非常感谢[chenpy228](https://github.com/chenpy228)修正了PaddleClas文档中的部分错别字。
- 非常感谢[jm12138](https://github.com/jm12138)为PaddleClas添加ViT,DeiT系列模型和RepVGG系列模型。
- 非常感谢[FutureSI](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/76563)对PaddleClas代码的解析与总结。
3 changes: 2 additions & 1 deletion README_en.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,8 @@ PaddleClas is an image recognition toolset for industry and academia, helping us

**Recent updates**

- 2021.09.17 Add PP-LCNet series model developed by PaddleClas, these models show strong competitiveness on Intel CPUs. The metrics and pretrained model are available [here](docs/en/ImageNet_models_en.md).
- 2021.09.17 Add PP-LCNet series model developed by PaddleClas, these models show strong competitiveness on Intel CPUs.
For the introduction of PP-LCNet, please refer to [paper](https://arxiv.org/pdf/2109.15099.pdf) or [PP-LCNet model introduction](docs/en/models/PP-LCNet_en.md). The metrics and pretrained model are available [here](docs/en/ImageNet_models_en.md).

- 2021.06.29 Add Swin-transformer series model,Highest top1 acc on ImageNet1k dataset reaches 87.2%, training, evaluation and inference are all supported. Pretrained models can be downloaded [here](docs/en/models/models_intro_en.md).
- 2021.06.16 PaddleClas release/2.2. Add metric learning and vector search modules. Add product recognition, animation character recognition, vehicle recognition and logo recognition. Added 30 pretrained models of LeViT, Twins, TNT, DLA, HarDNet, and RedNet, and the accuracy is roughly the same as that of the paper.
Expand Down
27 changes: 27 additions & 0 deletions benchmark/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# benchmark使用说明

此目录所有shell脚本是为了测试PaddleClas中不同模型的速度指标,如单卡训练速度指标、多卡训练速度指标等。

## 相关脚本说明

一共有3个脚本:

- `prepare_data.sh`: 下载相应的测试数据,并配置好数据路径
- `run_benchmark.sh`: 执行单独一个训练测试的脚本,具体调用方式,可查看脚本注释
- `run_all.sh`: 执行所有训练测试的入口脚本

## 使用说明

**注意**:为了跟PaddleClas中其他的模块的执行目录保持一致,此模块的执行目录为`PaddleClas`的根目录。

### 1.准备数据

```shell
bash benchmark/prepare_data.sh
```

### 2.执行所有模型的测试

```shell
bash benchmark/run_all.sh
```
11 changes: 11 additions & 0 deletions benchmark/prepare_data.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
#!/bin/bash
dataset_url=$1

cd dataset
rm -rf ILSVRC2012
wget -nc ${dataset_url}
tar xf ILSVRC2012_val.tar
ln -s ILSVRC2012_val ILSVRC2012
cd ILSVRC2012
ln -s val_list.txt train_list.txt
cd ../../
25 changes: 25 additions & 0 deletions benchmark/run_all.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# 提供可稳定复现性能的脚本,默认在标准docker环境内py37执行: paddlepaddle/paddle:latest-gpu-cuda10.1-cudnn7 paddle=2.1.2 py=37
# 执行目录:需说明
# cd **
# 1 安装该模型需要的依赖 (如需开启优化策略请注明)
# pip install ...
# 2 拷贝该模型需要数据、预训练模型
# 3 批量运行(如不方便批量,1,2需放到单个模型中)

model_mode_list=(MobileNetV1 MobileNetV2 MobileNetV3_large_x1_0 EfficientNetB0 ShuffleNetV2_x1_0 DenseNet121 HRNet_W48_C SwinTransformer_tiny_patch4_window7_224 alt_gvt_base)
fp_item_list=(fp32)
bs_list=(32 64 96 128)
for model_mode in ${model_mode_list[@]}; do
for fp_item in ${fp_item_list[@]}; do
for bs_item in ${bs_list[@]};do
echo "index is speed, 1gpus, begin, ${model_name}"
run_mode=sp
CUDA_VISIBLE_DEVICES=0 bash benchmark/run_benchmark.sh ${run_mode} ${bs_item} ${fp_item} 10 ${model_mode} # (5min)
sleep 10
echo "index is speed, 8gpus, run_mode is multi_process, begin, ${model_name}"
run_mode=mp
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 bash benchmark/run_benchmark.sh ${run_mode} ${bs_item} ${fp_item} 10 ${model_mode}
sleep 10
done
done
done
56 changes: 56 additions & 0 deletions benchmark/run_benchmark.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
#!/usr/bin/env bash
set -xe
# 运行示例:CUDA_VISIBLE_DEVICES=0 bash run_benchmark.sh ${run_mode} ${bs_item} ${fp_item} 500 ${model_mode}
# 参数说明
function _set_params(){
run_mode=${1:-"sp"} # 单卡sp|多卡mp
batch_size=${2:-"64"}
fp_item=${3:-"fp32"} # fp32|fp16
epochs=${4:-"10"} # 可选,如果需要修改代码提前中断
model_name=${5:-"model_name"}
run_log_path="${TRAIN_LOG_DIR:-$(pwd)}/benchmark" # TRAIN_LOG_DIR 后续QA设置该参数

# 以下不用修改
device=${CUDA_VISIBLE_DEVICES//,/ }
arr=(${device})
num_gpu_devices=${#arr[*]}
log_file=${run_log_path}/clas_${model_name}_${run_mode}_bs${batch_size}_${fp_item}_${num_gpu_devices}
}
function _train(){
echo "Train on ${num_gpu_devices} GPUs"
echo "current CUDA_VISIBLE_DEVICES=$CUDA_VISIBLE_DEVICES, gpus=$num_gpu_devices, batch_size=$batch_size"

if [ ${fp_item} = "fp32" ];then
model_config=`find ppcls/configs/ImageNet -name ${model_name}.yaml`
else
model_config=`find ppcls/configs/ImageNet -name ${model_name}_fp16.yaml`
fi

train_cmd="-c ${model_config} -o DataLoader.Train.sampler.batch_size=${batch_size} -o Global.epochs=${epochs}"
case ${run_mode} in
sp) train_cmd="python -u tools/train.py ${train_cmd}" ;;
mp)
train_cmd="python -m paddle.distributed.launch --log_dir=./mylog --gpus=$CUDA_VISIBLE_DEVICES tools/train.py ${train_cmd}"
log_parse_file="mylog/workerlog.0" ;;
*) echo "choose run_mode(sp or mp)"; exit 1;
esac
rm -rf mylog
# 以下不用修改
timeout 15m ${train_cmd} > ${log_file} 2>&1
if [ $? -ne 0 ];then
echo -e "${model_name}, FAIL"
export job_fail_flag=1
else
echo -e "${model_name}, SUCCESS"
export job_fail_flag=0
fi
kill -9 `ps -ef|grep 'python'|awk '{print $2}'`

if [ $run_mode = "mp" -a -d mylog ]; then
rm ${log_file}
cp mylog/workerlog.0 ${log_file}
fi
}

_set_params $@
_train
36 changes: 36 additions & 0 deletions deploy/configs/build_general.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
Global:
rec_inference_model_dir: "./models/general_PPLCNet_x2_5_lite_v1.0_infer"
batch_size: 32
use_gpu: True
enable_mkldnn: True
cpu_num_threads: 10
enable_benchmark: True
use_fp16: False
ir_optim: True
use_tensorrt: False
gpu_mem: 8000
enable_profile: False

RecPreProcess:
transform_ops:
- ResizeImage:
size: 224
- NormalizeImage:
scale: 0.00392157
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- ToCHWImage:

RecPostProcess: null

# indexing engine config
IndexProcess:
index_method: "HNSW32" # supported: HNSW32, IVF, Flat
image_root: "./drink_dataset_v1.0/gallery/"
index_dir: "./drink_dataset_v1.0/index"
data_file: "./drink_dataset_v1.0/gallery/drink_label.txt"
index_operation: "new" # suported: "append", "remove", "new"
delimiter: "\t"
dist_type: "IP"
embedding_size: 512
55 changes: 55 additions & 0 deletions deploy/configs/inference_general.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
Global:
infer_imgs: "./drink_dataset_v1.0/test_images/nongfu_spring.jpeg"
det_inference_model_dir: "./models/picodet_PPLCNet_x2_5_mainbody_lite_v1.0_infer"
rec_inference_model_dir: "./models/general_PPLCNet_x2_5_lite_v1.0_infer"
rec_nms_thresold: 0.05

batch_size: 1
image_shape: [3, 640, 640]
threshold: 0.2
max_det_results: 5
labe_list:
- foreground

# inference engine config
use_gpu: True
enable_mkldnn: True
cpu_num_threads: 10
enable_benchmark: True
use_fp16: False
ir_optim: True
use_tensorrt: False
gpu_mem: 8000
enable_profile: False

DetPreProcess:
transform_ops:
- DetResize:
interp: 2
keep_ratio: false
target_size: [640, 640]
- DetNormalizeImage:
is_scale: true
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
- DetPermute: {}
DetPostProcess: {}

RecPreProcess:
transform_ops:
- ResizeImage:
size: 224
- NormalizeImage:
scale: 0.00392157
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- ToCHWImage:

RecPostProcess: null

# indexing engine config
IndexProcess:
index_dir: "./drink_dataset_v1.0/index/"
return_k: 5
score_thres: 0.5
Loading

0 comments on commit dfb26b6

Please sign in to comment.