Skip to content

Commit

Permalink
[Feature] Add dataset analysis script (#172)
Browse files Browse the repository at this point in the history
* messages

* again

* again1

* again_2

* again_3

* again_4

* again_5

* Update docs/zh_cn/user_guides/useful_tools.md

Co-authored-by: HinGwenWoong <[email protected]>

* Update docs/zh_cn/user_guides/useful_tools.md

Co-authored-by: HinGwenWoong <[email protected]>

* Update docs/en/user_guides/useful_tools.md

Co-authored-by: HinGwenWoong <[email protected]>

* Update docs/en/user_guides/useful_tools.md

Co-authored-by: HinGwenWoong <[email protected]>

* Update docs/en/user_guides/useful_tools.md

Co-authored-by: HinGwenWoong <[email protected]>

* Update docs/zh_cn/user_guides/useful_tools.md

Co-authored-by: HinGwenWoong <[email protected]>

* Update tools/analysis_tools/dataset_analysis.py

Co-authored-by: HinGwenWoong <[email protected]>

* Update tools/analysis_tools/dataset_analysis.py

Co-authored-by: HinGwenWoong <[email protected]>

* Update tools/analysis_tools/dataset_analysis.py

Co-authored-by: HinGwenWoong <[email protected]>

* Update tools/analysis_tools/dataset_analysis.py

Co-authored-by: HinGwenWoong <[email protected]>

* modify code

* Update docs/en/user_guides/useful_tools.md

Co-authored-by: HinGwenWoong <[email protected]>

* Modify document

* Modify document

* new code

* revise decuments and codes

* Revise datails

* Update tools/analysis_tools/dataset_analysis.py

Co-authored-by: HinGwenWoong <[email protected]>

* modify func name

* code

* Documentation and code

* modify error meaasge

* deleted height,

Co-authored-by: HinGwenWoong <[email protected]>
  • Loading branch information
2 people authored and hhaAndroid committed Nov 3, 2022
1 parent 93e802b commit e6e6f73
Show file tree
Hide file tree
Showing 3 changed files with 642 additions and 2 deletions.
80 changes: 79 additions & 1 deletion docs/en/user_guides/useful_tools.md
Original file line number Diff line number Diff line change
Expand Up @@ -108,7 +108,85 @@ python tools/analysis_tools/browse_dataset.py 'configs/yolov5/yolov5_s-v61_syncb
--not-show
```
## Convert Dataset
### Visualize dataset analysis
`tools/analysis_tools/dataset_analysis.py` help users get the renderings of the four functions, and save the pictures to the `dataset_analysis` folder under the current running directory.
Description of the script's functions:
The data required by each sub function is obtained through the data preparation of `main()`.
Function 1: Generated by the sub function `show_bbox_num` to display the distribution of categories and bbox instances.
<img src="https://user-images.githubusercontent.com/90811472/196891728-4c2f1ab3-01cb-445f-a6b8-39752387c40f.jpg"/>
Function 2: Generated by the sub function `show_bbox_wh` to display the width and height distribution of categories and bbox instances.
<img src="https://user-images.githubusercontent.com/90811472/199019573-650b9652-eb14-4bc0-a5e8-650dfc578fc8.jpg"/>
Function 3: Generated by the sub function `show_bbox_wh_ratio` to display the width to height ratio distribution of categories and bbox instances.
<img src="https://user-images.githubusercontent.com/90811472/199019593-0f810a21-18d2-41ac-b4fa-baa8288bcb23.jpg"/>
Function 3: Generated by the sub function `show_bbox_area` to display the distribution map of category and bbox instance area based on area rules.
<img src="https://user-images.githubusercontent.com/90811472/199022991-5388db47-d0f3-4201-9eee-13c5fab6bca9.jpg"/>
Print List: Generated by the sub function `show_class_list` and `show_data_lis`.
<img src="https://user-images.githubusercontent.com/90811472/199090989-15109bbf-f035-477d-8566-e2a28de0935d.jpg"/>
```shell
python tools/analysis_tools/dataset_analysis.py ${CONFIG} \
[-h] \
[--type ${TYPE}] \
[--class-name ${CLASS_NAME}] \
[--area-rule ${AREA_RULE}] \
[--func ${FUNC}] \
[--output-dir ${OUTPUT_DIR}]
```
E,g:
1.Use `config` file `configs/yolov5/yolov5_s-v61_syncbn_8xb16-300e_coco.py` analyze the dataset, By default,the data loadingt type is `train_dataset`, the area rule is `[0,32,96,1e5]`, generate a result graph containing all functions and save the graph to the current running directory `./dataset_analysis` folder:
```shell
python tools/analysis_tools/dataset_analysis.py configs/yolov5/yolov5_s-v61_syncbn_8xb16-300e_coco.py
```
2.Use `config` file `configs/yolov5/yolov5_s-v61_syncbn_8xb16-300e_coco.py` analyze the dataset, change the data loading type from the default `train_dataset` to `val_dataset` through the `--val-dataset` setting:
```shell
python tools/analysis_tools/dataset_analysis.py configs/yolov5/yolov5_s-v61_syncbn_8xb16-300e_coco.py \
--val-dataset
```
3.Use `config` file `configs/yolov5/yolov5_s-v61_syncbn_8xb16-300e_coco.py` analyze the dataset, change the display of all generated classes to specific classes. Take the display of `person` classes as an example:
```shell
python tools/analysis_tools/dataset_analysis.py configs/yolov5/yolov5_s-v61_syncbn_8xb16-300e_coco.py \
--class-name person
```
4.Use `config` file `configs/yolov5/yolov5_s-v61_syncbn_8xb16-300e_coco.py` analyze the dataset, redefine the area rule through `--area-rule` . Take `30 70 125` as an example, the area rule becomes `[0,30,70,125,1e5]`:
```shell
python tools/analysis_tools/dataset_analysis.py configs/yolov5/yolov5_s-v61_syncbn_8xb16-300e_coco.py \
--area-rule 30 70 120
```
5.Use `config` file `configs/yolov5/yolov5_s-v61_syncbn_8xb16-300e_coco.py` analyze the dataset, change the display of four function renderings to only display `Function 1` as an example:
```shell
python tools/analysis_tools/dataset_analysis.py configs/yolov5/yolov5_s-v61_syncbn_8xb16-300e_coco.py \
--func show_bbox_num
```
6.Use `config` file `configs/yolov5/yolov5_s-v61_syncbn_8xb16-300e_coco.py` analyze the dataset, modify the picture saving address to `work_ir/dataset_analysis`:
```shell
python tools/analysis_tools/dataset_analysis.py configs/yolov5/yolov5_s-v61_syncbn_8xb16-300e_coco.py \
--output-dir work_dir/dataset_analysis
```
## Dataset Conversion
The folder `tools/data_converters` currently contains `ballon2coco.py` and `yolo2coco.py` two dataset conversion tools.
Expand Down
80 changes: 79 additions & 1 deletion docs/zh_cn/user_guides/useful_tools.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ mim run mmdet print_config [CONFIG]

### 可视化 COCO 标签

脚本 `tools/analysis_tools/browse_coco_json.py` 能够使用可视化显示 COCO 标签在图片的情况
脚本 `tools/analysis_tools/browse_coco_json.py` 能够使用可视化显示 COCO 标签在图片的情况

```shell
python tools/analysis_tools/browse_coco_json.py ${DATA_ROOT} \
Expand Down Expand Up @@ -108,6 +108,84 @@ python tools/analysis_tools/browse_dataset.py 'configs/yolov5/yolov5_s-v61_syncb
--not-show
```
### 可视化数据集分

This comment has been minimized.

Copy link
@vansin

vansin Nov 4, 2022

Collaborator

可视化数据集部分 ?

脚本 `tools/analysis_tools/dataset_analysis.py` 能够帮助用户得到四种功能的结果图,并将图片保存到当前运行目录下的 `dataset_analysis` 文件夹中。
关于该脚本的功能的说明:
通过 `main()` 的数据准备,得到每个子函数所需要的数据。
功能一:显示类别和 bbox 实例个数的分布图,通过子函数 `show_bbox_num` 生成。
<img src="https://user-images.githubusercontent.com/90811472/196891728-4c2f1ab3-01cb-445f-a6b8-39752387c40f.jpg"/>
功能二:显示类别和 bbox 实例宽、高的分布图,通过子函数 `show_bbox_wh` 生成。
<img src="https://user-images.githubusercontent.com/90811472/199019573-650b9652-eb14-4bc0-a5e8-650dfc578fc8.jpg"/>
功能三:显示类别和 bbox 实例宽/高比例的分布图,通过子函数 `show_bbox_wh_ratio` 生成。
<img src="https://user-images.githubusercontent.com/90811472/199019593-0f810a21-18d2-41ac-b4fa-baa8288bcb23.jpg"/>
功能四:基于面积规则下,显示类别和 bbox 实例面积的分布图,通过子函数 `show_bbox_area` 生成。
<img src="https://user-images.githubusercontent.com/90811472/199022991-5388db47-d0f3-4201-9eee-13c5fab6bca9.jpg"/>
打印列表显示,通过脚本中子函数 `show_class_list``show_data_lis` 生成。
<img src="https://user-images.githubusercontent.com/90811472/199090989-15109bbf-f035-477d-8566-e2a28de0935d.jpg"/>
```shell
python tools/analysis_tools/dataset_analysis.py ${CONFIG} \
[-h] \
[--val-dataset ${TYPE}] \
[--class-name ${CLASS_NAME}] \
[--area-rule ${AREA_RULE}] \
[--func ${FUNC}] \
[--output-dir ${OUTPUT_DIR}]
```
例子:
1.使用 `config` 文件 `configs/yolov5/yolov5_s-v61_syncbn_8xb16-300e_coco.py` 分析数据集,其中默认设置:数据加载类型为 `train_dataset` ,面积规则设置为 `[0,32,96,1e5]` ,生成包含所有类的结果图并将图片保存到当前运行目录下 `./dataset_analysis` 文件夹中:
```shell
python tools/analysis_tools/dataset_analysis.py configs/yolov5/yolov5_s-v61_syncbn_8xb16-300e_coco.py
```
2.使用 `config` 文件 `configs/yolov5/yolov5_s-v61_syncbn_8xb16-300e_coco.py` 分析数据集,通过 `--val-dataset` 设置将数据加载类型由默认的 `train_dataset` 改为 `val_dataset`
```shell
python tools/analysis_tools/dataset_analysis.py configs/yolov5/yolov5_s-v61_syncbn_8xb16-300e_coco.py \
--val-dataset
```
3.使用 `config` 文件 `configs/yolov5/yolov5_s-v61_syncbn_8xb16-300e_coco.py` 分析数据集,通过 `--class-name` 设置将生成所有类改为特定类显示,以显示 `person` 为例:
```shell
python tools/analysis_tools/dataset_analysis.py configs/yolov5/yolov5_s-v61_syncbn_8xb16-300e_coco.py \
--class-name person
```
4.使用 `config` 文件 `configs/yolov5/yolov5_s-v61_syncbn_8xb16-300e_coco.py` 分析数据集,通过 `--area-rule` 重新定义面积规则,以 `30 70 125` 为例,面积规则变为 `[0,30,70,125,1e5]`
```shell
python tools/analysis_tools/dataset_analysis.py configs/yolov5/yolov5_s-v61_syncbn_8xb16-300e_coco.py \
--area-rule 30 70 125
```
5.使用 `config` 文件 `configs/yolov5/yolov5_s-v61_syncbn_8xb16-300e_coco.py` 分析数据集,通过 `--func` 设置,将显示四个功能效果图改为只显示 `功能一` 为例:
```shell
python tools/analysis_tools/dataset_analysis.py configs/yolov5/yolov5_s-v61_syncbn_8xb16-300e_coco.py \
--func show_bbox_num
```
6.使用 `config` 文件 `configs/yolov5/yolov5_s-v61_syncbn_8xb16-300e_coco.py` 分析数据集,通过 `--output-dir` 设置修改图片保存地址,以 `work_ir/dataset_analysis` 地址为例:
```shell
python tools/analysis_tools/dataset_analysis.py configs/yolov5/yolov5_s-v61_syncbn_8xb16-300e_coco.py \
--output-dir work_dir/dataset_analysis
```
## 数据集转换
文件夹 `tools/data_converters/` 目前包含 `ballon2coco.py``yolo2coco.py` 两个数据集转换工具。
Expand Down
Loading

0 comments on commit e6e6f73

Please sign in to comment.