-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add TSN dygraph model #4817
Open
LiuChaoXD
wants to merge
7
commits into
PaddlePaddle:release/1.8
Choose a base branch
from
LiuChaoXD:liuchao45-tsn
base: release/1.8
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
add TSN dygraph model #4817
Changes from all commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
fdcd1ff
add tsn
f5e723e
add tsn
LiuChaoXD 4048b3b
add refined tsn model
LiuChaoXD ba0dd40
add refined dynamic tsn model
LiuChaoXD e035d91
refined dynamic tsn 2020-08-30
LiuChaoXD 5df6704
final dynamic tsn 2020-08-30
LiuChaoXD 9e963ac
add tsn 2020-08-31
LiuChaoXD File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,98 @@ | ||
# TSN 视频分类模型 | ||
本目录下为基于PaddlePaddle 动态图实现的TSN视频分类模型。模型支持PaddlePaddle Fluid 1.8, GPU, Linux。 | ||
|
||
--- | ||
## 内容 | ||
|
||
- [模型简介](#模型简介) | ||
- [安装说明](#安装说明) | ||
- [数据准备](#数据准备) | ||
- [模型训练](#模型训练) | ||
- [模型评估](#模型评估) | ||
- [参考论文](#参考论文) | ||
|
||
|
||
## 模型简介 | ||
|
||
Temporal Segment Network (TSN) 是视频分类领域经典的基于2D-CNN的解决方案。该方法主要解决视频的长时间行为判断问题,通过稀疏采样视频帧的方式代替稠密采样,既能捕获视频全局信息,也能去除冗余,降低计算量。最终将每帧特征平均融合后得到视频的整体特征,并用于分类。本代码实现的模型为基于单路RGB图像的TSN网络结构,Backbone采用ResNet50结构。 | ||
|
||
详细内容请参考ECCV 2016年论文[Temporal Segment Networks: Towards Good Practices for Deep Action Recognition](https://arxiv.org/abs/1608.00859) | ||
|
||
## 安装说明 | ||
|
||
### 环境依赖: | ||
|
||
``` | ||
python=3.7 | ||
paddlepaddle-gpu==1.8.3.post97 | ||
opencv=4.3 | ||
CUDA >= 9.0 | ||
cudnn >= 7.5 | ||
wget | ||
numpy | ||
``` | ||
|
||
### 依赖安装: | ||
|
||
- 安装PaddlePaddle,GPU版本: | ||
|
||
``` pip3 install paddlepaddle-gpu==1.8.3.post97 -i https://mirror.baidu.com/pypi/simple``` | ||
- 安装opencv 4.2: | ||
|
||
``` pip3 install opencv-python==4.3.0.36``` | ||
- 安装wget | ||
|
||
``` pip3 install wget``` | ||
- 安装numpy | ||
|
||
``` pip3 install numpy``` | ||
|
||
## 数据准备 | ||
|
||
TSN的训练数据采用UCF101动作识别数据集。数据下载及准备请参考[数据说明](./data/dataset/ucf101/README.md) | ||
|
||
## 模型训练 | ||
|
||
数据准备完毕后,可以通过如下两种方式启动训练 | ||
|
||
1. 多卡训练 | ||
```bash | ||
bash multi_gpus_run.sh | ||
``` | ||
多卡训练所使用的gpu可以通过如下方式设置: | ||
- 修改`multi_gpus_run.sh` 中 `export CUDA_VISIBLE_DEVICES=0,1,2,3`(默认为0,1,2,3表示使用0,1,2,3卡号的gpu进行训练) | ||
- 注意:多卡训练的参数配置文件为`multi_tsn.yaml`。若修改了batchsize则学习率也要做相应的修改,规则为大batchsize用大lr,即同倍数增长缩小关系。例如,默认四卡batchsize=128,lr=0.001,若batchsize=64,lr=0.0005。 | ||
|
||
|
||
2. 单卡训练 | ||
```bash | ||
bash single_gpu_run.sh | ||
``` | ||
单卡训练所使用的gpu可以通过如下方式设置: | ||
- 修改 `single_gpu_run.sh` 中的 `export CUDA_VISIBLE_DEVICES=0` (表示使用gpu 0 进行模型训练) | ||
- 注意:单卡训练的参数配置文件为`single_gpu_run.sh`。若修改了batchsize则学习率也要做相应的修改,规则为大batchsize用大lr,即同倍数增长缩小关系。默认单卡batchsize=64,lr=0.0005;若batchsize=32,lr=0.00025 | ||
## 模型评估 | ||
|
||
可通过如下方式进行模型评估: | ||
```bash | ||
bash run_eval.sh ./configs/tsn_test.yaml ./weights/final.pdparams | ||
``` | ||
|
||
- 使用`run.sh`进行评估时,需要修改脚本中的`weights`参数指定需要评估的权重 | ||
|
||
- `./tsn_test.yaml` 是评估模型时所用的参数文件;`./weights/final.pdparams` 为模型训练完成后,保存的模型文件 | ||
|
||
- 评估结果以log的形式直接打印输出TOP1\_ACC、TOP5\_ACC等精度指标 | ||
|
||
|
||
|
||
实验结果,采用四卡训练,默认配置参数时,在UCF101数据的validation数据集下评估精度如下: | ||
|
||
| | seg\_num | Top-1 | Top-5 | | ||
| :------: | :----------: | :----: | :----: | | ||
| Paddle TSN (静态图) | 3 | 84.00% | 97.38% | | ||
| Paddle TSN (动态图) | 3 | 84.27% | 97.27% | | ||
|
||
## 参考论文 | ||
|
||
- [Temporal Segment Networks: Towards Good Practices for Deep Action Recognition](https://arxiv.org/abs/1608.00859), Limin Wang, Yuanjun Xiong, Zhe Wang, Yu Qiao, Dahua Lin, Xiaoou Tang, Luc Van Gool |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,86 @@ | ||
# UCF101数据准备 | ||
UCF101数据的相关准备。主要包括数据下载,视频文件的提取frames,以及生成文件的路径list | ||
|
||
--- | ||
## 1. 数据下载 | ||
UCF101数据的详细信息可以参考网站[UCF101](https://www.crcv.ucf.edu/data/UCF101.php)。 为了方便用户使用,我们提供了UCF101数据的annotations文件和videos文件的下载脚本。 | ||
|
||
### 下载annotations文件 | ||
首先,请确保在`./data/dataset/ucf101/`目录下,输入如下UCF101数据集的标注文件的命令。 | ||
```shell | ||
bash download_annotations.sh | ||
``` | ||
|
||
### 下载UCF101的视频文件 | ||
同样需要确保在`./data/dataset/ucf101/`目录下,输入下述命令下载视频文件 | ||
```shell | ||
bash download_videos.sh | ||
``` | ||
下载完成后视频文件会存储在`./data/dataset/ucf101/videos/`文件夹下,视频文件大小为6.8G。 | ||
|
||
--- | ||
## 2. 提取视频文件的frames | ||
为了加速网络的训练过程,我们首先对视频文件(ucf101视频文件为avi格式)提取帧 (frames)。相对于直接通过视频文件进行网络训练的方式,frames的方式能够加快网络训练的速度。 | ||
|
||
直接输入如下命令,即可提取ucf101视频文件的frames | ||
``` python | ||
python extract_rawframes.py ./videos/ ./rawframes/ --level 2 --ext avi | ||
``` | ||
视频文件frames提取完成后,会存储在`./rawframes`文件夹下,大小为56G。 | ||
|
||
--- | ||
## 3. 生成frames文件和视频文件的路径list | ||
生成视频文件的路径list,输入如下命令 | ||
|
||
```python | ||
python build_ucf101_file_list.py videos/ --level 2 --format videos --out_list_path ./ | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (1) 生成list的时候建议不要shuffle,在Reader里做shuffle |
||
``` | ||
生成frames文件的路径list,输入如下命令: | ||
```python | ||
python build_ucf101_file_list.py rawframes/ --level 2 --format rawframes --out_list_path ./ | ||
``` | ||
|
||
**参数说明** | ||
|
||
`videos/` 或者 `rawframes/` : 表示视频或者frames文件的存储路径 | ||
|
||
`--level 2` : 表示文件的存储结构 | ||
|
||
`--format`: 表示是针对视频还是frames生成路径list | ||
|
||
`--out_list_path `: 表示生的路径list文件存储位置 | ||
|
||
|
||
# 以上步骤完成后,文件组织形式如下所示 | ||
|
||
``` | ||
├── data | ||
| ├── dataset | ||
| │ ├── ucf101 | ||
| │ │ ├── ucf101_{train,val}_split_{1,2,3}_rawframes.txt | ||
| │ │ ├── ucf101_{train,val}_split_{1,2,3}_videos.txt | ||
| │ │ ├── annotations | ||
| │ │ ├── videos | ||
| │ │ │ ├── ApplyEyeMakeup | ||
| │ │ │ │ ├── v_ApplyEyeMakeup_g01_c01.avi | ||
| | ||
| │ │ │ ├── YoYo | ||
| │ │ │ │ ├── v_YoYo_g25_c05.avi | ||
| │ │ ├── rawframes | ||
| │ │ │ ├── ApplyEyeMakeup | ||
| │ │ │ │ ├── v_ApplyEyeMakeup_g01_c01 | ||
| │ │ │ │ │ ├── img_00001.jpg | ||
| │ │ │ │ │ ├── img_00002.jpg | ||
| │ │ │ │ │ ├── ... | ||
| │ │ │ │ │ ├── flow_x_00001.jpg | ||
| │ │ │ │ │ ├── flow_x_00002.jpg | ||
| │ │ │ │ │ ├── ... | ||
| │ │ │ │ │ ├── flow_y_00001.jpg | ||
| │ │ │ │ │ ├── flow_y_00002.jpg | ||
| │ │ │ ├── ... | ||
| │ │ │ ├── YoYo | ||
| │ │ │ │ ├── v_YoYo_g01_c01 | ||
| │ │ │ │ ├── ... | ||
| │ │ │ │ ├── v_YoYo_g25_c05 | ||
|
||
``` |
157 changes: 157 additions & 0 deletions
157
dygraph/tsn/data/dataset/ucf101/build_ucf101_file_list.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,157 @@ | ||
import argparse | ||
import os | ||
import glob | ||
import fnmatch | ||
import random | ||
|
||
|
||
def parse_directory(path, | ||
key_func=lambda x: x[-11:], | ||
rgb_prefix='img_', | ||
level=1): | ||
""" | ||
Parse directories holding extracted frames from standard benchmarks | ||
""" | ||
print('parse frames under folder {}'.format(path)) | ||
if level == 1: | ||
frame_folders = glob.glob(os.path.join(path, '*')) | ||
elif level == 2: | ||
frame_folders = glob.glob(os.path.join(path, '*', '*')) | ||
else: | ||
raise ValueError('level can be only 1 or 2') | ||
|
||
def count_files(directory, prefix_list): | ||
lst = os.listdir(directory) | ||
cnt_list = [len(fnmatch.filter(lst, x + '*')) for x in prefix_list] | ||
return cnt_list | ||
|
||
# check RGB | ||
frame_dict = {} | ||
for i, f in enumerate(frame_folders): | ||
all_cnt = count_files(f, (rgb_prefix)) | ||
k = key_func(f) | ||
|
||
x_cnt = all_cnt[1] | ||
y_cnt = all_cnt[2] | ||
if x_cnt != y_cnt: | ||
raise ValueError('x and y direction have different number ' | ||
'of flow images. video: ' + f) | ||
if i % 200 == 0: | ||
print('{} videos parsed'.format(i)) | ||
|
||
frame_dict[k] = (f, all_cnt[0], x_cnt) | ||
|
||
print('frame folder analysis done') | ||
return frame_dict | ||
|
||
|
||
def build_split_list(split, frame_info, shuffle=False): | ||
def build_set_list(set_list): | ||
rgb_list = list() | ||
for item in set_list: | ||
if item[0] not in frame_info: | ||
continue | ||
elif frame_info[item[0]][1] > 0: | ||
rgb_cnt = frame_info[item[0]][1] | ||
rgb_list.append('{} {} {}\n'.format(item[0], rgb_cnt, item[1])) | ||
else: | ||
rgb_list.append('{} {}\n'.format(item[0], item[1])) | ||
if shuffle: | ||
random.shuffle(rgb_list) | ||
return rgb_list | ||
|
||
train_rgb_list = build_set_list(split[0]) | ||
test_rgb_list = build_set_list(split[1]) | ||
return (train_rgb_list, test_rgb_list) | ||
|
||
|
||
def parse_ucf101_splits(level): | ||
class_ind = [x.strip().split() for x in open('./annotations/classInd.txt')] | ||
class_mapping = {x[1]: int(x[0]) - 1 for x in class_ind} | ||
|
||
def line2rec(line): | ||
items = line.strip().split(' ') | ||
vid = items[0].split('.')[0] | ||
vid = '/'.join(vid.split('/')[-level:]) | ||
label = class_mapping[items[0].split('/')[0]] | ||
return vid, label | ||
|
||
splits = [] | ||
for i in range(1, 4): | ||
train_list = [ | ||
line2rec(x) | ||
for x in open('./annotations/trainlist{:02d}.txt'.format(i)) | ||
] | ||
test_list = [ | ||
line2rec(x) | ||
for x in open('./annotations/testlist{:02d}.txt'.format(i)) | ||
] | ||
splits.append((train_list, test_list)) | ||
return splits | ||
|
||
|
||
def parse_args(): | ||
parser = argparse.ArgumentParser(description='Build file list') | ||
parser.add_argument( | ||
'frame_path', type=str, help='root directory for the frames') | ||
parser.add_argument('--rgb_prefix', type=str, default='img_') | ||
parser.add_argument('--num_split', type=int, default=3) | ||
parser.add_argument('--level', type=int, default=2, choices=[1, 2]) | ||
parser.add_argument( | ||
'--format', | ||
type=str, | ||
default='rawframes', | ||
choices=['rawframes', 'videos']) | ||
parser.add_argument('--out_list_path', type=str, default='./') | ||
parser.add_argument('--shuffle', action='store_true', default=True) | ||
args = parser.parse_args() | ||
|
||
return args | ||
|
||
|
||
def main(): | ||
args = parse_args() | ||
|
||
if args.level == 2: | ||
|
||
def key_func(x): | ||
return '/'.join(x.split('/')[-2:]) | ||
else: | ||
|
||
def key_func(x): | ||
return x.split('/')[-1] | ||
|
||
if args.format == 'rawframes': | ||
frame_info = parse_directory( | ||
args.frame_path, | ||
key_func=key_func, | ||
rgb_prefix=args.rgb_prefix, | ||
level=args.level) | ||
elif args.format == 'videos': | ||
if args.level == 1: | ||
video_list = glob.glob(os.path.join(args.frame_path, '*')) | ||
elif args.level == 2: | ||
video_list = glob.glob(os.path.join(args.frame_path, '*', '*')) | ||
frame_info = { | ||
os.path.relpath(x.split('.')[0], args.frame_path): (x, -1, -1) | ||
for x in video_list | ||
} | ||
|
||
split_tp = parse_ucf101_splits(args.level) | ||
assert len(split_tp) == args.num_split | ||
|
||
out_path = args.out_list_path | ||
|
||
for i, split in enumerate(split_tp): | ||
lists = build_split_list(split_tp[i], frame_info, shuffle=args.shuffle) | ||
filename = 'ucf101_train_split_{}_{}.txt'.format(i + 1, args.format) | ||
|
||
with open(os.path.join(out_path, filename), 'w') as f: | ||
f.writelines(lists[0]) | ||
filename = 'ucf101_val_split_{}_{}.txt'.format(i + 1, args.format) | ||
with open(os.path.join(out_path, filename), 'w') as f: | ||
f.writelines(lists[1]) | ||
|
||
|
||
if __name__ == "__main__": | ||
main() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
#! /usr/bin/bash env | ||
|
||
DATA_DIR="./annotations" | ||
|
||
if [[ ! -d "${DATA_DIR}" ]]; then | ||
echo "${DATA_DIR} does not exist. Creating"; | ||
mkdir -p ${DATA_DIR} | ||
fi | ||
|
||
wget --no-check-certificate "https://www.crcv.ucf.edu/data/UCF101/UCF101TrainTestSplits-RecognitionTask.zip" | ||
|
||
unzip -j UCF101TrainTestSplits-RecognitionTask.zip -d ${DATA_DIR}/ | ||
rm UCF101TrainTestSplits-RecognitionTask.zip |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
#! /usr/bin/bash env | ||
|
||
wget --no-check-certificate "https://www.crcv.ucf.edu/data/UCF101/UCF101.rar" | ||
unrar x UCF101.rar | ||
mv ./UCF-101 ./videos | ||
rm -rf ./UCF101.rar | ||
|
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
输入如下命令下载UCF101数据集的标注文件。