Name	Name	Last commit message	Last commit date
Latest commit v-iashin VGGish support; i3d changes in documentation; CF Jun 11, 2020 4fa02bd · Jun 11, 2020 History 6 Commits
models	models	VGGish support; i3d changes in documentation; CF	Jun 11, 2020
sample	sample	init commit	Jun 1, 2020
utils	utils	VGGish support; i3d changes in documentation; CF	Jun 11, 2020
LICENSE	LICENSE	Initial commit	Jun 1, 2020
README.md	README.md	VGGish support; i3d changes in documentation; CF	Jun 11, 2020
conda_env_i3d.yml	conda_env_i3d.yml	init commit	Jun 1, 2020
conda_env_vggish.yml	conda_env_vggish.yml	VGGish support; i3d changes in documentation; CF	Jun 11, 2020
main.py	main.py	VGGish support; i3d changes in documentation; CF	Jun 11, 2020

Repository files navigation

Multi-GPU Extraction of Video Features

This is a PyTorch module that does a feature extraction in parallel on any number of GPUs. So far, I3D and VGGish features are supported.

I3D

Please note, this implementation uses PWC-Net instead of the TVL1 algorithm, which was used in the original I3D paper as PWC Net is much faster. Yet, one may create a Pull Request implementing TVL1 as an option to form optical flow frames.

Set up the Environment for I3D

Setup conda environment. Requirements are in file conda_env_i3d.yml

# it will create new conda environment called 'i3d' on your machine 
conda env create -f conda_env_i3d.yml
conda activate i3d

Examples

It will extract I3D features for sample videos using 0th and 2nd devices in parallel. The features are going to be extracted with the default parameters. Check out python main.py --help for help on available options.

python main.py --feature_type i3d --device_ids 0 2 --video_paths ./sample/v_ZNVhz7ctTq0.mp4 ./sample/v_GGSY1Qvo990.mp4

The video paths can be specified as a .txt file with paths

python main.py --feature_type i3d --device_ids 0 2 --file_with_video_paths ./sample/sample_video_paths.txt

The features can be saved as numpy arrays by specifying --on_extraction save_numpy. By default, it will create a folder ./output and will store features there

python main.py --feature_type i3d --device_ids 0 2 --on_extraction save_numpy --file_with_video_paths ./sample/sample_video_paths.txt

You can change the output folder using --output_path argument.

Also, you may want to try to change I3D window and step sizes

python main.py --feature_type i3d --device_ids 0 2 --stack_size 24 --step_size 24 --file_with_video_paths ./sample/sample_video_paths.txt

By default, the frames are extracted according to the original fps of a video. If you would like to extract frames at a certain fps, specify --extraction_fps argument.

python main.py --feature_type i3d --device_ids 0 2 --extraction_fps 25 --stack_size 24 --step_size 24 --file_with_video_paths ./sample/sample_video_paths.txt

If --keep_frames is specified, it keeps them in --tmp_path which is ./tmp by default. Be careful with the --keep_frames argument when playing with --extraction_fps as it may mess up the frames you extracted before in the same folder.

Credits

An implementation of PWC-Net in PyTorch: https://github.com/sniklaus/pytorch-pwc
A port of I3D weights from TensorFlow to PyTorch: https://github.com/hassony2/kinetics_i3d_pytorch
The I3D paper: Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset.

License

All is MIT except for PWC Net implementation in I3D. Please read the PWC implementation License (Last time I checked it was GPL-3.0).

VGGish

The extraction of VGGish features is implemeted as a wrapper of the TensorFlow implementation.

Set up the Environment for VGGish

Setup conda environment. Requirements are in file conda_env_vggish.yml

# it will create new conda environment called 'vggish' on your machine 
conda env create -f conda_env_vggish.yml
conda activate vggish
# download the pre-trained VGGish model. The script will put the files in the checkpoint directory
wget https://storage.googleapis.com/audioset/vggish_model.ckpt -P ./models/vggish/checkpoints

Example

python main.py --feature_type vggish --device_ids 0 2 --video_paths ./sample/v_ZNVhz7ctTq0.mp4 ./sample/v_GGSY1Qvo990.mp4

See python main.py --help for more arguments and I3D examples

Credits

The TensorFlow implementation.
The VGGish paper: CNN Architectures for Large-Scale Audio Classification.

License

My code (this wrapping) is under MIT but the tf implementation complies with the tensorflow license which is Apache-2.0.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multi-GPU Extraction of Video Features

I3D

Set up the Environment for I3D

Examples

Credits

License

VGGish

Set up the Environment for VGGish

Example

Credits

License

About

Contributors 6

Languages

License

v-iashin/video_features

Folders and files

Latest commit

History

Repository files navigation

Multi-GPU Extraction of Video Features

I3D

Set up the Environment for I3D

Examples

Credits

License

VGGish

Set up the Environment for VGGish

Example

Credits

License

About

Topics

Resources

License

Stars

Watchers

Forks

Contributors 6

Languages