Marker-Elan

Video Analysis tool

Introduction

This program transforms .wav files into a .json files whose contain words with certain character selected (filter) in a specific language. This, using the whisper-timestamped library that processes the files and applies this filter (one or more characters). The program detects the words whose contain these characters and creates a .json file with the words detected, including timestamps attributes for each word.

Use instructions

Prerequisites

Python 3.9 or newer

Installation

Clone this repo (Or download it as a zip):

clone https://github.com/Klefur/Elan-Marker.git

Install whisper-timestamped library:

pip3 install git+https://github.com/linto-ai/whisper-timestamped

Install ffmpeg:
- On Ubuntu or Debian:
```
sudo apt update && sudo apt install ffmpeg
```
- On Arch Linux:
```
sudo pacman -S ffmpeg
```
- On MacOS using Homebrew (https://brew.sh/):
```
brew install ffmpeg
```
- on Windows using Chocolatey (https://chocolatey.org/):
```
choco install ffmpeg
```
- on Windows using Scoop (https://scoop.sh/):
```
scoop install ffmpeg
```
Install ONNX Runtime:

pip3 install onnxruntime torchaudio

Audio backend torchaudio:
- SoundFile for Windows
```
pip install soundfile
```
- Sox for Linux/MacOs
```
pip install sox
```
moviepy

pip install moviepy

pympi-ling

pip install pympi-ling

Setup files

Move all files to process to the input folder. The .mp4 files will be automatically transformed into .wav files. To avoid the conversion, use the flag --use_wav True

Run the program from the terminal

Open the console from the cloned repository. You can use the cd command.

cd ./path/Marcador-Elan

Open the repository in the terminal using

cd ./{path}/Elan-Marker

then the following command line will execute the program and mark on the timeline the words that contain the letters 's' and 'd'.

python ./marcador_elan.py --filters s d

Parameters:

--filters: List of strings to filter (use lowercase)

python ./marcador_elan.py --filters s d asa

--input_folder: Folder with the input files

python ./marcador_elan.py --input_folder mp4_folder

--output_folder: Folder for output files

python ./marcador_elan.py --output_folder elan_folder

--save_temp: save temporal files

python ./marcador_elan.py --save_temp

--use_wav: Skip .wav to .mp4 conversion

python ./marcador_elan.py --use_wav

--name_model: Select whisper model

python ./marcador_elan.py --name_model medium

--language: Select language of the audio (--help to see list) (default: Spanish)

python ./marcador_elan.py --language en

The generated files will be in output folder

Acknowlegment

whisper-timestamped: Multilingual Automatic Speech Recognition with word-level timestamps and confidence (License AGPL-3.0).
whisper: Whisper speech recognition (License MIT).
dtw-python: Dynamic Time Warping (License GPL v3).
json-to-elan: Tools and scripts for working with ELAN (License Apache-2.0).

Authors

Lucas Mesías | Joaquín Salidivia | Nicolás Aguilera

Paper Citations

If you incorporate this in your research, reference the repository as the source.

@misc{mesias2023marcadorelan,
author = {Mesías, Lucas and Saldivia, Joaquín and Aguilera, Nicolás},
month = {6},
title = {Marcador-elan},
url = {https://github.com/Klefur/Marcador-Elan/},
year = {2023}
}

Whisper-timestamped:

@misc{lintoai2023whispertimestamped,
  title={whisper-timestamped},
  author={Louradour, J{\'e}r{\^o}me},
  journal={GitHub repository},
  year={2023},
  publisher={GitHub},
  howpublished = {\url{https://github.com/linto-ai/whisper-timestamped}}
}

OpenAI Whisper paper:

@article{radford2022robust,
  title={Robust speech recognition via large-scale weak supervision},
  author={Radford, Alec and Kim, Jong Wook and Xu, Tao and Brockman, Greg and McLeavey, Christine and Sutskever, Ilya},
  journal={arXiv preprint arXiv:2212.04356},
  year={2022}
}

Dynamic-Time-Warping:

@article{JSSv031i07,
  title={Computing and Visualizing Dynamic Time Warping Alignments in R: The dtw Package},
  author={Giorgino, Toni},
  journal={Journal of Statistical Software},
  year={2009},
  volume={31},
  number={7},
  doi={10.18637/jss.v031.i07}
}

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
input		input
json_to_elan		json_to_elan
.gitignore		.gitignore
CITATION.cff		CITATION.cff
LICENSE		LICENSE
README.es.md		README.es.md
README.md		README.md
audio_to_text.py		audio_to_text.py
detector.py		detector.py
marcador_elan.py		marcador_elan.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Marker-Elan

Video Analysis tool

Introduction

Use instructions

Prerequisites

Installation

Setup files

Run the program from the terminal

Parameters:

Acknowlegment

Authors

Paper Citations

About

Releases

Packages

Contributors 3

Languages

License

Klefur/Elan-Marker

Folders and files

Latest commit

History

Repository files navigation

Marker-Elan

Video Analysis tool

Introduction

Use instructions

Prerequisites

Installation

Setup files

Run the program from the terminal

Parameters:

Acknowlegment

Authors

Paper Citations

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages