GitHub - zabir-nabil/audioperm: A python library for generating different permutations of audible segments from audio files.

Audioperm, a python library for generating different permutations of audible segments from audio files.

Audioperm

A python library for generating different permutations of audible segments from audio files.

pip install audioperm

Use:

Silence Removal from Audio
Audio / Speech augmentation
Word segmentation
Word level permutation generation
Add new synthetic data for deep learning
Speaker recognition, Speaker verification, Audio classification, Audio fingerprinting

Documentation: https://zabir-nabil.github.io/audioperm/

Source Code: https://github.com/zabir-nabil/audioperm

Word segmentation

from audioperm import AudioPerm
from audioperm.utils import save_audio

ap = AudioPerm("i_love_cats.m4a")
label = "i love cats"

words = ap.word_segments()
label_words = label.split()

for i, w in enumerate(words):
  save_audio(w, label_words[i] + ".wav")

cats.wav  i_love_cats.m4a  i.wav  love.wav

Word-level permutation

import numpy as np
from audioperm import AudioPerm
from audioperm.utils import save_audio

ap = AudioPerm("i_love_cats.m4a")
ap.word_segments(return_words=False)
perm_sentences = ap.permute(n_permutations = 5)

for i, s in enumerate(perm_sentences):
  save_audio(s, f"perm_{i}.wav")

cats.wav	   i.wav       perm_1.wav    perm_4.wav
i_love_cats.m4a    love.wav    perm_2.wav    perm_0.wav  
perm_3.wav

`permutations` on multiple files

from audioperm import read_audio, word_segments, permutations

ap = read_audio(["bangla_demo.wav", "i_love_cats.m4a"])
out = word_segments(ap)
perms = permutations(out, n_permutations = 5)

Fixed-length segments

Generate fixed length audible segments (with permutation/augmentation)

from audioperm import fixed_len_segments
fixed_len_segments("bangla_demo.wav", return_segments = False, save_path = "fls_out", save = True, segment_size = 0.5)
out = fixed_len_segments("bangla_demo.wav", return_segments = True, max_segments = 5, permute = True, save = False, segment_size = 0.5)

Support

Tested with: python3.6 python3.7 python3.8

Internal audio representation: PCM 16 float32

TO-DO:

multi-channel audio
augmentation
multi-processing
gpu-support

Others

To run the code: Google Colab

Any contribution is welcome.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
audioperm.egg-info		audioperm.egg-info
audioperm		audioperm
build/lib/audioperm		build/lib/audioperm
dist		dist
docs		docs
html		html
notebooks		notebooks
tests		tests
.gitignore		.gitignore
CONTRIBUTE.md		CONTRIBUTE.md
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Audioperm

Use:

Word segmentation

Word-level permutation

`permutations` on multiple files

Fixed-length segments

Support

Others

About

Releases 1

Packages

Languages

License

zabir-nabil/audioperm

Folders and files

Latest commit

History

Repository files navigation

Audioperm

Use:

Word segmentation

Word-level permutation

permutations on multiple files

Fixed-length segments

Support

Others

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

`permutations` on multiple files

Packages