Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update KWS, MSNoise, Signalmixer Data Loaders & Evaluation Notebook, Add New Scripts for Mixed Signals #299

Merged
merged 108 commits into from
Jun 28, 2024
Merged
Show file tree
Hide file tree
Changes from 99 commits
Commits
Show all changes
108 commits
Select commit Hold shift + click to select a range
518b925
latest changes
EyubogluMerve Dec 11, 2023
33ee741
-- msnoise is updated with rms values
EyubogluMerve Dec 18, 2023
3cade04
-- speechmixer notebook to test snr levels
EyubogluMerve Dec 18, 2023
da27cd9
-- msnoise dataset changed to a dynamic one
EyubogluMerve Jan 1, 2024
32dabb9
Automated Evaluation in KWS
EyubogluMerve Jan 1, 2024
8cc2bd8
Automated Evaluation on KWS
EyubogluMerve Jan 1, 2024
721154a
Automated Evaluation on KWS
EyubogluMerve Jan 1, 2024
d8b4d78
Merge branch 'develop' of https://github.com/EyubogluMerve/ai8x-train…
EyubogluMerve Jan 1, 2024
7225280
Delete codes directory
EyubogluMerve Jan 1, 2024
75b96d7
Automated Evaluation on KWS
EyubogluMerve Jan 1, 2024
084a005
final upload
EyubogluMerve Jan 1, 2024
e88510f
Automated Evaluation on KWS
EyubogluMerve Jan 1, 2024
dfc3e42
Add NAS KWS model and Dynamic Augmentation
alicangok Jan 1, 2024
d873c85
Fix line endings
alicangok Jan 1, 2024
4486295
Remove utf-8 copyright character
alicangok Jan 1, 2024
76027dd
Fix import
alicangok Jan 1, 2024
211b3b5
automated evaluation on KWS notebook is added
EyubogluMerve Jan 2, 2024
e0ce102
signal mixer updated
EyubogluMerve Jan 2, 2024
81bbdbd
comments are added
EyubogluMerve Jan 2, 2024
583fe32
Automated Evaluation on KWS
EyubogluMerve Jan 2, 2024
a7797b4
Automated Evaluation on KWS
EyubogluMerve Jan 2, 2024
91293d5
Automated Evaluation on KWS
EyubogluMerve Jan 2, 2024
1b3c9f9
parameters are changed
EyubogluMerve Jan 4, 2024
14343dc
get dataset functions are changed
EyubogluMerve Jan 4, 2024
a323093
automated evaluation KWS
EyubogluMerve Jan 4, 2024
7e4cfe1
signalmixer data loader updated
EyubogluMerve Jan 4, 2024
4e78ca8
Automated evaluation on KWS
EyubogluMerve Jan 4, 2024
a8cfe11
fixing linter errors
EyubogluMerve Jan 4, 2024
798a97a
Merge mixedkws
EyubogluMerve Jan 4, 2024
67b284a
Merge branch 'MaximIntegratedAI-develop' into develop
EyubogluMerve Jan 4, 2024
398eb16
Merge branch 'kws/dynamicaug_nas' of https://github.com/alicangok/ai8…
EyubogluMerve Jan 4, 2024
993f8aa
Lint code fixes
EyubogluMerve Jan 5, 2024
516f10a
more lint fixes
EyubogluMerve Jan 5, 2024
21c9e3c
code spell fixes
EyubogluMerve Jan 5, 2024
57b7f10
fixes
EyubogluMerve Jan 5, 2024
317f143
changed files are added
EyubogluMerve Jan 5, 2024
b873e9f
name changes are done
EyubogluMerve Jan 5, 2024
cef3c35
Update msnoise.py copyrights
alicangok Jan 8, 2024
1c89b22
Update signalmixer.py copyright notices
alicangok Jan 8, 2024
8df0c2d
Update Automated_Evaluation_KWS.ipynb copyright notices
alicangok Jan 8, 2024
86b1ed9
signalmixer parameters are updated
EyubogluMerve Jan 15, 2024
a3af468
Notebook is updated using current paths
EyubogluMerve Jan 15, 2024
044339f
Correct os.path.join usage for non-Linux operating systems
alicangok Jan 15, 2024
8530368
Define `data_path` once
alicangok Jan 15, 2024
0b0952d
filtering operations are removed
EyubogluMerve Jan 30, 2024
01e7ab6
Filtering operations are removed.
EyubogluMerve Jan 30, 2024
9675d5d
lint code fixes
EyubogluMerve Jan 30, 2024
c1a9330
Unused variables are deleted.
EyubogluMerve Jan 30, 2024
24d8bbc
Several parameters are added to signalmixer.
EyubogluMerve Feb 6, 2024
40deac8
Fixes for label differences in MSnoise train/test.
EyubogluMerve Feb 6, 2024
c04513f
lint fixes
EyubogluMerve Feb 6, 2024
fedd66e
lint fixes
EyubogluMerve Feb 6, 2024
d629aa2
default values are updated
EyubogluMerve Feb 6, 2024
aa593aa
lint fix
EyubogluMerve Feb 6, 2024
d25de74
probability normalization is added
EyubogluMerve Feb 6, 2024
c13f6b9
signalmixer script
EyubogluMerve Feb 6, 2024
1ec7b38
more datasets are added
EyubogluMerve Feb 7, 2024
dbb3349
deleted script for train_kws20_signalmixer
EyubogluMerve Feb 7, 2024
f5bd8b3
codespell fix
EyubogluMerve Feb 7, 2024
665feed
Merge branch 'develop' into kws/dynamicaug_nas
EyubogluMerve Feb 7, 2024
cae7be0
Merge pull request #7 from EyubogluMerve/kws/dynamicaug_nas
EyubogluMerve Feb 7, 2024
2a020f0
updates
EyubogluMerve Mar 6, 2024
5b57666
Merge branch 'develop'
EyubogluMerve Mar 6, 2024
3d5e0fd
mixedkws lint fix
EyubogluMerve Mar 6, 2024
4c08016
Updated Signalmixer & KWS20 Data Loaders, Notebook
EyubogluMerve Mar 7, 2024
2a14438
Updated Signalmixer & KWS20 Data Loaders, Notebook
EyubogluMerve Mar 7, 2024
ef7dd59
Updates on Data Loaders
EyubogluMerve Mar 7, 2024
60c86c6
Lint code fixes
EyubogluMerve Mar 7, 2024
874032d
update
EyubogluMerve Mar 7, 2024
ec3b726
updates
EyubogluMerve Mar 7, 2024
be47b69
Updating data loaders
EyubogluMerve Mar 7, 2024
bf2a347
Merge branch 'MaximIntegratedAI:develop' into kws/signalmixer
EyubogluMerve Mar 12, 2024
8283807
KWS Noise Evaluation Notebook is Updated
EyubogluMerve Mar 13, 2024
4a48d7d
Evaluation Scripts for KWS NAS & v3-MSnoise mixed
EyubogluMerve Mar 13, 2024
36d70a3
Update kws20.py
rotx-eva Mar 13, 2024
1a5a33b
Merge branch 'develop' into kws/signalmixer
rotx-eva Mar 13, 2024
ae451ab
Merge branch 'develop' into kws/signalmixer
rotx-eva Mar 19, 2024
df2586a
Merge branch 'MaximIntegratedAI:develop' into develop
EyubogluMerve Mar 20, 2024
973e387
Several changes are added.
EyubogluMerve Mar 20, 2024
31620f1
Several changes are added.
EyubogluMerve Mar 20, 2024
98ea3b3
Merge branch 'kws/signalmixer'
EyubogluMerve Mar 20, 2024
61ba307
lint fixes
EyubogluMerve Mar 20, 2024
9968e78
Missing commits are fixed +notebook is updated
EyubogluMerve Mar 21, 2024
82c8b90
benchmark code -v1
EyubogluMerve Apr 2, 2024
b4bd6a1
final changes for benchmark
EyubogluMerve Apr 4, 2024
00960f1
lint fix
EyubogluMerve Apr 4, 2024
76b784f
more lint fixes
EyubogluMerve Apr 4, 2024
991bdc0
Merge branch 'kws/signalmixer' into kws/benchmark
EyubogluMerve Apr 4, 2024
4a59998
Update kws20.py
EyubogluMerve Apr 4, 2024
bd574eb
lint fix
EyubogluMerve Apr 4, 2024
bc910ce
Merge pull request #8 from EyubogluMerve/kws/benchmark
EyubogluMerve Apr 4, 2024
22eaa31
dataset file name fix
EyubogluMerve May 15, 2024
4a1f063
msnoise conflict fix
EyubogluMerve May 15, 2024
1d956e7
Merge branch 'develop' into kws/signalmixer
EyubogluMerve May 15, 2024
a991acb
Merge branch 'develop' into kws/signalmixer
rotx-eva May 16, 2024
e9ffaad
Merge branch 'develop' into kws/signalmixer
rotx-eva May 21, 2024
347ad05
Minor changes
EyubogluMerve May 29, 2024
302b77c
Merge branch 'kws/signalmixer' of https://github.com/EyubogluMerve/ai…
EyubogluMerve May 29, 2024
f52a367
kws35 dataset function is added
EyubogluMerve May 31, 2024
878c3a6
scripts are updated
EyubogluMerve Jun 4, 2024
4ced40c
scripts are updated correctly
EyubogluMerve Jun 4, 2024
7f787f1
kws dataset dict update -current version
EyubogluMerve Jun 12, 2024
01e5af3
KWS dataset dict changed
EyubogluMerve Jun 14, 2024
55fe241
filter silence parameter removal
EyubogluMerve Jun 14, 2024
b921c2d
Merge branch 'develop' into kws/signalmixer
rotx-eva Jun 18, 2024
7feaf9f
patch is applied to kws20.py
EyubogluMerve Jun 19, 2024
9187094
PR patch-2 applied to kws20
EyubogluMerve Jun 25, 2024
c59db1f
Merge branch 'develop' into kws/signalmixer
rotx-eva Jun 26, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
394 changes: 355 additions & 39 deletions datasets/kws20.py

Large diffs are not rendered by default.

493 changes: 0 additions & 493 deletions datasets/mixedkws.py

This file was deleted.

175 changes: 76 additions & 99 deletions datasets/msnoise.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,11 +43,11 @@ class MSnoise:
Args:
root (string): Root directory of dataset where ``MSnoise/processed/dataset.pt``
exist.
classes(array): List of keywords to be used.
d_type(string): Option for the created dataset. ``train`` or ``test``.
dataset_len(int): Dataset length to be returned.
remove_unknowns (bool, optional): If true, unchosen classes are not gathered as
the unknown class.
classes (array): List of keywords to be used.
d_type (string): Option for the created dataset. ``train`` or ``test``.
dataset_len (int): Dataset length to be returned.
exp_len (int, optional): Expected length of the 1-sec audio samples.
desired_probs (array, optional): Desired probabilities array for each noise type specified.
transform (callable, optional): A function/transform that takes in an PIL image
and returns a transformed version.
quantize (bool, optional): If true, the datasets are prepared and saved as
Expand All @@ -67,19 +67,16 @@ class MSnoise:
'Square': 18, 'SqueakyChair': 19, 'Station': 20, 'TradeShow': 21, 'Traffic': 22,
'Typing': 23, 'VacuumCleaner': 24, 'WasherDryer': 25, 'Washing': 26}

def __init__(self, root, classes, d_type, dataset_len, exp_len=16384, fs=16000,
noise_time_step=0.25, remove_unknowns=False, transform=None,
quantize=False, download=False):
def __init__(self, root, classes, d_type, dataset_len, exp_len=16384, desired_probs=None,
transform=None, quantize=False, download=False):
self.root = root
self.classes = classes
self.d_type = d_type
self.remove_unknowns = remove_unknowns
self.transform = transform

self.dataset_len = dataset_len
self.exp_len = exp_len
self.fs = fs
self.noise_time_step = noise_time_step
self.desired_probs = desired_probs

self.noise_train_folder = os.path.join(self.raw_folder, 'noise_train')
self.noise_test_folder = os.path.join(self.raw_folder, 'noise_test')
Expand All @@ -97,9 +94,6 @@ def __init__(self, root, classes, d_type, dataset_len, exp_len=16384, fs=16000,
# rms values for each sample to be returned
self.rms = np.zeros(self.dataset_len)

self.__filter_dtype()
self.__filter_classes()

@property
def raw_folder(self):
"""Folder for the raw data.
Expand All @@ -117,6 +111,13 @@ def __download(self):
self.__download_raw(self.url_train)
self.__download_raw(self.url_test)

# Fix the naming convention mismatches
for record_name in os.listdir(self.noise_test_folder):
if 'Neighbor' in record_name.split('_')[0]:
rec_pth = f'NeighborSpeaking_{record_name.split("_")[-1]}'
rec_pth = os.path.join(self.noise_test_folder, rec_pth)
os.rename(os.path.join(self.noise_test_folder, record_name), rec_pth)

def __download_raw(self, api_url):
opener = urllib.request.build_opener()
opener.addheaders = [('User-agent', 'Mozilla/5.0')]
Expand Down Expand Up @@ -151,52 +152,6 @@ def __makedir_exist_ok(self, dirpath):
else:
raise

def __filter_dtype(self):

if self.d_type == 'train':
bool_list = [i == 0 for i in self.data_type]
idx_to_select = [i for i, x in enumerate(bool_list) if x]
elif self.d_type == 'test':
bool_list = [i == 1 for i in self.data_type]
idx_to_select = [i for i, x in enumerate(bool_list) if x]
else:
print(f'Unknown data type: {self.d_type}')
return

self.data = [self.data[i] for i in idx_to_select]
self.targets = [self.targets[i] for i in idx_to_select]
self.rms_val = [self.rms_val[i] for i in idx_to_select]
del self.data_type

def __filter_classes(self):
print('\n')
self.targets = np.array(self.targets)
initial_new_class_label = len(self.class_dict)
new_class_label = initial_new_class_label
for c in self.classes:
if c not in self.class_dict:
print(f'Class is not in the data: {c}')
return
# else:
print(f'Class {c}, {self.class_dict[c]}')
bool_list = [self.class_dict[c] == i for i in self.targets]
idx = [i for i, x in enumerate(bool_list) if x]
self.targets[idx] = new_class_label
print(f'{c}: {new_class_label - initial_new_class_label}')
new_class_label += 1

self.targets[(self.targets < initial_new_class_label)] = new_class_label
if self.remove_unknowns:
bool_list = [i != new_class_label for i in self.targets]
idx_to_keep = [i for i, x in enumerate(bool_list) if x]

self.data = [self.data[i] for i in idx_to_keep]
self.targets = [self.targets[i] for i in idx]
self.rms_val = [self.rms_val[i] for i in idx]

self.targets = [target - initial_new_class_label for target in self.targets]
print('\n')

@staticmethod
def quantize_audio(data, num_bits=8):
"""Quantize audio
Expand All @@ -213,13 +168,10 @@ def __len__(self):

def __getitem__(self, index):

rec_num = len(self.data)

rnd_num = np.random.randint(0, rec_num)
rnd_num = np.random.choice(range(len(self.data)), p=self.final_probs)
self.rms[index] = self.rms_val[rnd_num]

rec_len = len(self.data[rnd_num])

max_start_idx = rec_len - self.exp_len
start_idx = np.random.randint(0, max_start_idx)
end_idx = start_idx + self.exp_len
Expand All @@ -237,44 +189,54 @@ def __reshape_audio(self, audio, row_len=128):

return torch.transpose(torch.tensor(audio.reshape((-1, row_len))), 1, 0)

def __gen_datasets(self, exp_len=16384, row_len=128, overlap_ratio=0):
def __gen_datasets(self):

with warnings.catch_warnings():
warnings.simplefilter('error')

# PARAMETERS
overlap = int(np.ceil(row_len * overlap_ratio))
num_rows = int(np.ceil(exp_len / (row_len - overlap)))
data_len = int((num_rows*row_len - (num_rows-1)*overlap))
print(f'data_len: {data_len}')

# Cleaning the duplicate labels
labels = list(self.classes)
train_list = sorted(os.listdir(self.noise_train_folder))
test_list = sorted(os.listdir(self.noise_test_folder))
labels_train = set(sorted({i.split('_')[0] for i in train_list if '_' in i}))
labels_test = set(sorted({i.split('_')[0] for i in test_list if '_' in i}))
labels = labels_train | labels_test
labels_to_remove = set()
for label in labels:
other_labels = labels - {label}
for other_label_name in other_labels:
if label in other_label_name:
labels_to_remove.add(label)
break
labels = labels - labels_to_remove
labels = sorted(labels)
print(f'Labels: {labels}')

# Folders
train_test_folders = [self.noise_train_folder, self.noise_test_folder]

if self.d_type == 'train':
check_label = labels_train
audio_folder = [self.noise_train_folder]
elif self.d_type == 'test':
check_label = labels_test
audio_folder = [self.noise_test_folder]

for label in self.classes:
if label not in check_label:
print(f'Label {label} is not in the MSnoise {self.d_type} dataset.')
labels.remove(label)

print(f'Labels for {self.d_type}: {labels}')

if self.desired_probs is None or len(self.desired_probs) != len(labels):
self.desired_probs = []
print('Each class will be selected using the same probability!')
label_count = len(labels)
for i in range(label_count):
self.desired_probs.append(1/label_count)

elif np.sum(self.desired_probs) != 1:
print('Sum of the probabilities is not 1!\n')
print('Carrying out the normal probability distribution.')
self.desired_probs = self.desired_probs / np.sum(self.desired_probs)

print(f'Desired probabilities for each class: {self.desired_probs}')

self.data_class_count = {}
data_in = []
data_type = []
data_class = []
rms_val = []

for i, label in enumerate(labels):
for folder in train_test_folders:
count = 0
for folder in audio_folder:
for record_name in sorted(os.listdir(folder)):
if record_name.split('_')[0] in label:
record_path = os.path.join(folder, record_name)
Expand All @@ -292,12 +254,24 @@ def __gen_datasets(self, exp_len=16384, row_len=128, overlap_ratio=0):

data_class.append(i)
rms_val.append(np.mean(record**2)**0.5)
count += 1
self.data_class_count[label] = count

noise_dataset = (data_in, data_class, data_type, rms_val)

final_probs = np.zeros(len(data_in))

idx = 0
for i, label in enumerate(labels):
for _ in range(self.data_class_count[label]):
final_probs[idx] = self.desired_probs[i]/self.data_class_count[label]
idx += 1
self.final_probs = final_probs
return noise_dataset


def MSnoise_get_datasets(data, load_train=True, load_test=True):
def MSnoise_get_datasets(data, desired_probs=None, train_len=346338, test_len=11005,
load_train=True, load_test=True):
"""
Load the folded 1D version of MS Scalable Noisy Speech dataset (MS-SNSD)

Expand All @@ -316,22 +290,23 @@ def MSnoise_get_datasets(data, load_train=True, load_test=True):
'Square', 'SqueakyChair', 'Station', 'Traffic',
'Typing', 'VacuumCleaner', 'WasherDryer', 'Washing', 'TradeShow']

remove_unknowns = True
transform = transforms.Compose([
ai8x.normalize(args=args)
])
quantize = True

if load_train:
train_dataset = MSnoise(root=data_dir, classes=classes, d_type='train', dataset_len=11005,
remove_unknowns=remove_unknowns, transform=transform,
train_dataset = MSnoise(root=data_dir, classes=classes, d_type='train',
dataset_len=train_len, desired_probs=desired_probs,
transform=transform,
quantize=quantize, download=True)
else:
train_dataset = None

if load_test:
test_dataset = MSnoise(root=data_dir, classes=classes, d_type='test', dataset_len=11005,
remove_unknowns=remove_unknowns, transform=transform,
test_dataset = MSnoise(root=data_dir, classes=classes, d_type='test',
dataset_len=test_len, desired_probs=desired_probs,
transform=transform,
quantize=quantize, download=True)

if args.truncate_testset:
Expand All @@ -342,7 +317,8 @@ def MSnoise_get_datasets(data, load_train=True, load_test=True):
return train_dataset, test_dataset


def MSnoise_get_unquantized_datasets(data, load_train=True, load_test=True):
def MSnoise_get_unquantized_datasets(data, desired_probs=None, train_len=346338, test_len=11005,
load_train=True, load_test=True):
"""
Load the folded 1D and unquantized version of MS Scalable Noisy Speech dataset (MS-SNSD)

Expand All @@ -360,20 +336,21 @@ def MSnoise_get_unquantized_datasets(data, load_train=True, load_test=True):
'Square', 'SqueakyChair', 'Station', 'Traffic',
'Typing', 'VacuumCleaner', 'WasherDryer', 'Washing', 'TradeShow']

remove_unknowns = True
transform = None
quantize = False

if load_train:
train_dataset = MSnoise(root=data_dir, classes=classes, d_type='train', dataset_len=11005,
remove_unknowns=remove_unknowns, transform=transform,
train_dataset = MSnoise(root=data_dir, classes=classes, d_type='train',
dataset_len=train_len, desired_probs=desired_probs,
transform=transform,
quantize=quantize, download=True)
else:
train_dataset = None

if load_test:
test_dataset = MSnoise(root=data_dir, classes=classes, d_type='test', dataset_len=11005,
remove_unknowns=remove_unknowns, transform=transform,
test_dataset = MSnoise(root=data_dir, classes=classes, d_type='test',
dataset_len=test_len, desired_probs=desired_probs,
transform=transform,
quantize=quantize, download=True)

if args.truncate_testset:
Expand Down
Loading