Commit_Meaning



 	- Goal of this File:
	---------------------

- This file contains a description of the commits made in this 
repository.

- Sorted (in time) from oldest to newest.

- Note: any commit relative to the code, it will be denoted by
	'commit name'_code.


	*************** Commits Description **************


1) creating_sample_code:
--------------------------------

Till this point, I managed to create 1 sample from the csv files

of SpecCence data.

The sample is of the form: 29 x width , where width is the number of

lines in the csv files, then reshape this sample to a column vector

	- (29 x width),1
	- recall that each line represenets a widnow of 125 ms

I didn't check yet the gaps in time between each window.

To be done in the next phase.

---------------------------------------------------------------

2) adding_header_none_code:
------------------------------

Need to add header = None in case the csv file
don't have header.

3) data_per_csv_file_code:
-----------------------------

	- The construction of the data per csv file / per hour 
		is done
	- the dataloader also work


4) saving_data_npy:
--------------------

In this step, I have created a function which takes the csv file (1 hour duration),
check for the time continuity of the frames, and save these data frame in npy format
(numpy data type).


5) Organizing_to_class:
--------------------------


Transform the helper_function into a class and do a cleaning
for the main_data_building.

6) Adding_feature_make_directory:
----------------------------------

In the class 'Class_Dataset_Construction.py', adding option

of automatically creaing the data set folder and check if it exists
or not.

Also, fix some bug concerning file_name variable:

putting it before the loop 

7) Data_Scannnig_Construction:
----------------------

The Main_Data_Scannning.py:

This file take the data set directory and check for file existence in all
sensors directories.

It uses the itertool module so we can escape from nested loop while testing
for file existence in different sensors direcotries.

It is just a demo to see how itertoo.product work.

Then I have refactor the class 'Class_Dataset_Construction.py' so it takes into account:
	- the creation of directory/sensor_name/train_spec_id.npy
		- where sensor_name is also a subdiractory
	- train_spec_id.npy naming system: train_spec_day_hour_slice

In conclusion, the class 'Class_Dataset_Construction.py' creates proper dataset per sensor, all in one root directory, where its name
is chosen by the user.


8) Data_Building_small_changes:
------------------------------------

Instead of creating multiple direcotries for each sensor, I have put the sensor id also in .npy file

	sensor id is created using the command np.full(self.__width, sensor_index),
	where we create a vector of length self.__width, filled with values equal
	to sensor_index (which is an integer number).

So we have only 1 global directory for the created data set.

Naming convention:

	- train_id_xx.npy for the sensor index of length (10^4)
	- train_time_xx.npy for the time stamp of length (10^4)
	- train_spec_xx.npy for the spectrograms of length (10^4 x 29)

where xx = SensorIndex_month_day_hour_slice


9) Data_Building_multiple_csv_files:
------------------------------------

Extending the 'Class_Dataset_Construction.py' to handle multiple csv files
per 1 day.


10) Audio2Vec_Encoder_Part
----------------------------

In this part I have done the encoder part of audio2vec.

Having some issue with the size of the data when passing through the conv layer.

Working on solving it.


11) Audio2Vec_Encoder_Part_2
---------------------------------

Issue of the dimension has been solved.

Now committing the solution.

How I solve it:


I have added a padding option for the convolution operation


so we don't loose dimension.


Also in the max pooling operation, I have made the kernel size of 1 with
stride of 2, so we can reduce the frequency dimension only and not the time

dimension, since the time dimension is 1 already and nothing to be 

reduced.