-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathREADME.txt
69 lines (49 loc) · 2.23 KB
/
README.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
README.txt
============================================================
** Contact Info
============================================================
Sangeun Kum <[email protected]>
Changheun Oh <[email protected]>
Juhan Nam <[email protected]>
Korea Advanced Institute of Science and Technology
============================================================
** Description
============================================================
This is our submission to the 2016 MIREX melody extraction task.
The algorithm is a classification based approach using deep neural networks.
The file 'main.py' is the main function for calling the algorithm.
It takes as parameter, input the full path string for the input file and output file.
If you want to know about this algorithms,
please check https://wp.nyu.edu/ismir2016/wp-content/uploads/sites/2294/2016/07/119_Paper.pdf
============================================================
** Platform and Requirements
============================================================
1. OS : LINUX
2. Programming language : Python 2.7
3. Python Library :
1) Keras (Deep Learning library for Theano)
>> http://keras.io/
2) Theano (Backend of Keras)
>> http://deeplearning.net/software/theano/install.html#install
3) Librosa (for audio analysis such as laod,STFT,resampling)
>> http://librosa.github.io/librosa/
4) ffmpeg
>> https://www.ffmpeg.org/
>> for install : brew install ffmpeg
5) Numpy, SciPy
4. Hardware
1) GPU : GeForce GTX 980
>> https://developer.nvidia.com/cuda-toolkit
5. Expected runtime : 2~3 seconds/song
============================================================
** Use
============================================================
The algorithm is called as follows:
(to call from the command line)
>>python main.py <parameter> <input path> <ouput path>
ex) >>python main.py 0.2 '/home/keums/Melody/dataset/adc2004_full_set/file/pop3.wav' './SAVE_RESULTS/pop3.txt'
or
(to call from the shell)
>>main(param = 0.2, PATH_LOAD_FILE='/home/keums/Melody/dataset/adc2004_full_set/file/pop4.wav', PATH_SAVE_FILE='./SAVE_RESULTS/pop4.txt')
** default param = 0.2,
if the voice recall rate is low, increaing the param would be effective (0 <= param <= 1 )