-
Notifications
You must be signed in to change notification settings - Fork 10
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Source, pretrained models, and experimental setup for SleepEDF-SC and SleepEDF-ST
- Loading branch information
Showing
469 changed files
with
15,310 additions
and
3 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,81 @@ | ||
# sleep_transfer_learning | ||
This is the place holder for the source code and the pretrained models associated with the work "Towards More Accurate Automatic Sleep Staging via Deep Transfer Learning" | ||
|
||
The source code and the pretrained models are being prepared and will be made available soon. | ||
|
||
# Towards More Accurate Automatic Sleep Staging via Deep Transfer Learning | ||
|
||
This repository contains source code, pretrained models, and experimental setup in the manuscript: | ||
|
||
- Huy Phan, Oliver Y. Chén, Philipp Koch, Zongqing Lu, Ian McLoughlin, Alfred Mertins, and Maarten De Vos. [__Towards More Accurate Automatic Sleep Staging via Deep Transfer Learning.__](https://arxiv.org/abs/1907.13177) _arXiv preprint arXiv:1907.13177_, 2019 | ||
|
||
<img src="figure/Sleep_Transfer.png" class="center" alt="Sleep Transfer Learning" width="450"/> | ||
|
||
## Data Preparation with Matlab: | ||
------------- | ||
|
||
### SeqSleepNet | ||
- Change path to `seqsleepnet/` | ||
- Run `preprare_data_sleepedf_sc.m` to prepare SleepEDF-SC data (the path to the data must be provided, refer to the script for comments). The `.mat` files generated are stored in `mat/` directory. | ||
- Run `genlist_sleepedf_sc.m` to generate list of SleepEDF-SC files for network training based on the data split in `data_split_sleepedf_sc.mat`. The files generated are stored in `tf_data/` directory. | ||
- Run `preprare_data_sleepedf_st.m` to prepare SleepEDF-ST data (the path to the data must be provided refer to the script for comments). The `.mat` files generated are stored in `mat/` directory. | ||
- Run `genlist_sleepedf_st.m` to generate list of SleepEDF-ST files for network training based on the data split in `data_split_sleepedf_st.mat`. The files generated are stored in `tf_data/` directory. | ||
|
||
### DeepSleepNet (likewise) | ||
|
||
## Network training and evaluation with Tensorflow: | ||
------------- | ||
### SeqSleepNet | ||
- Change path to `seqsleepnet/tensorflow/seqsleepnet/` | ||
- Run the example bash scripts: | ||
|
||
- `finetune_all.sh`: finetune entire a pretrained network | ||
- `finetune_softmax_SPB.sh`: finetune softmax + sequence processing block (SPB) | ||
- `finetune_softmax_EPB.sh`: finetune softmax + epoch processing block (EPB) | ||
- `finetune_softmax.sh`: finetune softmax | ||
- `train_scratch.sh`: train a network from scratch | ||
|
||
_Note_: when the `--pretrained_model` parameter is empty, the network will be trained from scratch. Otherwise, the specified pretrained model will be loaded and finetuned with the finetuning strategy specified in the `--finetune_mode` | ||
### DeepSleepNet (likewise) | ||
|
||
_Note_: DeepSleepNet pretrained models are quite heavy. They were uploaded separately and can be downloaded from here: [https://zenodo.org/record/3375235](https://zenodo.org/record/3375235) | ||
|
||
## Evaluation | ||
After training/finetuning and testing the network on test data: | ||
|
||
- Change path to `seqsleepnet/` or `deepsleepnet/` | ||
- Refer to `examples_evaluation.m` for examples that calculates the performance metrics. | ||
|
||
## Some results: | ||
------------- | ||
- Finetuning results with _SeqSleepNet_: | ||
|
||
 | ||
|
||
- Finetuning results with _DeepSleepNet_: | ||
|
||
 | ||
|
||
Environment: | ||
------------- | ||
- Matlab v7.3 (for data preparation) | ||
- Python3 | ||
- Tensorflow GPU versions 1.4 - 1.14 (for network training and evaluation) | ||
- numpy | ||
- scipy | ||
- sklearn | ||
- h5py | ||
|
||
## Note on the SleepEDF Expanded Database: | ||
|
||
The SleepEDF expanded database can be download from https://physionet.org/content/sleep-edfx/1.0.0/. The latest version of this database contains 153 subjects in the SC subset. This experiment was conducted with the __previous version__ of the SC subset which contains __20 subjects__ intentionally to simulate the situation of a small cohort. If you download the new version, make sure to use 20 subjects __SC400-SC419__. | ||
|
||
On the ST subset of the database, the experiments were conducted with 22 placebo recordings. Make sure that you refer to https://physionet.org/content/sleep-edfx/1.0.0/ST-subjects.xls to obtain the right recordings and subjects. | ||
|
||
The experiments only used the __in-bed__ parts (from _light off_ time to _light on_ time) of the recordings to avoid the dominance of Wake stage as suggested in | ||
|
||
- S. A. Imtiaz and E. Rodriguez-Villegas, __An open-source toolbox for standardized use of PhysioNet Sleep EDF Expanded Database__. _Proc. EMBC_, pp. 6014-6017, 2015. | ||
|
||
Meta information (e.g. _light off_ and _light on_ times to extract the __in-bed__ parts data from the whole day-night recordings the meta information is provided in `sleepedfx_meta`. | ||
|
||
Contact: | ||
------------- | ||
Huy Phan <br> | ||
Email: huy.phan{at}ieee.org or h.phan{a}kent.ac.uk |
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,97 @@ | ||
function [acc, f1, kappa, mean_sens, mean_sel] = compute_sleepedfsc_performance(ret_path) | ||
|
||
seq_len = 20; | ||
Nfold = 20; | ||
yh = cell(Nfold,1); | ||
yt = cell(Nfold,1); | ||
mat_path = './mat/sleepedf_sc/'; | ||
% load data split | ||
load('./data_split_sleepedf_sc.mat'); | ||
|
||
for fold = 1 : Nfold | ||
fold | ||
test_s = test_sub{fold}; | ||
sample_size = []; | ||
for i = 1 : numel(test_s) | ||
i | ||
for night = 1 : 2 | ||
sname = ['n', num2str(test_s(i),'%02d'), '_', num2str(night), '_eeg.mat']; | ||
% subject 13 does not have 2 nights | ||
if(~exist([mat_path, sname], 'file')) | ||
continue | ||
end | ||
load([mat_path,sname], 'label'); | ||
% this is actual output of the network as we excluded those at the | ||
% recording ends which do not consitute a full sequence | ||
sample_size = [sample_size; numel(label) - (seq_len - 1)]; | ||
% pool ground-truth labels of all test subjects | ||
yt{fold} = [yt{fold}; double(label)]; | ||
end | ||
end | ||
|
||
|
||
if(~exist([ret_path, 'n', num2str(fold),'/test_ret.mat'],'file')) | ||
disp('Returned file does not exist:') | ||
disp([ret_path, 'n', num2str(fold),'/test_ret.mat']) | ||
end | ||
|
||
load([ret_path, 'n', num2str(fold),'/test_ret.mat']); | ||
% as we shifted by one PSG epoch when generating sequences, L (sequence | ||
% length) decisions are available for each PSG epoch. This segment is | ||
% to aggregate the decisions to derive the final one. | ||
score_ = cell(1,seq_len); | ||
for n = 1 : seq_len | ||
score_{n} = softmax(squeeze(score(n,:,:))); | ||
end | ||
score = score_; | ||
clear score_; | ||
|
||
count = 0; | ||
for i = 1 : numel(test_s) | ||
for night = 1 : 2 | ||
sname = ['n', num2str(test_s(i),'%02d'), '_', num2str(night), '_eeg.mat']; | ||
if(~exist([mat_path, sname], 'file')) | ||
continue | ||
end | ||
count = count + 1; | ||
% start and end positions of current test subject's output | ||
start_pos = sum(sample_size(1:count-1)) + 1; | ||
end_pos = sum(sample_size(1:count-1)) + sample_size(count); | ||
score_i = cell(1,seq_len); | ||
for n = 1 : seq_len | ||
score_i{n} = score{n}(start_pos:end_pos, :); | ||
N = size(score_i{n},1); | ||
% padding ones for those positions not constituting full | ||
% sequences | ||
score_i{n} = [ones(seq_len-1,5); score{n}(start_pos:end_pos, :)]; | ||
score_i{n} = circshift(score_i{n}, -(seq_len - n), 1); | ||
end | ||
|
||
% multiplicative probabilistic smoothing for aggregation | ||
% which equivalent to summation in log domain | ||
fused_score = log(score_i{1}); | ||
for n = 2 : seq_len | ||
fused_score = fused_score + log(score_i{n}); | ||
end | ||
|
||
% the final output labels via likelihood maximization | ||
yhat = zeros(1,size(fused_score,1)); | ||
for k = 1 : size(fused_score,1) | ||
[~, yhat(k)] = max(fused_score(k,:)); | ||
end | ||
|
||
% pool outputs of all test subjects | ||
yh{fold} = [yh{fold}; double(yhat')]; | ||
end | ||
end | ||
end | ||
|
||
yh = cell2mat(yh); | ||
yt = cell2mat(yt); | ||
|
||
[acc, kappa, f1, ~, spec] = calculate_overall_metrics(yt, yh); | ||
[sens, sel] = calculate_classwise_sens_sel(yt, yh); | ||
mean_sens = mean(sens); | ||
mean_sel = mean(sel); | ||
end | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,93 @@ | ||
function [acc, f1, kappa, mean_sens, mean_sel] = compute_sleepedfst_performance(ret_path) | ||
|
||
seq_len = 20; | ||
Nfold = 11; | ||
yh = cell(Nfold,1); | ||
yt = cell(Nfold,1); | ||
mat_path = './mat/sleepedf_st/'; | ||
% load data split | ||
load('./data_split_sleepedf_st.mat'); | ||
|
||
for fold = 1 : Nfold | ||
fold | ||
test_s = test_sub{fold}; | ||
sample_size = []; | ||
for i = 1 : numel(test_s) | ||
i | ||
sname = ['n', num2str(test_s(i),'%02d'), '_eeg.mat']; | ||
if(~exist([mat_path, sname], 'file')) | ||
continue | ||
end | ||
load([mat_path,sname], 'label'); | ||
% this is actual output of the network as we excluded those at the | ||
% recording ends which do not consitute a full sequence | ||
sample_size = [sample_size; numel(label) - (seq_len - 1)]; | ||
% pool ground-truth labels of all test subjects | ||
yt{fold} = [yt{fold}; double(label)]; | ||
end | ||
|
||
|
||
if(~exist([ret_path, 'n', num2str(fold),'/test_ret.mat'],'file')) | ||
disp('Returned file does not exist:') | ||
disp([ret_path, 'n', num2str(fold),'/test_ret.mat']) | ||
end | ||
|
||
load([run_path, 'n', num2str(fold),'/test_ret.mat']); | ||
% as we shifted by one PSG epoch when generating sequences, L (sequence | ||
% length) decisions are available for each PSG epoch. This segment is | ||
% to aggregate the decisions to derive the final one. | ||
score_ = cell(1,seq_len); | ||
for n = 1 : seq_len | ||
score_{n} = softmax(squeeze(score(n,:,:))); | ||
end | ||
score = score_; | ||
clear score_; | ||
|
||
count = 0; | ||
for i = 1 : numel(test_s) | ||
sname = ['n', num2str(test_s(i),'%02d'), '_eeg.mat']; | ||
if(~exist([mat_path, sname], 'file')) | ||
continue | ||
end | ||
count = count + 1; | ||
% start and end positions of current test subject's output | ||
start_pos = sum(sample_size(1:count-1)) + 1; | ||
end_pos = sum(sample_size(1:count-1)) + sample_size(count); | ||
score_i = cell(1,seq_len); | ||
%valid_ind = cell(1,seq_len); | ||
for n = 1 : seq_len | ||
score_i{n} = score{n}(start_pos:end_pos, :); | ||
N = size(score_i{n},1); | ||
% padding ones for those positions not constituting full | ||
% sequences | ||
score_i{n} = [ones(seq_len-1,5); score{n}(start_pos:end_pos, :)]; | ||
score_i{n} = circshift(score_i{n}, -(seq_len - n), 1); | ||
end | ||
|
||
% multiplicative probabilistic smoothing for aggregation | ||
% which equivalent to summation in log domain | ||
fused_score = log(score_i{1}); | ||
for n = 2 : seq_len | ||
fused_score = fused_score + log(score_i{n}); | ||
end | ||
|
||
% the final output labels via likelihood maximization | ||
yhat = zeros(1,size(fused_score,1)); | ||
for k = 1 : size(fused_score,1) | ||
[~, yhat(k)] = max(fused_score(k,:)); | ||
end | ||
|
||
% pool outputs of all test subjects | ||
yh{fold} = [yh{fold}; double(yhat')]; | ||
end | ||
end | ||
|
||
yh = cell2mat(yh); | ||
yt = cell2mat(yt); | ||
|
||
[acc, kappa, f1, ~, spec] = calculate_overall_metrics(yt, yh); | ||
[sens, sel] = calculate_classwise_sens_sel(yt, yh); | ||
mean_sens = mean(sens); | ||
mean_sel = mean(sel); | ||
end | ||
|
Binary file not shown.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
%% | ||
% Examples on how to evaluate the performance | ||
%% | ||
clear all | ||
close all | ||
clc | ||
|
||
addpath('../metrics'); | ||
|
||
%% Example 1 | ||
% path to tensorflow experiments with SleepEDF-SC and the network output saved in | ||
% test_ret.mat | ||
% finetuning 2chan EEG+EOG experiment is used as the example here | ||
ret_path = './tensorflow/seqsleepnet/finetune_all_2chan/sleepedf_sc/'; | ||
|
||
[acc, f1, kappa, mean_sens, mean_sel] = compute_sleepedfsc_performance(ret_path); | ||
|
||
|
||
%% Example 2 | ||
% path to tensorflow experiments with SleepEDF-ST and the network output saved in | ||
% test_ret.mat | ||
% finetuning 2chan EEG+EOG experiment is used as the example here | ||
ret_path = './tensorflow/seqsleepnet/finetune_all_2chan/sleepedf_st/'; | ||
|
||
[acc, f1, kappa, mean_sens, mean_sel] = compute_sleepedfsc_performance(ret_path); |
Oops, something went wrong.