NormAE (Normalization Autoencoder)

It's a novel batch effects removal method based on deep autoencoder and adversarial leanring for metabolomics data. Additional classifier and ranker are trained to provide adversarial regularizations during training AE model, and latent representations are extracted by the encoder and then decoder reconstructs data without batch effects. The schematic diagram of NormAE is shown as follow.

The NormAE method was tested in two real metabolomics datasets. We show the results of Amide dataset as follow.

The results of Amide dataset. There are PCA score plots (A), heatmaps of the PCCs (B), intensity of peak M235T294 changing with injection order (C), the cumulative RSD curve of QCs (D), the number of differential peaks (E), average AUC values using same number of peaks (F), and AUC values using selected peaks based on features selection pipeline (G) before and after applying each batch effects removal method. Four colors circles refer to different batches. The solid and open circles refer to QCs and subject samples, respectively.

Paper: NormAE: A Novel Deep Adversarial Learning Model to Remove Batch Effects in Liquid Chromatography Mass Spectrometry-Based Metabolomics Data

usage: main.py [-h] [--task TASK] [--meta_data META_DATA]
               [--sample_data SAMPLE_DATA] [-td TRAIN_DATA] [-s SAVE]
               [--ae_encoder_units AE_ENCODER_UNITS [AE_ENCODER_UNITS ...]]
               [--ae_decoder_units AE_DECODER_UNITS [AE_DECODER_UNITS ...]]
               [--disc_b_units DISC_B_UNITS [DISC_B_UNITS ...]]
               [--disc_o_units DISC_O_UNITS [DISC_O_UNITS ...]]
               [--bottle_num BOTTLE_NUM]
               [--dropouts DROPOUTS DROPOUTS DROPOUTS DROPOUTS]
               [--lambda_b LAMBDA_B] [--lambda_o LAMBDA_O] [--lr_rec LR_REC]
               [--lr_disc_b LR_DISC_B] [--lr_disc_o LR_DISC_O]
               [-e EPOCH EPOCH EPOCH]
               [--use_batch_for_order USE_BATCH_FOR_ORDER] [-bs BATCH_SIZE]
               [--load LOAD] [--visdom_env VISDOM_ENV]
               [--visdom_port VISDOM_PORT] [-nw NUM_WORKERS] [--use_log]
               [--use_batch USE_BATCH] [--sample_size SAMPLE_SIZE]
               [--random_seed RANDOM_SEED] [--device {None,CPU,GPU}]

optional arguments:
  -h, --help            show this help message and exit
  --task TASK           task, train model (train, default) or remove batch
                        effects (remove)
  --meta_data META_DATA
                        the path of metabolomics data
  --sample_data SAMPLE_DATA
                        the path of sample information
  -td TRAIN_DATA, --train_data TRAIN_DATA
                        the training data, subject or all (default)
  -s SAVE, --save SAVE  the path to save results, default ./save
  --ae_encoder_units AE_ENCODER_UNITS [AE_ENCODER_UNITS ...]
                        the hidden units of encoder, default 1000, 1000
  --ae_decoder_units AE_DECODER_UNITS [AE_DECODER_UNITS ...]
                        the hidden units of decoder, default 1000, 1000
  --disc_b_units DISC_B_UNITS [DISC_B_UNITS ...]
                        the hidden units of disc_b, default 250, 250
  --disc_o_units DISC_O_UNITS [DISC_O_UNITS ...]
                        the hidden units of disc_b, default 250, 250
  --bottle_num BOTTLE_NUM
                        the number of bottle neck units, default 500
  --dropouts DROPOUTS DROPOUTS DROPOUTS DROPOUTS
                        the dropout rates of encoder, decoder, disc_b,
                        disc_o,default 0.3, 0.1, 0.3, 0.3
  --lambda_b LAMBDA_B   the weight of adversarial loss for batch labels,
                        default 1
  --lambda_o LAMBDA_O   the weight of adversarial loss for injection order,
                        default 1
  --lr_rec LR_REC       the learning rate of AE training, default 0.0002
  --lr_disc_b LR_DISC_B
                        the leanring rate of disc_b training, default 0.005
  --lr_disc_o LR_DISC_O
                        the leanring rate of disc_o training, default 0.0005
  -e EPOCH EPOCH EPOCH, --epoch EPOCH EPOCH EPOCH
                        ae pretrain, disc pretrain, iteration train
                        epochs，default (1000, 10, 700)
  --use_batch_for_order USE_BATCH_FOR_ORDER
                        if compute rank loss with batch ?, default True
  -bs BATCH_SIZE, --batch_size BATCH_SIZE
                        batch size，default 64
  --load LOAD           load trained models, default None
  --visdom_env VISDOM_ENV
                        if use visdom, it is the env name,default main
  --visdom_port VISDOM_PORT
                        if use visdom, it is the port, default 8097
  -nw NUM_WORKERS, --num_workers NUM_WORKERS
                        the number of multi cores, default 12
  --use_log             use logrithm?
  --use_batch USE_BATCH
                        use part of batches? default None
  --sample_size SAMPLE_SIZE
                        use size of part of samples? default None
  --random_seed RANDOM_SEED
                        random seed, default 1234.
  --device {None,CPU,GPU}
                        device

Contact

For more information please contact Zhiwei Rong ([email protected])

Name		Name	Last commit message	Last commit date
Latest commit History 90 Commits
imgs		imgs
LICENSE		LICENSE
README.md		README.md
config.py		config.py
datasets.py		datasets.py
main.py		main.py
metrics.py		metrics.py
networks.py		networks.py
train.py		train.py
transfer.py		transfer.py
visual.py		visual.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NormAE (Normalization Autoencoder)

Table of Contents

Detail informations

Running time

Hardware requirements

Recommended sample size

Number of QCs

Imput format

Requirements

How to use

Data preparation

Training

Remove batch effects

Other

Contact

About

Releases

Packages

Languages

License

muccg/NormAE

Folders and files

Latest commit

History

Repository files navigation

NormAE (Normalization Autoencoder)

Table of Contents

Detail informations

Running time

Hardware requirements

Recommended sample size

Number of QCs

Imput format

Requirements

How to use

Data preparation

Training

Remove batch effects

Other

Contact

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages