Skip to content

RECOD Titans participation at the ISBI 2017 challenge - Part 3

Notifications You must be signed in to change notification settings

learningtitans/isbi2017-part3

Repository files navigation

ISIC 2017 Challenge Models by "RECOD Titans"

This repository is a branch of Tensorflow/models/slim containing the models implemented by RECOD "Titans" for the IEEE ISBI 2017 Challenge presented by ISIC (ISIC 2017: Skin Lesion Analysis Towards Melanoma Detection challenge / Part 3: Lesion Classification).

RECOD "Titans" got the best ROC AUC for melanoma classification (87.4%), 3rd best ROC AUC for seborrheic keratosis classification (94.3%), and 3rd best combined/mean ROC AUC (90.8%).

There's a separated repository for the models used in Part 1: Lesion Segmentation, and a technical report detailing our participation on both tasks.

Foreword

Please note: this is a beta public release. Please, help us to improve this code, by submitting an issue if you find any problems.

Despite the best effort of authors, reproducing results of todays' Machine Learning is challenging, due to the complexity of the machinery, involving millions of lines of code distributed among thousands of packages — and the management of hundreds of random factors.

We are committed to alleviate that problem. We are a very small team, and unfortunately, cannot provide help with technical issues (e.g., procuring data, installing hardware or software, etc.), but we'll do our best to share the technical and scientific details needed to reproduce the results. Please, see our contacts at the end of this documents.

Most of the code is a direct copy of the models posted in Tensorflow/slim, adjusted to fit the challenge (dataset, data preparation, results formatting, etc.). We created the code needed for the SVM decision layers, and the final meta-learning SVM stacking.

N.B.: Our code is now a lot behind current Tensorflow/slim. If you need to contrast our code with a reference version, the March 1st 2017 commit is a good place to start. In order to run our code you don't have to download Tensorflow/slim, you just need Tensorflow-GPU v.012 (it won't work with newer versions of Tensorflow after r1.0).

If you use this code in an academic context, please cite us. The main reference is the "RECOD Titans at ISIC Challenge 2017" report. If the transfer learning aspects of this work are important to your context, you might find appropriate to cite the ISBI 2017 paper "Knowledge transfer for melanoma screening with deep learning" as well. The report and the paper are linked at the end of this file.

Requirements

Hardware: You'll need a CUDA/cuDNN compatible GPU card with enough RAM. We tested our models on NVIDIA GeForce Titan X, Titan X (Pascal), and Tesla K40c cards, all with 12 GiB of RAM.

Software: All our tests used Linux. We ran most experiments on Ubuntu 14.04.5 LTS, and some on Debian 5.4.1-4. You'll needed Python 2.7, with packages tensorflow-gpu (v0.12), numpy, scipy, and sklearn. You can install those packages with pip. You'll also need curl, git, ImageMagick, and bc (the command-line basic calculator).

Docker installation: The easiest way to install the needed software is to create a nvidia-docker container from image tensorflow/tensorflow:0.12.1-gpu, and then add the remaining packages:

nvidia-docker pull tensorflow/tensorflow:0.12.1-gpu

mkdir ~/isbi2017-part3

nvidia-docker run -ti -e OUTSIDE_USER=$USER  -e OUTSIDE_UID=$UID -e OUTSIDE_GROUP=`/usr/bin/id -ng $USER` -e OUTSIDE_GID=`/usr/bin/id -g $USER` -v $HOME/isbi2017-part3:/isbi2017-part3 --name isbichallenge2017 tensorflow/tensorflow:0.12.1-gpu /bin/bash

# Inside container:
apt-get update
apt-get install git -y
apt-get install imagemagick -y
apt-get install bc -y

groupadd --gid "$OUTSIDE_GID" "$OUTSIDE_GROUP"
useradd --create-home --uid "$OUTSIDE_UID" --gid "$OUTSIDE_GID" "$OUTSIDE_USER"

su -l $OUTSIDE_USER
ln -s /isbi2017-part3 ~/isbi2017-part3

The procedure above creates a user inside the container equivalent to your external user, and maps the external directory ~/isbi2017-part3 into /isbi2017-part3 inside the container. That is highly recommended because Docker filesystem isn't fit for extensive data manipulation.

Cloning this repository

We're assuming that you'll use the commands below to clone this repository. If you use a different path than ~/isbi2017-part3, adapt the instructions that follow as needed.

cd ~/
git clone https://github.com/learningtitans/isbi2017-part3

Obtaining and preparing the data

External data was allowed by the challenge. We collected data from several sources, listed below. Those sources are publicly obtainable — more or less easily — some requiring a license agreement, some requiring payment, some requiring both.

If you're going to use our pre-trained models (see below) and test on the official challenge validation and test splits, you just have to procure the official challenge datasets and prepare the test sets (see below). If you want to train the models "from scratch" and reproduce our steps exactly, you'll have to procure all datasets, and prepare the training set as well.

Official challenge datasets

The official ISIC 2017 Challenge challenge dataset has 2,000 dermoscopic images (374 melanomas, 254 seborrheic keratoses, and 1,372 benign nevi). It's freely available, after signing up at the challenge website. You have to download both training and test (and validation, if desired) sets, and unzip them to the ~/isbi2017-part3/data/challenge directory (flat in the same directory, or in subdirectories, it does not matter).

The procedure: (1) Download/unzip the challenge files into ~/isbi2017-part3/data/challenge (including ground truth data). (2) Delete or move the superpixel files. One way to accomplish it:

mkdir -p ~/isbi2017-part3/data/challenge
cd ~/isbi2017-part3/data/challenge

# Download and unzip all files here

mkdir -p ~/isbi2017-part3/extras/challenge/superpixels
find . -name '*.png' -exec mv -v "{}" ~/isbi2017-part3/extras/challenge/superpixels \;

Additional ISIC Archive Images

We used additional images from the ISIC Archive an international consortium to improve melanoma diagnosis, containing over 13,000 dermoscopic images. They're freely available. We used a relatively small subset of the Archive.

The procedure: Download the images to ~/isbi2017-part3/data/isic. One way to accomplish it:

mkdir -p ~/isbi2017-part3/data/isic/images
cd ~/isbi2017-part3/data/isic/images
cat ~/isbi2017-part3/data/isic-ids.txt | while read imgid; do curl -o $imgid.jpg https://isic-archive.com:443/api/v1/image/$imgid/download?contentDisposition=attachment; sleep 1; done

Interactive Atlas of Dermoscopy

The Interactive Atlas of Dermoscopy has 1,000+ clinical cases (270 melanomas, 49 seborrheic keratoses), each with at least two images: dermoscopic, and close-up clinical. It's available for anyone to buy for ~250€.

The procedure: (1) Insert/Mount the Atlas CD-ROM. (2) Copy all image files to ~/isbi2017-part3/data/atlas. (3) Rename all files to lowercase. One way to accomplish it:

mkdir -p ~/isbi2017-part3/data/atlas
cd ~/isbi2017-part3/data/atlas
# Adapt the path /media/cdrom below to the CD mount point
find /media/cdrom/Images -name '*.jpg' -exec sh -c 'cp -v "{}" `basename "{}" | tr "[A-Z]" "[a-z]"`' \;

Dermofit Image Library

The Dermofit Image Library has 1,300 images (76 melanomas, 257 seborrheic keratoses). It's available after signing a license agreement, for a fee of ~50€.

The procedure: (1) Download/unzip all the dataset files (*.zip) to ~/isbi2017-part3/data/dermofit. (2) Delete or move the mask files. One way to accomplish it:

mkdir -p ~/isbi2017-part3/data/dermofit
cd ~/isbi2017-part3/data/dermofit

# Download and unzip all files here

mkdir -p ~/isbi2017-part3/extras/dermofit/masks
find . -name '*mask.png' -exec mv -v "{}" ~/isbi2017-part3/extras/dermofit/masks \;

The IRMA Skin Lesion Dataset

The IRMA Skin Lesion Dataset has 747 dermoscopic images (187 melanomas). This dataset is unlisted, but available under special request, and the signing of a license agreement.

The procedure: Download/unzip all the images to ~/isbi2017-part3/data/irma. One way to accomplish it:

mkdir -p ~/isbi2017-part3/data/irma
cd ~/isbi2017-part3/data/irma
# Download all images here

The PH2 Dataset

The PH2 Dataset has 200 dermoscopic images (40 melanomas). It's freely available after signing a short online registration form.

The procedure: (1) Download/unzip all the images to ~/isbi2017-part3/data/ph2. (2) Delete or move the maskfiles. One way to accomplish it:

mkdir -p ~/isbi2017-part3/data/ph2
cd ~/isbi2017-part3/data/ph2

# Download all images here

mkdir -p ~/isbi2017-part3/extras/ph2/masks
find . -name '*_mask.bmp' -exec mv -v "{}" ~/isbi2017-part3/extras/ph2/masks \;

Integrating the dataset

The procedure: (1) Copy all images to a single folder, while resizing them to 299×299, and converting them to JPEG. (2) Repeat on another folder for a size of 224×224. One way to accomplish it:

mkdir -p ~/isbi2017-part3/data/images299
cd ~/isbi2017-part3/data/images299
find ~/isbi2017-part3/data -name '*.jpg' -exec sh -c 'echo "{}"; convert "{}" -resize 299x299\! `basename "{}"`' \;
find ~/isbi2017-part3/data -name '*.png' -exec sh -c 'echo "{}"; convert "{}" -resize 299x299\! `basename "{}" .png`.jpg' \;
find ~/isbi2017-part3/data -name '*.bmp' -exec sh -c 'echo "{}"; convert "{}" -resize 299x299\! `basename "{}" .bmp`.jpg' \;

mkdir -p ~/isbi2017-part3/data/images224
cd ~/isbi2017-part3/data/images224
find ~/isbi2017-part3/data -name '*.jpg' -exec sh -c 'echo "{}"; convert "{}" -resize 224x224\! `basename "{}"`' \;
find ~/isbi2017-part3/data -name '*.png' -exec sh -c 'echo "{}"; convert "{}" -resize 224x224\! `basename "{}" .png`.jpg' \;
find ~/isbi2017-part3/data -name '*.bmp' -exec sh -c 'echo "{}"; convert "{}" -resize 224x224\! `basename "{}" .bmp`.jpg' \;

Converting training images and metadata to Tensorflow TF-Record format

The procedure below creates the actual training sets. The training set we called "deploy" in the technical report contains all images listed in the first column of ~/isbi2017-part3/data/deploy2017.txt. The training set we called "semi" contain those images, minus those listed in ~/isbi2017-part3/data/diff-semi.txt.

Each training set will actually be separated into three splits: train, validation, and a vestigial test split with a handful of images (due to our reuse of generic code that assumes three splits). The train split was used to find the weights in the deep learning models, and to train the SVM layers in the models which use it. The validation split was used to compute what we called in the report "internal validation AUC", to train the stacked SVM meta-model, and — in a few cases — to establish an early-stopping procedure for the deep-learning training. We didn't use the vestigial test split.

You can inspect which images fall in which splits by listing the *.log files in the *.tfr folders created below.

The procedure:

mkdir -p ~/isbi2017-part3/data/deploy.299.tfr
python ~/isbi2017-part3/datasets/convert_skin_lesions.py TRAIN ~/isbi2017-part3/data/deploy2017.txt ~/isbi2017-part3/data/images299 ~/isbi2017-part3/data/deploy.299.tfr ~/isbi2017-part3/data/no-blacklist.txt

mkdir -p ~/isbi2017-part3/data/semi.299.tfr
python ~/isbi2017-part3/datasets/convert_skin_lesions.py TRAIN ~/isbi2017-part3/data/deploy2017.txt ~/isbi2017-part3/data/images299 ~/isbi2017-part3/data/semi.299.tfr ~/isbi2017-part3/data/diff-semi.txt

mkdir -p ~/isbi2017-part3/data/deploy.224.tfr
python ~/isbi2017-part3/datasets/convert_skin_lesions.py TRAIN ~/isbi2017-part3/data/deploy2017.txt ~/isbi2017-part3/data/images224 ~/isbi2017-part3/data/deploy.224.tfr ~/isbi2017-part3/data/no-blacklist.txt

mkdir -p ~/isbi2017-part3/data/semi.224.tfr
python ~/isbi2017-part3/datasets/convert_skin_lesions.py TRAIN ~/isbi2017-part3/data/deploy2017.txt ~/isbi2017-part3/data/images224 ~/isbi2017-part3/data/semi.224.tfr ~/isbi2017-part3/data/diff-semi.txt

Preparing the official test dataset

The procedure below creates the official test sets to tf-record:

mkdir -p ~/isbi2017-part3/data/test.299.tfr
python ~/isbi2017-part3/datasets/convert_skin_lesions.py TEST ~/isbi2017-part3/data/isbi2017_official_test_v2.txt ~/isbi2017-part3/data/images299 ~/isbi2017-part3/data/test.299.tfr ~/isbi2017-part3/data/no-blacklist.txt

mkdir -p ~/isbi2017-part3/data/test.224.tfr
python ~/isbi2017-part3/datasets/convert_skin_lesions.py TEST ~/isbi2017-part3/data/isbi2017_official_test_v2.txt ~/isbi2017-part3/data/images224 ~/isbi2017-part3/data/test.224.tfr ~/isbi2017-part3/data/no-blacklist.txt

(Optional) Preparing the official validation dataset

The procedure below converts the official validation sets to tf-record:

mkdir -p ~/isbi2017-part3/data/val.299.tfr
python ~/isbi2017-part3/datasets/convert_skin_lesions.py TEST ~/isbi2017-part3/data/isbi2017_official_validation.txt ~/isbi2017-part3/data/images299 ~/isbi2017-part3/data/val.299.tfr ~/isbi2017-part3/data/no-blacklist.txt

mkdir -p ~/isbi2017-part3/data/val.224.tfr
python ~/isbi2017-part3/datasets/convert_skin_lesions.py TEST ~/isbi2017-part3/data/isbi2017_official_validation.txt ~/isbi2017-part3/data/images224 ~/isbi2017-part3/data/val.224.tfr ~/isbi2017-part3/data/no-blacklist.txt

Pre-trained model

We released a pre-trained, ready-to-use model. That model is exactly the one used in the Challenge, so modulo bugs, package incompatibilites and random fluctuations, you should get the same AUCs as we did.

The pre-trained model consists of 3 base deep-learning models, 3 base-model SVM layers, and one final stacked SVM layer. That's a lot of parameters! The files are too big for github, and are shared on figshare with DOI: 10.6084/m9.figshare.4993931. Direct links are provided below:

File Size MD5
Deep Learning Model RC.25 465M eaf7bb10806783d54c6b72c90b42486e
Deep Learning Model RC.28 465M 700d7ef0ee53e6729e8a20bdd1acf8d8
Deep Learning Model RC.30 453M 360167526d3a52fc52f7f4dced5035f1
All SVM Models 71M ce79acca7cf7dcdeabec32ed58e4feca

Download and unzip all files into ~/isbi2017-part3/running...

mkdir -p ~/isbi2017-part3/running
# Download and unzip all files here
mkdir ~/isbi2017-part3/running/checkpoints.rc30/best # Fix path issue with pre-trained rc.30 model
ln -s ~/isbi2017-part3/running/checkpoints.rc30/model.ckpt-22907.* ~/isbi2017-part3/running/checkpoints.rc30/best

...then proceed to predicting with the model.

Training the model "from scratch"

Strictly speaking, the training will not be purely from scratch, since we will transfer knowledge from models pre-trained on ImageNet. We do not recommend — except for scientific curiosity — training strictly from scratch, since training for ImageNet is a slow and complex endeavor in itself.

We need the ImageNet weights of two models: Resnet-101 and Inception-v4, available here (or check the exact addresses at the curl commands below). Download and unzip them to ~/isbi2017-part3/running:

mkdir -p ~/isbi2017-part3/running
cd ~/isbi2017-part3/running
curl http://download.tensorflow.org/models/inception_v4_2016_09_09.tar.gz | tar xvz
curl http://download.tensorflow.org/models/resnet_v1_101_2016_08_28.tar.gz | tar xvz

Deep Learning Component Model rc25

Inception v4 trained on "deploy" dataset for 40000 batches, with per-image normalization that erases the average of the pixels.

mkdir -p ~/isbi2017-part3/running/checkpoints.rc25
cd ~/isbi2017-part3/
python train_image_classifier.py \
    --train_dir=$HOME/isbi2017-part3/running/checkpoints.rc25 \
    --dataset_dir=$HOME/isbi2017-part3/data/deploy.299.tfr \
    --dataset_name=skin_lesions \
    --task_name=label \
    --dataset_split_name=train \
    --model_name=inception_v4 \
    --preprocessing_name=dermatologic \
    --checkpoint_path=$HOME/isbi2017-part3/running/inception_v4.ckpt  \
    --checkpoint_exclude_scopes=InceptionV4/Logits,InceptionV4/AuxLogits \
    --save_interval_secs=3600 \
    --optimizer=rmsprop \
    --normalize_per_image=1 \
    --max_number_of_steps=40000 \
    --experiment_tag="Model: Inceptionv4 Train: Deploy; Normalization: mode 1, erases mean"  \
    --experiment_file=$HOME/isbi2017-part3/running/checkpoints.rc25/experiment.meta

Deep Learning Component Model rc28

Inception v4 trained on "semi" dataset for 40000 batches, with per-image normalization that erases the average of the pixels.

mkdir -p ~/isbi2017-part3/running/checkpoints.rc28
cd ~/isbi2017-part3/
python train_image_classifier.py \
    --train_dir=$HOME/isbi2017-part3/running/checkpoints.rc28 \
    --dataset_dir=$HOME/isbi2017-part3/data/semi.299.tfr \
    --dataset_name=skin_lesions \
    --task_name=label \
    --dataset_split_name=train \
    --model_name=inception_v4 \
    --preprocessing_name=dermatologic \
    --checkpoint_path=$HOME/isbi2017-part3/running/inception_v4.ckpt  \
    --checkpoint_exclude_scopes=InceptionV4/Logits,InceptionV4/AuxLogits \
    --save_interval_secs=3600 \
    --optimizer=rmsprop \
    --normalize_per_image=1 \
    --max_number_of_steps=40000 \
    --experiment_tag="Train: Semi; Normalization: mode 1, erases mean"  \
    --experiment_file=$HOME/isbi2017-part3/running/checkpoints.rc28/experiment.meta

Deep Learning Component Model rc30

Resnet-101 v1 trained on "semi" dataset for 40000 batches, with per-image normalization that erases the average of the pixels.

mkdir -p ~/isbi2017-part3/running/checkpoints.rc30/best
cd ~/isbi2017-part3/
python train_image_classifier.py \
      --train_dir=$HOME/isbi2017-part3/running/checkpoints.rc30 \
      --dataset_dir=$HOME/isbi2017-part3/data/semi.224.tfr \
      --dataset_name=skin_lesions \
      --task_name=label \
      --dataset_split_name=train \
      --train_image_size=224 \
      --model_name=resnet_v1_101 \
      --preprocessing_name=vgg \
      --checkpoint_path=$HOME/isbi2017-part3/running/resnet_v1_101.ckpt  \
      --checkpoint_exclude_scopes=resnet_v1_101/logits \
      --save_interval_secs=3600 \
      --normalize_per_image=1 \
      --max_number_of_steps=40000 \
      --experiment_tag="Network: Resnet101"  \
      --experiment_file=$HOME/isbi2017-part3/running/checkpoints.rc30/experiment.meta

This is the only model that requires validation for early stopping. The validation loop has to run as a separate process. You have to run the command below at the same time as the training above running (for example, in another shell):

cd ~/isbi2017-part3/
./etc/launch_validation_loop.sh RESNET \
      $HOME/isbi2017-part3/running/checkpoints.rc30 \
      $HOME/isbi2017-part3/data/semi.224.tfr

Training the SVM layer of the component models

We start by extracting the needed features from the training set:

mkdir -p ~/isbi2017-part3/running/svm.features
cd ~/isbi2017-part3/
python predict_image_classifier.py \
    --alsologtostderr \
    --checkpoint_path=$HOME/isbi2017-part3/running/checkpoints.rc25/model.ckpt-40000 \
    --dataset_dir=$HOME/isbi2017-part3/data/deploy.299.tfr \
    --dataset_name=skin_lesions \
    --task_name=label \
    --dataset_split_name=train \
    --model_name=inception_v4 \
    --preprocessing_name=dermatologic \
    --id_field_name=id \
    --eval_replicas=50 \
    --pool_features=none \
    --pool_scores=none \
    --extract_features \
    --add_scores_to_features=logits \
    --output_file=$HOME/isbi2017-part3/running/svm.features/train.50.rc25.feats \
    --output_format=pickle \
    --normalize_per_image=1

python predict_image_classifier.py \
    --alsologtostderr \
    --checkpoint_path=$HOME/isbi2017-part3/running/checkpoints.rc28/model.ckpt-40000 \
    --dataset_dir=$HOME/isbi2017-part3/data/semi.299.tfr \
    --dataset_name=skin_lesions \
    --task_name=label \
    --dataset_split_name=train \
    --model_name=inception_v4 \
    --preprocessing_name=dermatologic \
    --id_field_name=id \
    --eval_replicas=50 \
    --pool_features=none \
    --pool_scores=none \
    --extract_features \
    --add_scores_to_features=logits \
    --output_file=$HOME/isbi2017-part3/running/svm.features/train.50.rc28.feats \
    --output_format=pickle \
    --normalize_per_image=1

python etc/aggregate_pickle.py \
    $HOME/isbi2017-part3/running/svm.features/train.50.rc28.feats \
    $HOME/isbi2017-part3/running/svm.features/train.50avg.rc28.feats

Then, we train the SVM models. The train_svm_layer.py script below uses multi-threading to accelerate the training, but that has pitfalls due to joblib dealing poorly with temporary files over NFS. If you experience issues, set the environment variable JOBLIB_TEMP_FOLDER to a local directory, or change --jobs 4 to --jobs 1 (the training will be significantly slower, but still tolerable).

mkdir -p ~/isbi2017-part3/running/svm.models
cd ~/isbi2017-part3/

python train_svm_layer.py --input_training ~/isbi2017-part3/running/svm.features/train.50.rc25.feats --output_model ~/isbi2017-part3/running/svm.models/rc25.50.svm --jobs 4 --svm_method LINEAR_PRIMAL

python train_svm_layer.py --input_training ~/isbi2017-part3/running/svm.features/train.50.rc28.feats --output_model ~/isbi2017-part3/running/svm.models/rc28.50.svm --jobs 4 --svm_method LINEAR_PRIMAL

python train_svm_layer.py --input_training ~/isbi2017-part3/running/svm.features/train.50avg.rc28.feats --output_model ~/isbi2017-part3/running/svm.models/rc28.50avg.svm --jobs 4 --svm_method LINEAR_PRIMAL

Training the final SVM meta-model

We start by making the predictions in the internal validation dataset, from which the meta-model will be trained:

mkdir -p ~/isbi2017-part3/running/meta.training
cd ~/isbi2017-part3

./etc/predict_all_component_models_isbi.sh ~/isbi2017-part3/data/deploy.299.tfr ~/isbi2017-part3/data/deploy.224.tfr validation ~/isbi2017-part3/running/meta.training

Each component model is sampled thrice. The procedure below creates 100 replicas from combinations of those samples:

python etc/assemble_meta_features.py ALL_LOGITS ~/isbi2017-part3/running/meta.training ~/isbi2017-part3/running/svm.features/validation.metall.feats ~/isbi2017-part3/data/deploy2017.txt

Finally, the stacked SVM model is learned (the warnings above about joblib apply here as well):

python train_svm_layer.py --input_training ~/isbi2017-part3/running/svm.features/validation.metall.feats --output_model ~/isbi2017-part3/running/svm.models/metall.svm --jobs 4 --svm_method LINEAR_PRIMAL --max_iter_hyper 30 --preprocess NONE

Predicting with the model

The instructions below show how to make predictions with the model, by assembling the official challenge submissions.

Start by getting the predictions and features from the componente models:

mkdir -p ~/isbi2017-part3/running/isbitest.features
cd ~/isbi2017-part3

./etc/predict_all_component_models_isbi.sh ~/isbi2017-part3/data/test.299.tfr ~/isbi2017-part3/data/test.224.tfr test ~/isbi2017-part3/running/isbitest.features

Each component model is sampled thrice. The procedure below creates 100 replicas from combinations of those samples:

python etc/assemble_meta_features.py ALL_LOGITS ~/isbi2017-part3/running/isbitest.features ~/isbi2017-part3/running/isbitest.features/isbitest.metall.features

Finally, get the predictions from the stacked meta-model:

mkdir -p ~/isbi2017-part3/submission/
python predict_svm_layer.py \
    --input_model ~/isbi2017-part3/running/svm.models/metall.svm  \
    --input_test ~/isbi2017-part3/running/isbitest.features/isbitest.metall.features \
    --pool_by_id xtrm \
    > ~/isbi2017-part3/submission/isbi2017-rc36xtrm.txt

(Optional) Predicting for the official validation set

Change the commands above to:

mkdir -p ~/isbi2017-part3/running/isbival.features
cd ~/isbi2017-part3

./etc/predict_all_component_models_isbi.sh ~/isbi2017-part3/data/val.299.tfr ~/isbi2017-part3/data/val.224.tfr test ~/isbi2017-part3/running/isbival.features

python etc/assemble_meta_features.py ALL_LOGITS ~/isbi2017-part3/running/isbival.features ~/isbi2017-part3/running/isbival.features/isbival.metall.features

mkdir -p ~/isbi2017-part3/submission/
python predict_svm_layer.py \
    --input_model ~/isbi2017-part3/running/svm.models/metall.svm  \
    --input_test ~/isbi2017-part3/running/isbival.features/isbival.metall.features \
    --pool_by_id xtrm \
    > ~/isbi2017-part3/submission/isbi2017-val-rc36xtrm.txt

Checking the procedure

There are two tests you can apply to check if you've ran the procedures correctly: contrast your submission files with ours, and check your submission files against the the challenge ground truth.

Comparing the submission files

You cannot just diff the submission files, because the probabilities will be slightly different (due to random factors we did not control in the procedure above). However, the classification order should be the same, or almost the same between runs. Not all inversions are significant (ranks inversions between images on the same class do not affect the metrics).

You can check your files agains ours using the commands below:

cd ~/isbi2017-part3

python etc/count_inversions.py data/challenge/ISIC-2017_Test_v2_Part3_GroundTruth.csv data/isbi2017-titans-testv2-rc36xtrm.txt submission/isbi2017-rc36xtrm.txt
python etc/count_inversions.py data/challenge/ISIC-2017_Validation_Part3_GroundTruth.csv data/isbi2017-titans-val-rc36xtrm.txt submission/isbi2017-val-rc36xtrm.txt

In our tests we found that a few thousand significant inversions are expected.

Comparing the performances

To compare performances, first download to ~/isbi2017-part3/data the ground truth files for the challenge test set and for the challenge validation set. Then run the commands below:

cd ~/isbi2017-part3

python etc/compute_metrics.py data/challenge/ISIC-2017_Test_v2_Part3_GroundTruth.csv submission/isbi2017-rc36xtrm.txt
python etc/compute_metrics.py data/challenge/ISIC-2017_Validation_Part3_GroundTruth.csv submission/isbi2017-val-rc36xtrm.txt

For the test set you should get numbers very close to:

Melanoma AUC: 0.873511
Keratosis AUC: 0.942527
Average AUC: 0.908019

For the validation set you should get numbers very close to:

Melanoma AUC: 0.907778
Keratosis AUC: 0.994929
Average AUC: 0.951354

In our tests we found that ~1 p.p. fluctuations are expected.

About us

The Learning Titans are a team of researchers lead by Prof. Eduardo Valle and hosted by the RECOD Lab, at the University of Campinas, in Brazil.

Our papers and reports

A Menegola, J Tavares, M Fornaciali, LT Li, S Avila, E Valle. RECOD Titans at ISIC Challenge 2017. arXiv preprint arXiv:1703.04819 | Video presentation | PDF Presentation

A Menegola, M Fornaciali, R Pires, FV Bittencourt, S Avila, E Valle. Knowledge transfer for melanoma screening with deep learning. IEEE International Symposium on Biomedical Images (ISBI) 2017. arXiv preprint arXiv:1703.07479 | Video presentation | PDF Presentation

M Fornaciali, M Carvalho, FV Bittencourt, S Avila, E Valle. Towards automated melanoma screening: Proper computer vision & reliable results. arXiv preprint arXiv:1604.04024.

M Fornaciali, S Avila, M Carvalho, E Valle. Statistical learning approach for robust melanoma screening. SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI) 2014. DOI: 10.1109/SIBGRAPI.2014.48 | PDF Presentation

Robust Melanoma Screening Minisite

Copyright and license

Please check files LICENSE/AUTHORS, LICENSE/CONTRIBUTORS, and LICENSE/LICENSE.