Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training of an nnUNet model #27

Open
plbenveniste opened this issue Jul 25, 2024 · 14 comments
Open

Training of an nnUNet model #27

plbenveniste opened this issue Jul 25, 2024 · 14 comments
Assignees

Comments

@plbenveniste
Copy link
Collaborator

plbenveniste commented Jul 25, 2024

In this issue, I detail the exploration of training an nnUNet model.
The code is in branch plb/new_nnunet.

The script nnunet/convert_msd_to_nnunet.py takes the json file of an MSD dataset and converts it to the nnUNet format.

I created the associated virtual environment: conda create venv_nnunet python=3.9 on koios.

@plbenveniste plbenveniste self-assigned this Jul 25, 2024
@plbenveniste
Copy link
Collaborator Author

plbenveniste commented Jul 25, 2024

To install:

conda install pytorch torchvision pytorch-cuda=12.1 -c pytorch -c nvidia
pip install -r nnunet/requirements.txt
pip install --upgrade git+https://github.com/FabianIsensee/hiddenlayer.git

The MSD dataset was converted to the nnUNet format using the following command:

python nnunet/convert_msd_to_nnunet.py --input ~/net/ms-lesion-agnostic/msd_data/dataset_2024-07-24_seed42_lesionOnly.json -o ~/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_raw/ --tasknumber 101

The environment variables were set

export nnUNet_raw="/home/plbenveniste/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_raw"
export nnUNet_results="/home/plbenveniste/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_results"
export nnUNet_preprocessed="/home/plbenveniste/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_preprocessed"

The nnUNet raw data was preprocessed:

nnUNetv2_plan_and_preprocess -d 101 --verify_dataset_integrity -c 3d_fullres

I got a message saying:

...
Warning: Direction mismatch between segmentation and corresponding images. 
Direction images: (0.9998246020407097, -0.002916572623419852, 0.01850023567035481, 0.0008972689266514793, -0.9792064375432301, -0.2028643810866037, -0.018707219531766597, -0.2028453752936667, 0.9790320649327531). 
Direction seg: (0.999824551454073, -0.002916848812073564, 0.01850292634995937, 0.0008975451891515451, -0.9792064339919263, -0.20286437942668814, -0.01870990973509375, -0.20284538846575453, 0.9790320144286973). 
Image files: ['/home/plbenveniste/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_raw/Dataset101_msLesionAgnostic/imagesTr/msLesionAgnostic_640_0000.nii.gz']. 
Seg file: /home/plbenveniste/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_raw/Dataset101_msLesionAgnostic/labelsTr/msLesionAgnostic_640.nii.gz

Traceback (most recent call last):
  File "/home/plbenveniste/miniconda3/envs/venv_nnunet/bin/nnUNetv2_plan_and_preprocess", line 8, in <module>
    sys.exit(plan_and_preprocess_entry())
  File "/home/plbenveniste/miniconda3/envs/venv_nnunet/lib/python3.9/site-packages/nnunetv2/experiment_planning/plan_and_preprocess_entrypoints.py", line 184, in plan_and_preprocess_entry
    extract_fingerprints(args.d, args.fpe, args.npfp, args.verify_dataset_integrity, args.clean, args.verbose)
  File "/home/plbenveniste/miniconda3/envs/venv_nnunet/lib/python3.9/site-packages/nnunetv2/experiment_planning/plan_and_preprocess_api.py", line 47, in extract_fingerprints
    extract_fingerprint_dataset(d, fingerprint_extractor_class, num_processes, check_dataset_integrity, clean,
  File "/home/plbenveniste/miniconda3/envs/venv_nnunet/lib/python3.9/site-packages/nnunetv2/experiment_planning/plan_and_preprocess_api.py", line 30, in extract_fingerprint_dataset
    verify_dataset_integrity(join(nnUNet_raw, dataset_name), num_processes)
  File "/home/plbenveniste/miniconda3/envs/venv_nnunet/lib/python3.9/site-packages/nnunetv2/experiment_planning/verify_dataset_integrity.py", line 220, in verify_dataset_integrity
    raise RuntimeError(
RuntimeError: Some images have errors. Please check text output above to see which one(s) and what's going on.

I need to deal with this when doing the dataset conversion (using sct_register_multimodal -identity 1

@valosekj
Copy link
Member

I experienced something similar in the past. If you are sure that your images and labels are in the same space, which is something I would expect (you can check this, for example, by opening them in FSLeyes and check if you get a red warning in the left-down corner), then you can fix ITK direction using this script: https://gist.github.com/valosekj/a03195d9060b0e164faff95102129feb

Alternatively, you can maybe try to change "overwrite_image_reader_writer" to NibabelIO (to use nibabel instead of itk) in dataset.json.

@plbenveniste
Copy link
Collaborator Author

plbenveniste commented Jul 31, 2024

Thanks @valosekj for your message.
I ended up using sct_register_multimodal -identity 1 to have same dimensions in images and labels.
I also than binarized every label to a threshold of 0.5 (I could have avoided doing this by using nearest neigbour in the above process.

Surprisingly, I still had a dimension mismatch for some images as we can see below, but it didn't cause any problem.

Warning messages
Direction images: (0.9995568285353207, -4.2305939833177105e-05, 0.029768181674008378, 0.0023210087268018707, 0.9970655078825915, -0.07651788804874493, -0.02967759166276047, 0.07655306134333943, 0.9966237344998277). 
Direction seg: (0.999557527771378, 0.0011393518202445663, 0.029722897004697196, 0.001139351755611898, 0.9970662055766669, -0.07653550708132914, -0.029722896109700924, 0.0765355053186275, 0.9966237331859261). 
Image files: ['/home/plbenveniste/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_raw/Dataset201_msLesionAgnostic/imagesTr/msLesionAgnostic_1371_0000.nii.gz']. 
Seg file: /home/plbenveniste/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_raw/Dataset201_msLesionAgnostic/labelsTr/msLesionAgnostic_1371.nii.gz

Warning: Direction mismatch between segmentation and corresponding images. 
Direction images: (0.9998727930631436, -0.01594979726329166, 4.048400076738005e-05, -0.015482158666859506, -0.9699406718540176, 0.24284850674579794, 0.003834117254632673, 0.24281823871084504, 0.9700642252615967). 
Direction seg: (0.9998746193770184, -0.015715991708088715, 0.001937302249608592, -0.01571599139066678, -0.9699406460725575, 0.24283357955510604, 0.001937302317948398, 0.24283358643519834, 0.9700660284230386). 
Image files: ['/home/plbenveniste/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_raw/Dataset201_msLesionAgnostic/imagesTr/msLesionAgnostic_1184_0000.nii.gz']. 
Seg file: /home/plbenveniste/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_raw/Dataset201_msLesionAgnostic/labelsTr/msLesionAgnostic_1184.nii.gz

Warning: Direction mismatch between segmentation and corresponding images. 
Direction images: (0.9999650507186807, -3.2498976463678855e-05, -0.008360399655654377, -0.0019455056336788246, 0.9716359354515495, -0.23647371604735118, 0.008130950068450082, 0.23648172843373924, 0.9716019171123297). 
Direction seg: (0.9999655147443567, -0.0009890025352992376, -0.008245677081396018, -0.0009890025711708239, 0.9716363921293472, -0.2364777896768553, 0.008245677411101177, 0.2364777861953439, 0.9716019060289138). 
Image files: ['/home/plbenveniste/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_raw/Dataset201_msLesionAgnostic/imagesTr/msLesionAgnostic_1341_0000.nii.gz']. 
Seg file: /home/plbenveniste/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_raw/Dataset201_msLesionAgnostic/labelsTr/msLesionAgnostic_1341.nii.gz

Warning: Direction mismatch between segmentation and corresponding images. 
Direction images: (0.9999461782630069, 1.4256992099983761e-05, 0.010374988239862817, 0.0021222265209242816, 0.9785737025935498, -0.20588591831855255, -0.010155625623308788, 0.2058968391915934, 0.9785210004170319). 
Direction seg: (0.9999467397067836, 0.0010682421584383053, 0.01026531114786726, 0.0010682421705256642, 0.9785742550834928, -0.20589144819860944, -0.01026531092755063, 0.20589144262566617, 0.9785209936152197). 
Image files: ['/home/plbenveniste/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_raw/Dataset201_msLesionAgnostic/imagesTr/msLesionAgnostic_1349_0000.nii.gz']. 
Seg file: /home/plbenveniste/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_raw/Dataset201_msLesionAgnostic/labelsTr/msLesionAgnostic_1349.nii.gz

Warning: Direction mismatch between segmentation and corresponding images. 
Direction images: (0.9998737875777984, -0.015846944828360025, 0.0011327979091779968, -0.01587049846636809, -0.9995553321726754, 0.025244112225836113, -0.0007322521810542541, 0.025258904663096143, 0.9996806747991213). 
Direction seg: (0.9998742224498159, -0.01585872568560471, 0.00020027293067782567, -0.015858725394327734, -0.9995553321235096, 0.025251513487364926, 0.00020027292187368913, 0.025251511699089922, 0.9996811096331424). 
Image files: ['/home/plbenveniste/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_raw/Dataset201_msLesionAgnostic/imagesTr/msLesionAgnostic_1199_0000.nii.gz']. 
Seg file: /home/plbenveniste/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_raw/Dataset201_msLesionAgnostic/labelsTr/msLesionAgnostic_1199.nii.gz

Warning: Direction mismatch between segmentation and corresponding images. 
Direction images: (0.999964081679984, 4.2173231885602473e-05, -0.00847546859402508, -0.001320598932424257, 0.9885497176643242, -0.15088973555037227, 0.008372058788976455, 0.15089550665014348, 0.9885142660263687). 
Direction seg: (0.9999643151625024, -0.0006392129841412547, -0.008423765271167316, -0.000639212927685442, 0.9885499483048401, -0.15089264395156346, 0.008423764504104498, 0.1508926476448042, 0.9885142640245324). 
Image files: ['/home/plbenveniste/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_raw/Dataset201_msLesionAgnostic/imagesTr/msLesionAgnostic_1353_0000.nii.gz']. 
Seg file: /home/plbenveniste/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_raw/Dataset201_msLesionAgnostic/labelsTr/msLesionAgnostic_1353.nii.gz

Warning: Direction mismatch between segmentation and corresponding images. 
Direction images: (0.999967560007628, 2.4250750962094284e-05, -0.008054708809685418, -0.003024104554356132, 0.9279712714617492, -0.3726394708944929, 0.007465502262755987, 0.37265173923324924, 0.9279412408107873). 
Direction seg: (0.9999687649609699, -0.001499927893738001, -0.007760110282334256, -0.0014999279201193896, 0.9279724411610171, -0.37264582077690295, 0.007760110738075229, 0.3726458085930611, 0.9279412012330923). 
Image files: ['/home/plbenveniste/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_raw/Dataset201_msLesionAgnostic/imagesTr/msLesionAgnostic_2180_0000.nii.gz']. 
Seg file: /home/plbenveniste/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_raw/Dataset201_msLesionAgnostic/labelsTr/msLesionAgnostic_2180.nii.gz

Warning: Direction mismatch between segmentation and corresponding images. 
Direction images: (0.9999675164049507, 2.735942794877295e-05, -0.008060110927302988, -0.0030159611672683187, 0.9286202535678794, -0.3710192836777322, 0.007474631305437654, 0.3710315672756014, 0.928590181808478). 
Direction seg: (0.9999687169716581, -0.0014943018706426896, -0.007767376468637484, -0.0014943019824960953, 0.9286214214917144, -0.3710256655337067, 0.007767376625409159, 0.37102563605312655, 0.9285901266856461). 
Image files: ['/home/plbenveniste/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_raw/Dataset201_msLesionAgnostic/imagesTr/msLesionAgnostic_2183_0000.nii.gz']. 
Seg file: /home/plbenveniste/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_raw/Dataset201_msLesionAgnostic/labelsTr/msLesionAgnostic_2183.nii.gz

The dataset is being preprocessed using:

nnUNetv2_plan_and_preprocess -d 201 --verify_dataset_integrity -c 3d_fullres 2d

@plbenveniste
Copy link
Collaborator Author

plbenveniste commented Jul 31, 2024

The model is now training on kronos and koios using:

  • On kronos:
CUDA_VISIBLE_DEVICES=2 nnUNetv2_train 201 3d_fullres 0 && CUDA_VISIBLE_DEVICES=2 nnUNetv2_train 201 3d_fullres 1
CUDA_VISIBLE_DEVICES=3 nnUNetv2_train 201 3d_fullres 2 && CUDA_VISIBLE_DEVICES=3 nnUNetv2_train 201 3d_fullres 3
  • on koios
CUDA_VISIBLE_DEVICES=1 nnUNetv2_train 201 2d 0 && CUDA_VISIBLE_DEVICES=1 nnUNetv2_train 201 3d_fullres 4 

@plbenveniste
Copy link
Collaborator Author

plbenveniste commented Jul 31, 2024

I was faced with this issue:

Error message
2024-07-31 14:20:25.881668: unpacking dataset...
Error when checking /home/plbenveniste/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_preprocessed/Dataset201_msLesionAgnostic/nnUNetPlans_3d_fullres/msLesionAgnostic_793.npy and /home/plbenveniste/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_preprocessed/Dataset201_msLesionAgnostic/nnUNetPlans_3d_fullres/msLesionAgnostic_793_seg.npy, fixing...
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
  File "/home/plbenveniste/miniconda3/envs/venv_nnunet/lib/python3.9/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/home/plbenveniste/miniconda3/envs/venv_nnunet/lib/python3.9/multiprocessing/pool.py", line 51, in starmapstar
    return list(itertools.starmap(args[0], args[1]))
  File "/home/plbenveniste/miniconda3/envs/venv_nnunet/lib/python3.9/site-packages/nnunetv2/training/dataloading/utils.py", line 40, in _convert_to_npy
    np.load(seg_npy, mmap_mode='r')
  File "/home/plbenveniste/miniconda3/envs/venv_nnunet/lib/python3.9/site-packages/numpy/lib/_npyio_impl.py", line 464, in load
    raise EOFError("No data left in file")
EOFError: No data left in file
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
  File "/home/plbenveniste/miniconda3/envs/venv_nnunet/bin/nnUNetv2_train", line 8, in <module>
    sys.exit(run_training_entry())
  File "/home/plbenveniste/miniconda3/envs/venv_nnunet/lib/python3.9/site-packages/nnunetv2/run/run_training.py", line 274, in run_training_entry
    run_training(args.dataset_name_or_id, args.configuration, args.fold, args.tr, args.p, args.pretrained_weights,
  File "/home/plbenveniste/miniconda3/envs/venv_nnunet/lib/python3.9/site-packages/nnunetv2/run/run_training.py", line 210, in run_training
    nnunet_trainer.run_training()
  File "/home/plbenveniste/miniconda3/envs/venv_nnunet/lib/python3.9/site-packages/nnunetv2/training/nnUNetTrainer/nnUNetTrainer.py", line 1287, in run_training
    self.on_train_start()
  File "/home/plbenveniste/miniconda3/envs/venv_nnunet/lib/python3.9/site-packages/nnunetv2/training/nnUNetTrainer/nnUNetTrainer.py", line 847, in on_train_start
    unpack_dataset(self.preprocessed_dataset_folder, unpack_segmentation=True, overwrite_existing=False,
  File "/home/plbenveniste/miniconda3/envs/venv_nnunet/lib/python3.9/site-packages/nnunetv2/training/dataloading/utils.py", line 66, in unpack_dataset
    p.starmap(_convert_to_npy, zip(npz_files,
  File "/home/plbenveniste/miniconda3/envs/venv_nnunet/lib/python3.9/multiprocessing/pool.py", line 372, in starmap
    return self._map_async(func, iterable, starmapstar, chunksize).get()
  File "/home/plbenveniste/miniconda3/envs/venv_nnunet/lib/python3.9/multiprocessing/pool.py", line 771, in get
    raise self._value
EOFError: No data left in file
Exception in thread Thread-1:
Traceback (most recent call last):
  File "/home/plbenveniste/miniconda3/envs/venv_nnunet/lib/python3.9/threading.py", line 980, in _bootstrap_inner
    self.run()
  File "/home/plbenveniste/miniconda3/envs/venv_nnunet/lib/python3.9/threading.py", line 917, in run
    self._target(*self._args, **self._kwargs)
  File "/home/plbenveniste/miniconda3/envs/venv_nnunet/lib/python3.9/site-packages/batchgenerators/dataloading/nondet_multi_threaded_augmenter.py", line 125, in results_loop
    raise e
  File "/home/plbenveniste/miniconda3/envs/venv_nnunet/lib/python3.9/site-packages/batchgenerators/dataloading/nondet_multi_threaded_augmenter.py", line 103, in results_loop
    raise RuntimeError("One or more background workers are no longer alive. Exiting. Please check the "
RuntimeError: One or more background workers are no longer alive. Exiting. Please check the print statements above for the actual error message
Exception in thread Thread-2:
Traceback (most recent call last):
  File "/home/plbenveniste/miniconda3/envs/venv_nnunet/lib/python3.9/threading.py", line 980, in _bootstrap_inner
    self.run()
  File "/home/plbenveniste/miniconda3/envs/venv_nnunet/lib/python3.9/threading.py", line 917, in run
    self._target(*self._args, **self._kwargs)
  File "/home/plbenveniste/miniconda3/envs/venv_nnunet/lib/python3.9/site-packages/batchgenerators/dataloading/nondet_multi_threaded_augmenter.py", line 125, in results_loop
    raise e
  File "/home/plbenveniste/miniconda3/envs/venv_nnunet/lib/python3.9/site-packages/batchgenerators/dataloading/nondet_multi_threaded_augmenter.py", line 103, in results_loop
    raise RuntimeError("One or more background workers are no longer alive. Exiting. Please check the "
RuntimeError: One or more background workers are no longer alive. Exiting. Please check the print statements above for the actual error message

Looking into this now.

I found the solution in this issue: MIC-DKFZ/nnUNet#441 . I should just deleted all the .npy files and when I run training, I should wait for one of the trainings to have reached GPU stage before launching the others.

@plbenveniste
Copy link
Collaborator Author

plbenveniste commented Aug 2, 2024

The inference were performed with the 2d model on koios using :

CUDA_VISIBLE_DEVICES=1 nnUNetv2_predict -i /home/plbenveniste/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_raw/Dataset201_msLesionAgnostic/imagesTs/ -o /home/plbenveniste/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_results/Dataset201_msLesionAgnostic/nnUNetTrainer__nnUNetPlans__2d/fold_0/test_set -d 201 -c 2d -f 0 -chk checkpoint_best.pth

The results were computed using:

python nnunet/evaluate_predictions.py -pred-folder ~/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_results/Dataset201_msLesionAgnostic/nnUNetTrainer__nnUNetPlans__2d/fold_0/test_set/ -label-folder ~/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_raw/Dataset201_msLesionAgnostic/labelsTs  -image-folder ~/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_raw/Dataset201_msLesionAgnostic/imagesTs/ -conversion-dict ~/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_raw/Dataset201_msLesionAgnostic/conversion_dict.json -output-folder ~/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_results/Dataset201_msLesionAgnostic/nnUNetTrainer__nnUNetPlans__2d/fold_0/test_set/

And the plots were done using:

python nnunet/plot_performance.py --pred-dir-path /home/plbenveniste/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_results/Dataset201_msLesionAgnostic/nnUNetTrainer__nnUNetPlans__2d/fold_0/test_set/ --data-json-path /home/plbenveniste/net/ms-lesion-agnostic/msd_data/dataset_2024-07-24_seed42_lesionOnly.json --split test

The results are the following:
dice_scores_contrast
dice_scores_orientation
dice_scores_site

@plbenveniste
Copy link
Collaborator Author

plbenveniste commented Aug 2, 2024

When comparing the results of the nnUNet compared to the results of the monai Unet (#21 (comment)), it seems that the 2d nnUnet underperforms, but its performances are more regular (lower variance).
Why? To be investigated...

Also need to look into the results of the 3D model

@plbenveniste
Copy link
Collaborator Author

plbenveniste commented Aug 9, 2024

Here we focus on the results of the 3d nnUNet. We used the fold 3 for inference as it presented the best dice score:

  • fold 0 = 0.6470999717712402
  • fold 1 =0.6467999815940857
  • fold 2 =0.6449000239372253
  • fold 3 =0.6585999727249146
  • fold 4 =0.6327999830245972

Inference was done using:

CUDA_VISIBLE_DEVICES=1 nnUNetv2_predict -i /home/plbenveniste/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_raw/Dataset201_msLesionAgnostic/imagesTs/ -o /home/plbenveniste/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_results/Dataset201_msLesionAgnostic/nnUNetTrainer__nnUNetPlans__3d_fullres/fold_3/test_set -d 201 -c 3d_fullres -f 3 -chk checkpoint_best.pth

python nnunet/evaluate_predictions.py -pred-folder ~/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_results/Dataset201_msLesionAgnostic/nnUNetTrainer__nnUNetPlans__3d_fullres/fold_3/test_set/ -label-folder ~/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_raw/Dataset201_msLesionAgnostic/labelsTs  -image-folder ~/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_raw/Dataset201_msLesionAgnostic/imagesTs/ -conversion-dict ~/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_raw/Dataset201_msLesionAgnostic/conversion_dict.json -output-folder ~/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_results/Dataset201_msLesionAgnostic/nnUNetTrainer__nnUNetPlans__3d_fullres/fold_3/test_set/

python nnunet/plot_performance.py --pred-dir-path /home/plbenveniste/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_results/Dataset201_msLesionAgnostic/nnUNetTrainer__nnUNetPlans__3d_fullres/fold_3/test_set/ --data-json-path /home/plbenveniste/net/ms-lesion-agnostic/msd_data/dataset_2024-07-24_seed42_lesionOnly.json --split test

Here are the results for fold 3 of the 3d_fullres model:
dice_scores_contrast
dice_scores_orientation
dice_scores_site

The same conclusion as the 2d model stands when comparing the 3d_fullres model to the monai attentionUnet model.

@plbenveniste
Copy link
Collaborator Author

plbenveniste commented Sep 25, 2024

I investigated the performance of the model on other metrics than Dice such as F1 score, PPV and sensitivity.

2D nnUNet model
To compute the performance I did the following:

python nnunet/evaluate_predictions.py -pred-folder ~/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_results/Dataset201_msLesionAgnostic/nnUNetTrainer__nnUNetPlans__2d/fold_0/test_set/ -label-folder ~/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_raw/Dataset201_msLesionAgnostic/labelsTs  -image-folder ~/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_raw/Dataset201_msLesionAgnostic/imagesTs/ -conversion-dict ~/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_raw/Dataset201_msLesionAgnostic/conversion_dict.json -output-folder ~/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_results/Dataset201_msLesionAgnostic/nnUNetTrainer__nnUNetPlans__2d/fold_0/test_set/

To plot the performances:

python nnunet/plot_performance.py --pred-dir-path /home/plbenveniste/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_results/Dataset201_msLesionAgnostic/nnUNetTrainer__nnUNetPlans__2d/fold_0/test_set_2/ --data-json-path /home/plbenveniste/net/ms-lesion-agnostic/msd_data/dataset_2024-07-24_seed42_lesionOnly.json --split test

Here are the results of fold 0:

dice_scores_contrast
f1_scores_contrast
ppv_scores_contrast
sensitivity_scores_contrast

3D nnUNet model

To performn inference on test set with fold 0.

CUDA_VISIBLE_DEVICES=1 nnUNetv2_predict -i /home/plbenveniste/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_raw/Dataset201_msLesionAgnostic/imagesTs/ -o /home/plbenveniste/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_results/Dataset201_msLesionAgnostic/nnUNetTrainer__nnUNetPlans__3d_fullres/fold_0/test_set  -d 201 -c 3d_fullres -f 0 -chk checkpoint_best.pth

To compute the performance I did the following:

python nnunet/evaluate_predictions.py -pred-folder ~/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_results/Dataset201_msLesionAgnostic/nnUNetTrainer__nnUNetPlans__3d_fullres/fold_0/test_set/ -label-folder ~/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_raw/Dataset201_msLesionAgnostic/labelsTs  -image-folder ~/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_raw/Dataset201_msLesionAgnostic/imagesTs/ -conversion-dict ~/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_raw/Dataset201_msLesionAgnostic/conversion_dict.json -output-folder ~/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_results/Dataset201_msLesionAgnostic/nnUNetTrainer__nnUNetPlans__3d_fullres/fold_0/test_set/

To plot the performances:

python nnunet/plot_performance.py --pred-dir-path /home/plbenveniste/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_results/Dataset201_msLesionAgnostic/nnUNetTrainer__nnUNetPlans__3d_fullres/fold_0/test_set/ --data-json-path /home/plbenveniste/net/ms-lesion-agnostic/msd_data/dataset_2024-07-24_seed42_lesionOnly.json --split test

Here are the results:

dice_scores_contrast
f1_scores_contrast
ppv_scores_contrast
sensitivity_scores_contrast

@plbenveniste
Copy link
Collaborator Author

plbenveniste commented Oct 9, 2024

Based on @NathanMolinier 's comment, nnUNet doesn't handle well the cases where images are not in the same resolution. Therefore, I tried to reorient all the images to RPI during dataset conversion and then trained the model.

To convert and reorient the images:

python nnunet/convert_msd_to_nnunet_reorient.py --input ~/net/ms-lesion-agnostic/msd_data/dataset_2024-07-24_seed42_lesionOnly.json -o ~/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_raw/ --tasknumber 301

To preprocess the dataset:

nnUNetv2_plan_and_preprocess -d 301 --verify_dataset_integrity -c 3d_fullres 2d

To train the model:

CUDA_VISIBLE_DEVICES=1 nnUNetv2_train 301 3d_fullres 0 

Running predictions with:

CUDA_VISIBLE_DEVICES=1 nnUNetv2_predict -i /home/plbenveniste/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_raw/Dataset301_msLesionAgnostic/imagesTs/ -o /home/plbenveniste/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_results/Dataset301_msLesionAgnostic/nnUNetTrainer__nnUNetPlans__3d_fullres/fold_0/test_set -d 301 -c 3d_fullres -f 0 -chk checkpoint_best.pth

To evaluate the predictions:

python nnunet/evaluate_predictions.py -pred-folder /home/plbenveniste/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_results/Dataset301_msLesionAgnostic/nnUNetTrainer__nnUNetPlans__3d_fullres/fold_0/test_set -label-folder ~/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_raw/Dataset301_msLesionAgnostic/labelsTs  -image-folder ~/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_raw/Dataset301_msLesionAgnostic/imagesTs/ -conversion-dict ~/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_raw/Dataset301_msLesionAgnostic/conversion_dict.json -output-folder /home/plbenveniste/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_results/Dataset301_msLesionAgnostic/nnUNetTrainer__nnUNetPlans__3d_fullres/fold_0/test_set

To plot the performances:

python nnunet/plot_performance.py --pred-dir-path /home/plbenveniste/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_results/Dataset301_msLesionAgnostic/nnUNetTrainer__nnUNetPlans__3d_fullres/fold_0/test_set --data-json-path /home/plbenveniste/net/ms-lesion-agnostic/msd_data/dataset_2024-07-24_seed42_lesionOnly.json --split test

Output:

Dice score per contrast (mean ± std)
PSIR (n=60): 0.3566 ± 0.2495
STIR (n=11): 0.4106 ± 0.2020
T2star (n=83): 0.5161 ± 0.1918
T2w (n=358): 0.4772 ± 0.1708
UNIT1 (n=57): 0.5981 ± 0.1671

dice_scores_contrast

For the other metrics:
PPV score per contrast (mean ± std)
PSIR (n=60): 0.6099 ± 0.3587
STIR (n=11): 0.6107 ± 0.3465
T2star (n=83): 0.8295 ± 0.2721
T2w (n=358): 0.7411 ± 0.2799
UNIT1 (n=57): 0.9026 ± 0.2001

F1 score per contrast (mean ± std)
PSIR (n=60): 0.4982 ± 0.3108
STIR (n=11): 0.6075 ± 0.3444
T2star (n=83): 0.7550 ± 0.2446
T2w (n=358): 0.7442 ± 0.2426
UNIT1 (n=57): 0.8283 ± 0.1955

Sensitivity score per contrast (mean ± std)
PSIR (n=60): 0.5090 ± 0.3542
STIR (n=11): 0.6953 ± 0.3567
T2star (n=83): 0.7532 ± 0.2792
T2w (n=358): 0.8384 ± 0.2558
UNIT1 (n=57): 0.8170 ± 0.2392

f1_scores_contrast
sensitivity_scores_contrast
ppv_scores_contrast

@plbenveniste
Copy link
Collaborator Author

plbenveniste commented Oct 12, 2024

Here is the training curve of the last model:
image

It seems that the training could have gone on longer to improve the performance. Therefore, I tried training it for a longer number of epochs (2000 instead of 1000) to see if the performances could be improved.

CUDA_VISIBLE_DEVICES=3 nnUNetv2_train 301 3d_fullres 0 -tr nnUNetTrainer_2000epochs

Ran predictions using:

CUDA_VISIBLE_DEVICES=0 nnUNetv2_predict -i /home/plbenveniste/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_raw/Dataset301_msLesionAgnostic/imagesTs/ -o /home/plbenveniste/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_results/Dataset301_msLesionAgnostic/nnUNetTrainer_2000epochs__nnUNetPlans__3d_fullres/fold_0/test_set  -d 301 -c 3d_fullres -f 0 -chk checkpoint_best.pth -tr nnUNetTrainer_2000epochs

To evaluate the predictions:

python nnunet/evaluate_predictions.py -pred-folder /home/plbenveniste/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_results/Dataset301_msLesionAgnostic/nnUNetTrainer_2000epochs__nnUNetPlans__3d_fullres/fold_0/test_set -label-folder ~/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_raw/Dataset301_msLesionAgnostic/labelsTs  -image-folder ~/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_raw/Dataset301_msLesionAgnostic/imagesTs/ -conversion-dict ~/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_raw/Dataset301_msLesionAgnostic/conversion_dict.json -output-folder /home/plbenveniste/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_results/Dataset301_msLesionAgnostic/nnUNetTrainer_2000epochs__nnUNetPlans__3d_fullres/fold_0/test_set

To plot the performances:

python nnunet/plot_performance.py --pred-dir-path /home/plbenveniste/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_results/Dataset301_msLesionAgnostic/nnUNetTrainer_2000epochs__nnUNetPlans__3d_fullres/fold_0/test_set --data-json-path /home/plbenveniste/net/ms-lesion-agnostic/msd_data/dataset_2024-07-24_seed42_lesionOnly.json --split test

Output:

Dice score per contrast (mean ± std)
PSIR (n=60): 0.3538 ± 0.2571
STIR (n=11): 0.4428 ± 0.1961
T2star (n=83): 0.4959 ± 0.2028
T2w (n=358): 0.4705 ± 0.1769
UNIT1 (n=57): 0.5865 ± 0.1715

dice_scores_contrast

For the other metrics
PPV score per contrast (mean ± std)
PSIR (n=60): 0.5972 ± 0.3679
STIR (n=11): 0.6972 ± 0.3809
T2star (n=83): 0.8121 ± 0.2730
T2w (n=358): 0.7477 ± 0.2866
UNIT1 (n=57): 0.8984 ± 0.1956

F1 score per contrast (mean ± std)
PSIR (n=60): 0.4818 ± 0.3333
STIR (n=11): 0.6426 ± 0.3510
T2star (n=83): 0.7648 ± 0.2494
T2w (n=358): 0.7414 ± 0.2543
UNIT1 (n=57): 0.8266 ± 0.1892

Sensitivity score per contrast (mean ± std)
PSIR (n=60): 0.4943 ± 0.3710
STIR (n=11): 0.7284 ± 0.3621
T2star (n=83): 0.7801 ± 0.2740
T2w (n=358): 0.8262 ± 0.2714
UNIT1 (n=57): 0.8183 ± 0.2389

f1_scores_contrast
sensitivity_scores_contrast
ppv_scores_contrast

@plbenveniste
Copy link
Collaborator Author

plbenveniste commented Oct 16, 2024

I also tried training an nnUNet with the ResEnc plans:

Here is the command:

 CUDA_VISIBLE_DEVICES=1 nnUNetv2_train 301 3d_fullres 0 -p nnUNetResEncUNetLPlans

To run predictions on the test set:

CUDA_VISIBLE_DEVICES=1 nnUNetv2_predict -i /home/plbenveniste/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_raw/Dataset301_msLesionAgnostic/imagesTs/ -o /home/plbenveniste/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_results/Dataset301_msLesionAgnostic/nnUNetTrainer__nnUNetResEncUNetLPlans__3d_fullres/fold_0/test_set -d 301 -c 3d_fullres -f 0 -chk checkpoint_best.pth -p nnUNetResEncUNetLPlans

To evaluate the predictions:

python nnunet/evaluate_predictions.py -pred-folder /home/plbenveniste/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_results/Dataset301_msLesionAgnostic/nnUNetTrainer__nnUNetResEncUNetLPlans__3d_fullres/fold_0/test_set -label-folder ~/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_raw/Dataset301_msLesionAgnostic/labelsTs  -image-folder ~/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_raw/Dataset301_msLesionAgnostic/imagesTs/ -conversion-dict ~/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_raw/Dataset301_msLesionAgnostic/conversion_dict.json -output-folder /home/plbenveniste/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_results/Dataset301_msLesionAgnostic/nnUNetTrainer__nnUNetResEncUNetLPlans__3d_fullres/fold_0/test_set

To plot the performances:

python nnunet/plot_performance.py --pred-dir-path /home/plbenveniste/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_results/Dataset301_msLesionAgnostic/nnUNetTrainer__nnUNetResEncUNetLPlans__3d_fullres/fold_0/test_set  --data-json-path /home/plbenveniste/net/ms-lesion-agnostic/msd_data/dataset_2024-07-24_seed42_lesionOnly.json --split test

Results:

Dice score per contrast (mean ± std)
PSIR (n=60): 0.3882 ± 0.2260
STIR (n=11): 0.4408 ± 0.1399
T2star (n=83): 0.5104 ± 0.1723
T2w (n=358): 0.4664 ± 0.1663
UNIT1 (n=57): 0.5858 ± 0.1387

dice_scores_contrast

For the other metrics:
PPV score per contrast (mean ± std)
PSIR (n=60): 0.6424 ± 0.3546
STIR (n=11): 0.7354 ± 0.3181
T2star (n=83): 0.8567 ± 0.2364
T2w (n=358): 0.7963 ± 0.2812
UNIT1 (n=57): 0.6546 ± 0.2443

F1 score per contrast (mean ± std)
PSIR (n=60): 0.5411 ± 0.2890
STIR (n=11): 0.6462 ± 0.3106
T2star (n=83): 0.7856 ± 0.2116
T2w (n=358): 0.7622 ± 0.2542
UNIT1 (n=57): 0.6965 ± 0.1924

Sensitivity score per contrast (mean ± std)
PSIR (n=60): 0.6182 ± 0.3294
STIR (n=11): 0.7219 ± 0.3348
T2star (n=83): 0.7915 ± 0.2508
T2w (n=358): 0.8184 ± 0.2659
UNIT1 (n=57): 0.8480 ± 0.2154

f1_scores_contrast
sensitivity_scores_contrast
ppv_scores_contrast

This model was used for ACTRIMS 2025 abstract

@plbenveniste
Copy link
Collaborator Author

After aggregating more data, I used the msd data dataset_2024-11-05_seed42.json to build the nnUNet dataset 401 and msd data dataset_2024-11-06_seed42_lesionOnly.json to build the nnUNet dataset 501.

@plbenveniste
Copy link
Collaborator Author

plbenveniste commented Nov 8, 2024

ResEnc nnUNet model training on more data:

I am training two new versions of ResEnc using the nnUNet framework with the newly aggregated data (see comment just above). One is trained on the entirety of labeled images (Dataset401), one is trained on the labeled images which have lesions (Dataset501).

Code used
export nnUNet_raw="/home/plbenveniste/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_raw"
export nnUNet_results="/home/plbenveniste/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_results"
export nnUNet_preprocessed="/home/plbenveniste/net/ms-lesion-agnostic/nnunet_experiments/nnUNet_preprocessed"
nnUNetv2_plan_and_preprocess -d 401 --verify_dataset_integrity -c 3d_fullres -pl nnUNetPlannerResEncL -gpu_memory_target 47 -overwrite_plans_name nnUNetResEncUNetPlans_47G

Training

 CUDA_VISIBLE_DEVICES=1 nnUNetv2_train 401 3d_fullres 0 -p nnUNetResEncUNetPlans_47G

And same code for dataset 501.

The above code was designed to leverage the 47GB of the GPU. However, I also ran the standard plan to see if there is a significant difference or not: nnUNetv2_plan_experiment -d 401 -pl nnUNetPlannerResEncL

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants