Skip to content

Latest commit

 

History

History
32 lines (30 loc) · 2.36 KB

TODO.md

File metadata and controls

32 lines (30 loc) · 2.36 KB

09/10/2023

  • Setup AWS Datasource with delegated access IAM role for Lightly
  • Setup Lightly Datasource under lightly directory in ai4ls-bps-microscopy-data
  • Created data directory in ai4ls-bps-microscopy-data
  • Create Scripts to download subset of data locally
    • 00_download_Gyhi_4hr_from_s3_source.sh
    • Write filtered meta.csv and separate rows into .json metadata files
    • Do a train, val split on the data and organize with directory structure required by Lightly platform
  • Label a small validation dataset using Label Studio directly (~100 samples if possible)
  • Revisit bps_labeler/01_setup_data_local.py to divide training and validaton set for Lightly Active Learning Pipeline
  • Incorporate file metadata json to Lightly for use and display in Lightly platform
  • Delete full Gy=hi, hr_post_exposure=4 from aws bucket and reupload the training dataset only
  • Incorporate the label in the format that 01_setup_data_local.py under the key: label for the validation set
  • Profit 💰

09/11/2023-09/12/2023

  • Create Scripts to download subset of data locally
    • 00_download_Gyhi_4hr_from_s3_source.sh
    • Write filtered meta.csv and separate rows into .json metadata files
    • Do a train, val split on the data and organize with directory structure required by Lightly platform
  • Label a small validation dataset using Label Studio directly (~100 samples if possible)
  • Revisit bps_labeler/01_setup_data_local.py to divide training and validaton set for Lightly Active Learning Pipeline
  • Incorporate file metadata json to Lightly for use and display in Lightly platform
  • Delete full Gy=hi, hr_post_exposure=4 from aws bucket and reupload the training dataset only
  • Incorporate the label in the format that 01_setup_data_local.py under the key: label for the validation set
  • Profit 💰

09/12/2023

  • Completed data setup with final -2_upload_training_set_Gyhi_4hr_to_s3_dest.sh
  • Write a script to start the lightly worker with the token information taken from .env file
  • Troubleshoot 03_run_first_selection.py for Lightly dataset that already exists and also for proper taking of tokens from .env AWS S3
  • Configure the sampling strategy to select balanced on the particle_type as well diversity of embeddings for self supervised learning