@inproceedings{LARP2024,
  author    = {Rebecca Salganik and
               Xiaohao Liu and
               Jian Kang and
               Yunshan Ma and
               Tat{-}Seng Chua},
  title     = {LARP: Language Audio Relational Pre-training 
                for Cold-Start Playlist Continuation},
  booktitle = {{KDD}},
  publisher = {{ACM}},
  year      = {2024}
}

Building Environment

We find that installing in two phases was most effective for package compatibility.

Run conda env create --name envname --file=/KDD2024-LARP/environment_setup/conda.yml
Activate env. Run pip install /KDD2024-LARP/environment_setup/pip.txt

Running pre-trained model on your data

We have added two sample files to provide examples of the format required by our model. Audio files should be stored in .mp3 format and captions should be saved in a .json file that contains all the stri
Start by downloading the pre-trained HTSAT checkpoint at https://drive.google.com/drive/folders/1SMQyzJvc6DwJNuhQ_WI8tlCFL5HG2vk6. Then go into the /KDD2024-LARP/configs/config2.yaml file and manually change the path under MODELS>AUDIO_MODELS>HTSAT>checkpoint entry.
Download our checkpoints from: https://huggingface.co/Rsalga/LARP/tree/main
Manually input your audio and caption folders inside the run_pretrained.py file
Launch: python run_pretrained.py --checkpoint <checkpoint path>

Dataset Building

Unfortunately, due to legal restrictions we cannot post the mp3 files that were used during our training. However, we provide all of the elements necessary to recreate our code exactly. If you have any issues please feel free to reach out to our primary author, Rebecca Salganik.

We store the exact song-caption pairs in the datasets/ folder of this repository. These uris can be used in tandem with the code in the databuild_utils/ to download the mp3. Furthermore, to replicate the truncation procedure we store the exact start and end of each 10 second sample using the keys start and end.

In order to facilitate your audio loading, we provide two versions of the code - one that can be batched using a scheduler such as SLURM and another which can be done slowly on a personal computer.

Full service script

you can run KDD2024-LARP/databuild_utils/runner.sh
- NOTE: you will need to manually change some paths in the runner.sh file.
  - track_path: should be the exact location of the tracks.csv file
  - interaction_path: should be the exact location of interaction.csv file
  - data_folder: should be the <project_folder> where you plan store all the data. This doesn't have be inside KDD2024-LARP.

Piecewise dataloading procedure

Downloading original data
- LFM
  - Download data from: http://www.cp.jku.at/datasets/LFM-2b/
    - NOTE: only files you need: spotify-uris.tsv.bz2 and listening-counts.tsv.bz2
  - Run bzip2 -d filename.bz2 to unzip files
    - NOTE: the files are large so this may take a few minutes (especially for the interactions).
  - Rename files to follow proper structure:
    - run KDD2024-LARP/databuild_utils/extractions/tsv_to_csv.py -t <track path> -i <interaction path> -o <data_folder>
- MPD
  - Download data from: https://www.aicrowd.com/challenges/spotify-million-playlist-dataset-challenge/dataset_files
    - NOTE: you will need to make an account.
    - NOTE: download both train and test sets.
  - Rename files to follow propeor structure:
    - run KDD2024-LARP/databuild_utils/extractions/json_to_csv.py
Create dataset splits:
- run KDD2024-LARP/dataset_build/generate_splits.py --train_size 100_000 —valid_size 1_000 —test_size 1_000 —min_playlist_length 10 —project_folder <output_folder from previous step>
Setup spotify credentials:
- Make an account here: https://developer.spotify.com/dashboard
- Create a file /KDD2024-LARP/databuild_utils/secret.py where you store your client secret and client id:
```
spotify_client_secret = <secret here> 
spotify_client_id = <id here> 
```
Scrape for audio samples:
- run: python scrape_audio.py --mode scrape_audio --input <project_folder>/<dataset>/clean_org/train_uri.pkl --output <project_folder>/<dataset>/mp3/
  - NOTE: must be repeated for each split — train, valid, test
Scrape metadata associated with audio (if not already available):
- run: python scrape_audio.py --mode scrape_metadata --input <project_folder>/<dataset>/clean_org/train_uri.pkl --output <project_folder>/<dataset>/temp_metadata/
  - NOTE: repeat for each split — train, valid, test
Sample from full audio to create 10 second clips:
- run: python process_data.py —input_audio_folder <project_folder>/<dataset>/mp3/ —output_audio_folder <project_folder>/<dataset>/trunc_mp3/ —meta_output_folder <project_folder>/<dataset>/trunc_meta/ --mt
Generate captions:
- run: python generate_captions.py --dataset_folder <project_folder>/<dataset>/ --split <project_folder>/<dataset>/train_uri.pkl --output_path <project_folder>/<dataset>/caption_sets/
Generate interaction matrix for downstream evaluation:
- run: python -u build_interactions.py --project_folder <project_folder>/<dataset>/

Using your own dataset:

For seamless data loading, files should have the following structure:

project_folder 

| -- MPD/ 
		| -- clean_org/  
					| -- train_uri.pkl 
					| -- valid_uri.pkl 
					| -- test_uri.pkl 
		| -- trunc_mp3/ 
					| -- <uri>/ 
							| -- <uri>_<sample_index>.mp3 
							| -- ...
					| ... 
		| -- caption_sets/ 
					| -- train_captions.pkl 
					| -- valid_captions.pkl 
					| -- test_captions.pkl 
		| -- evaluation_sets
					| -- bundle
							| -- bi_full.npy 
							| -- ... 
        | -- pair_sets/
					| ... 
		
| -- LFM/ 
		| ...

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
configs		configs
databuild_utils		databuild_utils
datasets		datasets
environment_setup		environment_setup
evaluation		evaluation
models		models
.gitignore		.gitignore
README.md		README.md
audio_example.mp3		audio_example.mp3
caption_example.json		caption_example.json
data.py		data.py
parser.py		parser.py
run_pretrained.py		run_pretrained.py
train.py		train.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Table of Contents

Building Environment

Running pre-trained model on your data

Dataset Building

Full service script

Piecewise dataloading procedure

Using your own dataset:

About

Releases

Packages

Languages

Rsalganik1123/LARP

Folders and files

Latest commit

History

Repository files navigation

Table of Contents

Building Environment

Running pre-trained model on your data

Dataset Building

Full service script

Piecewise dataloading procedure

Using your own dataset:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages