Skip to content

igui/Align-MacridVAE

 
 

Repository files navigation

This repo contains the code for the implementation of Align MacridVAE. If you want to learn more about the paper you can check the article presented at ECIR 2024. This implements a multimodal recommender that can suggest items to users based on their preferences

The project is implemented in Torch and implements a shallow Variational Autoencoder with a pre-training step to Align image and textual representation.

Requirements

About software, you will need the following tools:

  • Python 3.10 or later
  • CUDA 12.1 or later, if you plan to train this on a GPU

Regarding harwate, this project is meant to run in NVIDIA GPUs, like the ones personal laptops, or in datacenters. It can also run on the CPU but it will be much slower. We tested it in V100, A100 Series and RTX 20 series. The model is relatively simple and small and we don't load larger models like CLIP, BERT or ViT during training or inference. Items are preprocessed before running through the model to simplify training.

Installation

First, install the requirements.txt file which specifies the dependencies

pip install -r requirements.txt

Next fetch the datasets. The datasets are hosted in Kaggle here and it is available to download through the web UI or using the command line tools. For example, if you already have set up your Kaggle credentials.

# Optional, you can download the dataset through the website
kaggle datasets download ignacioavas/alignmacrid-vae

unzip alignmacrid-vae.zip -d RecomData/
rm alignmacrid-vae.zip

The dataset contains data from subcategories Amazon Dataset, Movielens 25M, Bookcrossing.. Those datasets were prepared by adding images and filtering missing items, and then passing textual and visual representation through and encoders like BERT, CLIP or ViT. You can learn more by reading the README.md in the dataset root directory. The preprocessing code for building the datasets is available at Align-MacridVAE-data,

Running

Once you have the datasets downloaded, you can train a model by running the main.py script with the train argument. For example, to train the Amazon Musical Instruments dataset encoded with CLIP for visual and textual modality run the following command:

python main.py train --data Musical_Instruments-clip_clip

The training code will generate a file in the run/ directory with a name depending on the dataset and the model parameters, for example: Musical_Instruments-clip_clip-AlignMacridVAE-50E-100B-0.001L-0.0001W-0.5D-0.2b-7k-200d-0.1t-98765s. The model.pkl file contains the trained model.

Run python main.py --help to see all available parameters.

Evaluating

To evaluate a given model, we can pass the test mode. It will try to load a model from the run directory provided it was already trained. For example, to evaluate the same model as above run the following command:

python main.py test --data Musical_Instruments-clip_clip

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 99.7%
  • Other 0.3%