laplace, a Graph Neural Network based Recommendation Engine

Overview

laplace is an end-to-end ML framework to train and predict on neurally-enhanced graphs for recommendation.

The pipeline is designed for self-supervised edge prediction on heterogenous graphs.

Features

Multi-step, hybrid recommendation pipeline:
1. Candidate Selection:
  - Integrating LightGCN recommendations (can be ran on its own aswell)
  - Multiple, custom heuristics
  - Strategies can be mixed and matched
2. Ranking: GraphConvolutional network prediction on candidate edges
Works on Heterogenous graphs
- User based training, validation and test splitting
- N-hop neighborhood aggregation
- Node Features
- Works on any number of node types
Advanced preprocessing of tabular data into graphs
- Neo4j integration for better visualization and handling of large graphs.

Get Started

Installation & Data

Install the environment with:

conda env create -n fashion --file environment.yml

Activate the environment:

conda activate fashion

Download the required data.

Upload your data to a server. You should have a seperate file for:
- articles.parquet
- customers.parquet
- transactions_splitted.parquet

Create an .env file in the root directory with a DATA_HOST_URL variable
Run the following script from terminal:

python run-download-data.py fashion

Currently the system works with the

H&M Fashion Recommendation Kaggle challenge dataset.
python run-download-data.py fashion
Movielens dataset
python run-download-data.py movielens

Main Pipeline

To run the pipeline there are four steps required:

Adjust Config file under config.py -> link_pred_config and config.py -> preprocessing_config
Run Preprocessing with run_preprocessing.py
Run Training run_pipeline.py
Save results Inference run_submission

Step 1: Prepocessing

Preprocessing turns tabular data into a graph and (optionally) loads it into a neo4j database.

First download data as defined in 'Get Started'
Set preprocessing configurations in config.py -> preprocessing_config
Run run_preprocessing.py

Data will be saved under data/derived.

Note on neo4j:

It is recommended to use neo4j, it is the officially supported database of laplace, by setting these parameters in config.py:

preprocessing_config.save_to_neo4j = True 
link_pred_config.neo4j = True

You can view the graph and run queries after running the preprocessing pipeline (it automatically starts neo4j server).

However, if neo4j stops running you can restart it with neo4j start in the terminal. More info on neo4j.

Step 2: Training

Set training configurations in config.py -> link_pred_config
run training with run_pipeline.py

Step 3: Get Inference

Run inference by launching run_submission.py

Advanced Usage

Hyperparameter tuning

wandb is integrated into laplace.

Create an .env file in the root of the project. Add your wandb api key: WANDB_API_KEY=12345random678letters91011example121314
You can configure the sweep under sweep.yaml
Then run run_sweep.py

! Some sweep parameters are overwritten under run_sweep.py

Futher todo

:white_large_square Benchmark different implementation :white_large_square Additional matchers

Name		Name	Last commit message	Last commit date
Latest commit History 143 Commits
.github/workflows		.github/workflows
.vscode		.vscode
data		data
docs/assets		docs/assets
model		model
notebooks		notebooks
output		output
pinsage		pinsage
reporting		reporting
tests		tests
utils		utils
.gitignore		.gitignore
README.md		README.md
config.py		config.py
environment.yml		environment.yml
import.report		import.report
pytest.ini		pytest.ini
run_command.py		run_command.py
run_data_splitting.py		run_data_splitting.py
run_download_data.py		run_download_data.py
run_hpo.py		run_hpo.py
run_pinsage.py		run_pinsage.py
run_pipeline.py		run_pipeline.py
run_pipeline_lightgcn.py		run_pipeline_lightgcn.py
run_preprocessing.py		run_preprocessing.py
run_preprocessing_fashion.py		run_preprocessing_fashion.py
run_submission.py		run_submission.py
run_sweep.py		run_sweep.py
sweep.yaml		sweep.yaml
temporary_hetero.py		temporary_hetero.py
training.py		training.py
visualize_dataloaders.py		visualize_dataloaders.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

laplace, a Graph Neural Network based Recommendation Engine

Table of Contents

Overview

Features

Get Started

Installation & Data

Main Pipeline

Advanced Usage

Hyperparameter tuning

Futher todo

Recommended resources

About

Releases

Packages

Contributors 3

Languages

dream-faster/laplace-gnn-recommendation

Folders and files

Latest commit

History

Repository files navigation

laplace, a Graph Neural Network based Recommendation Engine

Table of Contents

Overview

Features

Get Started

Installation & Data

Main Pipeline

Advanced Usage

Hyperparameter tuning

Futher todo

Recommended resources

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages