Description

Code for the paper:

Filip Szatkowski, Bartosz Wójcik, Mikołaj Piórczyński, Simone Scardapane. "Exploiting Activation Sparsity with Dense to Dynamic-k Mixture-of-Experts Conversion"

(Accepted to NeurIPS 2024)

Description

D2DMoE works by: a) Enhancing the activation sparsity in the base model. (b) Converting FFN layers in the model to MoE layers with routers that predict the contribution of each expert. (c) Introducing dynamic-k routing that selects the experts for execution based on their predicted contribution.

Running

Make sure that conda is installed on your server, and that it's present in $PATH. You may want to install it by yourself, or ask whether conda is already installed somewhere and just add it to $PATH.
If effbench_env is not visible in conda, e.g. it's a fresh conda installation, create this environment: bash create_env.sh
Optionally create a W&B account and add the following content to your ~/.bashrc: export WANDB_API_KEY="<YOUR_KEY>"
Copy the user_example.env file and fill in the paths: cp user_example.env user.env vi user.env
Edit the submitit/slurm run script to run the experiments you need: vi scripts/your_run_script_name.py
Run the experiment using that script with slurm: bash run.sh your_run_script_name

Notes

A few things to keep in mind:

The code generates a unique run name based on the command line arguments passed to the script. When adding new CLI argument that should not affect the run name you have to update the generate_run_name() function appropriately.
The weights are saved every N minutes.
The training will continue from the last checkpoint if the run with the generated name is present.
Use the use_wandb flag to log to W&B.
Remember that changing the code will not change the generated experiment name.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
architectures		architectures
data_utils		data_utils
methods		methods
scripts		scripts
visualize		visualize
LICENSE		LICENSE
README.md		README.md
common.py		common.py
create_env.sh		create_env.sh
environment.yml		environment.yml
eval.py		eval.py
run.sh		run.sh
teaser.png		teaser.png
train.py		train.py
user_example.env		user_example.env
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Description

Running

Notes

About

Releases

Packages

Languages

License

bartwojcik/D2DMoE

Folders and files

Latest commit

History

Repository files navigation

Description

Running

Notes

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages