RustBERTa-SNLI

A Rust implementation of a RoBERTa classification model for the SNLI dataset, with support for fine-tuning, predicting, and serving. This is built on top of tch-rs and rust-bert.

Background

This was the result of the AI2 2020 Employee Hackathon. The motivation for this project was to demonstrate that Rust is already a viable alternative to Python for deep learning projects in the context of both research and production.

Setup

RustBERTa-SNLI is packaged as a Rust binary, and will work on any operating system that PyTorch supports.

The only prerequisite is that you have the Rust toolchain installed. So if you're already a Rustacean, skip ahead to the "Additional setup for CUDA" section (optional).

Now, luckily, installing Rust is nothing like installing a proper Python environment, i.e. it doesn't require a PhD in system administration or the courage to blindly run every sudo command you can find on Stack Overflow until something works or completely breaks your computer.

All you have to do is run this:

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

Then just make sure ~/.cargo/bin is in your $PATH, and you're good to go. You can test for success by running rustup --version and cargo --version.

rustup can be used to update your toolchain when a new version of Rust is released (which happens monthly). cargo is used to compile, run, and test your code, as well as to build documentation, publish your crate (the Rust term for a module/library) to crates.io, and install binaries from other crates on crates.io.

Additional setup for CUDA

If you have CUDA-enabled GPUs available on your machine, you'll probably want to compile this library with CUDA support.

To do that, you just need to download the right version of LibTorch from the PyTorch website: https://pytorch.org/get-started/locally/.

Then unzip the downloaded file to someplace safe like ~/torch/libtorch and set the environment variables:

LIBTORCH="$HOME/torch/libtorch"  # or wherever you unzipped it
LD_LIBRARY_PATH="$HOME/torch/libtorch/lib:$LD_LIBRARY_PATH"

‼️ NOTE: it's important that you do this step before trying to compile the Rust binary with cargo. But if you accidentally start the build for this step, just delete the target/ directory and start over.

Compiling and running

To build the release binary, just run make.

To see all of the available commands, run

./roberta-snli --help

For example, to fine-tune a pretrained RoBERTa model, run

./roberta-snli train --out weights.ot

To interactively get predictions with a fine-tuned model, run

./roberta-snli predict --weigths weights.ot

To evaluate a fine-tuned model on the test set, run

./roberta-snli evaluate

And to serve a fine-tuned model as a production-grade webservice with batched prediction, run

./roberta-snli serve

This will serve on port 3030 by default. You can then test it out by running:

curl \
    -d '{"premise":"A soccer game with multiple people playing.","hypothesis":"Some people are playing a sport."}' \
    -H "Content-Type: application/json" \
    http://localhost:3030/predict

You can also test the batching functionality by sending a bunch of requests at once with:

./scripts/test_server.sh

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
.github		.github
config		config
img		img
scripts		scripts
src		src
test_fixtures		test_fixtures
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
Makefile		Makefile
README.md		README.md
roberta-snli		roberta-snli

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RustBERTa-SNLI

Background

Setup

Additional setup for CUDA

Compiling and running

About

Contributors 2

Languages

allenai/rustberta-snli

Folders and files

Latest commit

History

Repository files navigation

RustBERTa-SNLI

Background

Setup

Additional setup for CUDA

Compiling and running

About

Topics

Resources

Stars

Watchers

Forks

Contributors 2

Languages