FNet_with_BART_classification

Description

FNet proposes to replace Transformers self-attention layers (having O(n^2) computational complexity) with linear transformations that mix input tokens. More specifically, the authors find that replacing the attention mechanism with a standard unparameterized Fourier Transform achieves 92% of the accuracy of BERT on the GLUE benchmark, but pre-trains and runs up to seven times faster on GPUs and twice as fast on TPUs.

This project leverages an existing implementation of FNet's encoder (https://github.com/jaketae/fnet.git) and completes the code to train the model on Stanford Sentiment Treebank (SST) dataset. SST is a dataset aimed for binary classification and is part of GLUE benchmark.

The original FNet model (from the paper) is pre-trained on masked language modelling (MLM) and next sentence prediction (NSP). However, the paper was only published one month ago and the pre-trained model is not available online. In order to harness transfer learning and speed up the model's convergence, this project initialises FNet's encoder with BartForSequenceClassification parameters (loaded from HuggingFace). Additionally, BartTokenizer and BartLearnedPositionalEmbedding are used to process the input sequence before it is fed to the encoder.

In other words, the model can be seen as BART's encoder with a Fourier Transform token mixing mechanism (from FNet) instead of self-attention.

How to run the project

To run the project, clone the repo and run the following commands:

cd FNet_with_BART_classification
pip install -r requirements.txt
python fnet.py

Citation

article{DBLP:journals/corr/abs-2105-03824,
     author = {James Lee-Thorp and Joshua Ainslie and Ilya Eckstein and Santiago Ontañón},
     title = {FNet: Mixing Tokens with Fourier Transforms},
     journal = {CoRR},
     volume = {abs/2105.03824},
     year = {2021},
     url = {https://arxiv.org/abs/2105.03824},
     archivePrefix = {arXiv},
     eprint = {2105.03824},
     timestamp = {Fri, 14 May 2021 12:13:30 +0200},
     biburl = {https://dblp.org/rec/journals/corr/abs-2105-03824.bib},
     bibsource = {dblp computer science bibliography, https://dblp.org}
}

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
README.md		README.md
fnet.py		fnet.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FNet_with_BART_classification

Description

How to run the project

Citation

About

Releases

Packages

Languages

amoramine/FNet_with_BART_classification

Folders and files

Latest commit

History

Repository files navigation

FNet_with_BART_classification

Description

How to run the project

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages