abstractive-text-summarization

This repository and notebook contains code for in-progress implementation/experiments on Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond paper.

Requirements

Create conda environment

conda env create -f environment.yml --gpu

conda env create -f environment-cpu.yml --cpu

Activate environment

source activate abs-sum -- gpu

source activate abs-sum-cpu -- cpu

Install dependencies (PyTorch, Fastai, etc) via:

pip install -r requirements.txt

Download spacy english module

python -m spacy download en

Dataset

The dataset used is a subset of the gigaword dataset and can be found here.

It contains 3,803,955 parallel source & target examples for training and 189,649 examples for validation.

After downloading, we created article-title pairs, saved in tabular datset format (.csv) and extracted a sample subset (80,000 for training & 20,000 for validation). This data preparation can be found here.

An example article-title pair looks like this:

source: the algerian cabinet chaired by president abdelaziz bouteflika on sunday adopted the #### finance bill predicated on an oil price of ## dollars a barrel and a growth rate of #.# percent , it was announced here .

target: algeria adopts #### finance bill with oil put at ## dollars a barrel

Experimenting on the complete dataset (3M) would take a really long time (also $$$$). So in order to train and experiment faster we use our sample subset of 80,000.

Current Features

model architecture supports LSTM & GRU (biLSTM-to-uniLSTM or biGRU-to-uniGRU)
implements batch data processing
implements attention mechanism (Bahdanau et al. & Luong et al.(global dot))
implements scheduled sampling (teacher forcing)
implements tied embeddings
initializes encoder-decoder with pretrained vectors (glove.6B.200d)
implements custom training callbacks (tensorboard visualization for PyTorch, save best model & log checkpoint)
implements attention plots

To-Do

Implement additional linguistic features embeddings
Implement generator-pointer switch and replace unknown words by selecting source token with the highest attention score.
Implement large vocabulary trick
Implement sentence level attention
Implement beam search during inference
implement rouge evaluation

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
data		data
imgs		imgs
py scripts		py scripts
.DS_Store		.DS_Store
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
abstractive-text-summ.ipynb		abstractive-text-summ.ipynb
data-preparation.ipynb		data-preparation.ipynb
environment-cpu.yml		environment-cpu.yml
environment.yml		environment.yml
launch-tensorboard_viz.ipynb		launch-tensorboard_viz.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

abstractive-text-summarization

Requirements

Dataset

Current Features

To-Do

Baseline Training & Validation Loss

About

Releases

Packages

Languages

alesee/abstractive-text-summarization

Folders and files

Latest commit

History

Repository files navigation

abstractive-text-summarization

Requirements

Dataset

Current Features

To-Do

Baseline Training & Validation Loss

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages