UM-SJTU-JI Deep learning Hands-on Tutorial

Session 9 - Applying Multi-Head Attention and Transformers for Sentiment Analysis

Introduction

In this session, we'll deploy Transformers, particularly focusing on the Multi-Head Attention mechanism, for Sentiment Analysis on text data, categorizing it into positive or negative sentiment.

The theoretical part of transformer architecture can be found in MIT course Lecture 2. You can refer to the course official website for more in depth understanding.

Prerequisites

Ensure to have:

PyTorch
Torchtext
Spacy (for tokenization)

pip install torch torchtext spacy
python -m spacy download en_core_web_sm

Data Preprocessing

Step 1: Load and Preprocess Data

from torchtext.legacy import data
from torchtext.legacy import datasets

TEXT = data.Field(tokenize='spacy', batch_first=True)
LABEL = data.LabelField(dtype=torch.float)

train_data, test_data = datasets.IMDB.splits(TEXT, LABEL)

# Build the vocabulary and load default word embeddings
TEXT.build_vocab(train_data, max_size=25000, vectors="glove.6B.100d")
LABEL.build_vocab(train_data)

# Split data and create iterators
train_data, valid_data = train_data.split()
train_iterator, valid_iterator, test_iterator = data.BucketIterator.splits(
    (train_data, valid_data, test_data),
    batch_size=64,
    device=device)

Building the Transformer Model

Step 2: Define the Transformer Model

import torch.nn as nn

class TransformerModel(nn.Module):
    def __init__(self, input_dim, emb_dim, n_heads, hid_dim, n_layers, output_dim, dropout, pad_idx):
        super().__init__()
        self.embedding = nn.Embedding(input_dim, emb_dim, padding_idx=pad_idx)
        self.transformer = nn.TransformerEncoder(
            nn.TransformerEncoderLayer(d_model=emb_dim, nhead=n_heads, dim_feedforward=hid_dim),
            num_layers=n_layers
        )
        self.fc = nn.Linear(emb_dim, output_dim)
        self.dropout = nn.Dropout(dropout)

    def forward(self, text):
        #text = [batch_size, seq_len]
        embedded = self.dropout(self.embedding(text))
        #embedded = [batch_size, seq_len, emb_dim]
        transformed = self.transformer(embedded)
        #transformed = [batch_size, seq_len, emb_dim]
        prediction = self.fc(transformed[:, -1, :])
        #prediction = [batch_size, output_dim]
        return prediction

Training the Transformer

Step 3: Training Loop

Define the hyperparameters, optimizer, criterion, and execute the training loop.

# Hyperparameters, optimizer, and criterion
# ...

# Training loop
for epoch in range(num_epochs):
    # Training and validation code
    # ...

Evaluation and Testing

Step 4: Evaluating the Model

After training, evaluate the model using accuracy or any other suitable metric on your validation/test set and utilize the model for predicting sentiment on new sentences.

def predict_sentiment(model, sentence, text_field):
    model.eval()
    tokenized = [tok.text for tok in spacy_en.tokenizer(sentence)]
    indexed = [text_field.vocab.stoi[t] for t in tokenized]
    tensor = torch.LongTensor(indexed).unsqueeze(0)
    prediction = torch.sigmoid(model(tensor))
    return prediction.item()

Conclusion

In Session 9, you deployed a Transformer model to perform sentiment analysis on the IMDB dataset. Transformers are versatile and have been extensively applied across numerous NLP applications, showcasing exemplary performance, especially in understanding contextual information in text. Continue exploring further applications and enhancing model architectures to deepen your understanding!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

session_9.md

session_9.md

UM-SJTU-JI Deep learning Hands-on Tutorial

Session 9 - Applying Multi-Head Attention and Transformers for Sentiment Analysis

Table of Contents

Introduction

Prerequisites

Data Preprocessing

Step 1: Load and Preprocess Data

Building the Transformer Model

Step 2: Define the Transformer Model

Training the Transformer

Step 3: Training Loop

Evaluation and Testing

Step 4: Evaluating the Model

Conclusion

Files

session_9.md

Latest commit

History

session_9.md

File metadata and controls

UM-SJTU-JI Deep learning Hands-on Tutorial

Session 9 - Applying Multi-Head Attention and Transformers for Sentiment Analysis

Table of Contents

Introduction

Prerequisites

Data Preprocessing

Step 1: Load and Preprocess Data

Building the Transformer Model

Step 2: Define the Transformer Model

Training the Transformer

Step 3: Training Loop

Evaluation and Testing

Step 4: Evaluating the Model

Conclusion