AMPeD: An Analytical Model for Performance in Distributed Training of Transformers

This repository contains the python implementation of AMPeD model published in ISPASS 2023. To reproduce the data presented in the paper, please use the branch reproduce_paper. The main branch is under experimentation.

Inputs

All the parameters of the model are contained in config.json. The parameters that must be calculated from other parameters each have a function in inputs.py that calculates them. These functions are stored in the calculate_functions dictionary at the top of inputs.py. Parameters can also be looked up in a lookup table (lookup_tables.json).

config.json

The config file has the following structure:

neural_network_training_parameters
|   lookup_config
|   |   lookup_table_name: string
|   |   lookup_table_row: string
|   parameters
|   |   weight_precision: par_type
|   |   activation_precision: par_type
|   |   ...
mapping_parameters
|   parameters
|   |   number_of_microbatches_per_minibatch: par_type
|   |   ...
system_architecture_parameters
|   parameters
|   |   ...
accelerator_architecture_parameters
|   lookup_config
|   |   lookup_table_name: string
|   |   lookup_table_row: string
|   parameters
|   |   ...

There are currently four categories of parameters. Each category can have a lookup_config that specifies which lookup table (see below) to use when looking up parameters of that category, and it specifies which row/element of the lookup table should be used.

The parameters are defined by a par_type object as follows:

{
    "value": any,  // the value the parameter will take if not calculated or looked up
    "calculated"?: boolean,  // set this to true to calculate this parameter from others
    "from_lookup_table"?: boolean,  // set this to true to look this parameter up in table
    "lookup_name"?: string,  // if set, this will be used as the name of this parameter in the lookup table
    "description"?: string,  // an optional description of the parameter
}

The value property can take integers, floats, but also unit strings like "8 GB" which will internally be converted to 8000000000. The unit itself, in this case B, is purely descriptive and can be omitted.
A question mark means that this property is not necessary to include. If you omit lookup_name, then the name of the parameter will be used as the lookup name.

Lookup tables

The lookup tables file (lookup_tables.json) has the following structure:

accelerator_specifications
|   V100
|   |   frequency: int
|   |   number_of_cores: int
|   |   ...
|   A100
|   |   ...
transformer_network_parameters
|   GPT-2 small
|   |   tokens: int
|   |   dimensionality: int
|   |   ...
|   GPT-2 medium
|   |   ...

It currently has two tables. If a parameter has from_lookup_table set to true, then the program will first determine which table that parameter's category uses and which row in that table should be used. Secondly, the program will determine which name to use to look up the parameter's value: either lookup_name if provided, or the name of the parameter itself in config.json. Finally, the parameter's value will be looked up.

Calculating parameters

Setting calculated to true will tell the program that this parameter must be calculated. The dictionary calculate_functions at the top of inputs.py contains, for each parameter to be calculated, the function that calculates it. These functions are given all the parameters (p) that have been supplied by the user (value in config.json), looked up, or previously calculated. If a calculation depends on yet uncalculated parameters, then those will be recursively calculated. If you want to set a parameter to be calculated, make sure there is a corresponding function in calculate_functions.

Performance Model

performance_model.py contains functions that calculate performance metrics of the system based on the configuration. Metrics such as the total time to train the network of transformers, the total amount of TFLOPS, etc. The value function from common.py can be used to easily select between a user-supplied value and a calculated value.

Scenario Summary

scenario-summary.py contains the functions that display breakdowns of the training time required by the current configuration as pie charts.

How to run

First, install dependencies: pip install -r requirements.txt.

Then you can run any file like this:

python -m amped.main
python -m amped.scenario_summary
python -m amped.optimization
python -m amped.mat_dims
...

Some of these will generate output files. Output files will always be saved in the output_files subfolder. This folder will be created automatically if it doesn't exist. All the files saved will have the current date and time prefixed to their name.
Running amped.main will save a summary of the config parameters in config_summary.txt and it will also save a full breakdown of the training time (including all metrics) in training_time_breakdown.txt.

Commandline arguments

Different config file

If you want to use a different config file, specify its path using the --config argument. E.g. python -m amped.main --config myConfig.json. This filepath starts in the amped subfolder. This config file must have the correct structure (see above).

GEMM breakdown

Adding the --GEMM flag will save a GEMM breakdown in gemm_breakdown.txt.

Compute graph

Add the --compute_graph flag together with the --GEMM flag to save the text format of the compute graph in compute_graph.txt.

Design space exploration (optional feature)

The optimization.py file contains functions that allow you to evaluate all valid permutations of the batch size and the six parallelization degrees ({DP, TP, PP} ⨯ {intra, inter}). The function print_best_parallelizations at the bottom of the file allows you to find and print these permutations. Change the arguments given to this function as desired.
If you want to save the output as a file, simply redirect it to a file when running from the terminal: python -m amped.optimization > myFile.txt. If you want to save the output to a different file and also see the output in the terminal, run python -m amped.optimization | tee myFile.txt.

Matrix dimensions for DeepFlow

Running python -m amped.mat_dims will generate a file (mat_dims.txt) with matrix dimensions in the right format to be used in DeepFlow. The current config file is used for the parameters.

Tests

The tests directory contains tests that check the correctness of the program. It uses its own config file so that changing the main config file doesn't alter the execution of the tests. There are also unit tests for other internal files.

Running tests

Run the tests with python -m unittest discover.

Package

The package can be installed using pip install amped and then imported and used as follows:

import amped

print(amped.estimateTrainingTime("H100 PCIe"))

Parameters can be set by editing the config file inside the package folder. E.g. if using virtualenv, inside venv/Lib/site-packages/amped/config.json.

Building and publishing the package

You only have to do steps 1 through 4 once. If you have done them already, skip to step 5.

Step 1: Create a PyPi account: https://pypi.org/account/register/
Step 2: Create a PyPi API token: https://pypi.org/manage/account/#api-tokens
Step 3: Create a file called .pypirc in your user folder if it doesn't exist already.

Step 4: Append the following to this file:

[pypi]
username = __token__
password = <the API token value, including the 'pypi-' prefix>

Step 5: Increment the version number in pyproject.toml if the package has already been published with this version number.
Step 6: Run python package.py. This will build and publish the package.

To only build the package, comment out the last line in package.py and run it.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
amped		amped
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
package.py		package.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AMPeD: An Analytical Model for Performance in Distributed Training of Transformers

Inputs

config.json

Lookup tables

Calculating parameters

Performance Model

Scenario Summary

How to run

Commandline arguments

Different config file

GEMM breakdown

Compute graph

Design space exploration (optional feature)

Matrix dimensions for DeepFlow

Tests

Running tests

Package

Building and publishing the package

About

Releases 1

Packages

Contributors 3

Languages

License

CSA-infra/AMPeD

Folders and files

Latest commit

History

Repository files navigation

AMPeD: An Analytical Model for Performance in Distributed Training of Transformers

Inputs

config.json

Lookup tables

Calculating parameters

Performance Model

Scenario Summary

How to run

Commandline arguments

Different config file

GEMM breakdown

Compute graph

Design space exploration (optional feature)

Matrix dimensions for DeepFlow

Tests

Running tests

Package

Building and publishing the package

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 3

Languages

Packages