GitHub - JackGittes/FAST-CNN: Fixed-point Arithmetic Simulation Toolbox for CNN

Fixed-point Arithmetic Simulation Toolbox for CNN (FAST-CNN)

The motivation for this project is creating a general library which can simulate a CNN with fixed-point operations fast. Since Python is poor in fixed-point calculation support, I try to write the project in MATLAB thoroughly. While the fi object in MATLAB can conviniently express fixed-point(FXP) operations, there are still many functions in MATLAB that don't support fi. So I have to rewrite a lot of basic functions such as conv2d etc step by step.

Project Progress

Several basic functions have been completed and carried on in a parallel way as possible. The comprehensive review of all functions as below:

Type	Status	Description
Conv2d	Completed
Depthwise Conv	Completed
Pooling	Completed	MAX/AVG*
ReLU	Completed
FC	Completed	By converting FC to Conv or MM
Pointwise Conv	Completed	Can be replaced by Conv2d**

Details

The parallelism level (PL) of the source code is from vector to tensor. The definition of different levels is as below:
- L1 Vector: 1×N array
- L2 Matrix: M×N matrix
- L3 Tensor: M×N×[H1,H2 ...], where the length of [H] is no less than 1.
Higher the PL is, faster the function runs.
Conv2d calculates 2d convolution of the input tensor, PL is L3, the input and output format are TF-compatible.
Pooling calculates 2d pooling of the input tensor, PL is L3, while pooling function doesn't support 3d pooling like TF.
Up to present, a standard ConvNet like MobileNet without ResBlock can run on this library. The MobileNet which has 28 layers with depthwise and pointwise convolution consumes about 3~5 minutes to forward once on a 224×224×3 input image @Intel Core i5-8400 CPU with 16.0 GB RAM.
NOTE: If your MATLAB supports parallel computing with Parallel Computing Toolbox (PCT), the algorithm will become faster. According to my test, fully parallel MobileNet will save about 65% time of non-parallel version, which means it takes 70 seconds to forward total 28 layers once. More cores you have on your PC, faster it runs.
The acceleration mainly benefits from the Im2col - Reshape - GEMM - Reshape - Output procedure as below:

Requirement

All codes tested in MATLAB R2017/18b and 2019a on GTX 1050Ti/1080/1080Ti and RTX 2080Ti/2060. As I say above, this library is designed for simulating FXP-CNN and it's also an experimental research which helps us understand FXP behaviors of deep neural networks. Furthermore, it can help people who want to deploy their CNN algorithms on FXP devices (FPGA/ASIC etc) to verify the effectiveness of quantization method.

GPU Support

GPU support is described as below, as for different hardware design, the GPU support is not a general implementation.

TODO

More features will be added into FAST-CNN, including:

Construct computation graph and execute forward process automatically.
- User defines basic graph structure
- Make graph and optimize the executation order
- Initialize Graph with given parameters
- Execute computation graph and report intermediate results for analyzing
Support batch input mode as a suppplement to single image input mode which is the current implementation.
Add backward function.

Name		Name	Last commit message	Last commit date
Latest commit History 67 Commits
+FAST		+FAST
.gitattributes		.gitattributes
.gitignore		.gitignore
BUG.pdf		BUG.pdf
LICENSE		LICENSE
README.md		README.md
debug.md		debug.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fixed-point Arithmetic Simulation Toolbox for CNN (FAST-CNN)

Project Progress

Details

Requirement

GPU Support

TODO

About

Releases

Packages

Languages

License

JackGittes/FAST-CNN

Folders and files

Latest commit

History

Repository files navigation

Fixed-point Arithmetic Simulation Toolbox for CNN (FAST-CNN)

Project Progress

Details

Requirement

GPU Support

TODO

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages