This repository has been archived by the owner on Jan 15, 2024. It is now read-only.
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Browse files
Browse the repository at this point in the history
* Add example script to deploy BERT * Add options to better measure performance * Allow specification of path for exported model * Add option to use custom graph pass * Add optimization for MHA in custom graph pass * Correct bug with input shapes in optimize_for * correct typo * fix lint * fix lint * Add documentation * Add documentation for using deploy script * Correct typo/add spaces in documentation * Add setup.py to compile pass, update documentation * Fix bug in path to include dir & fix pylint * Add unitest for deploy bert script * change CUDA version in wheel * test latest wheel * change path to custom pass library * fixing trigger custom pass compilation * fix lint * fix lint * Update mxnet pip version * Only GPU versions changed * fix lint * change wheel to include mkl headers * lint docstring * remove debug print * change include paths * lint * debugging lib_api.h * debugging lib_api.h * debugging * Disable test for now * skip test if mxnet_version < 1.7.0 * use pytest.mark.skipif to skip test * test only BERT-base (fp16/fp32, SST/QA, embeddings) to avoid timeout Co-authored-by: Leonard Lausen <[email protected]> Co-authored-by: Leonard Lausen <[email protected]>
- Loading branch information