Releases: microsoft/nni
NNI v2.3 Release
Major Updates
Neural Architecture Search
-
Retiarii Framework (NNI NAS 2.0) Beta Release with new features:
- Support new high-level APIs:
Repeat
andCell
(#3481) - Support pure-python execution engine (#3605)
- Support policy-based RL strategy (#3650)
- Support nested ModuleList (#3652)
- Improve documentation (#3785)
Note: there are more exciting features of Retiarii planned in the future releases, please refer to Retiarii Roadmap for more information.
- Support new high-level APIs:
-
Add new NAS algorithm: Blockwise DNAS FBNet (#3532, thanks the external contributor @alibaba-yiwuyao)
Model Compression
- Support Auto Compression Framework (#3631)
- Support slim pruner in Tensorflow (#3614)
- Support LSQ quantizer (#3503, thanks the external contributor @chenbohua3)
- Improve APIs for iterative pruners (#3507 #3688)
Training service & Rest
- Support 3rd-party training service (#3662 #3726)
- Support setting prefix URL (#3625 #3674 #3672 #3643)
- Improve NNI manager logging (#3624)
- Remove outdated TensorBoard code on nnictl (#3613)
Hyper-Parameter Optimization
WebUI
- Improve search parameters on trial detail page (#3651 #3723 #3715)
- Make selected trials consistent after auto-refresh in detail table (#3597)
- Add trial stdout button on local mode (#3653 #3690)
Examples & Documentation
- Convert all trial examples' from config v1 to config v2 (#3721 #3733 #3711 #3600)
- Add new jupyter notebook examples (#3599 #3700)
Dev Excellent
- Upgrade dependencies in Dockerfile (#3713 #3722)
- Substitute PyYAML for
ruamel.yaml
(#3702) - Add pipelines for AML and hybrid training service and experiment config V2 (#3477 #3648)
- Add pipeline badge in README (#3589)
- Update issue bug report template (#3501)
Bug Fixes & Minor Updates
- Fix syntax error on Windows (#3634)
- Fix a logging related bug (#3705)
- Fix a bug in GPU indices (#3721)
- Fix a bug in FrameworkController (#3730)
- Fix a bug in
export_data_url format
(#3665) - Report version check failure as a warning (#3654)
- Fix bugs and lints in nnictl (#3712)
- Fix bug of
optimize_mode
on WebUI (#3731) - Fix bug of
useActiveGpu
in AML v2 config (#3655) - Fix bug of
experiment_working_directory
in Retiarii config (#3607) - Fix a bug in mask conflict (#3629, thanks the external contributor @Davidxswang)
- Fix a bug in model speedup shape inference (#3588, thanks the external contributor @Davidxswang)
- Fix a bug in multithread on Windows (#3604, thanks the external contributor @Ivanfangsc)
- Delete redundant code in training service (#3526, thanks the external contributor @maxsuren)
- Fix typo in DoReFa compression doc (#3693, thanks the external contributor @Erfandarzi)
- Update docstring in model compression (#3647, thanks the external contributor @ichejun)
- Fix a bug when using Kubernetes container (#3719, thanks the external contributor @rmfan)
NNI v2.2 Release
Major updates
Neural Architecture Search
-
Improve NAS 2.0 (Retiarii) Framework (Alpha Release)
- Support local debug mode (#3476)
- Support nesting
ValueChoice
inLayerChoice
(#3508) - Support dict/list type in
ValueChoice
(#3508) - Improve the format of export architectures (#3464)
- Refactor of NAS examples (#3513)
- Refer to
here <https://github.com/microsoft/nni/issues/3301>
__ for Retiarii Roadmap
Model Compression
- Support speedup for mixed precision quantization model (Experimental) (#3488 #3512)
- Support model export for quantization algorithm (#3458 #3473)
- Support model export in model compression for TensorFlow (#3487)
- Improve documentation (#3482)
nnictl & nni.experiment
- Add native support for experiment config V2 (#3466 #3540 #3552)
- Add resume and view mode in Python API
nni.experiment
(#3490 #3524 #3545)
Training Service
- Support umount for shared storage in remote training service (#3456)
- Support Windows as the remote training service in reuse mode (#3500)
- Remove duplicated env folder in remote training service (#3472)
- Add log information for GPU metric collector (#3506)
- Enable optional Pod Spec for FrameworkController platform (#3379, thanks the external contributor @mbu93)
WebUI
- Support launching TensorBoard on WebUI (#3454 #3361 #3531)
- Upgrade echarts-for-react to v5 (#3457)
- Add wrap for dispatcher/nnimanager log monaco editor (#3461)
Bug Fixes
- Fix bug of FLOPs counter (#3497)
- Fix bug of hyper-parameter Add/Remove axes and table Add/Remove columns button conflict (#3491)
- Fix bug that monaco editor search text is not displayed completely (#3492)
- Fix bug of Cream NAS (#3498, thanks the external contributor @AliCloud-PAI)
- Fix typos in docs (#3448, thanks the external contributor @OliverShang)
- Fix typo in NAS 1.0 (#3538, thanks the external contributor @ankitaggarwal23)
NNI v2.1 Release
Major updates
Neural architecture search
-
Improve NAS 2.0 (Retiarii) Framework (Improved Experimental)
- Improve the robustness of graph generation and code generation for PyTorch models (#3365)
- Support the inline mutation API
ValueChoice
(#3349 #3382) - Improve the design and implementation of Model Evaluator (#3359 #3404)
- Support Random/Grid/Evolution exploration strategies (i.e., search algorithms) (#3377)
- Refer to here for Retiarii Roadmap
Training service
- Support shared storage for reuse mode (#3354)
- Support Windows as the local training service in hybrid mode (#3353)
- Remove PAIYarn training service (#3327)
- Add "recently-idle" scheduling algorithm (#3375)
- Deprecate
preCommand
and enablepythonPath
for remote training service (#3284 #3410) - Refactor reuse mode temp folder (#3374)
nnictl & nni.experiment
- Migrate
nnicli
to new Python APInni.experiment
(#3334) - Refactor the way of specifying tuner in experiment Python API (
nni.experiment
), more aligned withnnictl
(#3419)
WebUI
- Support showing the assigned training service of each trial in hybrid mode on WebUI (#3261 #3391)
- Support multiple selection for filter status in experiments management page (#3351)
- Improve overview page (#3316 #3317 #3352)
- Support copy trial id in the table (#3378)
Documentation
- Improve model compression examples and documentation (#3326 #3371)
- Add Python API examples and documentation (#3396)
- Add SECURITY doc (#3358)
- Add 'What's NEW!' section in README (#3395)
- Update English contributing doc (#3398, thanks external contributor @Yongxuanzhang)
Bug fixes
- Fix AML outputs path and python process not killed (#3321)
- Fix bug that an experiment launched from Python cannot be resumed by nnictl (#3309)
- Fix import path of network morphism example (#3333)
- Fix bug in the tuple unpack (#3340)
- Fix bug of security for arbitrary code execution (#3311, thanks external contributor @huntr-helper)
- Fix
NoneType
error on jupyter notebook (#3337, thanks external contributor @tczhangzhi) - Fix bugs in Retiarii (#3339 #3341 #3357, thanks external contributor @tczhangzhi)
- Fix bug in AdaptDL mode example (#3381, thanks external contributor @ZeyaWang)
- Fix the spelling mistake of assessor (#3416, thanks external contributor @ByronChao)
- Fix bug in ruamel import (#3430, thanks external contributor @rushtehrani)
NNI v2.0 Release
Major updates
Neural architecture search
- Support an improved NAS framework: Retiarii (experimental)
- Support a new NAS algorithm: Cream (#2705)
- Add a new NAS benchmark for NLP model search (#3140)
Training service
- Support hybrid training service (#3097 #3251 #3252)
- Support AdlTrainingService, a new training service based on Kubernetes (#3022, thanks external contributors Petuum @pw2393)
Model compression
- Support pruning schedule for fpgm pruning algorithm (#3110)
- ModelSpeedup improvement: support torch v1.7 (updated graph_utils.py) (#3076)
- Improve model compression utility: model flops counter (#3048 #3265)
WebUI & nnictl
- Support experiments management on WebUI, add a web page for it (#3081 #3127)
- Improve the layout of overview page (#3046 #3123)
- Add navigation bar on the right for logs and configs; add expanded icons for table (#3069 #3103)
Others
- Support launching an experiment from Python code (#3111 #3210 #3263)
- Refactor builtin/customized tuner installation (#3134)
- Support new experiment configuration V2 (#3138 #3248 #3251)
- Reorganize source code directory hierarchy (#2962 #2987 #3037)
- Change SIGKILL to SIGTERM in local mode when cancelling trial jobs (#3173)
- Refector hyperband (#3040)
Documentation
- Port markdown docs to reStructuredText docs and introduce
githublink
(#3107) - List related research and publications in doc (#3150)
- Add tutorial of saving and loading quantized model (#3192)
- Remove paiYarn doc and add description of
reuse
config in remote mode (#3253) - Update EfficientNet doc to clarify repo versions (#3158, thanks external contributor @ahundt)
Bug fixes
- Fix exp-duration pause timing under NO_MORE_TRIAL status (#3043)
- Fix bug in NAS SPOS trainer, apply_fixed_architecture (#3051, thanks external contributor @HeekangPark)
- Fix
_compute_hessian
bug in NAS DARTS (PyTorch version) (#3058, thanks external contributor @hroken) - Fix bug of conv1d in the cdarts utils (#3073, thanks external contributor @athaker)
- Fix the handling of unknown trials when resuming an experiment (#3096)
- Fix bug of kill command under Windows (#3106)
- Fix lazy logging (#3108, thanks external contributor @HarshCasper)
- Fix checkpoint load and save issue in QAT quantizer (#3124, thanks external contributor @eedalong)
- Fix quant grad function calculation error (#3160, thanks external contributor @eedalong)
- Fix device assignment bug in quantization algorithm (#3212, thanks external contributor @eedalong)
- Fix bug in ModelSpeedup and enhance UT for it (#3279)
- and others
NNI v1.9 Release
Release 1.9 - 10/22/2020
Major updates
Neural architecture search
- Support regularized evolution algorithm for NAS scenario (#2802)
- Add NASBench201 in search space zoo (#2766)
Model compression
- AMC pruner improvement: support resnet, support reproduction of the experiments (default parameters in our example code) in AMC paper (#2876 #2906)
- Support constraint-aware on some of our pruners to improve model compression efficiency (#2657)
- Support "tf.keras.Sequential" in model compression for TensorFlow (#2887)
- Support customized op in the model flops counter (#2795)
- Support quantizing bias in QAT quantizer (#2914)
Training service
- Support configuring python environment using "preCommand" in remote mode (#2875)
- Support AML training service in Windows (#2882)
- Support reuse mode for remote training service (#2923)
WebUI & nnictl
- The "Overview" page on WebUI is redesigned with new layout (#2914)
- Upgraded node, yarn and FabricUI, and enabled Eslint (#2894 #2873 #2744)
- Add/Remove columns in hyper-parameter chart and trials table in "Trials detail" page (#2900)
- JSON format utility beautify on WebUI (#2863)
- Support nnictl command auto-completion (#2857)
UT & IT
- Add integration test for experiment import and export (#2878)
- Add integration test for user installed builtin tuner (#2859)
- Add unit test for nnictl (#2912)
Documentation
- Refactor of the document for model compression (#2919)
Bug fixes
- Bug fix of naïve evolution tuner, correctly deal with trial fails (#2695)
- Resolve the warning "WARNING (nni.protocol) IPC pipeline not exists, maybe you are importing tuner/assessor from trial code?" (#2864)
- Fix search space issue in experiment save/load (#2886)
- Fix bug in experiment import data (#2878)
- Fix annotation in remote mode (python 3.8 ast update issue) (#2881)
- Support boolean type for "choice" hyper-parameter when customizing trial configuration on WebUI (#3003)
NNI v1.8 Release
Release 1.8 - 8/27/2020
Major updates
Training service
- Access trial log directly on WebUI (local mode only) (#2718)
- Add OpenPAI trial job detail link (#2703)
- Support GPU scheduler in reusable environment (#2627) (#2769)
- Add timeout for
web_channel
intrial_runner
(#2710) - Show environment error message in AzureML mode (#2724)
- Add more log information when copying data in OpenPAI mode (#2702)
WebUI, nnictl and nnicli
- Improve hyper-parameter parallel coordinates plot (#2691) (#2759)
- Add pagination for trial job list (#2738) (#2773)
- Enable panel close when clicking overlay region (#2734)
- Remove support for Multiphase on WebUI (#2760)
- Support save and restore experiments (#2750)
- Add intermediate results in export result (#2706)
- Add command to list trial results with highest/lowest metrics (#2747)
- Improve the user experience of nnicli with examples (#2713)
Neural architecture search
- Search space zoo: ENAS and DARTS (#2589)
- API to query intermediate results in NAS benchmark (#2728)
Model compression
- Support the List/Tuple Construct/Unpack operation for TorchModuleGraph (#2609)
- Model speedup improvement: Add support of DenseNet and InceptionV3 (#2719)
- Support the multiple successive tuple unpack operations (#2768)
- Doc of comparing the performance of supported pruners (#2742)
- New pruners: Sensitivity pruner (#2684) and AMC pruner (#2573) (#2786)
- TensorFlow v2 support in model compression (#2755)
Backward incompatible changes
- Update the default experiment folder from
$HOME/nni/experiments
to$HOME/nni-experiments
. If you want to view the experiments created by previous NNI releases, you can move the experiments folders from$HOME/nni/experiments
to$HOME/nni-experiments
manually. (#2686) (#2753) - Dropped support for Python 3.5 and scikit-learn 0.20 (#2778) (#2777) (2783) (#2787) (#2788) (#2790)
Others
Examples
- Remove gpuNum in assessor examples (#2641)
Documentation
- Improve customized tuner documentation (#2628)
- Fix several typos and grammar mistakes in documentation (#2637 #2638, thanks @tomzx)
- Improve AzureML training service documentation (#2631)
- Improve CI of Chinese translation (#2654)
- Improve OpenPAI training service documenation (#2685)
- Improve documentation of community sharing (#2640)
- Add tutorial of Colab support (#2700)
- Improve documentation structure for model compression (#2676)
Bug fixes
- Fix mkdir error in training service (#2673)
- Fix bug when using chmod in remote training service (#2689)
- Fix dependency issue by making
_graph_utils
imported inline (#2675) - Fix mask issue in
SimulatedAnnealingPruner
(#2736) - Fix intermediate graph zooming issue (#2738)
- Fix issue when dict is unordered when querying NAS benchmark (#2728)
- Fix import issue for gradient selector dataloader iterator (#2690)
- Fix support of adding tens of machines in remote training service (#2725)
- Fix several styling issues in WebUI (#2762 #2737)
- Fix support of unusual types in metrics including NaN and Infinity (#2782)
- Fix nnictl experiment delete (#2791)
NNI v1.7.1 Release
NNI v1.7 Release
Release 1.7 - 7/8/2020
Major Features
Training Service
- Support AML(Azure Machine Learning) platform as NNI training service.
- OpenPAI job can be reusable. When a trial is completed, the OpenPAI job won't stop, and wait next trial. refer to reuse flag in OpenPAI config.
- Support ignoring files and folders in code directory with .nniignore when uploading code directory to training service.
Neural Architecture Search (NAS)
-
Provide NAS Open Benchmarks (NasBench101, NasBench201, NDS) with friendly APIs.
-
Support Classic NAS (i.e., non-weight-sharing mode) on TensorFlow 2.X.
Model Compression
-
Improve Model Speedup: track more dependencies among layers and automatically resolve mask conflict, support the speedup of pruned resnet.
-
Added new pruners, including three auto model pruning algorithms: NetAdapt Pruner, SimulatedAnnealing Pruner, AutoCompress Pruner, and ADMM Pruner.
-
Added model sensitivity analysis tool to help users find the sensitivity of each layer to the pruning.
-
Update lottery ticket pruner to export winning ticket.
Examples
- Automatically optimize tensor operators on NNI with a new customized tuner OpEvo.
Built-in tuners/assessors/advisors
WebUI
- Support visualizing nested search space more friendly.
- Show trial's dict keys in hyper-parameter graph.
- Enhancements to trial duration display.
Others
- Provide utility function to merge parameters received from NNI
- Support setting paiStorageConfigName in pai mode
Documentation
- Improve documentation for model compression
- Improve documentation
and examples for NAS benchmarks. - Improve documentation for AzureML training service
- Homepage migration to readthedoc.
Bug Fixes
- Fix bug for model graph with shared nn.Module
- Fix nodejs OOM when
make build
- Fix NASUI bugs
- Fix duration and intermediate results pictures update issue.
- Fix minor WebUI table style issues.
NNI v1.6 Release
Release 1.6 - 5/26/2020
Major Features
New Features and improvement
- support
__version__
for SDK version - support windows dev install
- Improve IPC limitation to 100W
- improve code storage upload logic among trials in non-local platform
HPO Updates
- Improve PBT on failure handling and support experiment resume for PBT
NAS Updates
- NAS support for TensorFlow 2.0 (preview) TF2.0 NAS examples
- Use OrderedDict for LayerChoice
- Prettify the format of export
- Replace layer choice with selected module after applied fixed architecture
Model Compression Updates
- Model compression PyTorch 1.4 support
Training Service Updates
- update pai yaml merge logic
- support windows as remote machine in remote mode Remote Mode
Web UI new supports or improvements
- Show trial error message
- finalize homepage layout
- Refactor overview's best trials module
- Remove multiphase from webui
- add tooltip for trial concurrency in the overview page
- Show top trials for hyper-parameter graph
Bug Fix
- fix dev install
- SPOS example crash when the checkpoints do not have state_dict
- Fix table sort issue when experiment had failed trial
- Support multi python env (conda, pyenv etc)
NNI v1.5 Release
New Features and Documentation
Hyper-Parameter Optimizing
- New tuner: Population Based Training (PBT)
- Trials can now report infinity and NaN as result
Neural Architecture Search
- New NAS algorithm: TextNAS
- ENAS and DARTS now support visualization through web UI.
Model Compression
- New Pruner: GradientRankFilterPruner
- Compressors will validate configuration by default
- Refactor: Adding optimizer as an input argument of pruner, for easy support of DataParallel and more efficient iterative pruning. This is a broken change for the usage of iterative pruning algorithms.
- Model compression examples are refactored and improved
- Added documentation for implementing compressing algorithm
Training Service
- Kubeflow now supports pytorchjob crd v1 (thanks external contributor @jiapinai)
- Experimental DLTS support
Overall Documentation Improvement
- Documentation is significantly improved on grammar, spelling, and wording (thanks external contributor @AHartNtkn)
Fixed Bugs
- ENAS cannot have more than one LSTM layers (thanks external contributor @marsggbo)
- NNI manager's timers will never unsubscribe (thanks external contributor @guilhermehn)
- NNI manager may exhaust head memory (thanks external contributor @Sundrops)
- Batch tuner does not support customized trials (#2075)
- Experiment cannot be killed if it failed on start (#2080)
- Non-number type metrics break web UI (#2278)
- A bug in lottery ticket pruner
- Other minor glitches