Skip to content
This repository has been archived by the owner on Oct 31, 2023. It is now read-only.

Latest commit

 

History

History

one-shot_LaNAS

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

one-shot/few-shot LaNAS

fast to get a working result

the inaccurate prediction from supernet degrades the final network performance

The one-shot LaNAS uses a pretrained supernet to predict the performance of a proposed architecture via masking. The following figure illustrates the search procedures.

The training of supernet is same as the regular training except for that we apply a random mask at each iterations.

Evaluating search algorithms on the supernet

NASBench-101 has very limited architectures (~420K architectures), which can be easily predicted with some sort of predictor. Supernet can be a great alternative to solve this problem as it renders a search space having 10^21 architectures. Therefore, our supernet can also be used as a benchmark to evaluate different search algorithms. See Fig.6 in LaNAS paper. Please check how LaNAS interacts with supernet, and samples the architecture and its accuracy.

Training the supernet

You can skip this step if use our pre-trained supernet.

Our supernet is designed for NASNet search space, and changing it to a new design space requires some work to change the codes. We're working on this issue, will update later. The training of supernet is fairly easy, simply

python train.py

  • Training on the ImageNet

Please use the training pipeline from Pytorch-Image-Models. Here we describe the procedures to do so:

  1. get the supernet model from supernet_train.py, line 94
  2. go to Pytorch-Image-Models
  3. find pytorch-image-models/blob/master/timm/models/factory.py, replace line 57 as follows
# model = create_fn(**model_args, **kwargs) 
model = our-supernet

Searching with a supernet

You can download the supernet pre-trained by us from here. Place it in the same folder, and start searching with

python train.py

The search results will be written into a results.txt, and you can read the results by

python read_result.py

The program outputs every samples with its test accuracy, e.g.

[[1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]] 81.69 3774

[1.0 .. 0.0] is the architecture encoding, which can be used to train a network later.

81.69 is the test accuracy predicted from supernet via weight sharing.

3774 means this is the 3774th sample.

Training a searched network

Once you pick a network after reading the results, you can train the network in the Evaluate folder.

cd Evaluate
#attention, you need supply the code of target architecture in the argument of masked_code
python super_individual_train.py --cutout --auxiliary --batch_size=16 --init_ch=36 --masked_code='[1, 0, 1, 1, 0, 1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 0, 1, 1, 0, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 1, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 1, 0, 1, 0, 1, 0, 1, 1, 0, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 0, 0, 1, 1, 1, 0, 1, 0, 0, 1, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1, 1, 0, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 1, 1, 1, 1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1, 0, 1, 0, 1, 0, 1, 1, 0, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0]'

Improving with few-shot NAS

Though one-shot NAS substantially reduces the computation cost by training only one supernet, to approximate the performance of every architecture in the search space via weight-sharing. However, the performance estimation can be very inaccurate due to the co-adaption among operations. Recently, we propose few-shot NAS that uses multiple supernetworks, called sub-supernet, each covering different regions of the search space to alleviate the undesired co-adaption. Since each sub-supernet only covers a small search space, compared to one-shot NAS, few-shot NAS improves the accuracy of architecture evaluation with a small increase of evaluation cost. Please see the following paper for details.

Few-shot Neural Architecture Search
in submission
Yiyang Zhao (WPI), Linnan Wang (Brown), Yuandong Tian (FAIR), Rodrigo Fonseca (Brown), Tian Guo (WPI)

To Evaluate Few-shot NAS, please check this repository. The following figures show the performance improvement of few-shot NAS.

These figures basically tell you few-shot NAS is an effective trade-off between one-shot NAS and vanilla NAS, i.e. training from scratch that retains both good performance estimation of a network and the fast speed.