Block-wisely Supervised Neural Architecture Search with Knowledge Distillation (CVPR 2020)
No Data
This repository provides the code of our paper: Blockwisely Supervised Neural Architecture Search with Knowledge Distillation.
Illustration of DNA. Each cell of the supernet is trained independently to mimic the behavior of the corresponding teacher block.
Comparison of model ranking for DNA vs. DARTS, SPOS and MnasNet under two different hyper-parameters.
Our searched models have been trained from scratch and can be found in: https://drive.google.com/drive/folders/1Oqc2gq8YysrJq2i6RmPMLKqheGfB9fWH.
Here is a summary of our searched models:
| Model | FLOPs | Params | [email protected] | [email protected] | |:---------:|:---------:|:---------:|:---------:|:---------:| | DNA-a | 348M | 4.2M | 77.1% | 93.3% | | DNA-b | 394M | 4.9M | 77.5% | 93.3% | | DNA-c | 466M | 5.3M | 77.8% | 93.7% | | DNA-d | 611M | 6.4M | 78.4% | 94.0% |
pip install timm==0.1.14We use this pytorch-image-models codebase to validate our models.
The code for supernet training, evaluation and searching is under
searchingdirectory. -
cd searching
initialize/data.yamlto your ImageNet path.
dist_train.shto suit your GPU number. The default batch size is 64 for 8 GPUs, you can change batch size and learning rate in
initialize/train_pipeline.yaml
start_stagein
initialize/train_pipeline.yamlto force start from a intermediate stage without loading checkpoint.
sh dist_train.sh
Our traversal search can handle a search space with 6 ops in each layer, 6 layers in each stage, 6 stages in total. A search process like this should finish in half an hour with a single cpu. To perform search over a larger search space, you can manually divide the search space or use other search algorithms such as Evolution Algorithms to process our evaluated architecture potential files.
Copy the path of architecture potential files generated in step i) to
potential_yamlin
process_potential.py. Modify the constraint in
process_potential.py.
python process_potential.py
The retraining code is simplified from the repo: pytorch-image-models and is under
retrainingdirectory.
cd retraining
Retrain our models or your searched models
run_example.sh: change data path and hyper-params according to your requirements
model.py. You can also use our searched and predefined DNA models.
sh run_example.sh
You can evaluate our models with the following command:\
python validate.py PATH/TO/ImageNet/validation --model DNA_a --checkpoint PATH/TO/model.pth.tar
PATH/TO/ImageNet/validationshould be replaced by your validation data path.
--model:
DNA_acan be replaced by
DNA_b,
DNA_c,
DNA_dfor our different models.
--checkpoint: Suggest the path of your downloaded checkpoint here.