Need help with DNA?
Click the “chat” button below for chat support from the developer who created it, or find similar developers for support.

About the developer

183 Stars 37 Forks 26 Commits 2 Opened issues


Block-wisely Supervised Neural Architecture Search with Knowledge Distillation (CVPR 2020)

Services available


Need anything else?

Contributors list

No Data


This repository provides the code of our paper: Blockwisely Supervised Neural Architecture Search with Knowledge Distillation.

Illustration of DNA. Each cell of the supernet is trained independently to mimic the behavior of the corresponding teacher block.

Comparison of model ranking for DNA vs. DARTS, SPOS and MnasNet under two different hyper-parameters.

Our Trained Models

  • Our searched models have been trained from scratch and can be found in:

  • Here is a summary of our searched models:

    | Model | FLOPs | Params | [email protected] | [email protected] | |:---------:|:---------:|:---------:|:---------:|:---------:| | DNA-a | 348M | 4.2M | 77.1% | 93.3% | | DNA-b | 394M | 4.9M | 77.5% | 93.3% | | DNA-c | 466M | 5.3M | 77.8% | 93.7% | | DNA-d | 611M | 6.4M | 78.4% | 94.0% |


1. Requirements

2. Searching

The code for supernet training, evaluation and searching is under

directory. -
cd searching

i) Train & evaluate the block-wise supernet with knowledge distillation

  • Modify datadir in
    to your ImageNet path.
  • Modify nprocpernode in
    to suit your GPU number. The default batch size is 64 for 8 GPUs, you can change batch size and learning rate in
  • By default, the supernet will be trained sequentially from stage 1 to stage 6 and evaluate after each stage. This will take about 2 days on 8 GPUs with EfficientNet B7 being the teacher. Resuming from checkpoints is supported. You can also change
    to force start from a intermediate stage without loading checkpoint.
  • sh

    ii) Search for the best architecture under constraint.

    Our traversal search can handle a search space with 6 ops in each layer, 6 layers in each stage, 6 stages in total. A search process like this should finish in half an hour with a single cpu. To perform search over a larger search space, you can manually divide the search space or use other search algorithms such as Evolution Algorithms to process our evaluated architecture potential files.

  • Copy the path of architecture potential files generated in step i) to

    . Modify the constraint in
  • python

    iii) Searching with multiple cells in each block.

    Please refer to the clarification from @MohanadOdema in this issue.

3. Retraining

The retraining code is simplified from the repo: pytorch-image-models and is under

  • cd retraining
  • Retrain our models or your searched models

    • Modify the
      : change data path and hyper-params according to your requirements
    • Add your searched model architecture to
      . You can also use our searched and predefined DNA models.
    • sh
  • You can evaluate our models with the following command:\

    python PATH/TO/ImageNet/validation --model DNA_a --checkpoint PATH/TO/model.pth.tar
    • PATH/TO/ImageNet/validation
      should be replaced by your validation data path.
    • --model
      can be replaced by
      for our different models.
    • --checkpoint
      : Suggest the path of your downloaded checkpoint here.

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.