MR-CNNs for Large-Scale Scene Recognition
Here we provide the code and models for the following paper (Arxiv Preprint):
Knowledge Guided Disambiguation for Large-Scale Scene Classification with Multi-Resolution CNNs Limin Wang, Sheng Guo, Weilin Huang, Yuanjun Xiong, and Yu Qiao in IEEE Transactions on Image Processing, 2017
We have made two efforts to exploit CNNs for large-scale scene recognition: - We design a modular framework to capture multi-level visual information for scene understanding by training CNNs from different resolutions - We propose a knowledge disambiguation strategy by using soft labels from extra networks to deal with the label ambiguity issue of scene recognition.
These two efforts are the core part of team "SIAT_MMLAB" for the following large-scale scene recogntion challenges.
| Challenge | Rank | Performance | |:-------------------:|:--------------:|:--------------:| | Places2 challenge 2015 | 2nd place | 0.1736 top5-error | | Places2 challenge 2016 | 4th place | 0.1042 top5-error | | LSUN challenge 2015 | 2nd place | 0.9030 top1-accuracy | | LSUN challenge 2016 | 1st place | 0.9161 top1-accuracy |
We first release the learned models on the Places365 dataset. - Models learned at resolution of 256 * 256
| Model | Top5 Error Rate | |:-------------------:|:--------------:| | (A0) Normal BN-Inception | 0.143 | | (A1) Normal BN-Inception + object networks | 0.141 | | (A2) Normal BN-Inception + scene networks | 0.134 |
| Model | Top5 Error Rate | |:-------------------:|:--------------:| | (B0) Deeper BN-Inception | 0.140 | | (B1) Deeper BN-Inception + object networks | 0.136 | | (B2) Deeper BN-Inception + scene networks | 0.130 |
We release the scripts at the directory of
bash scripts/get_init_models.shto downdload knowldege models.
bash scripts/get_reference_models.shto download reference models.
We release the testing code on the Places365 validation dataset at the directory of
We also release a demo code to use our Places365 model as generic feature extraction and perform scene recognition on the MIT Indoor67 dataset at the directory of
We release the models at the directory of
models/and the training scripts at the directory of
bash scripts/256_inception2_train.shto train standard CNNs.
bash scripts/256_kd_object_inception2_train.shto train knowledge disambiguation networks (by object network).
bash scripts/256_kd_scene_inception2_train.shto train knowledge disambiguation netowrks (by scene network).
The training code is based on our modified Caffe toolbox. It is a efficient parallel caffe with MPI implementation. Meanwhile, we implement a new kl-divergence loss layer for our knowledge disambiguation methods;