by dmlc

dmlc / gluon-cv

Gluon CV Toolkit

4.3K Stars 993 Forks Last release: 2 months ago (v0.8.0) Apache License 2.0 774 Commits 8 Releases

Available items

No Items, yet!

The developer of this repository has not created any items for sale yet. Need a bug fixed? Help with integration? A different license? Create a request here:

Gluon CV Toolkit

Build Status GitHub license Code Coverage PyPI PyPI Pre-release Downloads


| Installation | Documentation | Tutorials |

GluonCV provides implementations of the state-of-the-art (SOTA) deep learning models in computer vision.

It is designed for engineers, researchers, and students to fast prototype products and research ideas based on these models. This toolkit offers four main features:

  1. Training scripts to reproduce SOTA results reported in research papers
  2. A large number of pre-trained models
  3. Carefully designed APIs that greatly reduce the implementation complexity
  4. Community supports


Check the HD video at Youtube or Bilibili.

Supported Applications

| Application | Illustration | Available Models | |:-----------------------:|:---:|:---:| | Image Classification:
recognize an object in an image. | classification | 50+ models, including
ResNet, MobileNet,
DenseNet, VGG, ... | | Object Detection:
detect multiple objects with their
bounding boxes in an image. | detection | Faster RCNN, SSD, Yolo-v3 | | Semantic Segmentation:
associate each pixel of an image
with a categorical label. | semantic | FCN, PSP, ICNet, DeepLab-v3, DeepLab-v3+, DANet, FastSCNN | | Instance Segmentation:
detect objects and associate
each pixel inside object area with an
instance label. | instance | Mask RCNN| | Pose Estimation:
detect human pose
from images. | pose | Simple Pose| | Video Action Recognition:
recognize human actions
in a video. | action_recognition | TSN, C3D, I3D, P3D, R3D, R2+1D, Non-local, SlowFast | | Depth Prediction:
predict depth map
from images. | depth | Monodepth2| | GAN:
generate visually deceptive images | lsun | WGAN, CycleGAN, StyleGAN| | Person Re-ID:
re-identify pedestrians across scenes | re-id |Market1501 baseline |


GluonCV supports Python 3.5 or later. The easiest way to install is via pip.

Stable Release

The following commands install the stable version of GluonCV and MXNet:

pip install gluoncv --upgrade
pip install -U --pre mxnet -f
# if cuda 10.1 is installed
pip install -U --pre mxnet -f

The latest stable version of GluonCV is 0.8 and we recommend mxnet 1.6.0/1.7.0

Nightly Release

You may get access to latest features and bug fixes with the following commands which install the nightly build of GluonCV and MXNet:

pip install gluoncv --pre --upgrade
pip install -U --pre mxnet -f
# if cuda 10.1 is installed
pip install -U --pre mxnet -f

There are multiple versions of MXNet pre-built package available. Please refer to mxnet packages if you need more details about MXNet versions.

Docs ๐Ÿ“–

GluonCV documentation is available at our website.


All tutorials are available at our website!


Check out how to use GluonCV for your own research or projects.


If you feel our code or models helps in your research, kindly cite our papers:

  author  = {Jian Guo and He He and Tong He and Leonard Lausen and Mu Li and Haibin Lin and Xingjian Shi and Chenguang Wang and Junyuan Xie and Sheng Zha and Aston Zhang and Hang Zhang and Zhi Zhang and Zhongyue Zhang and Shuai Zheng and Yi Zhu},
  title   = {GluonCV and GluonNLP: Deep Learning in Computer Vision and Natural Language Processing},
  journal = {Journal of Machine Learning Research},
  year    = {2020},
  volume  = {21},
  number  = {23},
  pages   = {1-7},
  url     = {}

@article{he2018bag, title={Bag of Tricks for Image Classification with Convolutional Neural Networks}, author={He, Tong and Zhang, Zhi and Zhang, Hang and Zhang, Zhongyue and Xie, Junyuan and Li, Mu}, journal={arXiv preprint arXiv:1812.01187}, year={2018} }

@article{zhang2019bag, title={Bag of Freebies for Training Object Detection Neural Networks}, author={Zhang, Zhi and He, Tong and Zhang, Hang and Zhang, Zhongyue and Xie, Junyuan and Li, Mu}, journal={arXiv preprint arXiv:1902.04103}, year={2019} }

@article{zhang2020resnest, title={ResNeSt: Split-Attention Networks}, author={Zhang, Hang and Wu, Chongruo and Zhang, Zhongyue and Zhu, Yi and Zhang, Zhi and Lin, Haibin and Sun, Yue and He, Tong and Muller, Jonas and Manmatha, R. and Li, Mu and Smola, Alexander}, journal={arXiv preprint arXiv:2004.08955}, year={2020} }

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.