Need help with ssd.pytorch?
Click the “chat” button below for chat support from the developer who created it, or find similar developers for support.

About the developer

4.2K Stars 1.6K Forks MIT License 333 Commits 334 Opened issues


A PyTorch Implementation of Single Shot MultiBox Detector

Services available


Need anything else?

Contributors list

SSD: Single Shot MultiBox Object Detector, in PyTorch

A PyTorch implementation of Single Shot MultiBox Detector from the 2016 paper by Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang, and Alexander C. Berg. The official and original Caffe code can be found here.

Table of Contents



  • Install PyTorch by selecting your environment on the website and running the appropriate command.
  • Clone this repository.
    • Note: We currently only support Python 3+.
  • Then download the dataset by following the instructions below.
  • We now support Visdom for real-time loss visualization during training!
    • To use Visdom in the browser:
      # First install Python server and client
      pip install visdom
      # Start the server (probably in a screen or tmux)
      python -m visdom.server
    • Then (during training) navigate to http://localhost:8097/ (see the Train section below for training details).
  • Note: For training, we currently support VOC and COCO, and aim to add ImageNet support soon.


To make things easy, we provide bash scripts to handle the dataset downloads and setup for you. We also provide simple dataset loaders that inherit
, making them fully compatible with the


Microsoft COCO: Common Objects in Context

Download COCO 2014
# specify a directory for dataset to be downloaded into, else default is ~/data/
sh data/scripts/

VOC Dataset

PASCAL VOC: Visual Object Classes

Download VOC2007 trainval & test
# specify a directory for dataset to be downloaded into, else default is ~/data/
sh data/scripts/ # 
Download VOC2012 trainval
# specify a directory for dataset to be downloaded into, else default is ~/data/
sh data/scripts/ # 

Training SSD

  • First download the fc-reduced VGG-16 PyTorch base network weights at:
  • By default, we assume you have downloaded the file in the
mkdir weights
cd weights
  • To train SSD using the train script simply specify the parameters listed in
    as a flag or manually change them.
  • Note:
    • For training, an NVIDIA GPU is strongly recommended for speed.
    • For instructions on Visdom usage/installation, see the Installation section.
    • You can pick-up training from a checkpoint by specifying the path as one of the training parameters (again, see
      for options)


To evaluate a trained network:


You can specify the parameters listed in the
file by flagging them or manually changing them.


VOC2007 Test


| Original | Converted weiliu89 weights | From scratch w/o data aug | From scratch w/ data aug | |:-:|:-:|:-:|:-:| | 77.2 % | 77.26 % | 58.12% | 77.43 % |


GTX 1060: ~45.45 FPS


Use a pre-trained SSD network for detection

Download a pre-trained network

  • We are trying to provide PyTorch
    (dict of weight tensors) of the latest SSD model definitions trained on different datasets.
  • Currently, we provide the following PyTorch models:
    • SSD300 trained on VOC0712 (newest PyTorch weights)
    • SSD300 trained on VOC0712 (original Caffe weights)
  • Our goal is to reproduce this table from the original paper

    SSD results on multiple datasets

Try the demo notebook

  • Make sure you have jupyter notebook installed.
  • Two alternatives for installing jupyter notebook:
    1. If you installed PyTorch with conda (recommended), then you should already have it. (Just navigate to the ssd.pytorch cloned repo and run):
      jupyter notebook
2. If using [pip](
# make sure pip is upgraded
pip3 install --upgrade pip
# install jupyter notebook
pip install jupyter
# Run this inside ssd.pytorch
jupyter notebook
  • Now navigate to
    at http://localhost:8888 (by default) and have at it!

Try the webcam demo

  • Works on CPU (may have to tweak
    for optimal fps) or on an NVIDIA GPU
  • This demo currently requires opencv2+ w/ python bindings and an onboard webcam
    • You can change the default webcam in
  • Install the imutils package to leverage multi-threading on CPU:
    • pip install imutils
  • Running
    python -m
    opens the webcam and begins detecting!


We have accumulated the following to-do list, which we hope to complete in the near future - Still to come: * [x] Support for the MS COCO dataset * [ ] Support for SSD512 training and testing * [ ] Support for training on custom datasets


Note: Unfortunately, this is just a hobby of ours and not a full-time job, so we'll do our best to keep things up to date, but no guarantees. That being said, thanks to everyone for your continued help and feedback as it is really appreciated. We will try to address everything as soon as possible.


We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.