Need help with tensorflow-deeplab-v3-plus?
Click the “chat” button below for chat support from the developer who created it, or find similar developers for support.

About the developer

rishizek
744 Stars 307 Forks MIT License 78 Commits 62 Opened issues

Description

DeepLabv3+ built in TensorFlow

Services available

!
?

Need anything else?

Contributors list

# 32,512
Python
PHP
Shell
deeplab
66 commits
# 127,844
Python
deeplab
deeplab...
Tensorf...
6 commits

DeepLab-v3-plus Semantic Segmentation in TensorFlow

This repo attempts to reproduce Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation (DeepLabv3+) in TensorFlow for semantic image segmentation on the PASCAL VOC dataset and Cityscapes dataset. The implementation is largely based on my DeepLabv3 implementation, which was originally based on DrSleep's DeepLab v2 implemantation and tensorflow models Resnet implementation.

Setup

Requirements:

  • tensorflow >=1.6
  • numpy
  • matplotlib
  • pillow
  • opencv-python

You can install the requirements by running

pip install -r requirements.txt
.

Dataset Preparation

This project uses the TFRecord format to consume data in the training and evaluation process. Creating a TFRecord from raw image files is pretty straight forward and will be covered here.

Cityscapes

Note: This project includes a script for creating a TFRecord for Cityscapes and Pascal VOC, but not other datasets.

Creating TFRecords for Cityscapes

In order to download the Cityscapes dataset, you must first register with their website. After this, make sure to download both

leftImg8bit
and
gtFine
. You should end up with a folder that will be in the structure
+ cityscapes
 + leftImg8bit
 + gtFine

Next, in order to generate training labels for the dataset, clone cityScapesScripts project

git clone https://github.com/mcordts/cityscapesScripts.git
cd cityscapesScripts

Then from the root of your cityscapes dataset run

# must have $CITYSCAPES_ROOT defined
python cityscapesscripts/preparation/createTrainIdLabelImgs.py

Finally, you can now run the conversion script

create_cityscapes_tf_record.py
provided in this repository.

Pascal VOC

Creating TFRecords for Pascal VOC

Once you have the dataset available, you can create tf records for pascal voc by running the following

bash
python create_pascal_tf_record.py --data_dir DATA_DIR \
                                  --image_data_dir IMAGE_DATA_DIR \
                                  --label_data_dir LABEL_DATA_DIR 

Training

For training, you need to download and extract pre-trained Resnet v2 101 model from slim specifying the location with

--pre_trained_model
. You also need to convert original data to the TensorFlow TFRecord format. Once you have followed all the steps in dataset preparation and created TFrecord for training and validation data, you can start training model as follow:
bash
python train.py --model_dir MODEL_DIR --pre_trained_model PRE_TRAINED_MODEL
Here,
--pre_trained_model
contains the pre-trained Resnet model, whereas
--model_dir
contains the trained DeepLabv3+ checkpoints. If
--model_dir
contains the valid checkpoints, the model is trained from the specified checkpoint in
--model_dir
.

You can see other options with the following command:

bash
python train.py --help
For inference the trained model with
77.31%
mIoU on the Pascal VOC 2012 validation dataset is available here. Download and extract to
--model_dir
.

The training process can be visualized with Tensor Board as follow:

bash
tensorboard --logdir MODEL_DIR

Evaluation

To evaluate how model perform, one can use the following command:

bash
python evaluate.py --help
The current best model build by this implementation achieves
77.31%
mIoU on the Pascal VOC 2012 validation dataset.

| Network Backbone | train OS | eval OS | SC | mIOU paper | mIOU repo | |:----------------:|:--------:|:-------:|:---:|:-----------:|:----------:| | Resnet101 | 16 | 16 | | 78.85% | 77.31% |

Here, the above model was trained about 9.5 hours (with Tesla V100 and r1.6) with following parameters:

bash
python train.py --train_epochs 43 --batch_size 15 --weight_decay 2e-4 --model_dir models/ba=15,wd=2e-4,max_iter=30k --max_iter 30000

Inference

To apply semantic segmentation to your images, one can use the following commands:

bash
python inference.py --data_dir DATA_DIR --infer_data_list INFER_DATA_LIST --model_dir MODEL_DIR 
The trained model is available here. One can find the detailed explanation of mask such as meaning of color in DrSleep's repo.

TODO:

Pull requests are welcome. - [x] Implement Decoder - [x] Resnet as Network Backbone - [x] Training on cityscapes - [ ] Xception as Network Backbone - [ ] Implement depthwise separable convolutions - [ ] Make network more GPU memory efficient (i.e. support larger batch size) - [ ] Multi-GPU support - [ ] Channels first support (Apparently large performance boost on GPU) - [ ] Model pretrained on MS-COCO - [ ] Unit test

Acknowledgment

This repo borrows code heavily from - DrSleep's DeepLab-ResNet (DeepLabv2) - TensorFlow Official Models - Tensorflow Object Detection API - TensorFlow-Slim - TensorFlow

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.