by fsx950223

yolov3 with mobilenetv2 and efficientnet

234 Stars 86 Forks Last release: Not found MIT License 219 Commits 0 Releases

Available items

No Items, yet!

The developer of this repository has not created any items for sale yet. Need a bug fixed? Help with integration? A different license? Create a request here:


Tensorflow implementation mobilenetv2-yolov3 and efficientnet-yolov3 inspired by keras-yolo3


Backend: - [x] MobilenetV2 - [x] Efficientnet - [x] Darknet53

Callback: - [x] mAP - [ ] Tensorboard extern callback

Loss: - [x] MSE - [x] GIOU - [x] Adversarial loss

Train: - [x] Cosine learning rate - [x] Auto augment

Tensorflow: - [x] Tensorflow2 Ready - [x] pipeline - [ ] Convert model to tensorflow lite model - [x] Multi GPU training - [ ] TPU support - [x] TensorRT support

Serving: - [x] Tensorflow Serving warm up request - [x] Tensorflow Serving JAVA Client - [x] Tensorflow Serving Python Client - [x] Tensorflow Serving Service Control Client - [x] Tensorflow Serving Server Build and Plugins develop



pip install -r requirements.txt

Get help info:

python --help


  1. Format file name like [name][number].[extension]
    ``` voc
    train_3998.txt ```
  2. If you are using txt dataset, please format records like [imagepath] [,[xmin ymin xmax ymax class]]
    (for convenience, you can modify voc to parse your data to specific data format), else you should modify, then run

to parse your data to tfrecords.

/image/path 179 66 272 290 14 172 38 317 349 14 276 2 426 252 14 1 32 498 365 13

3. Run:

python --mode=TRAIN --train_dataset_glob= --epochs=50 --epochs=50 --mode=TRAIN


python --mode=IMAGE --model=


python --mode=MAP --model= --test_dataset_glob=

Export serving model:

python --mode=SERVING --model=

Use custom config file:

python --config=mobilenetv2.yaml

Set up tensorflow.js model (Live Demo:

  1. Create a web server on project folder
  2. Open browser and enter [yoururl:yourport]/tfjs


  • Download pascal tfrecords from here.
  • Download pre-trained mobilenetv2-yolov3 model(VOC2007) here
  • Download pre-trained efficientnet-yolov3 model(VOC2007) here
  • Download pre-trained efficientnet-yolov3 model(VOC2007+2012) here


Network: Mobilenetv2+Yolov3
Input size: 416*416
Train Dataset: VOC2007
Test Dataset: VOC2007

aeroplane ap:  0.6721874861775297
bicycle ap:  0.7844226664948993
bird ap:  0.6863393529648882
boat ap:  0.5102715372530052
bottle ap:  0.4098093697072679
bus ap:  0.7646277543282962
car ap:  0.8000339732789448
cat ap:  0.8681120849855787
chair ap:  0.4021823009684314
cow ap:  0.6768311030872428
diningtable ap:  0.626045232887253
dog ap:  0.8293983813984888
horse ap:  0.8315961581768014
motorbike ap:  0.771283337747543
person ap:  0.7298645793931624
pottedplant ap:  0.3081565644702266
sheep ap:  0.6510012751038824
sofa ap:  0.6442699680945367
train ap:  0.8025086962000969
tvmonitor ap:  0.6239227675451299
mAP:  0.6696432295131602

GPU inference time (GTX1080Ti): 19ms
CPU inference time (i7-8550U): 112ms
Model size: 37M

Network: Efficientnet+Yolov3
Input size: 380*380
Train Dataset: VOC2007
Test Dataset: VOC2007

aeroplane ap:  0.7770436248733187
bicycle ap:  0.822183784348553
bird ap:  0.7346967323068865
boat ap:  0.6142903989882571
bottle ap:  0.4518063126765959
bus ap:  0.782237197681936
car ap:  0.8138978890046222
cat ap:  0.8800232369515162
chair ap:  0.4531520519719176
cow ap: 0.6992367978932157
diningtable ap:  0.6765065569475968
dog ap:  0.8612118810883834
horse ap:  0.8559580684256001
motorbike ap:  0.8027311717682002
person ap:  0.7280218883512792
pottedplant ap:  0.35520418960051925
sheep ap:  0.6833401035128458
sofa ap:  0.6753841073186044
train ap:  0.8107647793504738
tvmonitor ap:  0.6726791558585905
mAP:  0.7075184964459456

GPU inference time (GTX1080Ti): 23ms
CPU inference time (i7-8550U): 168ms
Model size: 77M

Network: Efficientnet+Yolov3
Input size: 380*380
Train Dataset: VOC2007+VOC2012
Test Dataset: VOC2007

aeroplane ap:  0.8572154850266848
bicycle ap:  0.8129962658687486
bird ap:  0.8325678324285539
boat ap:  0.7061501348114156
bottle ap:  0.5603823420846883
bus ap:  0.8536452418769342
car ap:  0.8395446870008888
cat ap:  0.9200504816535645
chair ap:  0.514644868267842
cow ap:  0.8202171886452714
diningtable ap:  0.7370149790284737
dog ap:  0.900374518831019
horse ap:  0.8632567146990895
motorbike ap:  0.8147344820261591
person ap:  0.7690434789031615
pottedplant ap:  0.4576271726152926
sheep ap:  0.8006580581981677
sofa ap:  0.7478146395952494
train ap:  0.8783508559769437
tvmonitor ap:  0.6923886096918628
mAP:  0.7689339018615006

GPU inference time (GTX1080Ti): 23ms
CPU inference time (i7-8550U): 168ms
Model size: 77M


- YOLOv3: An Incremental Improvement
- An Analysis of Scale Invariance in Object Detection - SNIP
- Scale-Aware Trident Networks for Object Detection
- Understanding the Effective Receptive Field in Deep Convolutional Neural Networks
- Bag of Freebies for Training Object Detection Neural Networks
- Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regression
- MobileNetV2: Inverted Residuals and Linear Bottlenecks

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.