Need help with image-captioning?
Click the “chat” button below for chat support from the developer who created it, or find similar developers for support.

About the developer

JDAI-CV
202 Stars 38 Forks 9 Commits 4 Opened issues

Description

Implementation of 'X-Linear Attention Networks for Image Captioning' [CVPR 2020]

Services available

!
?

Need anything else?

Contributors list

Introduction

This repository is for X-Linear Attention Networks for Image Captioning (CVPR 2020). The original paper can be found here.

Please cite with the following BibTeX:

@inproceedings{xlinear2020cvpr,
  title={X-Linear Attention Networks for Image Captioning},
  author={Pan, Yingwei and Yao, Ting and Li, Yehao and Mei, Tao},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year={2020}
}

Requirements

Data preparation

  1. Download the bottom up features and convert them to npz files

    python2 tools/create_feats.py --infeats bottom_up_tsv --outfolder ./mscoco/feature/up_down_10_100
    
  2. Download the annotations into the mscoco folder. More details about data preparation can be referred to self-critical.pytorch

  3. Download coco-caption and setup the path of _C.INFERENCE.COCOPATH in lib/config.py

  4. The pretrained models and results can be downloaded here.

  5. The pretrained SENet-154 model can be downloaded here.

Training

Train X-LAN model

bash experiments/xlan/train.sh

Train X-LAN model using self critical

Copy the pretrained model into experiments/xlanrl/snapshot and run the script ``` bash experiments/xlanrl/train.sh ```

Train X-LAN transformer model

bash experiments/xtransformer/train.sh

Train X-LAN transformer model using self critical

Copy the pretrained model into experiments/xtransformerrl/snapshot and run the script ``` bash experiments/xtransformerrl/train.sh ```

Evaluation

CUDA_VISIBLE_DEVICES=0 python3 main_test.py --folder experiments/model_folder --resume model_epoch

Acknowledgements

Thanks the contribution of self-critical.pytorch and awesome PyTorch team.

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.