Need help with ViT?
Click the “chat” button below for chat support from the developer who created it, or find similar developers for support.

About the developer

gupta-abhay
166 Stars 15 Forks MIT License 33 Commits 6 Opened issues

Description

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Services available

!
?

Need anything else?

Contributors list

# 175,533
HTML
pytorch
CSS
image-r...
33 commits

Vision Transformers

Implementation of Vision Transformer in PyTorch, a new model to achieve SOTA in vision classification with using transformer style encoders. Associated blog article.

ViT

Features

  • [x] Vanilla ViT
  • [x] Hybrid ViT (with support for BiTResNets as backbone)
  • [x] Hybrid ViT (with support for AxialResNets as backbone)
  • [x] Training Scripts

To Do:

  • [ ] Training Script
    • [ ] Support for linear decay
    • [ ] Correct hyper parameters
  • [ ] Full Axial-ViT
  • [ ] Results for Imagenet-1K and Imagenet-21K

Installation

Create the environment:

conda env create -f environment.yml

Preparing the dataset:

mkdir data
cd data
ln -s path/to/dataset imagenet

Running the Scripts

For non-distributed training:

python train.py --model ViT --name vit_logs

For distributed training:

CUDA_VISIBLE_DEVICES=0,1,2,3 python dist_train.py --model ViT --name vit_dist_logs

For testing add the

--test
parameter:
python train.py --model ViT --name vit_logs --test
CUDA_VISIBLE_DEVICES=0,1,2,3 python dist_train.py --model ViT --name vit_dist_logs --test

References

  1. BiTResNet: https://github.com/google-research/bigtransfer/tree/master/bitpytorch
  2. AxialResNet: https://github.com/csrhddlam/axial-deeplab
  3. Training Scripts: https://github.com/csrhddlam/axial-deeplab

Citations

@inproceedings{
    anonymous2021an,
    title={An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale},
    author={Anonymous},
    booktitle={Submitted to International Conference on Learning Representations},
    year={2021},
    url={https://openreview.net/forum?id=YicbFdNTTy},
    note={under review}
}

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.