Need help with CoCLR?
Click the “chat” button below for chat support from the developer who created it, or find similar developers for support.

About the developer

TengdaHan
168 Stars 10 Forks Apache License 2.0 29 Commits 4 Opened issues

Description

[NeurIPS'20] Self-supervised Co-Training for Video Representation Learning. Tengda Han, Weidi Xie, Andrew Zisserman.

Services available

!
?

Need anything else?

Contributors list

# 249,098
Python
gsoc
Shell
29 commits

CoCLR: Self-supervised Co-Training for Video Representation Learning

arch

This repository contains the implementation of:

  • InfoNCE (MoCo on videos)
  • UberNCE (supervised contrastive learning on videos)
  • CoCLR

Link:

[Project Page] [PDF] [Arxiv]

News

  • [2021.01.29] Upload both RGB and optical flow dataset for UCF101 (links).
  • [2021.01.11] Update our paper for NeurIPS2020 final version: corrected InfoNCE-RGB-linearProbe baseline result in Table1 from 52.3% (pretrained for 800 epochs, unnessary and unfair) to 46.8% (pretrained for 500 epochs, fair comparison). Thanks @liuhualin333 for pointing out.
  • [2020.12.08] Update instructions.
  • [2020.11.17] Upload pretrained weights for UCF101 experiments.
  • [2020.10.30] Update "draft" dataloader files, CoCLR code, evaluation code as requested by some researchers. Will check and add detailed instructions later.

Pretrain Instruction

  • InfoNCE pretrain on UCF101-RGB

    CUDA_VISIBLE_DEVICES=0,1 python -m torch.distributed.launch \
    --nproc_per_node=2 main_nce.py --net s3d --model infonce --moco-k 2048 \
    --dataset ucf101-2clip --seq_len 32 --ds 1 --batch_size 32 \
    --epochs 300 --schedule 250 280 -j 16
    
  • InfoNCE pretrain on UCF101-Flow

    CUDA_VISIBLE_DEVICES=0,1 python -m torch.distributed.launch \
    --nproc_per_node=2 main_nce.py --net s3d --model infonce --moco-k 2048 \
    --dataset ucf101-f-2clip --seq_len 32 --ds 1 --batch_size 32 \
    --epochs 300 --schedule 250 280 -j 16
    
  • CoCLR pretrain on UCF101 for one cycle

    CUDA_VISIBLE_DEVICES=0,1 python -m torch.distributed.launch \
    --nproc_per_node=2 main_coclr.py --net s3d --topk 5 --moco-k 2048 \
    --dataset ucf101-2stream-2clip --seq_len 32 --ds 1 --batch_size 32 \
    --epochs 100 --schedule 80 --name_prefix Cycle1-FlowMining_ -j 8 \
    --pretrain {rgb_infoNCE_checkpoint.pth.tar} {flow_infoNCE_checkpoint.pth.tar}
    
    CUDA_VISIBLE_DEVICES=0,1 python -m torch.distributed.launch \
    --nproc_per_node=2 main_coclr.py --net s3d --topk 5 --moco-k 2048 --reverse \
    --dataset ucf101-2stream-2clip --seq_len 32 --ds 1 --batch_size 32 \
    --epochs 100 --schedule 80 --name_prefix Cycle1-RGBMining_ -j 8 \
    --pretrain {flow_infoNCE_checkpoint.pth.tar} {rgb_cycle1_checkpoint.pth.tar} 
    
  • InfoNCE pretrain on K400-RGB

    CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch \
    --nproc_per_node=4 main_infonce.py --net s3d --model infonce --moco-k 16384 \
    --dataset k400-2clip --lr 1e-3 --seq_len 32 --ds 1 --batch_size 32 \
    --epochs 300 --schedule 250 280 -j 16
    
  • InfoNCE pretrain on K400-Flow

    CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch \
    --nproc_per_node=4 teco_fb_main.py --net s3d --model infonce --moco-k 16384 \
    --dataset k400-f-2clip --lr 1e-3 --seq_len 32 --ds 1 --batch_size 32 \
    --epochs 300 --schedule 250 280 -j 16
    
  • CoCLR pretrain on K400 for one cycle

    CUDA_VISIBLE_DEVICES=0,1 python -m torch.distributed.launch \
    --nproc_per_node=2 main_coclr.py --net s3d --topk 5 --moco-k 16384 \
    --dataset k400-2stream-2clip --seq_len 32 --ds 1 --batch_size 32 \
    --epochs 50 --schedule 40 --name_prefix Cycle1-FlowMining_ -j 8 \
    --pretrain {rgb_infoNCE_checkpoint.pth.tar} {flow_infoNCE_checkpoint.pth.tar}
    
    CUDA_VISIBLE_DEVICES=0,1 python -m torch.distributed.launch \
    --nproc_per_node=2 main_coclr.py --net s3d --topk 5 --moco-k 16384 --reverse \
    --dataset k400-2stream-2clip --seq_len 32 --ds 1 --batch_size 32 \
    --epochs 50 --schedule 40 --name_prefix Cycle1-RGBMining_ -j 8 \
    --pretrain {flow_infoNCE_checkpoint.pth.tar} {rgb_cycle1_checkpoint.pth.tar} 
    

Dataset

  • RGB for UCF101: [download] (tar file, 29GB, packed with lmdb)
  • TVL1 optical flow for UCF101: [download] (tar file, 20.5GB, packed with lmdb)
  • Note: I created these lmdb files with msgpack==0.6.2, when load them with msgpack>=1.0.0, you can do
    msgpack.loads(raw_data, raw=True)
    (issue#32)

Result

Finetune entire network for action classification on UCF101: arch

Pretrained Weights

Our models: * UCF101-RGB-CoCLR: [download] [[email protected]=51.8 on UCF101-RGB] * UCF101-Flow-CoCLR: [download] [[email protected]=48.4 on UCF101-Flow]

Baseline models: * UCF101-RGB-InfoNCE: [download] [[email protected]=33.1 on UCF101-RGB] * UCF101-Flow-InfoNCE: [download] [[email protected]=45.2 on UCF101-Flow]

Kinetics400-pretrained models comming soon.

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.