AAAI‘20 - Learning 2D Temporal Localization Networks for Moment Localization with Natural Language
we are hiring talented interns: [email protected]
In this paper, we study the problem of moment localization with natural language, and propose a novel 2D Temporal Adjacent Networks(2D-TAN) method. The core idea is to retrieve a moment on a two-dimensional temporal map, which considers adjacent moment candidates as the temporal context. 2D-TAN is capable of encoding adjacent temporal relation, while learning discriminative feature for matching video moments with referring expressions. Our model is simple in design and achieves competitive performance in comparison with the state-of-the-art methods on three benchmark datasets.
For journal reviewers: please download and unzip
journal.zip. The password is the manuscript id of our submission.
| Method | [email protected] | [email protected] | [email protected] | [email protected] | | ---- |:-------------:| :-----:|:-----:|:-----:| | Pool | 40.94 | 22.85 | 83.84 | 50.35 | | Conv | 42.80 | 23.25 | 80.54 | 54.14 |
I fixed a bug for loading charades visual features, the updated performance is listed above. Please use these results when comparing with our AAAI paper.
| Method | [email protected] | [email protected] | [email protected] | [email protected] | [email protected] | [email protected] | | ---- |:-------------:| :-----:|:-----:|:-----:|:-----:|:-----:| | Pool | 59.45 | 44.51 | 26.54 | 85.53 | 77.13 | 61.96 | | Conv | 58.75 | 44.05 | 27.38 | 85.65 | 76.65 | 62.26 |
| Method | [email protected] | [email protected] | [email protected] | [email protected] | [email protected] | [email protected] | | ---- |:-------------:| :-----:|:-----:|:-----:|:-----:|:-----:| | Pool | 47.59 | 37.29 | 25.32 | 70.31 | 57.81 | 45.04 | | Conv | 46.39 | 35.17 | 25.17 | 74.46 | 56.99 | 44.24 |
Please download the visual features from google drive and save it to the
data/folder.
Use the following commands for training: ```
python moment_localization/train.py --cfg experiments/charades/2D-TAN-16x16-K5L8-pool.yaml --verbose
python moment_localization/train.py --cfg experiments/charades/2D-TAN-16x16-K5L8-conv.yaml --verbose
python moment_localization/train.py --cfg experiments/activitynet/2D-TAN-64x64-K9L4-pool.yaml --verbose
python moment_localization/train.py --cfg experiments/activitynet/2D-TAN-64x64-K9L4-conv.yaml --verbose
python moment_localization/train.py --cfg experiments/tacos/2D-TAN-128x128-K5L8-pool.yaml --verbose
python moment_localization/train.py --cfg experiments/tacos/2D-TAN-128x128-K5L8-conv.yaml --verbose ```
Our trained model are provided in google drive. Please download them to the
checkpointsfolder.
Then, run the following commands for evaluation: ```
python moment_localization/test.py --cfg experiments/charades/2D-TAN-16x16-K5L8-pool.yaml --verbose --split test
python moment_localization/test.py --cfg experiments/charades/2D-TAN-16x16-K5L8-conv.yaml --verbose --split test
python moment_localization/test.py --cfg experiments/activitynet/2D-TAN-64x64-K9L4-pool.yaml --verbose --split test
python moment_localization/test.py --cfg experiments/activitynet/2D-TAN-64x64-K9L4-conv.yaml --verbose --split test
python moment_localization/test.py --cfg experiments/tacos/2D-TAN-128x128-K5L8-pool.yaml --verbose --split test
python moment_localization/test.py --cfg experiments/tacos/2D-TAN-128x128-K5L8-conv.yaml --verbose --split test ```
If any part of our paper and code is helpful to your work, please generously cite with:
@InProceedings{2DTAN_2020_AAAI, author = {Zhang, Songyang and Peng, Houwen and Fu, Jianlong and Luo, Jiebo}, title = {Learning 2D Temporal Adjacent Networks forMoment Localization with Natural Language}, booktitle = {AAAI}, year = {2020} }