Github url

pwc

by zziz

zziz /pwc

Papers with code. Sorted by stars. Updated weekly.

14.8K Stars 2.4K Forks Last release: Not found 78 Commits 0 Releases

Available items

No Items, yet!

The developer of this repository has not created any items for sale yet. Need a bug fixed? Help with integration? A different license? Create a request here:

HEADER

| 2018 | 2017 | 2016 | 2015 | 2014 | 2013 | 2012 | 2011 | 2010 | 2009 | 2008 | Tweet | Suggestions | |:-------:|:-------:|:-------:|:-------:|:-------:|:-------:|:-------:|:-------:|:-------:|:-------:|:-------:|:-------:|:-------:| This work is in continuous progress and update. We are adding new PWC everyday! Tweet me @fvzaur Use this thread to request us your favorite conference to be added to our watchlist and to PWC list. #### Weekly updated pushed! ## 2018 | Title | Conf | Code | Stars | |:--------|:--------:|:--------:|:--------:| | Video-to-Video Synthesis | NIPS | code | 5578 | | Deep Image Prior | CVPR | code | 3736 | | StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation | CVPR | code | 3405 | | Joint 3D Face Reconstruction and Dense Alignment with Position Map Regression Network | ECCV | code | 2434 | | Learning to See in the Dark | CVPR | code | 2326 | | Glow: Generative Flow with Invertible 1x1 Convolutions | NIPS | code | 2088 | | Squeeze-and-Excitation Networks | CVPR | code | 1477 | | Efficient Neural Architecture Search via Parameters Sharing | ICML | code | 1382 | | Multimodal Unsupervised Image-to-image Translation | ECCV | code | 1296 | | Non-Local Neural Networks | CVPR | code | 992 | | Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet? | CVPR | code | 924 | | Single-Shot Refinement Neural Network for Object Detection | CVPR | code | 875 | | Image Generation From Scene Graphs | CVPR | code | 851 | | GANimation: Anatomically-aware Facial Animation from a Single Image | ECCV | code | 772 | | Simple Baselines for Human Pose Estimation and Tracking | ECCV | code | 752 | | Visualizing the Loss Landscape of Neural Nets | NIPS | code | 724 | | Detect-and-Track: Efficient Pose Estimation in Videos | CVPR | code | 650 | | Relation Networks for Object Detection | CVPR | code | 635 | | Generative Image Inpainting With Contextual Attention | CVPR | code | 609 | | PointCNN | NIPS | code | 607 | | Look at Boundary: A Boundary-Aware Face Alignment Algorithm | CVPR | code | 575 | | Pelee: A Real-Time Object Detection System on Mobile Devices | NIPS | code | 548 | | Distractor-aware Siamese Networks for Visual Object Tracking | ECCV | code | 545 | | Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples | ICML | code | 535 | | Which Training Methods for GANs do actually Converge? | ICML | code | 520 | | End-to-End Recovery of Human Shape and Pose | CVPR | code | 502 | | Taskonomy: Disentangling Task Transfer Learning | CVPR | code | 502 | | Cascaded Pyramid Network for Multi-Person Pose Estimation | CVPR | code | 497 | | Neural 3D Mesh Renderer | CVPR | code | 489 | | Zero-Shot Recognition via Semantic Embeddings and Knowledge Graphs | CVPR | code | 489 | | In-Place Activated BatchNorm for Memory-Optimized Training of DNNs | CVPR | code | 485 | | The Unreasonable Effectiveness of Deep Features as a Perceptual Metric | CVPR | code | 447 | | Frustum PointNets for 3D Object Detection From RGB-D Data | CVPR | code | 434 | | The Lovász-Softmax Loss: A Tractable Surrogate for the Optimization of the Intersection-Over-Union Measure in Neural Networks | CVPR | code | 416 | | ICNet for Real-Time Semantic Segmentation on High-Resolution Images | ECCV | code | 415 | | PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume | CVPR | code | 398 | | Efficient Interactive Annotation of Segmentation Datasets With Polygon-RNN++ | CVPR | code | 397 | | Gibson Env: Real-World Perception for Embodied Agents | CVPR | code | 385 | | Acquisition of Localization Confidence for Accurate Object Detection | ECCV | code | 384 | | Noise2Noise: Learning Image Restoration without Clean Data | ICML | code | 370 | | GeoNet: Geometric Neural Network for Joint Depth and Surface Normal Estimation | CVPR | code | 359 | | GeoNet: Unsupervised Learning of Dense Depth, Optical Flow and Camera Pose | CVPR | code | 359 | | A Style-Aware Content Loss for Real-time HD Style Transfer | ECCV | code | 349 | | Soccer on Your Tabletop | CVPR | code | 338 | | Pyramid Stereo Matching Network | CVPR | code | 335 | | Neural Baby Talk | CVPR | code | 332 | | License Plate Detection and Recognition in Unconstrained Scenarios | ECCV | code | 326 | | Supervision-by-Registration: An Unsupervised Approach to Improve the Precision of Facial Landmark Detectors | CVPR | code | 326 | | Pixel2Mesh: Generating 3D Mesh Models from Single RGB Images | ECCV | code | 323 | | Transparency by Design: Closing the Gap Between Performance and Interpretability in Visual Reasoning | CVPR | code | 317 | | Fast End-to-End Trainable Guided Filter | CVPR | code | 312 | | Deep Clustering for Unsupervised Learning of Visual Features | ECCV | code | 302 | | Deep Photo Enhancer: Unpaired Learning for Image Enhancement From Photographs With GANs | CVPR | code | 294 | | Neural Relational Inference for Interacting Systems | ICML | code | 289 | | Adversarially Regularized Autoencoders | ICML | code | 282 | | Learning to Adapt Structured Output Space for Semantic Segmentation | CVPR | code | 280 | | Convolutional Neural Networks With Alternately Updated Clique | CVPR | code | 272 | | Learning to Segment Every Thing | CVPR | code | 269 | | Supervising Unsupervised Learning | NIPS | code | 262 | | LiteFlowNet: A Lightweight Convolutional Neural Network for Optical Flow Estimation | CVPR | code | 261 | | Bilinear Attention Networks | NIPS | code | 258 | | ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation | ECCV | code | 254 | | An intriguing failing of convolutional neural networks and the CoordConv solution | NIPS | code | 249 | | End-to-End Learning of Motion Representation for Video Understanding | CVPR | code | 238 | | Image Super-Resolution Using Very Deep Residual Channel Attention Networks | ECCV | code | 234 | | Iterative Visual Reasoning Beyond Convolutions | CVPR | code | 228 | | Semi-Parametric Image Synthesis | CVPR | code | 226 | | Compressed Video Action Recognition | CVPR | code | 225 | | Style Aggregated Network for Facial Landmark Detection | CVPR | code | 223 | | Pose-Robust Face Recognition via Deep Residual Equivariant Mapping | CVPR | code | 220 | | Multi-Content GAN for Few-Shot Font Style Transfer | CVPR | code | 218 | | GraphRNN: Generating Realistic Graphs with Deep Auto-regressive Models | ICML | code | 214 | | Referring Relationships | CVPR | code | 210 | | MoCoGAN: Decomposing Motion and Content for Video Generation | CVPR | code | 205 | | Latent Alignment and Variational Attention | NIPS | code | 204 | | LayoutNet: Reconstructing the 3D Room Layout From a Single RGB Image | CVPR | code | 202 | | Large-Scale Point Cloud Semantic Segmentation With Superpoint Graphs | CVPR | code | 197 | | An End-to-End TextSpotter With Explicit Alignment and Attention | CVPR | code | 195 | | DeblurGAN: Blind Motion Deblurring Using Conditional Adversarial Networks | CVPR | code | 189 | | SPLATNet: Sparse Lattice Networks for Point Cloud Processing | CVPR | code | 188 | | Attentive Generative Adversarial Network for Raindrop Removal From a Single Image | CVPR | code | 186 | | Single View Stereo Matching | CVPR | code | 182 | | MegaDepth: Learning Single-View Depth Prediction From Internet Photos | CVPR | code | 181 | | ECO: Efficient Convolutional Network for Online Video Understanding | ECCV | code | 180 | | Unsupervised Feature Learning via Non-Parametric Instance Discrimination | CVPR | code | 180 | | ST-GAN: Spatial Transformer Generative Adversarial Networks for Image Compositing | CVPR | code | 179 | | Video Based Reconstruction of 3D People Models | CVPR | code | 179 | | Social GAN: Socially Acceptable Trajectories With Generative Adversarial Networks | CVPR | code | 178 | | Learning Category-Specific Mesh Reconstruction from Image Collections | ECCV | code | 176 | | Realistic Evaluation of Deep Semi-Supervised Learning Algorithms | NIPS | code | 175 | | BSN: Boundary Sensitive Network for Temporal Action Proposal Generation | ECCV | code | 175 | | Group Normalization | ECCV | code | 175 | | Real-Time Seamless Single Shot 6D Object Pose Prediction | CVPR | code | 174 | | MVSNet: Depth Inference for Unstructured Multi-view Stereo | ECCV | code | 174 | | Neural Motifs: Scene Graph Parsing With Global Context | CVPR | code | 171 | | Learning a Single Convolutional Super-Resolution Network for Multiple Degradations | CVPR | code | 169 | | Optimizing Video Object Detection via a Scale-Time Lattice | CVPR | code | 168 | | MultiPoseNet: Fast Multi-Person Pose Estimation using Pose Residual Network | ECCV | code | 167 | | Unsupervised Cross-Dataset Person Re-Identification by Transfer Learning of Spatial-Temporal Patterns | CVPR | code | 166 | | Weakly Supervised Instance Segmentation Using Class Peak Response | CVPR | code | 166 | | PlaneNet: Piece-Wise Planar Reconstruction From a Single RGB Image | CVPR | code | 164 | | Residual Dense Network for Image Super-Resolution | CVPR | code | 163 | | Embodied Question Answering | CVPR | code | 162 | | Evolved Policy Gradients | NIPS | code | 160 | | Camera Style Adaptation for Person Re-Identification | CVPR | code | 159 | | Weakly and Semi Supervised Human Body Part Parsing via Pose-Guided Knowledge Transfer | CVPR | code | 159 | | Scale-Recurrent Network for Deep Image Deblurring | CVPR | code | 159 | | Unsupervised Learning of Monocular Depth Estimation and Visual Odometry With Deep Feature Reconstruction | CVPR | code | 158 | | Relational recurrent neural networks | NIPS | code | 157 | | Densely Connected Pyramid Dehazing Network | CVPR | code | 155 | | Image Inpainting for Irregular Holes Using Partial Convolutions | ECCV | code | 153 | | SO-Net: Self-Organizing Network for Point Cloud Analysis | CVPR | code | 152 | | Pix3D: Dataset and Methods for Single-Image 3D Shape Modeling | CVPR | code | 152 | | ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices | CVPR | code | 152 | | DenseASPP for Semantic Segmentation in Street Scenes | CVPR | code | 151 | | Facelet-Bank for Fast Portrait Manipulation | CVPR | code | 150 | | Self-Imitation Learning | ICML | code | 145 | | Graph R-CNN for Scene Graph Generation | ECCV | code | 144 | | A Closer Look at Spatiotemporal Convolutions for Action Recognition | CVPR | code | 143 | | Cross-Domain Weakly-Supervised Object Detection Through Progressive Domain Adaptation | CVPR | code | 143 | | Quantized Densely Connected U-Nets for Efficient Landmark Localization | ECCV | code | 143 | | Recurrent Squeeze-and-Excitation Context Aggregation Net for Single Image Deraining | ECCV | code | 142 | | Two-Stream Convolutional Networks for Dynamic Texture Synthesis | CVPR | code | 141 | | Integral Human Pose Regression | ECCV | code | 141 | | Adaptive Affinity Fields for Semantic Segmentation | ECCV | code | 141 | | LSTM Pose Machines | CVPR | code | 141 | | Structure Inference Net: Object Detection Using Scene-Level Context and Instance-Level Relationships | CVPR | code | 140 | | Recovering Realistic Texture in Image Super-Resolution by Deep Spatial Feature Transform | CVPR | code | 139 | | Image-Image Domain Adaptation With Preserved Self-Similarity and Domain-Dissimilarity for Person Re-Identification | CVPR | code | 137 | | Learning to Compare: Relation Network for Few-Shot Learning | CVPR | code | 135 | | CosFace: Large Margin Cosine Loss for Deep Face Recognition | CVPR | code | 135 | | Deep Depth Completion of a Single RGB-D Image | CVPR | code | 134 | | Deep Back-Projection Networks for Super-Resolution | CVPR | code | 132 | | Context Embedding Networks | CVPR | code | 131 | | Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics | CVPR | code | 131 | | Perturbative Neural Networks | CVPR | code | 130 | | Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis | ICML | code | 129 | | Fast and Accurate Online Video Object Segmentation via Tracking Parts | CVPR | code | 129 | | Nonlinear 3D Face Morphable Model | CVPR | code | 128 | | BodyNet: Volumetric Inference of 3D Human Body Shapes | ECCV | code | 126 | | 3D-CODED: 3D Correspondences by Deep Deformation | ECCV | code | 125 | | DeepMVS: Learning Multi-View Stereopsis | CVPR | code | 125 | | Hierarchical Imitation and Reinforcement Learning | ICML | code | 124 | | Domain Adaptive Faster R-CNN for Object Detection in the Wild | CVPR | code | 123 | | L4: Practical loss-based stepsize adaptation for deep learning | NIPS | code | 123 | | A Generative Adversarial Approach for Zero-Shot Learning From Noisy Texts | CVPR | code | 122 | | Recurrent Relational Networks | NIPS | code | 121 | | Gated Path Planning Networks | ICML | code | 121 | | PSANet: Point-wise Spatial Attention Network for Scene Parsing | ECCV | code | 121 | | Rethinking Feature Distribution for Loss Functions in Image Classification | CVPR | code | 120 | | Density-Aware Single Image De-Raining Using a Multi-Stream Dense Network | CVPR | code | 118 | | FOTS: Fast Oriented Text Spotting With a Unified Network | CVPR | code | 118 | | ELEGANT: Exchanging Latent Encodings with GAN for Transferring Multiple Face Attributes | ECCV | code | 117 | | PU-Net: Point Cloud Upsampling Network | CVPR | code | 117 | | PackNet: Adding Multiple Tasks to a Single Network by Iterative Pruning | CVPR | code | 117 | | Long-term Tracking in the Wild: a Benchmark | ECCV | code | 116 | | Factoring Shape, Pose, and Layout From the 2D Image of a 3D Scene | CVPR | code | 114 | | Repulsion Loss: Detecting Pedestrians in a Crowd | CVPR | code | 113 | | Unsupervised Attention-guided Image-to-Image Translation | NIPS | code | 110 | | Attention-based Deep Multiple Instance Learning | ICML | code | 109 | | Learning Blind Video Temporal Consistency | ECCV | code | 109 | | Noisy Natural Gradient as Variational Inference | ICML | code | 108 | | End-to-End Weakly-Supervised Semantic Alignment | CVPR | code | 106 | | Decoupled Networks | CVPR | code | 105 | | LiDAR-Video Driving Dataset: Learning Driving Policies Effectively | CVPR | code | 104 | | MAttNet: Modular Attention Network for Referring Expression Comprehension | CVPR | code | 104 | | LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks | ECCV | code | 103 | | FSRNet: End-to-End Learning Face Super-Resolution With Facial Priors | CVPR | code | 100 | | Deep Mutual Learning | CVPR | code | 100 | | Macro-Micro Adversarial Network for Human Parsing | ECCV | code | 98 | | ScanComplete: Large-Scale Scene Completion and Semantic Segmentation for 3D Scans | CVPR | code | 97 | | Learning Depth From Monocular Videos Using Direct Methods | CVPR | code | 97 | | VITON: An Image-Based Virtual Try-On Network | CVPR | code | 95 | | Cascade R-CNN: Delving Into High Quality Object Detection | CVPR | code | 93 | | Learning Human-Object Interactions by Graph Parsing Neural Networks | ECCV | code | 93 | | Future Frame Prediction for Anomaly Detection – A New Baseline | CVPR | code | 92 | | Multi-view to Novel view: Synthesizing novel views with Self-Learned Confidence | ECCV | code | 92 | | Tell Me Where to Look: Guided Attention Inference Network | CVPR | code | 91 | | Neural Kinematic Networks for Unsupervised Motion Retargetting | CVPR | code | 90 | | Learning SO(3) Equivariant Representations with Spherical CNNs | ECCV | code | 89 | | One-Shot Unsupervised Cross Domain Translation | NIPS | code | 89 | | Synthesizing Images of Humans in Unseen Poses | CVPR | code | 88 | | Depth-aware CNN for RGB-D Segmentation | ECCV | code | 88 | | Piggyback: Adapting a Single Network to Multiple Tasks by Learning to Mask Weights | ECCV | code | 88 | | Knowledge Aided Consistency for Weakly Supervised Phrase Grounding | CVPR | code | 87 | | CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes | CVPR | code | 87 | | Neural Arithmetic Logic Units | NIPS | code | 87 | | A PID Controller Approach for Stochastic Optimization of Deep Networks | CVPR | code | 87 | | VITAL: VIsual Tracking via Adversarial Learning | CVPR | code | 86 | | Learning Spatial-Temporal Regularized Correlation Filters for Visual Tracking | CVPR | code | 86 | | Recurrent Pixel Embedding for Instance Grouping | CVPR | code | 85 | | SGPN: Similarity Group Proposal Network for 3D Point Cloud Instance Segmentation | CVPR | code | 84 | | Multi-Scale Location-Aware Kernel Representation for Object Detection | CVPR | code | 84 | | Repeatability Is Not Enough: Learning Affine Regions via Discriminability | ECCV | code | 84 | | “Zero-Shot” Super-Resolution Using Deep Internal Learning | CVPR | code | 84 | | DF-Net: Unsupervised Joint Learning of Depth and Flow using Cross-Task Consistency | ECCV | code | 82 | | Multi-View Consistency as Supervisory Signal for Learning Shape and Pose Prediction | CVPR | code | 80 | | Factorizable Net: An Efficient Subgraph-based Framework for Scene Graph Generation | ECCV | code | 78 | | Generalizing A Person Retrieval Model Hetero- and Homogeneously | ECCV | code | 78 | | Crafting a Toolchain for Image Restoration by Deep Reinforcement Learning | CVPR | code | 77 | | Pairwise Confusion for Fine-Grained Visual Classification | ECCV | code | 77 | | Learning to Reweight Examples for Robust Deep Learning | ICML | code | 76 | | Improving Generalization via Scalable Neighborhood Component Analysis | ECCV | code | 76 | | SparseMAP: Differentiable Sparse Structured Inference | ICML | code | 75 | | PDE-Net: Learning PDEs from Data | ICML | code | 75 | | Pose-Normalized Image Generation for Person Re-identification | ECCV | code | 75 | | Disentangled Person Image Generation | CVPR | code | 75 | | Learning to Navigate for Fine-grained Classification | ECCV | code | 74 | | Superpixel Sampling Networks | ECCV | code | 74 | | Shift-Net: Image Inpainting via Deep Feature Rearrangement | ECCV | code | 74 | | 3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation | ECCV | code | 74 | | Ordinal Depth Supervision for 3D Human Pose Estimation | CVPR | code | 74 | | Path-Level Network Transformation for Efficient Architecture Search | ICML | code | 73 | | Diverse Image-to-Image Translation via Disentangled Representations | ECCV | code | 72 | | Visual Feature Attribution Using Wasserstein GANs | CVPR | code | 72 | | Real-World Anomaly Detection in Surveillance Videos | CVPR | code | 72 | | Self-Supervised Adversarial Hashing Networks for Cross-Modal Retrieval | CVPR | code | 72 | | Holistic 3D Scene Parsing and Reconstruction from a Single RGB Image | ECCV | code | 72 | | Learning to Find Good Correspondences | CVPR | code | 72 | | Learning Less Is More - 6D Camera Localization via 3D Surface Regression | CVPR | code | 72 | | Object Level Visual Reasoning in Videos | ECCV | code | 71 | | Weakly-Supervised Semantic Segmentation Network With Deep Seeded Region Growing | CVPR | code | 71 | | Avatar-Net: Multi-Scale Zero-Shot Style Transfer by Feature Decoration | CVPR | code | 71 | | Fast and Accurate Single Image Super-Resolution via Information Distillation Network | CVPR | code | 71 | | Regularizing RNNs for Caption Generation by Reconstructing the Past With the Present | CVPR | code | 70 | | Multi-Shot Pedestrian Re-Identification via Sequential Decision Making | CVPR | code | 70 | | PointNetVLAD: Deep Point Cloud Based Retrieval for Large-Scale Place Recognition | CVPR | code | 69 | | Progressive Neural Architecture Search | ECCV | code | 68 | | Generative Neural Machine Translation | NIPS | code | 68 | | Learning Latent Super-Events to Detect Multiple Activities in Videos | CVPR | code | 67 | | Generate to Adapt: Aligning Domains Using Generative Adversarial Networks | CVPR | code | 67 | | Adversarial Feature Augmentation for Unsupervised Domain Adaptation | CVPR | code | 67 | | Learning Attentions: Residual Attentional Siamese Network for High Performance Online Visual Tracking | CVPR | code | 67 | | Pointwise Convolutional Neural Networks | CVPR | code | 67 | | Optimizing the Latent Space of Generative Networks | ICML | code | 66 | | Part-Aligned Bilinear Representations for Person Re-Identification | ECCV | code | 64 | | Geometry-Aware Learning of Maps for Camera Localization | CVPR | code | 63 | | Fighting Fake News: Image Splice Detection via Learned Self-Consistency | ECCV | code | 62 | | Isolating Sources of Disentanglement in Variational Autoencoders | NIPS | code | 62 | | Neural Program Synthesis from Diverse Demonstration Videos | ICML | code | 62 | | Learning Rigidity in Dynamic Scenes with a Moving Camera for 3D Motion Field Estimation | ECCV | code | 61 | | Rotation-Sensitive Regression for Oriented Scene Text Detection | CVPR | code | 61 | | Human Semantic Parsing for Person Re-Identification | CVPR | code | 61 | | Unsupervised Discovery of Object Landmarks as Structural Representations | CVPR | code | 61 | | IQA: Visual Question Answering in Interactive Environments | CVPR | code | 60 | | Hierarchical Long-term Video Prediction without Supervision | ICML | code | 60 | | Unsupervised Domain Adaptation for 3D Keypoint Estimation via View Consistency | ECCV | code | 60 | | Exploit the Unknown Gradually: One-Shot Video-Based Person Re-Identification by Stepwise Learning | CVPR | code | 59 | | Neural Style Transfer via Meta Networks | CVPR | code | 59 | | Frame-Recurrent Video Super-Resolution | CVPR | code | 58 | | PlaneMatch: Patch Coplanarity Prediction for Robust RGB-D Reconstruction | ECCV | code | 57 | | CBAM: Convolutional Block Attention Module | ECCV | code | 57 | | Decorrelated Batch Normalization | CVPR | code | 57 | | Learning Conditioned Graph Structures for Interpretable Visual Question Answering | NIPS | code | 57 | | Hierarchical Bilinear Pooling for Fine-Grained Visual Recognition | ECCV | code | 57 | | Leveraging Unlabeled Data for Crowd Counting by Learning to Rank | CVPR | code | 56 | | Deep Marching Cubes: Learning Explicit Surface Representations | CVPR | code | 56 | | Learning From Synthetic Data: Addressing Domain Shift for Semantic Segmentation | CVPR | code | 56 | | LF-Net: Learning Local Features from Images | NIPS | code | 55 | | Semi-supervised Adversarial Learning to Generate Photorealistic Face Images of New Identities from 3D Morphable Model | ECCV | code | 55 | | Discriminability Objective for Training Descriptive Captions | CVPR | code | 54 | | BlockDrop: Dynamic Inference Paths in Residual Networks | CVPR | code | 54 | | Conditional Probability Models for Deep Image Compression | CVPR | code | 54 | | Jointly Optimize Data Augmentation and Network Training: Adversarial Data Augmentation in Human Pose Estimation | CVPR | code | 54 | | Learning towards Minimum Hyperspherical Energy | NIPS | code | 54 | | DeepVS: A Deep Learning Based Video Saliency Prediction Approach | ECCV | code | 53 | | Learning Efficient Single-stage Pedestrian Detectors by Asymptotic Localization Fitting | ECCV | code | 52 | | Learning Pixel-Level Semantic Affinity With Image-Level Supervision for Weakly Supervised Semantic Segmentation | CVPR | code | 52 | | Wasserstein Introspective Neural Networks | CVPR | code | 51 | | SketchyGAN: Towards Diverse and Realistic Sketch to Image Synthesis | CVPR | code | 51 | | Self-produced Guidance for Weakly-supervised Object Localization | ECCV | code | 51 | | Measuring abstract reasoning in neural networks | ICML | code | 51 | | A Unified Feature Disentangler for Multi-Domain Image Translation and Manipulation | NIPS | code | 51 | | RayNet: Learning Volumetric 3D Reconstruction With Ray Potentials | CVPR | code | 51 | | Coloring with Words: Guiding Image Colorization Through Text-based Palette Generation | ECCV | code | 50 | | Efficient end-to-end learning for quantizable representations | ICML | code | 50 | | Visual Question Generation as Dual Task of Visual Question Answering | CVPR | code | 50 | | Fast and Scalable Bayesian Deep Learning by Weight-Perturbation in Adam | ICML | code | 49 | | Surface Networks | CVPR | code | 48 | | Deep k-Means: Re-Training and Parameter Sharing with Harder Cluster Assignments for Compressing Deep Convolutions | ICML | code | 48 | | Stacked Cross Attention for Image-Text Matching | ECCV | code | 48 | | Actor and Observer: Joint Modeling of First and Third-Person Videos | CVPR | code | 48 | | Super SloMo: High Quality Estimation of Multiple Intermediate Frames for Video Interpolation | CVPR | code | 47 | | Learning-based Video Motion Magnification | ECCV | code | 47 | | Pose Partition Networks for Multi-Person Pose Estimation | ECCV | code | 47 | | Neural Autoregressive Flows | ICML | code | 47 | | Weakly- and Semi-Supervised Panoptic Segmentation | ECCV | code | 46 | | Video Re-localization | ECCV | code | 46 | | Real-time 'Actor-Critic' Tracking | ECCV | code | 46 | | Black-box Adversarial Attacks with Limited Queries and Information | ICML | code | 46 | | Hyperbolic Entailment Cones for Learning Hierarchical Embeddings | ICML | code | 46 | | Structured Attention Guided Convolutional Neural Fields for Monocular Depth Estimation | CVPR | code | 46 | | Differentiable Compositional Kernel Learning for Gaussian Processes | ICML | code | 45 | | Visualizing and Understanding Atari Agents | ICML | code | 45 | | Image Manipulation with Perceptual Discriminators | ECCV | code | 45 | | Learning Intrinsic Image Decomposition From Watching the World | CVPR | code | 45 | | Overcoming Catastrophic Forgetting with Hard Attention to the Task | ICML | code | 44 | | Learning Pose Specific Representations by Predicting Different Views | CVPR | code | 44 | | Zero-Shot Object Detection | ECCV | code | 43 | | Mean Field Multi-Agent Reinforcement Learning | ICML | code | 43 | | Partial Adversarial Domain Adaptation | ECCV | code | 43 | | Mutual Learning to Adapt for Joint Human Parsing and Pose Estimation | ECCV | code | 43 | | Robust Classification With Convolutional Prototype Learning | CVPR | code | 43 | | SimplE Embedding for Link Prediction in Knowledge Graphs | NIPS | code | 42 | | PredRNN++: Towards A Resolution of the Deep-in-Time Dilemma in Spatiotemporal Predictive Learning | ICML | code | 42 | | Learning to Blend Photos | ECCV | code | 42 | | Mask-Guided Contrastive Attention Model for Person Re-Identification | CVPR | code | 41 | | Link Prediction Based on Graph Neural Networks | NIPS | code | 41 | | Generalisation in humans and deep neural networks | NIPS | code | 41 | | Towards Binary-Valued Gates for Robust LSTM Training | ICML | code | 41 | | Multi-scale Residual Network for Image Super-Resolution | ECCV | code | 41 | | Fully Motion-Aware Network for Video Object Detection | ECCV | code | 41 | | Interpretable Convolutional Neural Networks | CVPR | code | 40 | | Generative Adversarial Perturbations | CVPR | code | 40 | | The Sound of Pixels | ECCV | code | 40 | | Towards Faster Training of Global Covariance Pooling Networks by Iterative Matrix Square Root Normalization | CVPR | code | 40 | | Choose Your Neuron: Incorporating Domain Knowledge through Neuron-Importance | ECCV | code | 40 | | Multi-View Silhouette and Depth Decomposition for High Resolution 3D Object Representation | NIPS | code | 40 | | Learning Warped Guidance for Blind Face Restoration | ECCV | code | 39 | | Adversarial Complementary Learning for Weakly Supervised Object Localization | CVPR | code | 39 | | Learning Semantic Representations for Unsupervised Domain Adaptation | ICML | code | 39 | | Neural Architecture Search with Bayesian Optimisation and Optimal Transport | NIPS | code | 39 | | Mutual Information Neural Estimation | ICML | code | 39 | | NetGAN: Generating Graphs via Random Walks | ICML | code | 39 | | Learning to Evaluate Image Captioning | CVPR | code | 38 | | Hyperbolic Neural Networks | NIPS | code | 37 | | Unsupervised Geometry-Aware Representation for 3D Human Pose Estimation | ECCV | code | 37 | | Adversarially Learned One-Class Classifier for Novelty Detection | CVPR | code | 37 | | Disentangling by Factorising | ICML | code | 37 | | Extracting Automata from Recurrent Neural Networks Using Queries and Counterexamples | ICML | code | 37 | | Tangent Convolutions for Dense Prediction in 3D | CVPR | code | 37 | | Few-Shot Image Recognition by Predicting Parameters From Activations | CVPR | code | 37 | | Real-Time Monocular Depth Estimation Using Synthetic Data With Domain Adaptation via Image Style Transfer | CVPR | code | 37 | | Generalizing to Unseen Domains via Adversarial Data Augmentation | NIPS | code | 36 | | SeGAN: Segmenting and Generating the Invisible | CVPR | code | 36 | | Graphical Generative Adversarial Networks | NIPS | code | 36 | | PieAPP: Perceptual Image-Error Assessment Through Pairwise Preference | CVPR | code | 36 | | Gated Fusion Network for Single Image Dehazing | CVPR | code | 35 | | Neural Code Comprehension: A Learnable Representation of Code Semantics | NIPS | code | 35 | | Eye In-Painting With Exemplar Generative Adversarial Networks | CVPR | code | 35 | | Deep One-Class Classification | ICML | code | 34 | | Deep Regression Tracking with Shrinkage Loss | ECCV | code | 34 | | Deflecting Adversarial Attacks With Pixel Deflection | CVPR | code | 34 | | Learning Visual Question Answering by Bootstrapping Hard Attention | ECCV | code | 33 | | Human-Centric Indoor Scene Synthesis Using Stochastic Grammar | CVPR | code | 33 | | Improved Fusion of Visual and Language Representations by Dense Symmetric Co-Attention for Visual Question Answering | CVPR | code | 33 | | CleanNet: Transfer Learning for Scalable Image Classifier Training With Label Noise | CVPR | code | 33 | | Speaker-Follower Models for Vision-and-Language Navigation | NIPS | code | 33 | | Improving Shape Deformation in Unsupervised Image-to-Image Translation | ECCV | code | 33 | | Learning Single-View 3D Reconstruction with Limited Pose Supervision | ECCV | code | 33 | | 3D Steerable CNNs: Learning Rotationally Equivariant Features in Volumetric Data | NIPS | code | 33 | | Adversarial Logit Pairing | NIPS | code | 32 | | Attention in Convolutional LSTM for Gesture Recognition | NIPS | code | 32 | | Graph-Cut RANSAC | CVPR | code | 32 | | Neural Guided Constraint Logic Programming for Program Synthesis | NIPS | code | 32 | | Learning Dynamic Memory Networks for Object Tracking | ECCV | code | 32 | | GeoDesc: Learning Local Descriptors by Integrating Geometry Constraints | ECCV | code | 32 | | A Simple Unified Framework for Detecting Out-of-Distribution Samples and Adversarial Attacks | NIPS | code | 32 | | Flow-Grounded Spatial-Temporal Video Prediction from Still Images | ECCV | code | 32 | | Bidirectional Feature Pyramid Network with Recurrent Attention Residual Modules for Shadow Detection | ECCV | code | 32 | | On the Robustness of Semantic Segmentation Models to Adversarial Attacks | CVPR | code | 31 | | Large Scale Fine-Grained Categorization and Domain-Specific Transfer Learning | CVPR | code | 31 | | SketchyScene: Richly-Annotated Scene Sketches | ECCV | code | 31 | | Deep Randomized Ensembles for Metric Learning | ECCV | code | 30 | | Deep High Dynamic Range Imaging with Large Foreground Motions | ECCV | code | 30 | | Revisiting Video Saliency: A Large-Scale Benchmark and a New Model | CVPR | code | 30 | | Blazingly Fast Video Object Segmentation With Pixel-Wise Metric Learning | CVPR | code | 30 | | Deep Model-Based 6D Pose Refinement in RGB | ECCV | code | 30 | | TOM-Net: Learning Transparent Object Matting From a Single Image | CVPR | code | 30 | | Quaternion Convolutional Neural Networks | ECCV | code | 30 | | Densely Connected Attention Propagation for Reading Comprehension | NIPS | code | 30 | | A Trilateral Weighted Sparse Coding Scheme for Real-World Image Denoising | ECCV | code | 30 | | Self-Consistent Trajectory Autoencoder: Hierarchical Reinforcement Learning with Trajectory Embeddings | ICML | code | 29 | | Video Rain Streak Removal by Multiscale Convolutional Sparse Coding | CVPR | code | 29 | | Recurrent Scene Parsing With Perspective Understanding in the Loop | CVPR | code | 29 | | Single Shot Scene Text Retrieval | ECCV | code | 29 | | Toward Characteristic-Preserving Image-based Virtual Try-On Network | ECCV | code | 29 | | Explainable Neural Computation via Stack Neural Module Networks | ECCV | code | 29 | | Exploring Disentangled Feature Representation Beyond Face Identification | CVPR | code | 29 | | Controllable Video Generation With Sparse Trajectories | CVPR | code | 28 | | Layer-structured 3D Scene Inference via View Synthesis | ECCV | code | 28 | | Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation | ECCV | code | 28 | | PiCANet: Learning Pixel-Wise Contextual Attention for Saliency Detection | CVPR | code | 28 | | Learning Rich Features for Image Manipulation Detection | CVPR | code | 27 | | Fast Video Object Segmentation by Reference-Guided Mask Propagation | CVPR | code | 27 | | 3DFeat-Net: Weakly Supervised Local 3D Features for Point Cloud Registration | ECCV | code | 27 | | Who Let the Dogs Out? Modeling Dog Behavior From Visual Data | CVPR | code | 27 | | EC-Net: an Edge-aware Point set Consolidation Network | ECCV | code | 27 | | Interpretable Intuitive Physics Model | ECCV | code | 27 | | Learning a Discriminative Feature Network for Semantic Segmentation | CVPR | code | 26 | | Partial Transfer Learning With Selective Adversarial Networks | CVPR | code | 26 | | Cross-Modal Deep Variational Hand Pose Estimation | CVPR | code | 26 | | Between-Class Learning for Image Classification | CVPR | code | 26 | | AON: Towards Arbitrarily-Oriented Text Recognition | CVPR | code | 26 | | Conditional Image-to-Image Translation | CVPR | code | 25 | | Learning Convolutional Networks for Content-Weighted Image Compression | CVPR | code | 25 | | Diversity Regularized Spatiotemporal Attention for Video-Based Person Re-Identification | CVPR | code | 25 | | Dynamic Multimodal Instance Segmentation Guided by Natural Language Queries | ECCV | code | 25 | | CBMV: A Coalesced Bidirectional Matching Volume for Disparity Estimation | CVPR | code | 25 | | Deep Texture Manifold for Ground Terrain Recognition | CVPR | code | 25 | | Audio-Visual Event Localization in Unconstrained Videos | ECCV | code | 25 | | First Order Generative Adversarial Networks | ICML | code | 25 | | Visual Coreference Resolution in Visual Dialog using Neural Module Networks | ECCV | code | 25 | | SYQ: Learning Symmetric Quantization for Efficient Deep Neural Networks | CVPR | code | 24 | | Deep Reinforcement Learning of Marked Temporal Point Processes | NIPS | code | 24 | | Explicit Inductive Bias for Transfer Learning with Convolutional Networks | ICML | code | 24 | | LEGO: Learning Edge With Geometry All at Once by Watching Videos | CVPR | code | 24 | | Verisimilar Image Synthesis for Accurate Detection and Recognition of Texts in Scenes | ECCV | code | 24 | | Multi-Agent Diverse Generative Adversarial Networks | CVPR | code | 23 | | Face Aging With Identity-Preserved Conditional Generative Adversarial Networks | CVPR | code | 23 | | Learning to Separate Object Sounds by Watching Unlabeled Video | ECCV | code | 23 | | Exploiting the Potential of Standard Convolutional Autoencoders for Image Restoration by Evolutionary Search | ICML | code | 23 | | To Trust Or Not To Trust A Classifier | NIPS | code | 23 | | Im2Flow: Motion Hallucination From Static Images for Action Recognition | CVPR | code | 22 | | ISTA-Net: Interpretable Optimization-Inspired Deep Network for Image Compressive Sensing | CVPR | code | 22 | | Hallucinated-IQA: No-Reference Image Quality Assessment via Adversarial Learning | CVPR | code | 22 | | Anonymous Walk Embeddings | ICML | code | 22 | | Learning to Multitask | NIPS | code | 22 | | CondenseNet: An Efficient DenseNet Using Learned Group Convolutions | CVPR | code | 22 | | HashGAN: Deep Learning to Hash With Pair Conditional Wasserstein GAN | CVPR | code | 22 | | Hierarchical Relational Networks for Group Activity Recognition and Retrieval | ECCV | code | 22 | | Collaborative and Adversarial Network for Unsupervised Domain Adaptation | CVPR | code | 22 | | Geometry-Aware Scene Text Detection With Instance Transformation Network | CVPR | code | 22 | | Learning to Promote Saliency Detectors | CVPR | code | 21 | | CSGNet: Neural Shape Parser for Constructive Solid Geometry | CVPR | code | 21 | | Local Spectral Graph Convolution for Point Set Feature Learning | ECCV | code | 21 | | HiDDeN: Hiding Data with Deep Networks | ECCV | code | 21 | | GraphBit: Bitwise Interaction Mining via Deep Reinforcement Learning | CVPR | code | 20 | | Stacked Conditional Generative Adversarial Networks for Jointly Learning Shadow Detection and Shadow Removal | CVPR | code | 20 | | Fully-Convolutional Point Networks for Large-Scale Point Clouds | ECCV | code | 20 | | Learning Superpixels With Segmentation-Aware Affinity Loss | CVPR | code | 20 | | Zero-Shot Visual Recognition Using Semantics-Preserving Adversarial Embedding Networks | CVPR | code | 20 | | Crowd Counting With Deep Negative Correlation Learning | CVPR | code | 20 | | Dimensionality-Driven Learning with Noisy Labels | ICML | code | 20 | | Objects that Sound | ECCV | code | 20 | | Deep Expander Networks: Efficient Deep Networks from Graph Theory | ECCV | code | 19 | | Low-Shot Learning With Large-Scale Diffusion | CVPR | code | 19 | | Low-Shot Learning With Imprinted Weights | CVPR | code | 19 | | Cross-Domain Self-Supervised Multi-Task Feature Learning Using Synthetic Imagery | CVPR | code | 19 | | Learning Descriptor Networks for 3D Shape Synthesis and Analysis | CVPR | code | 19 | | Disentangling Factors of Variation with Cycle-Consistent Variational Auto-Encoders | ECCV | code | 19 | | CTAP: Complementary Temporal Action Proposal Generation | ECCV | code | 18 | | DVAE#: Discrete Variational Autoencoders with Relaxed Boltzmann Priors | NIPS | code | 18 | | Conditional Image-Text Embedding Networks | ECCV | code | 18 | | EPINET: A Fully-Convolutional Neural Network Using Epipolar Geometry for Depth From Light Field Images | CVPR | code | 18 | | Glimpse Clouds: Human Activity Recognition From Unstructured Feature Points | CVPR | code | 18 | | Bayesian Optimization of Combinatorial Structures | ICML | code | 18 | | FeaStNet: Feature-Steered Graph Convolutions for 3D Shape Analysis | CVPR | code | 18 | | Learning Type-Aware Embeddings for Fashion Compatibility | ECCV | code | 17 | | Sliced Wasserstein Distance for Learning Gaussian Mixture Models | CVPR | code | 17 | | Revisiting Deep Intrinsic Image Decompositions | CVPR | code | 17 | | A Spectral Approach to Gradient Estimation for Implicit Distributions | ICML | code | 17 | | Hierarchical Novelty Detection for Visual Object Recognition | CVPR | code | 17 | | Total Capture: A 3D Deformation Model for Tracking Faces, Hands, and Bodies | CVPR | code | 17 | | Learning Generative ConvNets via Multi-Grid Modeling and Sampling | CVPR | code | 17 | | Learning 3D Shape Completion From Laser Scan Data With Weak Supervision | CVPR | code | 17 | | Triplet Loss in Siamese Network for Object Tracking | ECCV | code | 17 | | Adversarial Attack on Graph Structured Data | ICML | code | 17 | | Arbitrary Style Transfer With Deep Feature Reshuffle | CVPR | code | 17 | | Visual Question Reasoning on General Dependency Tree | CVPR | code | 17 | | Predicting Gaze in Egocentric Video by Learning Task-dependent Attention Transition | ECCV | code | 16 | | Lipschitz-Margin Training: Scalable Certification of Perturbation Invariance for Deep Neural Networks | NIPS | code | 16 | | Coded Sparse Matrix Multiplication | ICML | code | 16 | | Weakly-Supervised Action Segmentation With Iterative Soft Boundary Assignment | CVPR | code | 16 | | Recovering 3D Planes from a Single Image via Convolutional Neural Networks | ECCV | code | 16 | | SegStereo: Exploiting Semantic Information for Disparity Estimation | ECCV | code | 16 | | Functional Gradient Boosting based on Residual Network Perception | ICML | code | 16 | | NAG: Network for Adversary Generation | CVPR | code | 16 | | Generative Probabilistic Novelty Detection with Adversarial Autoencoders | NIPS | code | 16 | | Hashing as Tie-Aware Learning to Rank | CVPR | code | 15 | | Pose Proposal Networks | ECCV | code | 15 | | Convolutional Sequence to Sequence Model for Human Dynamics | CVPR | code | 15 | | Joint Pose and Expression Modeling for Facial Expression Recognition | CVPR | code | 15 | | Grounding Referring Expressions in Images by Variational Context | CVPR | code | 15 | | Rethinking the Form of Latent States in Image Captioning | ECCV | code | 15 | | Open Set Domain Adaptation by Backpropagation | ECCV | code | 15 | | Neural Sign Language Translation | CVPR | code | 15 | | SpiderCNN: Deep Learning on Point Sets with Parameterized Convolutional Filters | ECCV | code | 15 | | Efficient Neural Audio Synthesis | ICML | code | 15 | | Deep Learning Under Privileged Information Using Heteroscedastic Dropout | CVPR | code | 14 | | Image Transformer | ICML | code | 14 | | Learning to Understand Image Blur | CVPR | code | 14 | | Learning and Using the Arrow of Time | CVPR | code | 14 | | Action Sets: Weakly Supervised Action Segmentation Without Ordering Constraints | CVPR | code | 14 | | Learning to Forecast and Refine Residual Motion for Image-to-Video Generation | ECCV | code | 14 | | Multi-Scale Weighted Nuclear Norm Image Restoration | CVPR | code | 14 | | Synthesizing Robust Adversarial Examples | ICML | code | 13 | | Fine-Grained Visual Categorization using Meta-Learning Optimization with Sample Selection of Auxiliary Data | ECCV | code | 13 | | Assessing Generative Models via Precision and Recall | NIPS | code | 13 | | Deep Diffeomorphic Transformer Networks | CVPR | code | 13 | | Learning by Asking Questions | CVPR | code | 13 | | Towards Human-Machine Cooperation: Self-Supervised Sample Mining for Object Detection | CVPR | code | 13 | | Variational Autoencoders for Deforming 3D Mesh Models | CVPR | code | 13 | | Min-Entropy Latent Model for Weakly Supervised Object Detection | CVPR | code | 13 | | Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering | CVPR | code | 13 | | Gradient-Based Meta-Learning with Learned Layerwise Metric and Subspace | ICML | code | 13 | | Learning a Discriminative Filter Bank Within a CNN for Fine-Grained Recognition | CVPR | code | 13 | | Finding Influential Training Samples for Gradient Boosted Decision Trees | ICML | code | 13 | | Gesture Recognition: Focus on the Hands | CVPR | code | 12 | | Cross-View Image Synthesis Using Conditional GANs | CVPR | code | 12 | | Joint Optimization Framework for Learning With Noisy Labels | CVPR | code | 12 | | Future Person Localization in First-Person Videos | CVPR | code | 12 | | AutoLoc: Weakly-supervised Temporal Action Localization in Untrimmed Videos | ECCV | code | 12 | | Learning Transferable Architectures for Scalable Image Recognition | CVPR | code | 12 | | Clipped Action Policy Gradient | ICML | code | 12 | | Mix and Match Networks: Encoder-Decoder Alignment for Zero-Pair Image Translation | CVPR | code | 12 | | Decouple Learning for Parameterized Image Operators | ECCV | code | 12 | | Generalized Earley Parser: Bridging Symbolic Grammars and Sequence Data for Future Prediction | ICML | code | 12 | | Adaptive Skip Intervals: Temporal Abstraction for Recurrent Dynamical Models | NIPS | code | 12 | | AMNet: Memorability Estimation With Attention | CVPR | code | 12 | | Adversarial Time-to-Event Modeling | ICML | code | 12 | | Reversible Recurrent Neural Networks | NIPS | code | 12 | | Human Pose Estimation With Parsing Induced Learner | CVPR | code | 11 | | ShapeStacks: Learning Vision-Based Physical Intuition for Generalised Object Stacking | ECCV | code | 11 | | A Joint Sequence Fusion Model for Video Question Answering and Retrieval | ECCV | code | 11 | | Learning Face Age Progression: A Pyramid Architecture of GANs | CVPR | code | 11 | | Robust Physical-World Attacks on Deep Learning Visual Classification | CVPR | code | 11 | | High-Quality Prediction Intervals for Deep Learning: A Distribution-Free, Ensembled Approach | ICML | code | 11 | | Meta-Learning by Adjusting Priors Based on Extended PAC-Bayes Theory | ICML | code | 11 | | Multimodal Explanations: Justifying Decisions and Pointing to the Evidence | CVPR | code | 11 | | Accelerating Natural Gradient with Higher-Order Invariance | ICML | code | 11 | | Hierarchical Multi-Label Classification Networks | ICML | code | 11 | | Convolutional Image Captioning | CVPR | code | 11 | | Boosting Domain Adaptation by Discovering Latent Domains | CVPR | code | 11 | | Logo Synthesis and Manipulation With Clustered Generative Adversarial Networks | CVPR | code | 10 | | PacGAN: The power of two samples in generative adversarial networks | NIPS | code | 10 | | Attention Clusters: Purely Attention Based Local Feature Integration for Video Classification | CVPR | code | 10 | | End-to-End Incremental Learning | ECCV | code | 10 | | Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation | CVPR | code | 10 | | On GANs and GMMs | NIPS | code | 10 | | Salient Object Detection Driven by Fixation Prediction | CVPR | code | 9 | | Semantic Video Segmentation by Gated Recurrent Flow Propagation | CVPR | code | 9 | | Constraint-Aware Deep Neural Network Compression | ECCV | code | 9 | | Statistically-motivated Second-order Pooling | ECCV | code | 9 | | Excitation Backprop for RNNs | CVPR | code | 9 | | Analyzing Uncertainty in Neural Machine Translation | ICML | code | 9 | | Learning Dynamics of Linear Denoising Autoencoders | ICML | code | 9 | | Saliency Detection in 360° Videos | ECCV | code | 9 | | Density Adaptive Point Set Registration | CVPR | code | 9 | | Decoupled Parallel Backpropagation with Convergence Guarantee | ICML | code | 9 | | Classification from Pairwise Similarity and Unlabeled Data | ICML | code | 9 | | oi-VAE: Output Interpretable VAEs for Nonlinear Group Factor Analysis | ICML | code | 9 | | Modeling Sparse Deviations for Compressed Sensing using Generative Models | ICML | code | 9 | | Pixels, Voxels, and Views: A Study of Shape Representations for Single View 3D Object Shape Prediction | CVPR | code | 9 | | Towards Open-Set Identity Preserving Face Synthesis | CVPR | code | 9 | | Five-Point Fundamental Matrix Estimation for Uncalibrated Cameras | CVPR | code | 8 | | BourGAN: Generative Networks with Metric Embeddings | NIPS | code | 8 | | Fast Information-theoretic Bayesian Optimisation | ICML | code | 8 | | Deep Variational Reinforcement Learning for POMDPs | ICML | code | 8 | | Specular-to-Diffuse Translation for Multi-View Reconstruction | ECCV | code | 8 | | Dynamic Conditional Networks for Few-Shot Learning | ECCV | code | 8 | | Learning Facial Action Units From Web Images With Scalable Weakly Supervised Clustering | CVPR | code | 8 | | High-Resolution Image Synthesis and Semantic Manipulation With Conditional GANs | CVPR | code | 8 | | Deep Defense: Training DNNs with Improved Adversarial Robustness | NIPS | code | 8 | | Learning K-way D-dimensional Discrete Codes for Compact Embedding Representations | ICML | code | 8 | | Light Structure from Pin Motion: Simple and Accurate Point Light Calibration for Physics-based Modeling | ECCV | code | 7 | | Non-metric Similarity Graphs for Maximum Inner Product Search | NIPS | code | 7 | | Towards Realistic Predictors | ECCV | code | 7 | | Deep Non-Blind Deconvolution via Generalized Low-Rank Approximation | NIPS | code | 7 | | Don’t Just Assume Look and Answer: Overcoming Priors for Visual Question Answering | CVPR | code | 7 | | Learning Dual Convolutional Neural Networks for Low-Level Vision | CVPR | code | 7 | | The Mirage of Action-Dependent Baselines in Reinforcement Learning | ICML | code | 7 | | DVQA: Understanding Data Visualizations via Question Answering | CVPR | code | 7 | | A Two-Step Disentanglement Method | CVPR | code | 7 | | Detecting and Correcting for Label Shift with Black Box Predictors | ICML | code | 7 | | Conditional Prior Networks for Optical Flow | ECCV | code | 7 | | Generative Adversarial Learning Towards Fast Weakly Supervised Detection | CVPR | code | 7 | | Adversarial Learning with Local Coordinate Coding | ICML | code | 7 | | Stochastic Downsampling for Cost-Adjustable Inference and Improved Regularization in Convolutional Networks | CVPR | code | 7 | | AttnGAN: Fine-Grained Text to Image Generation With Attentional Generative Adversarial Networks | CVPR | code | 7 | | Learning to Explain: An Information-Theoretic Perspective on Model Interpretation | ICML | code | 7 | | Banach Wasserstein GAN | NIPS | code | 7 | | Gradually Updated Neural Networks for Large-Scale Image Recognition | ICML | code | 7 | | Learning Steady-States of Iterative Algorithms over Graphs | ICML | code | 7 | | Progressive Attention Guided Recurrent Network for Salient Object Detection | CVPR | code | 7 | | Zoom and Learn: Generalizing Deep Stereo Matching to Novel Domains | CVPR | code | 6 | | Unsupervised holistic image generation from key local patches | ECCV | code | 6 | | Inner Space Preserving Generative Pose Machine | ECCV | code | 6 | | Bilevel Programming for Hyperparameter Optimization and Meta-Learning | ICML | code | 6 | | Optical Flow Guided Feature: A Fast and Robust Motion Representation for Video Action Recognition | CVPR | code | 6 | | Breaking the Activation Function Bottleneck through Adaptive Parameterization | NIPS | code | 6 | | Ultra Large-Scale Feature Selection using Count-Sketches | ICML | code | 6 | | Dynamic Scene Deblurring Using Spatially Variant Recurrent Neural Networks | CVPR | code | 6 | | Orthogonally Decoupled Variational Gaussian Processes | NIPS | code | 6 | | Batch Bayesian Optimization via Multi-objective Acquisition Ensemble for Automated Analog Circuit Design | ICML | code | 6 | | A Modulation Module for Multi-task Learning with Applications in Image Retrieval | ECCV | code | 6 | | A Memory Network Approach for Story-Based Temporal Summarization of 360° Videos | CVPR | code | 6 | | Towards Effective Low-Bitwidth Convolutional Neural Networks | CVPR | code | 5 | | Disentangling Factors of Variation by Mixing Them | CVPR | code | 5 | | Weakly-supervised Video Summarization using Variational Encoder-Decoder and Web Prior | ECCV | code | 5 | | Learning Longer-term Dependencies in RNNs with Auxiliary Losses | ICML | code | 5 | | Contour Knowledge Transfer for Salient Object Detection | ECCV | code | 5 | | HybridNet: Classification and Reconstruction Cooperation for Semi-Supervised Learning | ECCV | code | 5 | | Sidekick Policy Learning for Active Visual Exploration | ECCV | code | 5 | | Learning to Localize Sound Source in Visual Scenes | CVPR | code | 5 | | Neural Architecture Optimization | NIPS | code | 5 | | COLA: Decentralized Linear Learning | NIPS | code | 5 | | Diverse and Coherent Paragraph Generation from Images | ECCV | code | 5 | | DRACO: Byzantine-resilient Distributed Training via Redundant Gradients | ICML | code | 5 | | Inter and Intra Topic Structure Learning with Word Embeddings | ICML | code | 5 | | Estimating the Success of Unsupervised Image to Image Translation | ECCV | code | 5 | | Dynamic-Structured Semantic Propagation Network | CVPR | code | 5 | | The Description Length of Deep Learning models | NIPS | code | 5 | | Stereo Vision-based Semantic 3D Object and Ego-motion Tracking for Autonomous Driving | ECCV | code | 5 | | Blind Justice: Fairness with Encrypted Sensitive Attributes | ICML | code | 5 | | Transfer Learning via Learning to Transfer | ICML | code | 5 | | Deepcode: Feedback Codes via Deep Learning | NIPS | code | 4 | | Configurable Markov Decision Processes | ICML | code | 4 | | A Framework for Evaluating 6-DOF Object Trackers | ECCV | code | 4 | | Differentially Private Database Release via Kernel Mean Embeddings | ICML | code | 4 | | Recognizing Human Actions as the Evolution of Pose Estimation Maps | CVPR | code | 4 | | Connecting Pixels to Privacy and Utility: Automatic Redaction of Private Information in Images | CVPR | code | 4 | | DeLS-3D: Deep Localization and Segmentation With a 3D Semantic Map | CVPR | code | 4 | | Geolocation Estimation of Photos using a Hierarchical Model and Scene Classification | ECCV | code | 4 | | Tracking Emerges by Colorizing Videos | ECCV | code | 4 | | Diverse Conditional Image Generation by Stochastic Regression with Latent Drop-Out Codes | ECCV | code | 4 | | Inference Suboptimality in Variational Autoencoders | ICML | code | 4 | | Black Box FDR | ICML | code | 4 | | Feedback-Prop: Convolutional Neural Network Inference Under Partial Evidence | CVPR | code | 4 | | Quadrature-based features for kernel approximation | NIPS | code | 4 | | Joint Representation and Truncated Inference Learning for Correlation Filter based Tracking | ECCV | code | 4 | | Transferable Adversarial Perturbations | ECCV | code | 4 | | Single Image Water Hazard Detection using FCN with Reflection Attention Units | ECCV | code | 4 | | Multimodal Generative Models for Scalable Weakly-Supervised Learning | NIPS | code | 4 | | Importance Weighted Transfer of Samples in Reinforcement Learning | ICML | code | 3 | | Feature Generating Networks for Zero-Shot Learning | CVPR | code | 3 | | DICOD: Distributed Convolutional Coordinate Descent for Convolutional Sparse Coding | ICML | code | 3 | | CapProNet: Deep Feature Learning via Orthogonal Projections onto Capsule Subspaces | NIPS | code | 3 | | Bidirectional Retrieval Made Simple | CVPR | code | 3 | | Multilingual Anchoring: Interactive Topic Modeling and Alignment Across Languages | NIPS | code | 3 | | A Hybrid l1-l0 Layer Decomposition Model for Tone Mapping | CVPR | code | 3 | | Spatially-Adaptive Filter Units for Deep Neural Networks | CVPR | code | 3 | | Learning to Branch | ICML | code | 3 | | Explanations based on the Missing: Towards Contrastive Explanations with Pertinent Negatives | NIPS | code | 3 | | Lifelong Learning via Progressive Distillation and Retrospection | ECCV | code | 3 | | CLEAR: Cumulative LEARning for One-Shot One-Class Image Recognition | CVPR | code | 3 | | Not to Cry Wolf: Distantly Supervised Multitask Learning in Critical Care | ICML | code | 3 | | Learning Answer Embeddings for Visual Question Answering | CVPR | code | 3 | | Information Constraints on Auto-Encoding Variational Bayes | NIPS | code | 3 | | Parallel Bayesian Network Structure Learning | ICML | code | 3 | | Ring Loss: Convex Feature Normalization for Face Recognition | CVPR | code | 3 | | Teaching Categories to Human Learners With Visual Explanations | CVPR | code | 3 | | Stabilizing Gradients for Deep Neural Networks via Efficient SVD Parameterization | ICML | code | 3 | | Deep Burst Denoising | ECCV | code | 3 | | Convergent Tree Backup and Retrace with Function Approximation | ICML | code | 3 | | Gaze Prediction in Dynamic 360° Immersive Videos | CVPR | code | 3 | | Statistical Recurrent Models on Manifold valued Data | NIPS | code | 3 | | End-to-End Flow Correlation Tracking With Spatial-Temporal Attention | CVPR | code | 3 | ↥ back to top

2017

| Title | Conf | Code | Stars | |:--------|:--------:|:--------:|:--------:| | Bridging the Gap Between Value and Policy Based Reinforcement Learning | NIPS | code | 46593 | | REBAR: Low-variance, unbiased gradient estimates for discrete latent variable models | NIPS | code | 46593 | | Focal Loss for Dense Object Detection | ICCV | code | 18356 | | Mask R-CNN | ICCV | code | 9493 | | Deep Photo Style Transfer | CVPR | code | 8655 | | LightGBM: A Highly Efficient Gradient Boosting Decision Tree | NIPS | code | 7536 | | Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation | NIPS | code | 6449 | | Attention is All you Need | NIPS | code | 6288 | | Large Pose 3D Face Reconstruction From a Single Image via Direct Volumetric CNN Regression | ICCV | code | 3354 | | Densely Connected Convolutional Networks | CVPR | code | 3130 | | A Unified Approach to Interpreting Model Predictions | NIPS | code | 3122 | | Deformable Convolutional Networks | ICCV | code | 2165 | | ELF: An Extensive, Lightweight and Flexible Research Platform for Real-time Strategy Games | NIPS | code | 1823 | | PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation | CVPR | code | 1523 | | Improved Training of Wasserstein GANs | NIPS | code | 1405 | | Fully Convolutional Instance-Aware Semantic Segmentation | CVPR | code | 1395 | | Aggregated Residual Transformations for Deep Neural Networks | CVPR | code | 1361 | | Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network | CVPR | code | 1301 | | Unsupervised Image-to-Image Translation Networks | NIPS | code | 1205 | | Photographic Image Synthesis With Cascaded Refinement Networks | ICCV | code | 1142 | | High-Resolution Image Inpainting Using Multi-Scale Neural Patch Synthesis | CVPR | code | 1072 | | SphereFace: Deep Hypersphere Embedding for Face Recognition | CVPR | code | 1048 | | Deep Feature Flow for Video Recognition | CVPR | code | 966 | | Bayesian GAN | NIPS | code | 942 | | Pyramid Scene Parsing Network | CVPR | code | 934 | | Efficient Modeling of Latent Information in Supervised Learning using Gaussian Processes | NIPS | code | 906 | | Finding Tiny Faces | CVPR | code | 856 | | Toward Multimodal Image-to-Image Translation | NIPS | code | 794 | | Learning to Discover Cross-Domain Relations with Generative Adversarial Networks | ICML | code | 784 | | YOLO9000: Better, Faster, Stronger | CVPR | code | 773 | | PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space | NIPS | code | 772 | | Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks | ICML | code | 729 | | FlowNet 2.0: Evolution of Optical Flow Estimation With Deep Networks | CVPR | code | 720 | | Channel Pruning for Accelerating Very Deep Neural Networks | ICCV | code | 649 | | Dilated Residual Networks | CVPR | code | 640 | | Inferring and Executing Programs for Visual Reasoning | ICCV | code | 636 | | DSOD: Learning Deeply Supervised Object Detectors From Scratch | ICCV | code | 582 | | Arbitrary Style Transfer in Real-Time With Adaptive Instance Normalization | ICCV | code | 572 | | Accelerating Eulerian Fluid Simulation With Convolutional Networks | ICML | code | 570 | | Learning Disentangled Representations with Semi-Supervised Deep Generative Models | NIPS | code | 556 | | Inductive Representation Learning on Large Graphs | NIPS | code | 552 | | Regressing Robust and Discriminative 3D Morphable Models With a Very Deep Neural Network | CVPR | code | 537 | | How Far Are We From Solving the 2D & 3D Face Alignment Problem? (And a Dataset of 230,000 3D Facial Landmarks) | ICCV | code | 526 | | SSH: Single Stage Headless Face Detector | ICCV | code | 515 | | Learning From Simulated and Unsupervised Images Through Adversarial Training | CVPR | code | 492 | | Plug & Play Generative Networks: Conditional Iterative Generation of Images in Latent Space | CVPR | code | 487 | | Video Frame Interpolation via Adaptive Convolution | CVPR | code | 482 | | Video Frame Interpolation via Adaptive Separable Convolution | ICCV | code | 482 | | GMS: Grid-based Motion Statistics for Fast, Ultra-Robust Feature Correspondence | CVPR | code | 460 | | Joint Detection and Identification Feature Learning for Person Search | CVPR | code | 459 | | Dual Path Networks | NIPS | code | 451 | | Flow-Guided Feature Aggregation for Video Object Detection | ICCV | code | 436 | | Deep Image Matting | CVPR | code | 434 | | Richer Convolutional Features for Edge Detection | CVPR | code | 399 | | Annotating Object Instances With a Polygon-RNN | CVPR | code | 397 | | Recurrent Highway Networks | ICML | code | 397 | | Detect to Track and Track to Detect | ICCV | code | 387 | | RefineNet: Multi-Path Refinement Networks for High-Resolution Semantic Segmentation | CVPR | code | 379 | | Detecting Oriented Text in Natural Images by Linking Segments | CVPR | code | 364 | | Deep Lattice Networks and Partial Monotonic Functions | NIPS | code | 349 | | Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results | NIPS | code | 347 | | RON: Reverse Connection With Objectness Prior Networks for Object Detection | CVPR | code | 345 | | Universal Style Transfer via Feature Transforms | NIPS | code | 344 | | Residual Attention Network for Image Classification | CVPR | code | 329 | | One-Shot Video Object Segmentation | CVPR | code | 316 | | Accurate Single Stage Detector Using Recurrent Rolling Convolution | CVPR | code | 314 | | Feature Pyramid Networks for Object Detection | CVPR | code | 310 | | Efficient softmax approximation for GPUs | ICML | code | 304 | | OctNet: Learning Deep 3D Representations at High Resolutions | CVPR | code | 302 | | Deep Laplacian Pyramid Networks for Fast and Accurate Super-Resolution | CVPR | code | 301 | | Pixel Recursive Super Resolution | ICCV | code | 301 | | Self-Critical Sequence Training for Image Captioning | CVPR | code | 299 | | Age Progression/Regression by Conditional Adversarial Autoencoder | CVPR | code | 297 | | Style Transfer from Non-Parallel Text by Cross-Alignment | NIPS | code | 296 | | Dilated Recurrent Neural Networks | NIPS | code | 285 | | Lifting From the Deep: Convolutional 3D Pose Estimation From a Single Image | CVPR | code | 280 | | DeepBach: a Steerable Model for Bach Chorales Generation | ICML | code | 276 | | The Predictron: End-To-End Learning and Planning | ICML | code | 274 | | Convolutional Sequence to Sequence Learning | ICML | code | 258 | | OptNet: Differentiable Optimization as a Layer in Neural Networks | ICML | code | 245 | | Prototypical Networks for Few-shot Learning | NIPS | code | 244 | | Deep Voice: Real-time Neural Text-to-Speech | ICML | code | 242 | | Reinforcement Learning with Deep Energy-Based Policies | ICML | code | 233 | | Learning Deep CNN Denoiser Prior for Image Restoration | CVPR | code | 231 | | GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium | NIPS | code | 229 | | A Point Set Generation Network for 3D Object Reconstruction From a Single Image | CVPR | code | 228 | | Deeply Supervised Salient Object Detection With Short Connections | CVPR | code | 228 | | BlitzNet: A Real-Time Deep Network for Scene Understanding | ICCV | code | 227 | | Language Modeling with Gated Convolutional Networks | ICML | code | 221 | | Unlabeled Samples Generated by GAN Improve the Person Re-Identification Baseline in Vitro | ICCV | code | 215 | | Stacked Generative Adversarial Networks | CVPR | code | 215 | | RMPE: Regional Multi-Person Pose Estimation | ICCV | code | 215 | | Knowing When to Look: Adaptive Attention via a Visual Sentinel for Image Captioning | CVPR | code | 214 | | Generative Face Completion | CVPR | code | 212 | | VPGNet: Vanishing Point Guided Network for Lane and Road Marking Detection and Recognition | ICCV | code | 210 | | The Reversible Residual Network: Backpropagation Without Storing Activations | NIPS | code | 210 | | Recurrent Scale Approximation for Object Detection in CNN | ICCV | code | 209 | | Learning From Synthetic Humans | CVPR | code | 207 | | Spatially Adaptive Computation Time for Residual Networks | CVPR | code | 203 | | Beyond Face Rotation: Global and Local Perception GAN for Photorealistic and Identity Preserving Frontal View Synthesis | ICCV | code | 202 | | 3D Bounding Box Estimation Using Deep Learning and Geometry | CVPR | code | 200 | | Multi-View 3D Object Detection Network for Autonomous Driving | CVPR | code | 199 | | Visual Dialog | CVPR | code | 199 | | Interpretable Explanations of Black Boxes by Meaningful Perturbation | ICCV | code | 192 | | Inverse Compositional Spatial Transformer Networks | CVPR | code | 189 | | FastMask: Segment Multi-Scale Object Candidates in One Shot | CVPR | code | 189 | | OnACID: Online Analysis of Calcium Imaging Data in Real Time | NIPS | code | 189 | | Semantic Scene Completion From a Single Depth Image | CVPR | code | 188 | | Learning Efficient Convolutional Networks Through Network Slimming | ICCV | code | 186 | | Learning Feature Pyramids for Human Pose Estimation | ICCV | code | 185 | | Be Your Own Prada: Fashion Synthesis With Structural Coherence | ICCV | code | 183 | | Scene Graph Generation by Iterative Message Passing | CVPR | code | 182 | | Fast Image Processing With Fully-Convolutional Networks | ICCV | code | 180 | | Learning Multiple Tasks with Multilinear Relationship Networks | NIPS | code | 178 | | Learning to Reason: End-To-End Module Networks for Visual Question Answering | ICCV | code | 178 | | Single Shot Text Detector With Regional Attention | ICCV | code | 176 | | Binarized Convolutional Landmark Localizers for Human Pose Estimation and Face Alignment With Limited Resources | ICCV | code | 175 | | Deep Feature Interpolation for Image Content Changes | CVPR | code | 170 | | On Human Motion Prediction Using Recurrent Neural Networks | CVPR | code | 167 | | Image Super-Resolution via Deep Recursive Residual Network | CVPR | code | 163 | | Learning Cross-Modal Embeddings for Cooking Recipes and Food Images | CVPR | code | 160 | | Input Convex Neural Networks | ICML | code | 159 | | Simple Does It: Weakly Supervised Instance and Semantic Segmentation | CVPR | code | 159 | | Low-Shot Visual Recognition by Shrinking and Hallucinating Features | ICCV | code | 158 | | Oriented Response Networks | CVPR | code | 157 | | Soft Proposal Networks for Weakly Supervised Object Localization | ICCV | code | 154 | | Adversarial Variational Bayes: Unifying Variational Autoencoders and Generative Adversarial Networks | ICML | code | 147 | | Axiomatic Attribution for Deep Networks | ICML | code | 146 | | Gradient Episodic Memory for Continual Learning | NIPS | code | 146 | | DSAC - Differentiable RANSAC for Camera Localization | CVPR | code | 144 | | Attend to You: Personalized Image Captioning With Context Sequence Memory Networks | CVPR | code | 143 | | Conditional Similarity Networks | CVPR | code | 142 | | Language Modeling with Recurrent Highway Hypernetworks | NIPS | code | 141 | | Triple Generative Adversarial Nets | NIPS | code | 138 | | Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning | NIPS | code | 138 | | One-Sided Unsupervised Domain Mapping | NIPS | code | 137 | | Detecting Visual Relationships With Deep Relational Networks | CVPR | code | 137 | | Attentive Recurrent Comparators | ICML | code | 136 | | Towards 3D Human Pose Estimation in the Wild: A Weakly-Supervised Approach | ICCV | code | 136 | | Learning a Multi-View Stereo Machine | NIPS | code | 135 | | Deep Learning for Precipitation Nowcasting: A Benchmark and A New Model | NIPS | code | 134 | | Multi-Context Attention for Human Pose Estimation | CVPR | code | 131 | | Controlling Perceptual Factors in Neural Style Transfer | CVPR | code | 130 | | Bayesian Compression for Deep Learning | NIPS | code | 130 | | Adversarial Discriminative Domain Adaptation | CVPR | code | 129 | | Working hard to know your neighbor's margins: Local descriptor learning loss | NIPS | code | 128 | | Concrete Dropout | NIPS | code | 127 | | SegFlow: Joint Learning for Video Object Segmentation and Optical Flow | ICCV | code | 127 | | Segmentation-Aware Convolutional Networks Using Local Attention Masks | ICCV | code | 126 | | Detail-Revealing Deep Video Super-Resolution | ICCV | code | 126 | | CREST: Convolutional Residual Learning for Visual Tracking | ICCV | code | 126 | | Discriminative Correlation Filter With Channel and Spatial Reliability | CVPR | code | 124 | | SVDNet for Pedestrian Retrieval | ICCV | code | 121 | | Semantic Image Synthesis via Adversarial Learning | ICCV | code | 121 | | Spatiotemporal Multiplier Networks for Video Action Recognition | CVPR | code | 121 | | PoseTrack: Joint Multi-Person Pose Estimation and Tracking | CVPR | code | 121 | | Hierarchical Attentive Recurrent Tracking | NIPS | code | 121 | | Good Semi-supervised Learning That Requires a Bad GAN | NIPS | code | 120 | | Deep Watershed Transform for Instance Segmentation | CVPR | code | 120 | | Associative Domain Adaptation | ICCV | code | 119 | | Learning by Association -- A Versatile Semi-Supervised Training Method for Neural Networks | CVPR | code | 119 | | Value Prediction Network | NIPS | code | 119 | | Unrestricted Facial Geometry Reconstruction Using Image-To-Image Translation | ICCV | code | 119 | | MemNet: A Persistent Memory Network for Image Restoration | ICCV | code | 119 | | Bayesian Optimization with Gradients | NIPS | code | 117 | | TernGrad: Ternary Gradients to Reduce Communication in Distributed Deep Learning | NIPS | code | 117 | | Compressed Sensing using Generative Models | ICML | code | 116 | | Switching Convolutional Neural Network for Crowd Counting | CVPR | code | 116 | | WILDCAT: Weakly Supervised Learning of Deep ConvNets for Image Classification, Pointwise Localization and Segmentation | CVPR | code | 116 | | Show, Adapt and Tell: Adversarial Training of Cross-Domain Image Captioner | ICCV | code | 115 | | Video Frame Synthesis Using Deep Voxel Flow | ICCV | code | 114 | | Multiple Instance Detection Network With Online Instance Classifier Refinement | CVPR | code | 113 | | Deep Pyramidal Residual Networks | CVPR | code | 112 | | Train longer, generalize better: closing the generalization gap in large batch training of neural networks | NIPS | code | 112 | | Split-Brain Autoencoders: Unsupervised Learning by Cross-Channel Prediction | CVPR | code | 110 | | Unite the People: Closing the Loop Between 3D and 2D Human Representations | CVPR | code | 110 | | Learning Combinatorial Optimization Algorithms over Graphs | NIPS | code | 109 | | FeUdal Networks for Hierarchical Reinforcement Learning | ICML | code | 107 | | ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression | ICCV | code | 105 | | Learning a Deep Embedding Model for Zero-Shot Learning | CVPR | code | 104 | | ECO: Efficient Convolution Operators for Tracking | CVPR | code | 103 | | SCA-CNN: Spatial and Channel-Wise Attention in Convolutional Networks for Image Captioning | CVPR | code | 102 | | Multi-View Supervision for Single-View Reconstruction via Differentiable Ray Consistency | CVPR | code | 100 | | Task-based End-to-end Model Learning in Stochastic Optimization | NIPS | code | 100 | | Learning to Compose Domain-Specific Transformations for Data Augmentation | NIPS | code | 97 | | Genetic CNN | ICCV | code | 97 | | HashNet: Deep Learning to Hash by Continuation | ICCV | code | 97 | | Interleaved Group Convolutions | ICCV | code | 95 | | Deeply-Learned Part-Aligned Representations for Person Re-Identification | ICCV | code | 95 | | Best of Both Worlds: Transferring Knowledge from Discriminative Learning to a Generative Visual Dialog Model | NIPS | code | 94 | | Multi-Scale Continuous CRFs as Sequential Deep Networks for Monocular Depth Estimation | CVPR | code | 93 | | Octree Generating Networks: Efficient Convolutional Architectures for High-Resolution 3D Outputs | ICCV | code | 92 | | Semantic Autoencoder for Zero-Shot Learning | CVPR | code | 92 | | Deep Hyperspherical Learning | NIPS | code | 92 | | Decoupled Neural Interfaces using Synthetic Gradients | ICML | code | 90 | | Geometric Matrix Completion with Recurrent Multi-Graph Neural Networks | NIPS | code | 90 | | Practical Bayesian Optimization for Model Fitting with Bayesian Adaptive Direct Search | NIPS | code | 90 | | Optical Flow Estimation Using a Spatial Pyramid Network | CVPR | code | 90 | | AMC: Attention guided Multi-modal Correlation Learning for Image Search | CVPR | code | 90 | | Deep Video Deblurring for Hand-Held Cameras | CVPR | code | 89 | | Unsupervised Learning of Disentangled and Interpretable Representations from Sequential Data | NIPS | code | 88 | | Causal Effect Inference with Deep Latent-Variable Models | NIPS | code | 87 | | GANs for Biological Image Synthesis | ICCV | code | 85 | | MMD GAN: Towards Deeper Understanding of Moment Matching Network | NIPS | code | 84 | | Representation Learning by Learning to Count | ICCV | code | 84 | | Optical Flow in Mostly Rigid Scenes | CVPR | code | 83 | | Fast-Slow Recurrent Neural Networks | NIPS | code | 82 | | Unsupervised Video Summarization With Adversarial LSTM Networks | CVPR | code | 82 | | Constrained Policy Optimization | ICML | code | 81 | | A-NICE-MC: Adversarial Training for MCMC | NIPS | code | 80 | | Coarse-To-Fine Volumetric Prediction for Single-Image 3D Human Pose | CVPR | code | 80 | | End-To-End Instance Segmentation With Recurrent Attention | CVPR | code | 78 | | DeLiGAN : Generative Adversarial Networks for Diverse and Limited Data | CVPR | code | 78 | | Learning Shape Abstractions by Assembling Volumetric Primitives | CVPR | code | 77 | | Local Binary Convolutional Neural Networks | CVPR | code | 77 | | Raster-To-Vector: Revisiting Floorplan Transformation | ICCV | code | 76 | | Positive-Unlabeled Learning with Non-Negative Risk Estimator | NIPS | code | 76 | | Hard-Aware Deeply Cascaded Embedding | ICCV | code | 75 | | Deep Image Harmonization | CVPR | code | 73 | | Shape Completion Using 3D-Encoder-Predictor CNNs and Shape Synthesis | CVPR | code | 73 | | Not All Pixels Are Equal: Difficulty-Aware Semantic Segmentation via Deep Layer Cascade | CVPR | code | 73 | | Improved Stereo Matching With Constant Highway Networks and Reflective Confidence Learning | CVPR | code | 72 | | Query-Guided Regression Network With Context Policy for Phrase Grounding | ICCV | code | 72 | | Top-Down Visual Saliency Guided by Captions | CVPR | code | 72 | | Feedback Networks | CVPR | code | 72 | | What Actions Are Needed for Understanding Human Actions in Videos? | ICCV | code | 71 | | Xception: Deep Learning With Depthwise Separable Convolutions | CVPR | code | 71 | | Action-Decision Networks for Visual Tracking With Deep Reinforcement Learning | CVPR | code | 71 | | Video Propagation Networks | CVPR | code | 70 | | Image-To-Image Translation With Conditional Adversarial Networks | CVPR | code | 70 | | Quality Aware Network for Set to Set Recognition | CVPR | code | 69 | | Self-Supervised Learning of Visual Features Through Embedding Images Into Text Topic Spaces | CVPR | code | 69 | | Deep Subspace Clustering Networks | NIPS | code | 68 | | Escape From Cells: Deep Kd-Networks for the Recognition of 3D Point Cloud Models | ICCV | code | 68 | | A Distributional Perspective on Reinforcement Learning | ICML | code | 68 | | Physically-Based Rendering for Indoor Scene Understanding Using Convolutional Neural Networks | CVPR | code | 67 | | Deep Transfer Learning with Joint Adaptation Networks | ICML | code | 67 | | Training Deep Networks without Learning Rates Through Coin Betting | NIPS | code | 66 | | Full Resolution Image Compression With Recurrent Neural Networks | CVPR | code | 66 | | SurfaceNet: An End-To-End 3D Neural Network for Multiview Stereopsis | ICCV | code | 66 | | Doubly Stochastic Variational Inference for Deep Gaussian Processes | NIPS | code | 66 | | TURN TAP: Temporal Unit Regression Network for Temporal Action Proposals | ICCV | code | 66 | | Jointly Attentive Spatial-Temporal Pooling Networks for Video-Based Person Re-Identification | ICCV | code | 65 | | Synthesizing 3D Shapes via Modeling Multi-View Depth Maps and Silhouettes With Deep Generative Networks | CVPR | code | 65 | | Dance Dance Convolution | ICML | code | 65 | | Borrowing Treasures From the Wealthy: Deep Transfer Learning Through Selective Joint Fine-Tuning | CVPR | code | 64 | | Curriculum Domain Adaptation for Semantic Segmentation of Urban Scenes | ICCV | code | 64 | | Toward Controlled Generation of Text | ICML | code | 63 | | Person Re-Identification in the Wild | CVPR | code | 63 | | ALICE: Towards Understanding Adversarial Learning for Joint Distribution Matching | NIPS | code | 63 | | Differentiable Learning of Logical Rules for Knowledge Base Reasoning | NIPS | code | 62 | | Person Search With Natural Language Description | CVPR | code | 61 | | Multi-Channel Weighted Nuclear Norm Minimization for Real Color Image Denoising | ICCV | code | 61 | | Playing for Benchmarks | ICCV | code | 61 | | Unsupervised Learning by Predicting Noise | ICML | code | 60 | | Localizing Moments in Video With Natural Language | ICCV | code | 60 | | End-To-End 3D Face Reconstruction With Deep Neural Networks | CVPR | code | 60 | | CoupleNet: Coupling Global Structure With Local Parts for Object Detection | ICCV | code | 59 | | AdaGAN: Boosting Generative Models | NIPS | code | 59 | | Convolutional Gaussian Processes | NIPS | code | 57 | | A Deep Regression Architecture With Two-Stage Re-Initialization for High Performance Facial Landmark Detection | CVPR | code | 57 | | Modeling Relationships in Referential Expressions With Compositional Modular Networks | CVPR | code | 57 | | Curiosity-driven Exploration by Self-supervised Prediction | ICML | code | 56 | | Wavelet-SRNet: A Wavelet-Based CNN for Multi-Scale Face Super Resolution | ICCV | code | 56 | | The Neural Hawkes Process: A Neurally Self-Modulating Multivariate Point Process | NIPS | code | 56 | | Online and Linear-Time Attention by Enforcing Monotonic Alignments | ICML | code | 56 | | Neural Expectation Maximization | NIPS | code | 56 | | Dense-Captioning Events in Videos | ICCV | code | 55 | | Factorized Bilinear Models for Image Recognition | ICCV | code | 55 | | Net-Trim: Convex Pruning of Deep Neural Networks with Performance Guarantee | NIPS | code | 54 | | On-the-fly Operation Batching in Dynamic Computation Graphs | NIPS | code | 54 | | Visual Translation Embedding Network for Visual Relation Detection | CVPR | code | 54 | | Learning Blind Motion Deblurring | ICCV | code | 54 | | A Disentangled Recognition and Nonlinear Dynamics Model for Unsupervised Learning | NIPS | code | 53 | | Towards Diverse and Natural Image Descriptions via a Conditional GAN | ICCV | code | 53 | | CDC: Convolutional-De-Convolutional Networks for Precise Temporal Action Localization in Untrimmed Videos | CVPR | code | 53 | | A Generic Deep Architecture for Single Image Reflection Removal and Image Smoothing | ICCV | code | 52 | | Deep IV: A Flexible Approach for Counterfactual Prediction | ICML | code | 52 | | Triangle Generative Adversarial Networks | NIPS | code | 51 | | EAST: An Efficient and Accurate Scene Text Detector | CVPR | code | 51 | | SST: Single-Stream Temporal Action Proposals | CVPR | code | 51 | | Predicting Deeper Into the Future of Semantic Segmentation | ICCV | code | 51 | | L2-Net: Deep Learning of Discriminative Patch Descriptor in Euclidean Space | CVPR | code | 51 | | TALL: Temporal Activity Localization via Language Query | ICCV | code | 50 | | Hybrid Reward Architecture for Reinforcement Learning | NIPS | code | 50 | | Fast Fourier Color Constancy | CVPR | code | 49 | | Modulating early visual processing by language | NIPS | code | 49 | | Adversarial Examples for Semantic Segmentation and Object Detection | ICCV | code | 49 | | Learning Discrete Representations via Information Maximizing Self-Augmented Training | ICML | code | 49 | | Efficient Diffusion on Region Manifolds: Recovering Small Objects With Compact CNN Representations | CVPR | code | 48 | | Real Time Image Saliency for Black Box Classifiers | NIPS | code | 48 | | FC4: Fully Convolutional Color Constancy With Confidence-Weighted Pooling | CVPR | code | 47 | | Multiple People Tracking by Lifted Multicut and Person Re-Identification | CVPR | code | 47 | | Learned D-AMP: Principled Neural Network based Compressive Image Recovery | NIPS | code | 47 | | GP CaKe: Effective brain connectivity with causal kernels | NIPS | code | 46 | | Predicting Organic Reaction Outcomes with Weisfeiler-Lehman Network | NIPS | code | 46 | | Semantic Video CNNs Through Representation Warping | ICCV | code | 46 | | Grammar Variational Autoencoder | ICML | code | 46 | | EnhanceNet: Single Image Super-Resolution Through Automated Texture Synthesis | ICCV | code | 46 | | Safe Model-based Reinforcement Learning with Stability Guarantees | NIPS | code | 45 | | Deep Spectral Clustering Learning | ICML | code | 45 | | Semantic Compositional Networks for Visual Captioning | CVPR | code | 45 | | On-Demand Learning for Deep Image Restoration | ICCV | code | 45 | | Video Pixel Networks | ICML | code | 45 | | Stabilizing Training of Generative Adversarial Networks through Regularization | NIPS | code | 45 | | Structured Bayesian Pruning via Log-Normal Multiplicative Noise | NIPS | code | 44 | | Deriving Neural Architectures from Sequence and Graph Kernels | ICML | code | 44 | | Masked Autoregressive Flow for Density Estimation | NIPS | code | 44 | | Unsupervised Adaptation for Deep Stereo | ICCV | code | 44 | | Learning Residual Images for Face Attribute Manipulation | CVPR | code | 43 | | Learning to Generate Long-term Future via Hierarchical Prediction | ICML | code | 43 | | Accurate Optical Flow via Direct Cost Volume Processing | CVPR | code | 42 | | Generalized Orderless Pooling Performs Implicit Salient Matching | ICCV | code | 42 | | Comparative Evaluation of Hand-Crafted and Learned Local Features | CVPR | code | 42 | | SchNet: A continuous-filter convolutional neural network for modeling quantum interactions | NIPS | code | 41 | | Temporal Generative Adversarial Nets With Singular Value Clipping | ICCV | code | 41 | | Multiplicative Normalizing Flows for Variational Bayesian Neural Networks | ICML | code | 41 | | Neural Scene De-Rendering | CVPR | code | 40 | | Semantic Image Inpainting With Deep Generative Models | CVPR | code | 40 | | A Linear-Time Kernel Goodness-of-Fit Test | NIPS | code | 40 | | Least Squares Generative Adversarial Networks | ICCV | code | 39 | | Diversified Texture Synthesis With Feed-Forward Networks | CVPR | code | 39 | | No Fuss Distance Metric Learning Using Proxies | ICCV | code | 38 | | Template Matching With Deformable Diversity Similarity | CVPR | code | 38 | | What's in a Question: Using Visual Questions as a Form of Supervision | CVPR | code | 38 | | Face Normals "In-The-Wild" Using Fully Convolutional Networks | CVPR | code | 38 | | Conditional Image Synthesis with Auxiliary Classifier GANs | ICML | code | 37 | | Neural Episodic Control | ICML | code | 37 | | 3D-PRNN: Generating Shape Primitives With Recurrent Neural Networks | ICCV | code | 37 | | Structured Embedding Models for Grouped Data | NIPS | code | 36 | | Learning Active Learning from Data | NIPS | code | 36 | | Unified Deep Supervised Domain Adaptation and Generalization | ICCV | code | 35 | | Transformation-Grounded Image Generation Network for Novel 3D View Synthesis | CVPR | code | 35 | | Structured Attentions for Visual Question Answering | ICCV | code | 34 | | Geometric Loss Functions for Camera Pose Regression With Deep Learning | CVPR | code | 34 | | VidLoc: A Deep Spatio-Temporal Model for 6-DoF Video-Clip Relocalization | CVPR | code | 34 | | QMDP-Net: Deep Learning for Planning under Partial Observability | NIPS | code | 34 | | Using Ranking-CNN for Age Estimation | CVPR | code | 33 | | Hierarchical Boundary-Aware Neural Encoder for Video Captioning | CVPR | code | 33 | | Unsupervised Learning of Disentangled Representations from Video | NIPS | code | 32 | | Deep Learning on Lie Groups for Skeleton-Based Action Recognition | CVPR | code | 32 | | Deep Variation-Structured Reinforcement Learning for Visual Relationship and Attribute Detection | CVPR | code | 32 | | 3D Point Cloud Registration for Localization Using a Deep Neural Network Auto-Encoder | CVPR | code | 32 | | StyleNet: Generating Attractive Visual Captions With Styles | CVPR | code | 32 | | Dynamic Word Embeddings | ICML | code | 32 | | Learning to Prune Deep Neural Networks via Layer-wise Optimal Brain Surgeon | NIPS | code | 31 | | Continual Learning Through Synaptic Intelligence | ICML | code | 31 | | Full-Resolution Residual Networks for Semantic Segmentation in Street Scenes | CVPR | code | 31 | | Learning Detection With Diverse Proposals | CVPR | code | 31 | | LCNN: Lookup-Based Convolutional Neural Network | CVPR | code | 31 | | Towards Accurate Multi-Person Pose Estimation in the Wild | CVPR | code | 30 | | Real-Time Neural Style Transfer for Videos | CVPR | code | 30 | | Speaking the Same Language: Matching Machine to Human Captions by Adversarial Training | ICCV | code | 30 | | Deep Co-Occurrence Feature Learning for Visual Object Recognition | CVPR | code | 29 | | Joint distribution optimal transportation for domain adaptation | NIPS | code | 29 | | Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields | CVPR | code | 29 | | SplitNet: Learning to Semantically Split Deep Networks for Parameter Reduction and Model Parallelization | ICML | code | 29 | | The Statistical Recurrent Unit | ICML | code | 29 | | A Unified Approach of Multi-Scale Deep and Hand-Crafted Features for Defocus Estimation | CVPR | code | 28 | | Learning Spread-Out Local Feature Descriptors | ICCV | code | 28 | | Event-Based Visual Inertial Odometry | CVPR | code | 27 | | DropoutNet: Addressing Cold Start in Recommender Systems | NIPS | code | 27 | | Phrase Localization and Visual Relationship Detection With Comprehensive Image-Language Cues | ICCV | code | 27 | | Harvesting Multiple Views for Marker-Less 3D Human Pose Annotations | CVPR | code | 27 | | Deep 360 Pilot: Learning a Deep Agent for Piloting Through 360deg Sports Videos | CVPR | code | 27 | | Neural Message Passing for Quantum Chemistry | ICML | code | 27 | | State-Frequency Memory Recurrent Neural Networks | ICML | code | 27 | | DeepCD: Learning Deep Complementary Descriptors for Patch Representations | ICCV | code | 26 | | Contrastive Learning for Image Captioning | NIPS | code | 26 | | Stochastic Optimization with Variance Reduction for Infinite Datasets with Finite Sum Structure | NIPS | code | 26 | | Learning High Dynamic Range From Outdoor Panoramas | ICCV | code | 26 | | Speed/Accuracy Trade-Offs for Modern Convolutional Object Detectors | CVPR | code | 26 | | Learning to Detect Salient Objects With Image-Level Supervision | CVPR | code | 26 | | Improved Variational Autoencoders for Text Modeling using Dilated Convolutions | ICML | code | 26 | | Interspecies Knowledge Transfer for Facial Keypoint Detection | CVPR | code | 25 | | YASS: Yet Another Spike Sorter | NIPS | code | 25 | | Open Set Domain Adaptation | ICCV | code | 25 | | Domain-Adaptive Deep Network Compression | ICCV | code | 24 | | Long Short-Term Memory Kalman Filters: Recurrent Neural Estimators for Pose Regularization | ICCV | code | 24 | | Temporal Context Network for Activity Localization in Videos | ICCV | code | 24 | | Incremental Learning of Object Detectors Without Catastrophic Forgetting | ICCV | code | 24 | | Dense Captioning With Joint Inference and Visual Context | CVPR | code | 24 | | Universal Adversarial Perturbations | CVPR | code | 24 | | Asymmetric Tri-training for Unsupervised Domain Adaptation | ICML | code | 24 | | Reducing Reparameterization Gradient Variance | NIPS | code | 24 | | Exploiting Saliency for Object Segmentation From Image Level Labels | CVPR | code | 24 | | A Dirichlet Mixture Model of Hawkes Processes for Event Sequence Clustering | NIPS | code | 24 | | Shading Annotations in the Wild | CVPR | code | 24 | | Straight to Shapes: Real-Time Detection of Encoded Shapes | CVPR | code | 23 | | Dual Discriminator Generative Adversarial Nets | NIPS | code | 23 | | Zero-Order Reverse Filtering | ICCV | code | 23 | | Variational Walkback: Learning a Transition Operator as a Stochastic Recurrent Net | NIPS | code | 23 | | Learning Spherical Convolution for Fast Features from 360° Imagery | NIPS | code | 22 | | Learning to Detect Sepsis with a Multitask Gaussian Process RNN Classifier | ICML | code | 22 | | Deep Cross-Modal Hashing | CVPR | code | 22 | | When Unsupervised Domain Adaptation Meets Tensor Representations | ICCV | code | 22 | | Image Super-Resolution Using Dense Skip Connections | ICCV | code | 22 | | Multimodal Transfer: A Hierarchical Deep Convolutional Neural Network for Fast Artistic Style Transfer | CVPR | code | 22 | | STD2P: RGBD Semantic Segmentation Using Spatio-Temporal Data-Driven Pooling | CVPR | code | 22 | | Learning Continuous Semantic Representations of Symbolic Expressions | ICML | code | 22 | | Deep Growing Learning | ICCV | code | 21 | | Combined Group and Exclusive Sparsity for Deep Neural Networks | ICML | code | 21 | | Hash Embeddings for Efficient Word Representations | NIPS | code | 21 | | Accuracy First: Selecting a Differential Privacy Level for Accuracy Constrained ERM | NIPS | code | 21 | | Disentangled Representation Learning GAN for Pose-Invariant Face Recognition | CVPR | code | 21 | | Learning to Pivot with Adversarial Networks | NIPS | code | 21 | | Learning Dynamic Siamese Network for Visual Object Tracking | ICCV | code | 21 | | POSEidon: Face-From-Depth for Driver Pose Estimation | CVPR | code | 20 | | Deep Metric Learning via Facility Location | CVPR | code | 20 | | Automatic Spatially-Aware Fashion Concept Discovery | ICCV | code | 20 | | The Numerics of GANs | NIPS | code | 20 | | From Motion Blur to Motion Flow: A Deep Learning Solution for Removing Heterogeneous Motion Blur | CVPR | code | 20 | | Unpaired Image-To-Image Translation Using Cycle-Consistent Adversarial Networks | ICCV | code | 20 | | Zero-Inflated Exponential Family Embeddings | ICML | code | 20 | | InfoGAIL: Interpretable Imitation Learning from Visual Demonstrations | NIPS | code | 20 | | Weakly-Supervised Learning of Visual Relations | ICCV | code | 20 | | Multi-Label Image Recognition by Recurrently Discovering Attentional Regions | ICCV | code | 20 | | Scene Parsing With Global Context Embedding | ICCV | code | 20 | | Context Selection for Embedding Models | NIPS | code | 20 | | Deep Mean-Shift Priors for Image Restoration | NIPS | code | 20 | | Skeleton Key: Image Captioning by Skeleton-Attribute Decomposition | CVPR | code | 20 | | Fully-Adaptive Feature Sharing in Multi-Task Networks With Applications in Person Attribute Classification | CVPR | code | 19 | | Learning Compact Geometric Features | ICCV | code | 19 | | Structured Generative Adversarial Networks | NIPS | code | 19 | | Joint Gap Detection and Inpainting of Line Drawings | CVPR | code | 19 | | Chained Multi-Stream Networks Exploiting Pose, Motion, and Appearance for Action Classification and Detection | ICCV | code | 19 | | Adversarial Feature Matching for Text Generation | ICML | code | 18 | | BIER - Boosting Independent Embeddings Robustly | ICCV | code | 18 | | Predictive-Corrective Networks for Action Detection | CVPR | code | 18 | | Stochastic Generative Hashing | ICML | code | 18 | | A Bayesian Data Augmentation Approach for Learning Deep Models | NIPS | code | 18 | | Attentive Semantic Video Generation Using Captions | ICCV | code | 18 | | MDNet: A Semantically and Visually Interpretable Medical Image Diagnosis Network | CVPR | code | 18 | | Deep Unsupervised Similarity Learning Using Partially Ordered Sets | CVPR | code | 17 | | DualNet: Learn Complementary Features for Image Recognition | ICCV | code | 17 | | Neural system identification for large populations separating “what” and “where” | NIPS | code | 17 | | FALKON: An Optimal Large Scale Kernel Method | NIPS | code | 17 | | Deep Future Gaze: Gaze Anticipation on Egocentric Videos Using Adversarial Networks | CVPR | code | 17 | | Deep Learning with Topological Signatures | NIPS | code | 17 | | Streaming Sparse Gaussian Process Approximations | NIPS | code | 17 | | RPAN: An End-To-End Recurrent Pose-Attention Network for Action Recognition in Videos | ICCV | code | 17 | | Awesome Typography: Statistics-Based Text Effects Transfer | CVPR | code | 17 | | RoomNet: End-To-End Room Layout Estimation | ICCV | code | 17 | | Deep Spatial-Semantic Attention for Fine-Grained Sketch-Based Image Retrieval | ICCV | code | 16 | | Deep Supervised Discrete Hashing | NIPS | code | 16 | | Few-Shot Learning Through an Information Retrieval Lens | NIPS | code | 16 | | Estimating Accuracy from Unlabeled Data: A Probabilistic Logic Approach | NIPS | code | 16 | | Learning to Push the Limits of Efficient FFT-Based Image Deconvolution | ICCV | code | 16 | | Federated Multi-Task Learning | NIPS | code | 16 | | Label Distribution Learning Forests | NIPS | code | 16 | | Deep Multitask Architecture for Integrated 2D and 3D Human Sensing | CVPR | code | 16 | | Estimating Mutual Information for Discrete-Continuous Mixtures | NIPS | code | 16 | | Spatially-Varying Blur Detection Based on Multiscale Fused and Sorted Transform Coefficients of Gradient Magnitudes | CVPR | code | 16 | | StyleBank: An Explicit Representation for Neural Image Style Transfer | CVPR | code | 16 | | Surface Normals in the Wild | ICCV | code | 15 | | Automatic Discovery of the Statistical Types of Variables in a Dataset | ICML | code | 15 | | Learning Diverse Image Colorization | CVPR | code | 15 | | Learning Proximal Operators: Using Denoising Networks for Regularizing Inverse Imaging Problems | ICCV | code | 15 | | Non-Local Deep Features for Salient Object Detection | CVPR | code | 15 | | Structure-Measure: A New Way to Evaluate Foreground Maps | ICCV | code | 15 | | Shallow Updates for Deep Reinforcement Learning | NIPS | code | 15 | | Wasserstein Generative Adversarial Networks | ICML | code | 15 | | Recurrent 3D Pose Sequence Machines | CVPR | code | 15 | | Variational Dropout Sparsifies Deep Neural Networks | ICML | code | 15 | | Captioning Images With Diverse Objects | CVPR | code | 15 | | Off-policy evaluation for slate recommendation | NIPS | code | 15 | | Attributes2Classname: A Discriminative Model for Attribute-Based Unsupervised Zero-Shot Learning | ICCV | code | 14 | | Benchmarking Denoising Algorithms With Real Photographs | CVPR | code | 14 | | Neural Aggregation Network for Video Face Recognition | CVPR | code | 14 | | Learned Contextual Feature Reweighting for Image Geo-Localization | CVPR | code | 14 | | Streaming Weak Submodularity: Interpreting Neural Networks on the Fly | NIPS | code | 14 | | CVAE-GAN: Fine-Grained Image Generation Through Asymmetric Training | ICCV | code | 14 | | VQS: Linking Segmentations to Questions and Answers for Supervised Attention in VQA and Question-Focused Semantic Segmentation | ICCV | code | 14 | | Spherical convolutions and their application in molecular modelling | NIPS | code | 14 | | Multi-Information Source Optimization | NIPS | code | 14 | | Convolutional Neural Network Architecture for Geometric Matching | CVPR | code | 14 | | Neural Face Editing With Intrinsic Image Disentangling | CVPR | code | 14 | | Realistic Dynamic Facial Textures From a Single Image Using GANs | ICCV | code | 14 | | Predictive State Recurrent Neural Networks | NIPS | code | 13 | | Deep TextSpotter: An End-To-End Trainable Scene Text Localization and Recognition Framework | ICCV | code | 13 | | ExtremeWeather: A large-scale climate dataset for semi-supervised detection, localization, and understanding of extreme weather events | NIPS | code | 13 | | Hunt For The Unique, Stable, Sparse And Fast Feature Learning On Graphs | NIPS | code | 13 | | Consensus Convolutional Sparse Coding | ICCV | code | 13 | | Weakly Supervised Affordance Detection | CVPR | code | 13 | | Joint Learning of Object and Action Detectors | ICCV | code | 13 | | Light Field Blind Motion Deblurring | CVPR | code | 13 | | Asynchronous Stochastic Gradient Descent with Delay Compensation | ICML | code | 13 | | Unrolled Memory Inner-Products: An Abstract GPU Operator for Efficient Vision-Related Computations | ICCV | code | 12 | | Maximizing Subset Accuracy with Recurrent Neural Networks in Multi-label Classification | NIPS | code | 12 | | Self-Organized Text Detection With Minimal Post-Processing via Border Learning | ICCV | code | 12 | | Coordinated Multi-Agent Imitation Learning | ICML | code | 12 | | Gradient descent GAN optimization is locally stable | NIPS | code | 12 | | Removing Rain From Single Images via a Deep Detail Network | CVPR | code | 12 | | Convexified Convolutional Neural Networks | ICML | code | 12 | | Multigrid Neural Architectures | CVPR | code | 12 | | VegFru: A Domain-Specific Dataset for Fine-Grained Visual Categorization | ICCV | code | 12 | | Attend and Predict: Understanding Gene Regulation by Selective Attention on Chromatin | NIPS | code | 12 | | Differential Angular Imaging for Material Recognition | CVPR | code | 12 | | A Multilayer-Based Framework for Online Background Subtraction With Freely Moving Cameras | ICCV | code | 11 | | Formal Guarantees on the Robustness of a Classifier against Adversarial Manipulation | NIPS | code | 11 | | Max-value Entropy Search for Efficient Bayesian Optimization | ICML | code | 11 | | Higher-Order Integration of Hierarchical Convolutional Activations for Fine-Grained Visual Categorization | ICCV | code | 11 | | Generalized Deep Image to Image Regression | CVPR | code | 11 | | Adversarial Image Perturbation for Privacy Protection -- A Game Theory Perspective | ICCV | code | 11 | | Predicting Human Activities Using Stochastic Grammar | ICCV | code | 11 | | DESIRE: Distant Future Prediction in Dynamic Scenes With Interacting Agents | CVPR | code | 11 | | Fisher GAN | NIPS | code | 11 | | High-Order Attention Models for Visual Question Answering | NIPS | code | 11 | | IM2CAD | CVPR | code | 11 | | On Fairness and Calibration | NIPS | code | 11 | | DeepPermNet: Visual Permutation Learning | CVPR | code | 10 | | f-GANs in an Information Geometric Nutshell | NIPS | code | 10 | | Revisiting IM2GPS in the Deep Learning Era | ICCV | code | 10 | | Attentional Correlation Filter Network for Adaptive Visual Tracking | CVPR | code | 10 | | Learning Cross-Modal Deep Representations for Robust Pedestrian Detection | CVPR | code | 10 | | Confident Multiple Choice Learning | ICML | code | 10 | | Curriculum Dropout | ICCV | code | 9 | | Cognitive Mapping and Planning for Visual Navigation | CVPR | code | 9 | | Optimized Pre-Processing for Discrimination Prevention | NIPS | code | 9 | | Learning Motion Patterns in Videos | CVPR | code | 9 | | Scalable Log Determinants for Gaussian Process Kernel Learning | NIPS | code | 9 | | A Hierarchical Approach for Generating Descriptive Image Paragraphs | CVPR | code | 9 | | Deep Crisp Boundaries | CVPR | code | 9 | | Breaking the Nonsmooth Barrier: A Scalable Parallel Method for Composite Optimization | NIPS | code | 9 | | Practical Data-Dependent Metric Compression with Provable Guarantees | NIPS | code | 9 | | Do Deep Neural Networks Suffer from Crowding? | NIPS | code | 9 | | A Non-Convex Variational Approach to Photometric Stereo Under Inaccurate Lighting | CVPR | code | 9 | | End-To-End Learning of Geometry and Context for Deep Stereo Regression | ICCV | code | 9 | | From Bayesian Sparsity to Gated Recurrent Nets | NIPS | code | 8 | | Regret Minimization in MDPs with Options without Prior Knowledge | NIPS | code | 8 | | Following Gaze in Video | ICCV | code | 8 | | Model-Powered Conditional Independence Test | NIPS | code | 8 | | Cost efficient gradient boosting | NIPS | code | 8 | | Reflectance Adaptive Filtering Improves Intrinsic Image Estimation | CVPR | code | 8 | | DeepNav: Learning to Navigate Large Cities | CVPR | code | 8 | | Look, Listen and Learn | ICCV | code | 8 | | Attention-Aware Face Hallucination via Deep Reinforcement Learning | CVPR | code | 8 | | Plan, Attend, Generate: Planning for Sequence-to-Sequence Models | NIPS | code | 8 | | Introspective Neural Networks for Generative Modeling | ICCV | code | 8 | | Affinity Clustering: Hierarchical Clustering at Scale | NIPS | code | 8 | | Gaze Embeddings for Zero-Shot Image Classification | CVPR | code | 8 | | Input Switched Affine Networks: An RNN Architecture Designed for Interpretability | ICML | code | 8 | | Online multiclass boosting | NIPS | code | 8 | | Towards a Visual Privacy Advisor: Understanding and Predicting Privacy Risks in Images | ICCV | code | 8 | | SubUNets: End-To-End Hand Shape and Continuous Sign Language Recognition | ICCV | code | 7 | | Learning Koopman Invariant Subspaces for Dynamic Mode Decomposition | NIPS | code | 7 | | Unsupervised Monocular Depth Estimation With Left-Right Consistency | CVPR | code | 7 | | Personalized Image Aesthetics | ICCV | code | 7 | | Reasoning About Fine-Grained Attribute Phrases Using Reference Games | ICCV | code | 7 | | Lost Relatives of the Gumbel Trick | ICML | code | 7 | | Weakly Supervised Learning of Deep Metrics for Stereo Reconstruction | ICCV | code | 7 | | Centered Weight Normalization in Accelerating Training of Deep Neural Networks | ICCV | code | 6 | | Scalable Planning with Tensorflow for Hybrid Nonlinear Domains | NIPS | code | 6 | | Convex Global 3D Registration With Lagrangian Duality | CVPR | code | 6 | | Building a Regular Decision Boundary With Deep Networks | CVPR | code | 6 | | Learning Spatial Regularization With Image-Level Supervisions for Multi-Label Image Classification | CVPR | code | 6 | | Forecasting Human Dynamics From Static Images | CVPR | code | 6 | | AOD-Net: All-In-One Dehazing Network | ICCV | code | 6 | | K-Medoids For K-Means Seeding | NIPS | code | 6 | | Diverse Image Annotation | CVPR | code | 6 | | Practical Hash Functions for Similarity Estimation and Dimensionality Reduction | NIPS | code | 6 | | Deep Adaptive Image Clustering | ICCV | code | 6 | | Robust Adversarial Reinforcement Learning | ICML | code | 6 | | Improving Training of Deep Neural Networks via Singular Value Bounding | CVPR | code | 6 | | Analyzing Hidden Representations in End-to-End Automatic Speech Recognition Systems | NIPS | code | 6 | | Tensor Belief Propagation | ICML | code | 6 | | Sparse convolutional coding for neuronal assembly detection | NIPS | code | 6 | | Unsupervised Pixel-Level Domain Adaptation With Generative Adversarial Networks | CVPR | code | 6 | | Bayesian inference on random simple graphs with power law degree distributions | ICML | code | 6 | | Tensor Biclustering | NIPS | code | 6 | | Riemannian approach to batch normalization | NIPS | code | 6 | | Unsupervised Learning of Object Landmarks by Factorized Spatial Embeddings | ICCV | code | 6 | | Rolling-Shutter-Aware Differential SfM and Image Rectification | ICCV | code | 5 | | Active Decision Boundary Annotation With Deep Generative Models | ICCV | code | 5 | | Object Co-Skeletonization With Co-Segmentation | CVPR | code | 5 | | Discover and Learn New Objects From Documentaries | CVPR | code | 5 | | Understanding Black-box Predictions via Influence Functions | ICML | code | 5 | | Making Deep Neural Networks Robust to Label Noise: A Loss Correction Approach | CVPR | code | 5 | | Decoupling "when to update" from "how to update" | NIPS | code | 5 | | MarioQA: Answering Questions by Watching Gameplay Videos | ICCV | code | 5 | | Differentially private Bayesian learning on distributed data | NIPS | code | 5 | | Grad-CAM: Visual Explanations From Deep Networks via Gradient-Based Localization | ICCV | code | 5 | | Question Asking as Program Generation | NIPS | code | 5 | | Conic Scan-and-Cover algorithms for nonparametric topic modeling | NIPS | code | 5 | | Lip Reading Sentences in the Wild | CVPR | code | 5 | | ROAM: A Rich Object Appearance Model With Application to Rotoscoping | CVPR | code | 5 | | NeuralFDR: Learning Discovery Thresholds from Hypothesis Features | NIPS | code | 5 | | Viraliency: Pooling Local Virality | CVPR | code | 5 | | Learning Algorithms for Active Learning | ICML | code | 5 | | Point to Set Similarity Based Deep Feature Learning for Person Re-Identification | CVPR | code | 5 | | Click Here: Human-Localized Keypoints as Guidance for Viewpoint Estimation | ICCV | code | 5 | | The World of Fast Moving Objects | CVPR | code | 5 | | Cross-Modality Binary Code Learning via Fusion Similarity Hashing | CVPR | code | 5 | | Testing and Learning on Distributions with Symmetric Noise Invariance | NIPS | code | 5 | | Sticking the Landing: Simple, Lower-Variance Gradient Estimators for Variational Inference | NIPS | code | 5 | | Diving into the shallows: a computational perspective on large-scale shallow learning | NIPS | code | 5 | | Rotation Equivariant Vector Field Networks | ICCV | code | 5 | | Recursive Sampling for the Nystrom Method | NIPS | code | 5 | | Learning From Video and Text via Large-Scale Discriminative Clustering | ICCV | code | 5 | | Global optimization of Lipschitz functions | ICML | code | 5 | | Device Placement Optimization with Reinforcement Learning | ICML | code | 4 | | Alternating Direction Graph Matching | CVPR | code | 4 | | MEC: Memory-efficient Convolution for Deep Neural Network | ICML | code | 4 | | Expert Gate: Lifelong Learning With a Network of Experts | CVPR | code | 4 | | A Simple yet Effective Baseline for 3D Human Pose Estimation | ICCV | code | 4 | | On Structured Prediction Theory with Calibrated Convex Surrogate Losses | NIPS | code | 4 | | Sub-sampled Cubic Regularization for Non-convex Optimization | ICML | code | 4 | | Generalized Semantic Preserving Hashing for N-Label Cross-Modal Retrieval | CVPR | code | 4 | | Bottleneck Conditional Density Estimation | ICML | code | 4 | | Learning Cooperative Visual Dialog Agents With Deep Reinforcement Learning | ICCV | code | 4 | | Multi-way Interacting Regression via Factorization Machines | NIPS | code | 4 | | Joint Discovery of Object States and Manipulation Actions | ICCV | code | 4 | | Predicting Salient Face in Multiple-Face Videos | CVPR | code | 4 | | From Red Wine to Red Tomato: Composition With Context | CVPR | code | 4 | | Encoder Based Lifelong Learning | ICCV | code | 4 | | Deep Recurrent Neural Network-Based Identification of Precursor microRNAs | NIPS | code | 4 | | Guarantees for Greedy Maximization of Non-submodular Functions with Applications | ICML | code | 4 | | Pose-Aware Person Recognition | CVPR | code | 4 | | Zero-Shot Recognition Using Dual Visual-Semantic Mapping Paths | CVPR | code | 4 | | Asynchronous Distributed Variational Gaussian Processes for Regression | ICML | code | 3 | | Saliency Pattern Detection by Ranking Structured Trees | ICCV | code | 3 | | Toward Goal-Driven Neural Network Models for the Rodent Whisker-Trigeminal System | NIPS | code | 3 | | Learning Non-Maximum Suppression | CVPR | code | 3 | | Deep Latent Dirichlet Allocation with Topic-Layer-Adaptive Stochastic Gradient Riemannian MCMC | ICML | code | 3 | | Discriminative Bimodal Networks for Visual Localization and Detection With Natural Language Queries | CVPR | code | 3 | | AdaNet: Adaptive Structural Learning of Artificial Neural Networks | ICML | code | 3 | | Large Margin Object Tracking With Circulant Feature Maps | CVPR | code | 3 | | Compatible Reward Inverse Reinforcement Learning | NIPS | code | 3 | | Adversarial Surrogate Losses for Ordinal Regression | NIPS | code | 3 | | Non-monotone Continuous DR-submodular Maximization: Structure and Algorithms | NIPS | code | 3 | | Unifying PAC and Regret: Uniform PAC Bounds for Episodic Reinforcement Learning | NIPS | code | 3 | | A framework for Multi-A(rmed)/B(andit) Testing with Online FDR Control | NIPS | code | 3 | | Counting Everyday Objects in Everyday Scenes | CVPR | code | 3 | | Loss Max-Pooling for Semantic Image Segmentation | CVPR | code | 3 | | Aesthetic Critiques Generation for Photos | ICCV | code | 3 | | Expectation Propagation with Stochastic Kinetic Model in Complex Interaction Systems | NIPS | code | 3 | | Near-Optimal Edge Evaluation in Explicit Generalized Binomial Graphs | NIPS | code | 3 |

↥ back to top

2016

| Title | Conf | Code | Stars | |:--------|:--------:|:--------:|:--------:| | R-FCN: Object Detection via Region-based Fully Convolutional Networks | NIPS | code | 18356 | | Image Style Transfer Using Convolutional Neural Networks | CVPR | code | 16435 | | Deep Residual Learning for Image Recognition | CVPR | code | 4468 | | Convolutional Pose Machines | CVPR | code | 3260 | | Synthetic Data for Text Localisation in Natural Images | CVPR | code | 787 | | Combining Markov Random Fields and Convolutional Neural Networks for Image Synthesis | CVPR | code | 731 | | Instance-Aware Semantic Segmentation via Multi-Task Network Cascades | CVPR | code | 433 | | Learning Multi-Domain Convolutional Neural Networks for Visual Tracking | CVPR | code | 350 | | Convolutional Two-Stream Network Fusion for Video Action Recognition | CVPR | code | 342 | | Learning Deep Features for Discriminative Localization | CVPR | code | 323 | | Deep Metric Learning via Lifted Structured Feature Embedding | CVPR | code | 251 | | Learning Deep Representations of Fine-Grained Visual Descriptions | CVPR | code | 229 | | Eye Tracking for Everyone | CVPR | code | 223 | | NetVLAD: CNN Architecture for Weakly Supervised Place Recognition | CVPR | code | 204 | | Staple: Complementary Learners for Real-Time Tracking | CVPR | code | 183 | | Joint Unsupervised Learning of Deep Representations and Image Clusters | CVPR | code | 182 | | Accurate Image Super-Resolution Using Very Deep Convolutional Networks | CVPR | code | 182 | | Temporal Action Localization in Untrimmed Videos via Multi-Stage CNNs | CVPR | code | 167 | | LocNet: Improving Localization Accuracy for Object Detection | CVPR | code | 155 | | Shallow and Deep Convolutional Networks for Saliency Prediction | CVPR | code | 153 | | Compact Bilinear Pooling | CVPR | code | 148 | | Learning Compact Binary Descriptors With Unsupervised Deep Neural Networks | CVPR | code | 144 | | Dynamic Image Networks for Action Recognition | CVPR | code | 133 | | Rethinking the Inception Architecture for Computer Vision | CVPR | code | 130 | | Deep Sliding Shapes for Amodal 3D Object Detection in RGB-D Images | CVPR | code | 126 | | Context Encoders: Feature Learning by Inpainting | CVPR | code | 124 | | TI-Pooling: Transformation-Invariant Pooling for Feature Learning in Convolutional Neural Networks | CVPR | code | 109 | | Weakly Supervised Deep Detection Networks | CVPR | code | 103 | | Natural Language Object Retrieval | CVPR | code | 100 | | Deeply-Recursive Convolutional Network for Image Super-Resolution | CVPR | code | 96 | | Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network | CVPR | code | 92 | | Image Question Answering Using Convolutional Neural Network With Dynamic Parameter Prediction | CVPR | code | 88 | | Recurrent Convolutional Network for Video-Based Person Re-Identification | CVPR | code | 82 | | A Comparative Study for Single Image Blind Deblurring | CVPR | code | 82 | | Neural Module Networks | CVPR | code | 81 | | Stacked Attention Networks for Image Question Answering | CVPR | code | 78 | | Progressive Prioritized Multi-View Stereo | CVPR | code | 73 | | Marr Revisited: 2D-3D Alignment via Surface Normal Prediction | CVPR | code | 72 | | A Hierarchical Deep Temporal Model for Group Activity Recognition | CVPR | code | 71 | | Towards Open Set Deep Networks | CVPR | code | 71 | | Robust 3D Hand Pose Estimation in Single Depth Images: From Single-View CNN to Multi-View CNNs | CVPR | code | 70 | | Bilateral Space Video Segmentation | CVPR | code | 63 | | Deep Compositional Captioning: Describing Novel Object Categories Without Paired Training Data | CVPR | code | 57 | | Efficient 3D Room Shape Recovery From a Single Panorama | CVPR | code | 55 | | Non-Local Image Dehazing | CVPR | code | 50 | | Video Segmentation via Object Flow | CVPR | code | 50 | | Deep Supervised Hashing for Fast Image Retrieval | CVPR | code | 50 | | Deep Region and Multi-Label Learning for Facial Action Unit Detection | CVPR | code | 43 | | CRAFT Objects From Images | CVPR | code | 41 | | Slicing Convolutional Neural Network for Crowd Video Understanding | CVPR | code | 40 | | Sketch Me That Shoe | CVPR | code | 39 | | Image Captioning With Semantic Attention | CVPR | code | 35 | | Deep Saliency With Encoded Low Level Distance Map and High Level Features | CVPR | code | 34 | | A Benchmark Dataset and Evaluation Methodology for Video Object Segmentation | CVPR | code | 33 | | A Dual-Source Approach for 3D Pose Estimation From a Single Image | CVPR | code | 32 | | Learning Local Image Descriptors With Deep Siamese and Triplet Convolutional Networks by Minimising Global Loss Functions | CVPR | code | 30 | | Ordinal Regression With Multiple Output CNN for Age Estimation | CVPR | code | 30 | | Structured Feature Learning for Pose Estimation | CVPR | code | 29 | | Unsupervised Learning of Edges | CVPR | code | 29 | | PatchBatch: A Batch Augmented Loss for Optical Flow | CVPR | code | 27 | | Dense Human Body Correspondences Using Convolutional Networks | CVPR | code | 27 | | Actionness Estimation Using Hybrid Fully Convolutional Networks | CVPR | code | 26 | | You Only Look Once: Unified, Real-Time Object Detection | CVPR | code | 26 | | Fast Training of Triplet-Based Deep Binary Embedding Networks | CVPR | code | 25 | | Recurrent Attention Models for Depth-Based Person Identification | CVPR | code | 24 | | Detecting Vanishing Points Using Global Image Context in a Non-Manhattan World | CVPR | code | 22 | | First Person Action Recognition Using Deep Learned Descriptors | CVPR | code | 21 | | Proposal Flow | CVPR | code | 20 | | Scale-Aware Alignment of Hierarchical Image Segmentation | CVPR | code | 20 | | Quantized Convolutional Neural Networks for Mobile Devices | CVPR | code | 20 | | Semantic Segmentation With Boundary Neural Fields | CVPR | code | 19 | | Single-Image Crowd Counting via Multi-Column Convolutional Neural Network | CVPR | code | 19 | | Accumulated Stability Voting: A Robust Descriptor From Descriptors of Multiple Scales | CVPR | code | 19 | | Structure From Motion With Objects | CVPR | code | 17 | | Bottom-Up and Top-Down Reasoning With Hierarchical Rectified Gaussians | CVPR | code | 16 | | Semantic Filtering | CVPR | code | 16 | | Online Detection and Classification of Dynamic Hand Gestures With Recurrent 3D Convolutional Neural Network | CVPR | code | 16 | | ReconNet: Non-Iterative Reconstruction of Images From Compressively Sensed Measurements | CVPR | code | 15 | | Interactive Segmentation on RGBD Images via Cue Selection | CVPR | code | 14 | | Object Contour Detection With a Fully Convolutional Encoder-Decoder Network | CVPR | code | 14 | | Automatic Content-Aware Color and Tone Stylization | CVPR | code | 12 | | Similarity Learning With Spatial Constraints for Person Re-Identification | CVPR | code | 11 | | Personalizing Human Video Pose Estimation | CVPR | code | 10 | | Visually Indicated Sounds | CVPR | code | 9 | | Patch-Based Convolutional Neural Network for Whole Slide Tissue Image Classification | CVPR | code | 9 | | Region Ranking SVM for Image Classification | CVPR | code | 8 | | Pairwise Matching Through Max-Weight Bipartite Belief Propagation | CVPR | code | 8 | | Deep Hand: How to Train a CNN on 1 Million Hand Images When Your Data Is Continuous and Weakly Labelled | CVPR | code | 8 | | Cross-Stitch Networks for Multi-Task Learning | CVPR | code | 8 | | Learning a Discriminative Null Space for Person Re-Identification | CVPR | code | 8 | | Efficient Deep Learning for Stereo Matching | CVPR | code | 7 | | Globally Optimal Manhattan Frame Estimation in Real-Time | CVPR | code | 7 | | Where to Look: Focus Regions for Visual Question Answering | CVPR | code | 7 | | Detecting Migrating Birds at Night | CVPR | code | 7 | | Unsupervised Learning From Narrated Instruction Videos | CVPR | code | 7 | | Efficient and Robust Color Consistency for Community Photo Collections | CVPR | code | 7 | | Recurrent Attentional Networks for Saliency Detection | CVPR | code | 7 | | 3D Shape Attributes | CVPR | code | 6 | | Beyond Local Search: Tracking Objects Everywhere With Instance-Specific Proposals | CVPR | code | 5 | | Functional Faces: Groupwise Dense Correspondence Using Functional Maps | CVPR | code | 5 | | Visual Tracking Using Attention-Modulated Disintegration and Integration | CVPR | code | 5 | | Improving Human Action Recognition by Non-Action Classification | CVPR | code | 4 | | Prior-Less Compressible Structure From Motion | CVPR | code | 4 | | DenseCap: Fully Convolutional Localization Networks for Dense Captioning | CVPR | code | 4 | | Tensor Robust Principal Component Analysis: Exact Recovery of Corrupted Low-Rank Tensors via Convex Optimization | CVPR | code | 4 | | Force From Motion: Decoding Physical Sensation in a First Person Video | CVPR | code | 3 | | Context-Aware Gaussian Fields for Non-Rigid Point Set Registration | CVPR | code | 3 | | Using Spatial Order to Boost the Elimination of Incorrect Feature Matches | CVPR | code | 3 | | Fast Algorithms for Convolutional Neural Networks | CVPR | code | 3 |

↥ back to top

2015

| Title | Conf | Code | Stars | |:--------|:--------:|:--------:|:--------:| | Faster R-CNN: Towards Real-Time Object Detectionwith Region Proposal Networks | NIPS | code | 18356 | | Fast R-CNN | ICCV | code | 18356 | | Conditional Random Fields as Recurrent Neural Networks | ICCV | code | 1189 | | Fully Convolutional Networks for Semantic Segmentation | CVPR | code | 911 | | Learning to Track: Online Multi-Object Tracking by Decision Making | ICCV | code | 308 | | Learning to Compare Image Patches via Convolutional Neural Networks | CVPR | code | 300 | | Learning Deconvolution Network for Semantic Segmentation | ICCV | code | 296 | | Single Image Super-Resolution From Transformed Self-Exemplars | CVPR | code | 289 | | Sequence to Sequence - Video to Text | ICCV | code | 239 | | Deep Colorization | ICCV | code | 198 | | Deep Neural Decision Forests | ICCV | code | 192 | | Hierarchical Convolutional Features for Visual Tracking | ICCV | code | 179 | | Render for CNN: Viewpoint Estimation in Images Using CNNs Trained With Rendered 3D Model Views | ICCV | code | 176 | | Realtime Edge-Based Visual Odometry for a Monocular Camera | ICCV | code | 175 | | Understanding Deep Image Representations by Inverting Them | CVPR | code | 154 | | Context-Aware CNNs for Person Head Detection | ICCV | code | 153 | | Show and Tell: A Neural Image Caption Generator | CVPR | code | 141 | | Face Alignment by Coarse-to-Fine Shape Searching | CVPR | code | 140 | | An Improved Deep Learning Architecture for Person Re-Identification | CVPR | code | 127 | | FaceNet: A Unified Embedding for Face Recognition and Clustering | CVPR | code | 124 | | Depth-Based Hand Pose Estimation: Data, Methods, and Challenges | ICCV | code | 121 | | DynamicFusion: Reconstruction and Tracking of Non-Rigid Scenes in Real-Time | CVPR | code | 118 | | Massively Parallel Multiview Stereopsis by Surface Normal Diffusion | ICCV | code | 105 | | Learning to Propose Objects | CVPR | code | 91 | | Learning Spatially Regularized Correlation Filters for Visual Tracking | ICCV | code | 86 | | A Convolutional Neural Network Cascade for Face Detection | CVPR | code | 85 | | Discriminative Learning of Deep Convolutional Feature Point Descriptors | ICCV | code | 77 | | Unsupervised Visual Representation Learning by Context Prediction | ICCV | code | 73 | | Deep Neural Networks Are Easily Fooled: High Confidence Predictions for Unrecognizable Images | CVPR | code | 71 | | Deep Filter Banks for Texture Recognition and Segmentation | CVPR | code | 68 | | Saliency Detection by Multi-Context Deep Learning | CVPR | code | 66 | | Multi-Objective Convolutional Learning for Face Labeling | CVPR | code | 55 | | Finding Action Tubes | CVPR | code | 51 | | Category-Specific Object Reconstruction From a Single Image | CVPR | code | 48 | | Convolutional Color Constancy | ICCV | code | 47 | | Face Flow | ICCV | code | 45 | | P-CNN: Pose-Based CNN Features for Action Recognition | ICCV | code | 45 | | Learning From Massive Noisy Labeled Data for Image Classification | CVPR | code | 45 | | Image Specificity | CVPR | code | 40 | | Predicting Depth, Surface Normals and Semantic Labels With a Common Multi-Scale Convolutional Architecture | ICCV | code | 35 | | Neural Activation Constellations: Unsupervised Part Model Discovery With Convolutional Networks | ICCV | code | 35 | | VQA: Visual Question Answering | ICCV | code | 35 | | Mid-Level Deep Pattern Mining | CVPR | code | 34 | | PoseNet: A Convolutional Network for Real-Time 6-DOF Camera Relocalization | ICCV | code | 34 | | Parsimonious Labeling | ICCV | code | 33 | | Car That Knows Before You Do: Anticipating Maneuvers via Learning Temporal Driving Models | ICCV | code | 33 | | Recurrent Convolutional Neural Network for Object Recognition | CVPR | code | 32 | | TILDE: A Temporally Invariant Learned DEtector | CVPR | code | 30 | | In Defense of Color-Based Model-Free Tracking | CVPR | code | 30 | | Fast Bilateral-Space Stereo for Synthetic Defocus | CVPR | code | 29 | | Phase-Based Frame Interpolation for Video | CVPR | code | 28 | | Understanding Tools: Task-Oriented Object Modeling, Learning and Recognition | CVPR | code | 27 | | Deeply Learned Attributes for Crowded Scene Understanding | CVPR | code | 27 | | Unconstrained 3D Face Reconstruction | CVPR | code | 26 | | Viewpoints and Keypoints | CVPR | code | 25 | | Holistically-Nested Edge Detection | ICCV | code | 25 | | Going Deeper With Convolutions | CVPR | code | 25 | | Reconstructing the World* in Six Days *(As Captured by the Yahoo 100 Million Image Dataset) | CVPR | code | 25 | | Data-Driven 3D Voxel Patterns for Object Category Recognition | CVPR | code | 24 | | L0TV: A New Method for Image Restoration in the Presence of Impulse Noise | CVPR | code | 22 | | Beyond Frontal Faces: Improving Person Recognition Using Multiple Cues | CVPR | code | 21 | | Understanding Deep Features With Computer-Generated Imagery | ICCV | code | 19 | | HICO: A Benchmark for Recognizing Human-Object Interactions in Images | ICCV | code | 18 | | Structured Feature Selection | ICCV | code | 17 | | Learning Large-Scale Automatic Image Colorization | ICCV | code | 17 | | Semantic Component Analysis | ICCV | code | 17 | | Simultaneous Feature Learning and Hash Coding With Deep Neural Networks | CVPR | code | 16 | | 3D Object Reconstruction From Hand-Object Interactions | ICCV | code | 15 | | Learning Temporal Embeddings for Complex Video Analysis | ICCV | code | 14 | | Learning to See by Moving | ICCV | code | 14 | | Reflection Removal Using Ghosting Cues | CVPR | code | 14 | | Where to Buy It: Matching Street Clothing Photos in Online Shops | ICCV | code | 14 | | Oriented Edge Forests for Boundary Detection | CVPR | code | 13 | | A Large-Scale Car Dataset for Fine-Grained Categorization and Verification | CVPR | code | 11 | | Appearance-Based Gaze Estimation in the Wild | CVPR | code | 10 | | Learning a Descriptor-Specific 3D Keypoint Detector | ICCV | code | 10 | | Robust Image Filtering Using Joint Static and Dynamic Guidance | CVPR | code | 10 | | Partial Person Re-Identification | ICCV | code | 9 | | High Quality Structure From Small Motion for Rolling Shutter Cameras | ICCV | code | 9 | | Boosting Object Proposals: From Pascal to COCO | ICCV | code | 8 | | Convolutional Channel Features | ICCV | code | 8 | | Live Repetition Counting | ICCV | code | 8 | | Unsupervised Learning of Visual Representations Using Videos | ICCV | code | 8 | | Supervised Discrete Hashing | CVPR | code | 7 | | Multi-View Convolutional Neural Networks for 3D Shape Recognition | ICCV | code | 7 | | Simpler Non-Parametric Methods Provide as Good or Better Results to Multiple-Instance Learning | ICCV | code | 7 | | Finding Distractors In Images | CVPR | code | 7 | | Piecewise Flat Embedding for Image Segmentation | ICCV | code | 7 | | Long-Term Correlation Tracking | CVPR | code | 6 | | Towards Open World Recognition | CVPR | code | 6 | | Pooled Motion Features for First-Person Videos | CVPR | code | 6 | | Simultaneous Deep Transfer Across Domains and Tasks | ICCV | code | 6 | | What Makes an Object Memorable? | ICCV | code | 5 | | Mining Semantic Affordances of Visual Object Categories | CVPR | code | 5 | | Dense Semantic Correspondence Where Every Pixel is a Classifier | ICCV | code | 5 | | Segment Graph Based Image Filtering: Fast Structure-Preserving Smoothing | ICCV | code | 5 | | Fast Randomized Singular Value Thresholding for Nuclear Norm Minimization | CVPR | code | 5 | | Unsupervised Generation of a Viewpoint Annotated Car Dataset From Videos | ICCV | code | 5 | | Multi-Label Cross-Modal Retrieval | ICCV | code | 4 | | Superdifferential Cuts for Binary Energies | CVPR | code | 4 | | Pose Induction for Novel Object Categories | ICCV | code | 4 | | Efficient Minimal-Surface Regularization of Perspective Depth Maps in Variational Stereo | CVPR | code | 4 | | Low-Rank Matrix Factorization Under General Mixture Noise Distributions | ICCV | code | 4 | | Robust Saliency Detection via Regularized Random Walks Ranking | CVPR | code | 3 | | Simultaneous Video Defogging and Stereo Reconstruction | CVPR | code | 3 | | Hyperspectral Super-Resolution by Coupled Spectral Unmixing | ICCV | code | 3 | | Oriented Object Proposals | ICCV | code | 3 | | kNN Hashing With Factorized Neighborhood Representation | ICCV | code | 3 | | Minimum Barrier Salient Object Detection at 80 FPS | ICCV | code | 3 |

↥ back to top

2014

| Title | Conf | Code | Stars | |:--------|:--------:|:--------:|:--------:| | Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation | CVPR | code | 1681 | | Locally Optimized Product Quantization for Approximate Nearest Neighbor Search | CVPR | code | 437 | | Clothing Co-Parsing by Joint Image Segmentation and Labeling | CVPR | code | 218 | | Multiscale Combinatorial Grouping | CVPR | code | 185 | | Face Alignment at 3000 FPS via Regressing Local Binary Features | CVPR | code | 164 | | Cross-Scale Cost Aggregation for Stereo Matching | CVPR | code | 106 | | Transfer Joint Matching for Unsupervised Domain Adaptation | CVPR | code | 67 | | Deep Learning Face Representation from Predicting 10,000 Classes | CVPR | code | 62 | | BING: Binarized Normed Gradients for Objectness Estimation at 300fps | CVPR | code | 44 | | One Millisecond Face Alignment with an Ensemble of Regression Trees | CVPR | code | 43 | | 3D Reconstruction from Accidental Motion | CVPR | code | 42 | | Predicting Matchability | CVPR | code | 38 | | Dense Semantic Image Segmentation with Objects and Attributes | CVPR | code | 28 | | Scene-Independent Group Profiling in Crowd | CVPR | code | 28 | | Shrinkage Fields for Effective Image Restoration | CVPR | code | 25 | | Adaptive Color Attributes for Real-Time Visual Tracking | CVPR | code | 25 | | Minimal Scene Descriptions from Structure from Motion Models | CVPR | code | 22 | | Parallax-tolerant Image Stitching | CVPR | code | 20 | | Learning Mid-level Filters for Person Re-identification | CVPR | code | 20 | | Fast Edge-Preserving PatchMatch for Large Displacement Optical Flow | CVPR | code | 18 | | Product Sparse Coding | CVPR | code | 16 | | Convolutional Neural Networks for No-Reference Image Quality Assessment | CVPR | code | 16 | | Seeing 3D Chairs: Exemplar Part-based 2D-3D Alignment using a Large Dataset of CAD Models | CVPR | code | 15 | | StoryGraphs: Visualizing Character Interactions as a Timeline | CVPR | code | 14 | | Nonparametric Part Transfer for Fine-grained Recognition | CVPR | code | 13 | | Scalable Multitask Representation Learning for Scene Classification | CVPR | code | 11 | | Investigating Haze-relevant Features in A Learning Framework for Image Dehazing | CVPR | code | 7 | | Reconstructing PASCAL VOC | CVPR | code | 6 | | Collaborative Hashing | CVPR | code | 6 | | Tell Me What You See and I will Show You Where It Is | CVPR | code | 6 | | Salient Region Detection via High-Dimensional Color Transform | CVPR | code | 6 |

↥ back to top

2013

| Title | Conf | Code | Stars | |:--------|:--------:|:--------:|:--------:| | A generic decentralized trust management framework | SPE | code | 6 |

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.