Need help with EfficientDNNs?
Click the “chat” button below for chat support from the developer who created it, or find similar developers for support.

About the developer

540 Stars 85 Forks MIT License 142 Commits 0 Opened issues


Collection of recent methods on DNN compression and acceleration

Services available


Need anything else?

Contributors list


A collection of recent methods on DNN compression and acceleration. There are mainly 5 kinds of methods for efficient DNNs: - neural architecture re-designing or searching - maintain accuracy, less cost (e.g., #Params, #FLOPs, etc.): MobileNet, ShuffleNet etc. - maintain cost, more accuracy: Inception, ResNeXt, Xception etc. - pruning (including structured and unstructured) - quantization - matrix decomposition - knowledge distillation

About abbreviation: In the list below,

for oral,
for workshop,
for spotlight,
for best paper.



2011 - 2011-JMLR-Learning with Structured Sparsity - 2011-NIPSw-Improving the speed of neural networks on CPUs

2013 - 2013-NIPS-Predicting Parameters in Deep Learning - 2013.08-Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation

2014 - 2014-BMVC-Speeding up convolutional neural networks with low rank expansions - 2014-INTERSPEECH-1-Bit Stochastic Gradient Descent and its Application to Data-Parallel Distributed Training of Speech DNNs - 2014-NIPS-Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation - 2014-NIPS-Do deep neural nets really need to be deep - 2014.12-Memory bounded deep convolutional networks

2015 - 2015-ICLR-Speeding-up convolutional neural networks using fine-tuned cp-decomposition - 2015-ICML-Compressing neural networks with the hashing trick - 2015-INTERSPEECH-A Diversity-Penalizing Ensemble Training Method for Deep Learning - 2015-BMVC-Data-free parameter pruning for deep neural networks - 2015-BMVC-Learning the structure of deep architectures using l1 regularization - 2015-NIPS-Learning both Weights and Connections for Efficient Neural Network - 2015-NIPS-Binaryconnect: Training deep neural networks with binary weights during propagations - 2015-NIPS-Structured Transforms for Small-Footprint Deep Learning - 2015-NIPS-Tensorizing Neural Networks - 2015-NIPSw-Distilling Intractable Generative Models - 2015-NIPSw-Federated Optimization:Distributed Optimization Beyond the Datacenter - 2015-CVPR-Efficient and Accurate Approximations of Nonlinear Convolutional Networks [2016 TPAMI version: Accelerating Very Deep Convolutional Networks for Classification and Detection] - 2015-CVPR-Sparse Convolutional Neural Networks - 2015-ICCV-An Exploration of Parameter Redundancy in Deep Networks with Circulant Projections - 2015.12-Exploiting Local Structures with the Kronecker Layer in Convolutional Networks

2016 - 2016-ICLR-Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding (best paper!) - 2016-ICLR-All you need is a good init [Code] - 2016-ICLR-Convolutional neural networks with low-rank regularization [Code] - 2016-ICLR-Diversity networks - 2016-ICLR-Neural networks with few multiplications - 2016-ICLR-Compression of deep convolutional neural networks for fast and low power mobile applications - 2016-ICLRw-Randomout: Using a convolutional gradient norm to win the filter lottery - 2016-CVPR-Fast algorithms for convolutional neural networks - 2016-CVPR-Fast ConvNets Using Group-wise Brain Damage - 2016-BMVC-Learning neural network architectures using backpropagation - 2016-ECCV-Less is more: Towards compact cnns - 2016-EMNLP-Sequence-Level Knowledge Distillation - 2016-NIPS-Learning Structured Sparsity in Deep Neural Networks - 2016-NIPS-Dynamic Network Surgery for Efficient DNNs [Code] - 2016-NIPS-Learning the Number of Neurons in Deep Neural Networks - 2016-NIPS-Memory-Efficient Backpropagation Through Time - 2016-NIPS-PerforatedCNNs: Acceleration through Elimination of Redundant Convolutions - 2016-NIPS-LightRNN: Memory and Computation-Efficient Recurrent Neural Networks - 2016-NIPS-CNNpack: packing convolutional neural networks in the frequency domain - 2016-ISCA-Eyeriss: A Spatial Architecture for Energy-Efficient Dataflow for Convolutional Neural Networks - 2016-ICASSP-Learning compact recurrent neural networks - 2016-CoNLL-Compression of Neural Machine Translation Models via Pruning - 2016.03-Adaptive Computation Time for Recurrent Neural Networks - 2016.06-Structured Convolution Matrices for Energy-efficient Deep learning - 2016.06-Deep neural networks are robust to weight binarization and other non-linear distortions - 2016.06-Hypernetworks - 2016.07-IHT-Training skinny deep neural networks with iterative hard thresholding methods - 2016.08-Recurrent Neural Networks With Limited Numerical Precision - 2016.10-Deep model compression: Distilling knowledge from noisy teachers - 2016.10-Federated Optimization: Distributed Machine Learning for On-Device Intelligence - 2016.11-Alternating Direction Method of Multipliers for Sparse Convolutional Neural Networks

2017 - 2017-ICLR-Pruning Convolutional Neural Networks for Resource Efficient Inference - 2017-ICLR-Incremental Network Quantization: Towards Lossless CNNs with Low-Precision Weights [Code] - 2017-ICLR-Do Deep Convolutional Nets Really Need to be Deep and Convolutional? - 2017-ICLR-DSD: Dense-Sparse-Dense Training for Deep Neural Networks (Closely related work: SFP and IHT) - 2017-ICLR-Faster CNNs with Direct Sparse Convolutions and Guided Pruning - 2017-ICLR-Towards the Limit of Network Quantization - 2017-ICLR-Loss-aware Binarization of Deep Networks - 2017-ICLR-Trained Ternary Quantization [Code] - 2017-ICLR-Exploring Sparsity in Recurrent Neural Networks - 2017-ICLR-Soft Weight-Sharing for Neural Network Compression [Reddit discussion] [Code] - 2017-ICLR-Variable Computation in Recurrent Neural Networks - 2017-ICLR-Training Compressed Fully-Connected Networks with a Density-Diversity Penalty - 2017-ICML-Variational dropout sparsifies deep neural networks - 2017-ICML-Theoretical Properties for Neural Networks with Weight Matrices of Low Displacement Rank - 2017-ICML-Deep Tensor Convolution on Multicores - 2017-ICML-Delta Networks for Optimized Recurrent Network Computation - 2017-ICML-Beyond Filters: Compact Feature Map for Portable Deep Model - 2017-ICML-Combined Group and Exclusive Sparsity for Deep Neural Networks - 2017-ICML-MEC: Memory-efficient Convolution for Deep Neural Network - 2017-ICML-Deciding How to Decide: Dynamic Routing in Artificial Neural Networks - 2017-ICML-ZipML: Training Models with End-to-End Low Precision: The Cans, the Cannots, and a Little Bit of Deep Learning - 2017-ICML-Analytical Guarantees on Numerical Precision of Deep Neural Networks - 2017-ICML-Adaptive Neural Networks for Efficient Inference - 2017-ICML-SplitNet: Learning to Semantically Split Deep Networks for Parameter Reduction and Model Parallelization - 2017-ICMLw-Bayesian Sparsification of Recurrent Neural Networks - 2017-CVPR-Learning deep CNN denoiser prior for image restoration - 2017-CVPR-Deep roots: Improving cnn efficiency with hierarchical filter groups - 2017-CVPR-More is less: A more complicated network with less inference complexity - 2017-CVPR-All You Need is Beyond a Good Init: Exploring Better Solution for Training Extremely Deep Convolutional Neural Networks with Orthonormality and Modulation - 2017-CVPR-ResNeXt-Aggregated Residual Transformations for Deep Neural Networks - 2017-CVPR-Xception: Deep learning with depthwise separable convolutions - 2017-CVPR-Designing Energy-Efficient CNN using Energy-aware Pruning - 2017-CVPR-Spatially Adaptive Computation Time for Residual Networks - 2017-CVPR-Network Sketching: Exploiting Binary Structure in Deep CNNs - 2017-CVPR-A Compact DNN: Approaching GoogLeNet-Level Accuracy of Classification and Domain Adaptation - 2017-ICCV-Channel pruning for accelerating very deep neural networks [Code] - 2017-ICCV-Learning efficient convolutional networks through network slimming [Code] - 2017-ICCV-ThiNet: A filter level pruning method for deep neural network compression [Project] [Code] [2018 TPAMI version] - 2017-ICCV-Interleaved group convolutions - 2017-ICCV-Coordinating Filters for Faster Deep Neural Networks [Code] - 2017-ICCV-Performance Guaranteed Network Acceleration via High-Order Residual Quantization - 2017-NIPS-Net-trim: Convex pruning of deep neural networks with performance guarantee Code - 2017-NIPS-Runtime neural pruning - 2017-NIPS-Learning to Prune Deep Neural Networks via Layer-wise Optimal Brain Surgeon [Code] - 2017-NIPS-Federated Multi-Task Learning - 2017-NIPS-Bayesian Compression for Deep Learning [Code] - 2017-NIPS-Structured Bayesian Pruning via Log-Normal Multiplicative Noise - 2017-NIPS-Towards Accurate Binary Convolutional Neural Network - 2017-NIPS-Soft-to-Hard Vector Quantization for End-to-End Learning Compressible Representations - 2017-NIPS-TernGrad: Ternary Gradients to Reduce Communication in Distributed Deep Learning - 2017-NIPS-Flexpoint: An Adaptive Numerical Format for Efficient Training of Deep Neural Networks - 2017-NIPS-Training Quantized Nets: A Deeper Understanding - 2017-NIPS-The Reversible Residual Network: Backpropagation Without Storing Activations [Code] - 2017-NIPS-Compression-aware Training of Deep Networks - 2017-FPGA-ESE: efficient speech recognition engine with compressed LSTM on FPGA (best paper!) - 2017-AISTATS-Communication-Efficient Learning of Deep Networks from Decentralized Data - 2017-ICASSP-Accelerating Deep Convolutional Networks using low-precision and sparsity - 2017-NNs-Nonredundant sparse feature extraction using autoencoders with receptive fields clustering - 2017.02-The Power of Sparsity in Convolutional Neural Networks - 2017.07-Stochastic, Distributed and Federated Optimization for Machine Learning - 2017.05-Structural Compression of Convolutional Neural Networks Based on Greedy Filter Pruning - 2017.07-Extremely Low Bit Neural Network: Squeeze the Last Bit Out with ADMM - 2017.11-GPU Kernels for Block-Sparse Weights Code - 2017.11-Block-sparse recurrent neural networks

2018 - 2018-AAAI-Auto-balanced Filter Pruning for Efficient Convolutional Neural Networks - 2018-AAAI-Deep Neural Network Compression with Single and Multiple Level Quantization - 2018-AAAI-Dynamic Deep Neural Networks_Optimizing Accuracy-Efficiency Trade-offs by Selective Execution - 2018-ICML-On the Optimization of Deep Networks: Implicit Acceleration by Overparameterization - 2018-ICML-Weightless: Lossy Weight Encoding For Deep Neural Network Compression - 2018-ICMLw-Assessing the Scalability of Biologically-Motivated Deep Learning Algorithms and Architectures - 2018-ICML-Understanding and simplifying one-shot architecture search - 2018-ICLRo-Training and Inference with Integers in Deep Neural Networks - 2018-ICLR-Rethinking the Smaller-Norm-Less-Informative Assumption in Channel Pruning of Convolution Layers - 2018-ICLR-N2N learning: Network to Network Compression via Policy Gradient Reinforcement Learning - 2018-ICLR-Model compression via distillation and quantization - 2018-ICLR-Towards Image Understanding from Deep Compression Without Decoding - 2018-ICLR-Deep Gradient Compression: Reducing the Communication Bandwidth for Distributed Training - 2018-ICLR-Mixed Precision Training of Convolutional Neural Networks using Integer Operations - 2018-ICLR-Mixed Precision Training - 2018-ICLR-Apprentice: Using Knowledge Distillation Techniques To Improve Low-Precision Network Accuracy - 2018-ICLR-Loss-aware Weight Quantization of Deep Networks - 2018-ICLR-Alternating Multi-bit Quantization for Recurrent Neural Networks - 2018-ICLR-Adaptive Quantization of Neural Networks - 2018-ICLR-Variational Network Quantization - 2018-ICLR-Espresso: Efficient Forward Propagation for Binary Deep Neural Networks - 2018-ICLR-Learning to share: Simultaneous parameter tying and sparsification in deep learning - 2018-ICLR-Learning Sparse Neural Networks through L0 Regularization - 2018-ICLR-WRPN: Wide Reduced-Precision Networks - 2018-ICLR-Deep rewiring: Training very sparse deep networks - 2018-ICLR-Efficient sparse-winograd convolutional neural networks [Code] - 2018-ICLR-Learning Intrinsic Sparse Structures within Long Short-term Memory - 2018-ICLR-Multi-scale dense networks for resource efficient image classification - 2018-ICLR-Compressing Word Embedding via Deep Compositional Code Learning - 2018-ICLR-Learning Discrete Weights Using the Local Reparameterization Trick - 2018-ICLR-Training wide residual networks for deployment using a single bit for each weight - 2018-ICLR-The High-Dimensional Geometry of Binary Neural Networks - 2018-ICLRw-To Prune, or Not to Prune: Exploring the Efficacy of Pruning for Model Compression (Similar topic: 2018-NIPSw-nip in the bud, 2018-NIPSw-rethink) - 2018-ICLRw-Systematic Weight Pruning of DNNs using Alternating Direction Method of Multipliers - 2018-ICLRw-Weightless: Lossy weight encoding for deep neural network compression - 2018-ICLRw-Variance-based Gradient Compression for Efficient Distributed Deep Learning - 2018-ICLRw-Stacked Filters Stationary Flow For Hardware-Oriented Acceleration Of Deep Convolutional Neural Networks - 2018-ICLRw-Training Shallow and Thin Networks for Acceleration via Knowledge Distillation with Conditional Adversarial Networks - 2018-ICLRw-Accelerating Neural Architecture Search using Performance Prediction - 2018-ICLRw-Nonlinear Acceleration of CNNs - 2018-ICLRw-Attention-Based Guided Structured Sparsity of Deep Neural Networks [Code] - 2018-CVPR-Context-Aware Deep Feature Compression for High-Speed Visual Tracking - 2018-CVPR-NISP: Pruning Networks using Neuron Importance Score Propagation - 2018-CVPR-Condensenet: An efficient densenet using learned group convolutions [Code] - 2018-CVPR-Shift: A zero flop, zero parameter alternative to spatial convolutions - 2018-CVPR-Explicit Loss-Error-Aware Quantization for Low-Bit Deep Neural Networks - 2018-CVPR-Interleaved structured sparse convolutional neural networks - 2018-CVPR-Towards Effective Low-bitwidth Convolutional Neural Networks - 2018-CVPR-CLIP-Q: Deep Network Compression Learning by In-Parallel Pruning-Quantization - 2018-CVPR-Blockdrop: Dynamic inference paths in residual networks - 2018-CVPR-Nestednet: Learning nested sparse structures in deep neural networks - 2018-CVPR-Stochastic downsampling for cost-adjustable inference and improved regularization in convolutional networks - 2018-CVPR-Wide Compression: Tensor Ring Nets - 2018-CVPR-Learning Compact Recurrent Neural Networks With Block-Term Tensor Decomposition - 2018-CVPR-Learning Time/Memory-Efficient Deep Architectures With Budgeted Super Networks - 2018-CVPR-HydraNets: Specialized Dynamic Architectures for Efficient Inference - 2018-CVPR-SYQ: Learning Symmetric Quantization for Efficient Deep Neural Networks - 2018-CVPR-Towards Effective Low-Bitwidth Convolutional Neural Networks - 2018-CVPR-Two-Step Quantization for Low-Bit Neural Networks - 2018-CVPR-Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference - 2018-CVPR-"Learning-Compression" Algorithms for Neural Net Pruning - 2018-CVPR-PackNet: Adding Multiple Tasks to a Single Network by Iterative Pruning [Code] - 2018-CVPR-MorphNet: Fast & Simple Resource-Constrained Structure Learning of Deep Networks [Code] - 2018-CVPR-ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices - 2018-CVPRw-Squeezenext: Hardware-aware neural network design - 2018-ICML-Compressing Neural Networks using the Variational Information Bottleneck - 2018-ICML-DCFNet: Deep Neural Network with Decomposed Convolutional Filters - 2018-ICML-Deep k-Means Re-Training and Parameter Sharing with Harder Cluster Assignments for Compressing Deep Convolutions - 2018-ICML-Error Compensated Quantized SGD and its Applications to Large-scale Distributed Optimization - 2018-ICML-High Performance Zero-Memory Overhead Direct Convolutions - 2018-ICML-Kronecker Recurrent Units - 2018-ICML-Weightless: Lossy weight encoding for deep neural network compression - 2018-ICML-StrassenNets: Deep learning with a multiplication budget - 2018-ICML-Learning Compact Neural Networks with Regularization - 2018-ICML-WSNet: Compact and Efficient Networks Through Weight Sampling - 2018-ICML-Gradually Updated Neural Networks for Large-Scale Image Recognition [Code] - 2018-IJCAI-Efficient DNN Neuron Pruning by Minimizing Layer-wise Nonlinear Reconstruction Error - 2018-IJCAI-Soft Filter Pruning for Accelerating Deep Convolutional Neural Networks [Code] - 2018-IJCAI-Where to Prune: Using LSTM to Guide End-to-end Pruning - 2018-IJCAI-Accelerating Convolutional Networks via Global & Dynamic Filter Pruning - 2018-IJCAI-Optimization based Layer-wise Magnitude-based Pruning for DNN Compression - 2018-IJCAI-Progressive Blockwise Knowledge Distillation for Neural Network Acceleration - 2018-IJCAI-Complementary Binary Quantization for Joint Multiple Indexing - 2018-ECCV-A Systematic DNN Weight Pruning Framework using Alternating Direction Method of Multipliers [Code] - 2018-ECCV-Coreset-Based Neural Network Compression - 2018-ECCV-Data-Driven Sparse Structure Selection for Deep Neural Networks [Code] - 2018-ECCV-Training Binary Weight Networks via Semi-Binary Decomposition - 2018-ECCV-Learning Compression from Limited Unlabeled Data - 2018-ECCV-Constraint-Aware Deep Neural Network Compression - 2018-ECCV-Sparsely Aggregated Convolutional Networks - 2018-ECCV-Deep Expander Networks: Efficient Deep Networks from Graph Theory [Code] - 2018-ECCV-SparseNet-Sparsely Aggregated Convolutional Networks [Code] - 2018-ECCV-Ask, acquire, and attack: Data-free uap generation using class impressions - 2018-ECCV-Netadapt: Platform-aware neural network adaptation for mobile applications - 2018-ECCV-Clustering Convolutional Kernels to Compress Deep Neural Networks - 2018-ECCV-Bi-Real Net: Enhancing the Performance of 1-bit CNNs With Improved Representational Capability and Advanced Training Algorithm - 2018-ECCV-Extreme Network Compression via Filter Group Approximation - 2018-ECCV-Convolutional Networks with Adaptive Inference Graphs - 2018-ECCV-SkipNet: Learning Dynamic Routing in Convolutional Networks [Code] - 2018-ECCV-Value-aware Quantization for Training and Inference of Neural Networks - 2018-ECCV-LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks - 2018-ECCV-AMC: AutoML for Model Compression and Acceleration on Mobile Devices - 2018-ECCV-Piggyback: Adapting a single network to multiple tasks by learning to mask weights - 2018-BMVCo-Structured Probabilistic Pruning for Convolutional Neural Network Acceleration - 2018-BMVC-Efficient Progressive Neural Architecture Search - 2018-BMVC-Igcv3: Interleaved lowrank group convolutions for efficient deep neural networks - 2018-NIPS-Discrimination-aware Channel Pruning for Deep Neural Networks - 2018-NIPS-Frequency-Domain Dynamic Pruning for Convolutional Neural Networks - 2018-NIPS-ChannelNets: Compact and Efficient Convolutional Neural Networks via Channel-Wise Convolutions - 2018-NIPS-DropBlock: A regularization method for convolutional networks - 2018-NIPS-Constructing fast network through deconstruction of convolution - 2018-NIPS-Learning Versatile Filters for Efficient Convolutional Neural Networks [Code] - 2018-NIPS-Moonshine: Distilling with cheap convolutions - 2018-NIPS-HitNet: hybrid ternary recurrent neural network - 2018-NIPS-FastGRNN: A Fast, Accurate, Stable and Tiny Kilobyte Sized Gated Recurrent Neural Network - 2018-NIPS-Training DNNs with Hybrid Block Floating Point - 2018-NIPS-Reversible Recurrent Neural Networks - 2018-NIPS-Synaptic Strength For Convolutional Neural Network - 2018-NIPS-Learning sparse neural networks via sensitivity-driven regularization - 2018-NIPS-Multi-Task Zipping via Layer-wise Neuron Sharing - 2018-NIPS-A Linear Speedup Analysis of Distributed Deep Learning with Sparse and Quantized Communication - 2018-NIPS-Gradient Sparsification for Communication-Efficient Distributed Optimization - 2018-NIPS-GradiVeQ: Vector Quantization for Bandwidth-Efficient Gradient Aggregation in Distributed CNN Training - 2018-NIPS-ATOMO: Communication-efficient Learning via Atomic Sparsification - 2018-NIPS-Norm matters: efficient and accurate normalization schemes in deep networks - 2018-NIPS-Sparsified SGD with memory - 2018-NIPS-Pelee: A Real-Time Object Detection System on Mobile Devices - 2018-NIPS-Scalable methods for 8-bit training of neural networks - 2018-NIPS-TETRIS: TilE-matching the TRemendous Irregular Sparsity - 2018-NIPS-Training deep neural networks with 8-bit floating point numbers - 2018-NIPS-Multiple instance learning for efficient sequential data classification on resource-constrained devices - 2018-NIPSw-Pruning neural networks: is it time to nip it in the bud? - 2018-NIPSwb-Rethinking the Value of Network Pruning [2019 ICLR version] - 2018-NIPSw-Structured Pruning for Efficient ConvNets via Incremental Regularization [2019 IJCNN version] [Code] - 2018-NIPSw-Adaptive Mixture of Low-Rank Factorizations for Compact Neural Modeling - 2018-NIPSw-Learning Sparse Networks Using Targeted Dropout [OpenReview] [Code] - 2018-WACV-Recovering from Random Pruning: On the Plasticity of Deep Convolutional Neural Networks - 2018.05-Compression of Deep Convolutional Neural Networks under Joint Sparsity Constraints - 2018.05-AutoPruner: An End-to-End Trainable Filter Pruning Method for Efficient Deep Model Inference - 2018.10-A Closer Look at Structured Pruning for Neural Network Compression [Code] - 2018.11-Second-order Optimization Method for Large Mini-batch: Training ResNet-50 on ImageNet in 35 Epochs - 2018.11-PydMobileNet: Improved Version of MobileNets with Pyramid Depthwise Separable Convolution

2019 - 2019-SysML-Towards Federated Learning at Scale: System Design - 2019-SysML-To compress or not to compress: Understanding the Interactions between Adversarial Attacks and Neural Network Compression - 2019-ICLRo-The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks (best paper!) - 2019-ICLR-Slimmable Neural Networks [Code] - 2019-ICLR-Defensive Quantization: When Efficiency Meets Robustness - 2019-ICLR-Minimal Random Code Learning: Getting Bits Back from Compressed Model Parameters [Code] - 2019-ICLR-ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware [Code] - 2019-ICLR-SNIP: Single-shot Network Pruning based on Connection Sensitivity - 2019-ICLR-Non-vacuous Generalization Bounds at the ImageNet Scale: a PAC-Bayesian Compression Approach - 2019-ICLR-Dynamic Channel Pruning: Feature Boosting and Suppression - 2019-ICLR-Energy-Constrained Compression for Deep Neural Networks via Weighted Sparse Projection and Layer Input Masking - 2019-ICLR-RotDCF: Decomposition of Convolutional Filters for Rotation-Equivariant Deep Networks - 2019-ICLR-Dynamic Sparse Graph for Efficient Deep Learning - 2019-ICLR-Big-Little Net: An Efficient Multi-Scale Feature Representation for Visual and Speech Recognition - 2019-ICLR-Data-Dependent Coresets for Compressing Neural Networks with Applications to Generalization Bounds - 2019-ICLR-Learning Recurrent Binary/Ternary Weights - 2019-ICLR-Double Viterbi: Weight Encoding for High Compression Ratio and Fast On-Chip Reconstruction for Deep Neural Network - 2019-ICLR-Relaxed Quantization for Discretized Neural Networks - 2019-ICLR-Integer Networks for Data Compression with Latent-Variable Models - 2019-ICLR-Minimal Random Code Learning: Getting Bits Back from Compressed Model Parameters - 2019-ICLR-Analysis of Quantized Models - 2019-ICLR-DARTS: Differentiable Architecture Search [Code] - 2019-ICLR-Graph HyperNetworks for Neural Architecture Search - 2019-ICLR-Learnable Embedding Space for Efficient Neural Architecture Compression [Code] - 2019-ICLR-Efficient Multi-Objective Neural Architecture Search via Lamarckian Evolution - 2019-ICLR-SNAS: stochastic neural architecture search - 2019-AAAIo-A layer decomposition-recomposition framework for neuron pruning towards accurate lightweight networks - 2019-AAAI-Balanced Sparsity for Efficient DNN Inference on GPU [Code] - 2019-AAAI-CircConv: A Structured Convolution with Low Complexity - 2019-AAAI-Regularized Evolution for Image Classifier Architecture Search - 2019-AAAI-Universal Approximation Property and Equivalence of Stochastic Computing-Based Neural Networks and Binary Neural Networks - 2019-WACV-DAC: Data-free Automatic Acceleration of Convolutional Networks - 2019-ASPLOS-Packing Sparse Convolutional Neural Networks for Efficient Systolic Array Implementations: Column Combining Under Joint Optimization - 2019-CVPRo-HAQ: hardware-aware automated quantization - 2019-CVPRo-Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration [Code] - 2019-CVPR-All You Need is a Few Shifts: Designing Efficient Convolutional Neural Networks for Image Classification - 2019-CVPR-Importance Estimation for Neural Network Pruning [Code] - 2019-CVPR-HetConv Heterogeneous Kernel-Based Convolutions for Deep CNNs - 2019-CVPR-Fully Learnable Group Convolution for Acceleration of Deep Neural Networks - 2019-CVPR-Towards Optimal Structured CNN Pruning via Generative Adversarial Learning - 2019-CVPR-ChamNet: Towards Efficient Network Design through Platform-Aware Model Adaptation - 2019-CVPR-Partial Order Pruning: for Best Speed/Accuracy Trade-off in Neural Architecture Search [Code] - 2019-CVPR-Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation [Code] - 2019-CVPR-MnasNet: Platform-Aware Neural Architecture Search for Mobile [Code] - 2019-CVPR-MFAS: Multimodal Fusion Architecture Search - 2019-CVPR-A Neurobiological Evaluation Metric for Neural Network Model Search - 2019-CVPR-Fast Neural Architecture Search of Compact Semantic Segmentation Models via Auxiliary Cells - 2019-CVPR-Efficient Neural Network Compression [Code] - 2019-CVPR-T-Net: Parametrizing Fully Convolutional Nets with a Single High-Order Tensor - 2019-CVPR-Centripetal SGD for Pruning Very Deep Convolutional Networks with Complicated Structure [Code] - 2019-CVPR-DSC: Dense-Sparse Convolution for Vectorized Inference of Convolutional Neural Networks - 2019-CVPR-DupNet: Towards Very Tiny Quantized CNN With Improved Accuracy for Face Detection - 2019-CVPR-ECC: Platform-Independent Energy-Constrained Deep Neural Network Compression via a Bilinear Regression Model - 2019-CVPR-Variational Convolutional Neural Network Pruning - 2019-CVPR-Accelerating Convolutional Neural Networks via Activation Map Compression - 2019-CVPR-Compressing Convolutional Neural Networks via Factorized Convolutional Filters - 2019-CVPR-Deep Virtual Networks for Memory Efficient Inference of Multiple Tasks - 2019-CVPR-Exploiting Kernel Sparsity and Entropy for Interpretable CNN Compression - 2019-CVPR-MBS: Macroblock Scaling for CNN Model Reduction - 2019-CVPR-On Implicit Filter Level Sparsity in Convolutional Neural Networks - 2019-CVPR-Structured Pruning of Neural Networks With Budget-Aware Regularization - 2019-CVPRo-Neural Rejuvenation: Improving Deep Network Training by Enhancing Computational Resource Utilization [Code] - 2019-ICML-Approximated Oracle Filter Pruning for Destructive CNN Width Optimization [Code] - 2019-ICML-EigenDamage: Structured Pruning in the Kronecker-Factored Eigenbasis [Code] - 2019-ICML-Zero-Shot Knowledge Distillation in Deep Networks [Code] - 2019-ICML-LegoNet: Efficient Convolutional Neural Networks with Lego Filters [Code] - 2019-ICML-EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks [Code] - 2019-ICML-Collaborative Channel Pruning for Deep Networks - 2019-ICML-Training CNNs with Selective Allocation of Channels - 2019-ICML-NAS-Bench-101: Towards Reproducible Neural Architecture Search [Code] - 2019-ICMLw-Towards Learning of Filter-Level Heterogeneous Compression of Convolutional Neural Networks Code - 2019-IJCAI-Play and Prune: Adaptive Filter Pruning for Deep Model Compression - 2019-BigComp-Towards Robust Compressed Convolutional Neural Networks - 2019-ICCV-Rethinking ImageNet Pre-training - 2019-ICCV-Universally Slimmable Networks and Improved Training Techniques - 2019-ICCV-MetaPruning: Meta Learning for Automatic Neural Network Channel Pruning [Code] - 2019-ICCV-Progressive Differentiable Architecture Search: Bridging the Depth Gap between Search and Evaluation [Code] - 2019-ICCV-ACNet: Strengthening the Kernel Skeletons for Powerful CNN via Asymmetric Convolution Blocks - 2019-NIPS-Global Sparse Momentum SGD for Pruning Very Deep Neural Networks - 2019-NIPS-Model Compression with Adversarial Robustness: A Unified Optimization Framework - 2019-NIPS-AutoPrune: Automatic Network Pruning by Regularizing Auxiliary Parameters - 2019-NIPS-Double Quantization for Communication-Efficient Distributed Optimization - 2019-NIPS-Focused Quantization for Sparse CNNs - 2019-NIPS-E2-Train: Training State-of-the-art CNNs with Over 80% Energy Savings - 2019-NIPS-MetaQuant: Learning to Quantize by Learning to Penetrate Non-differentiable Quantization - 2019-NIPS-Random Projections with Asymmetric Quantization - 2019-NIPS-Network Pruning via Transformable Architecture Search [Code] - 2019-NIPS-Point-Voxel CNN for Efficient 3D Deep Learning [Code] - 2019-NIPS-Gate Decorator: Global Filter Pruning Method for Accelerating Deep Convolutional Neural Networks - 2019-NIPS-A Mean Field Theory of Quantized Deep Networks: The Quantization-Depth Trade-Off - 2019-NIPS-Qsparse-local-SGD: Distributed SGD with Quantization, Sparsification and Local Computations - 2019-NIPS-Post training 4-bit quantization of convolutional networks for rapid-deployment - 2019-PR-Filter-in-Filter: Improve CNNs in a Low-cost Way by Sharing Parameters among the Sub-filters of a Filter - 2019-PRL-BDNN: Binary Convolution Neural Networks for Fast Object Detection - 2019-TNNLS-Towards Compact ConvNets via Structure-Sparsity Regularized Filter Pruning [Code] - 2019.02-The State of Sparsity in Deep Neural Networks (Review) - 2019.03-Network Slimming by Slimmable Networks: Towards One-Shot Architecture Search for Channel Numbers [Code] - 2019.03-Single Path One-Shot Neural Architecture Search with Uniform Sampling - 2019.04-Resource Efficient 3D Convolutional Neural Networks - 2019.04-Meta Filter Pruning to Accelerate Deep Convolutional Neural Networks - 2019.04-Knowledge Squeezed Adversarial Network Compression - 2019.05-Dynamic Neural Network Channel Execution for Efficient Training - 2019.06-AutoGrow: Automatic Layer Growing in Deep Convolutional Networks - 2019.06-BasisConv: A method for compressed representation and learning in CNNs - 2019.06-BlockSwap: Fisher-guided Block Substitution for Network Compression - 2019.06-Data-Free Quantization through Weight Equalization and Bias Correction - 2019.06-Separable Layers Enable Structured Efficient Linear Substitutions [Code] - 2019.06-Butterfly Transform: An Efficient FFT Based Neural Architecture Design - 2019.06-A Taxonomy of Channel Pruning Signals in CNNs - 2019.08-Adversarial Neural Pruning with Latent Vulnerability Suppression - 2019.09-Training convolutional neural networks with cheap convolutions and online distillation - 2019.09-Pruning from Scratch - 2019.10-Structured Pruning of Large Language Models - 2019.11-Adversarial Interpolation Training: A Simple Approach for Improving Model Robustness - 2019.11-A Programmable Approach to Model Compression [Code]

2020 - 2020-AAAI-Pconv: The missing but desirable sparsity in dnn weight pruning for real-time execution on mobile devices - 2020-AAAI-Channel Pruning Guided by Classification Loss and Feature Importance - 2020-AAAI-Pruning from Scratch - 2020-AAAI-Harmonious Coexistence of Structured Weight Pruning and Ternarization for Deep Neural Networks - 2020-AAAI-AutoCompress: An Automatic DNN Structured Pruning Framework for Ultra-High Compression Rates - 2020-AAAI-DARB: A Density-Adaptive Regular-Block Pruning for Deep Neural Networks - 2020-AAAI-Real-Time Object Tracking via Meta-Learning: Efficient Model Adaptation and One-Shot Channel Pruning - 2020-AAAI-Dynamic Network Pruning with Interpretable Layerwise Channel Selection - 2020-AAAI-Reborn Filters: Pruning Convolutional Neural Networks with Limited Data - 2020-AAAI-Layerwise Sparse Coding for Pruned Deep Neural Networks with Extreme Compression Ratio - 2020-AAAI-Sparsity-inducing Binarized Neural Networks - 2020-AAAI-Structured Sparsification of Gated Recurrent Neural Networks - 2020-AAAI-Hierarchical Knowledge Squeezed Adversarial Network Compression - 2020-AAAI-Embedding Compression with Isotropic Iterative Quantization - 2020-ICLR-Lookahead: A Far-sighted Alternative of Magnitude-based Pruning [Code] - 2020-ICLR-Dynamic Model Pruning with Feedback - 2020-ICLR-Provable Filter Pruning for Efficient Neural Networks - 2020-ICLR-Data-Independent Neural Pruning via Coresets - 2020-ICLR-FSNet: Compression of Deep Convolutional Neural Networks by Filter Summary - 2020-ICLR-Probabilistic Connection Importance Inference and Lossless Compression of Deep Neural Networks - 2020-ICLR-BlockSwap: Fisher-guided Block Substitution for Network Compression on a Budget - 2020-ICLR-Neural Epitome Search for Architecture-Agnostic Network Compression - 2020-ICLR-One-Shot Pruning of Recurrent Neural Networks by Jacobian Spectrum Evaluation - 2020-ICLR-DeepHoyer: Learning Sparser Neural Network with Differentiable Scale-Invariant Sparsity Measures [Code] - 2020-ICLR-Dynamic Sparse Training: Find Efficient Sparse Network From Scratch With Trainable Masked Layers - 2020-ICLR-Scalable Model Compression by Entropy Penalized Reparameterization - 2020-ICLR-GraSP: Picking Winning Tickets Before Training By Preserving Gradient Flow [Code] - 2020-ICLR-A Signal Propagation Perspective for Pruning Neural Networks at Initialization - 2020-CVPR-GhostNet: More Features from Cheap Operations [Code] - 2020-CVPR-Learning Filter Pruning Criteria for Deep Convolutional Neural Networks Acceleration - 2020-CVPR-Filter Grafting for Deep Neural Networks - 2020-CVPR-Low-rank Compression of Neural Nets: Learning the Rank of Each Layer - 2020-CVPR-Structured Compression by Weight Encryption for Unstructured Pruning and Quantization - 2020-CVPR-Learning Filter Pruning Criteria for Deep Convolutional Neural Networks Acceleration - 2020-CVPR-APQ: Joint Search for Network Architecture, Pruning and Quantization Policy - 2020-CVPR-Group Sparsity: The Hinge Between Filter Pruning and Decomposition for Network Compression [Code] - 2020-CVPR-Neural Network Pruning With Residual-Connections and Limited-Data - 2020-CVPR-Multi-Dimensional Pruning: A Unified Framework for Model Compression - 2020-CVPR-Discrete Model Compression With Resource Constraint for Deep Neural Networks - 2020-CVPR-Automatic Neural Network Compression by Sparsity-Quantization Joint Learning: A Constrained Optimization-Based Approach - 2020-CVPR-Low-Rank Compression of Neural Nets: Learning the Rank of Each Layer - 2020-CVPR-The Knowledge Within: Methods for Data-Free Model Compression - 2020-CVPR-GAN Compression: Efficient Architectures for Interactive Conditional GANs [Code] - 2020-CVPR-Few Sample Knowledge Distillation for Efficient Network Compression - 2020-CVPR-Structured Multi-Hashing for Model Compression - 2020-CVPRo-AdderNet: Do We Really Need Multiplications in Deep Learning? [Code] - 2020-CVPRo-Towards Efficient Model Compression via Learned Global Ranking [Code] - 2020-CVPRo-HRank: Filter Pruning Using High-Rank Feature Map [Code] - 2020-ICML-PENNI: Pruned Kernel Sharing for Efficient CNN Inference [Code] - 2020-ICML-Operation-Aware Soft Channel Pruning using Differentiable Masks - 2020-ICML-DropNet: Reducing Neural Network Complexity via Iterative Pruning - 2020-ICML-Proving the Lottery Ticket Hypothesis: Pruning is All You Need - 2020-ICML-Network Pruning by Greedy Subnetwork Selection - 2020-ICML-AutoGAN-Distiller: Searching to Compress Generative Adversarial Networks

NAS (Neural Architecture Search)


Papers-Knowledge Distillation

People (in alphabeta order)

People in NAS (in alphabeta order)


Lightweight DNN Engines/APIs

Related Repos and Websites

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.