Need help with AdvancedML?

Click the “chat” button below for chat support from the developer who created it, or find similar developers for support.

176 Stars 36 Forks 230 Commits 0 Opened issues

Reading list for the Advanced Machine Learning Course

No Data

Readme

**Instructor:** Sung Ju Hwang ([email protected])

**Office:**
This is an online course.
E3-1, Room 1427 (Instructor) Room 1435 (TAs)

Office hours: By appointment only.

**Absolute Grading**- Paper Presentation: 20%
- Attendance and Participation: 20%
- Project: 60%

| Dates | Topic |
|---|:---|
|8/31| Course Introduction |
|9/2| Review of Deep Learning Basics (Video Lecture) |
|9/7| Bayesian Deep Learning - VAEs and BNNs (Lecture) |
|9/9| Bayesian Deep Learning - Bayesian Approximations, Modeling Uncertainty (Lecture) **Review Due September 12th**|
|9/14| Bayesian Deep Learning (Presentation) |
|9/16| Deep Generative Models - Generative Adversarial Networks(Lecture) |
|9/21| Deep Generative Models - Autoregressive Models (Lecture) **Review Due**|
|9/23| Deep Generative Models - Flow-Based Models (Lecture) |
|9/28| Deep Generative Models (Presentation) |
|10/5| Deep Reinforcement Learning - Value-based RL (Lecture) |
|10/7| Deep Reinforcement Learning - Policy and Model-based RL (Lecture) **Review Due October 7th. Project Proposal Due October 11th**|
|10/12| Deep Reinforcement Learning (Presentation) |
|10/14| Memory- and Computation-Efficient Deep Learning (Lecture) **Review Due**|

|10/26| Memory- and Computation-Efficient Deep Learning (Presentation), **Project Meetings** |
|10/28| Meta-Learning (Lecture) **Review Due**|
|11/2| Meta-Learning (Presentation) **Mid-term Presentation at 7pm**|
|11/4| Continual Learning (Lecture) **Review Due**|
|11/9| Continual Learning (Presentation) |
|11/11| Interpretable Deep Learning (Lecture) **Review Due**|
|11/16| Interpretable Deep Learning (Presentation) |
|11/18| Adversarially-Robust Deep Learning (Lecture), **Review Due, Project Meetings** |
|11/23| Adversarially-Robust Deep Learning (Presentation), **Project Meetings** |
|11/30| Graph Neural Networks (Lecture) **Review Due** |
|12/2| Graph Neural Networks (Presentation) |
|12/7| Self Supervised Learning (Lecture) **Review Due December 2nd** |
|12/9| Self Supervised Learning (Presentation) **Final Paper Due December 13th**|
|12/14| Federated Learning (Lecture) **Review Due** |
|12/16| Federated Learning (Presentation) |
|12/18| **(Online) Workshop**

[Kingma and Welling 14] Auto-Encoding Variational Bayes, ICLR 2014.

[Kingma et al. 15] Variational Dropout and the Local Reparameterization Trick, NIPS 2015.

[Blundell et al. 15] Weight Uncertainty in Neural Networks, ICML 2015.

[Gal and Ghahramani 16] Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning, ICML 2016.

[Liu et al. 16] Stein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm, NIPS 2016.

[Mandt et al. 17] Stochastic Gradient Descent as Approximate Bayesian Inference, JMLR 2017.

[Kendal and Gal 17] What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision?, ICML 2017.

[Gal et al. 17] Concrete Dropout, NIPS 2017.

[Gal et al. 17] Deep Bayesian Active Learning with Image Data, ICML 2017.

[Teye et al. 18] Bayesian Uncertainty Estimation for Batch Normalized Deep Networks, ICML 2018.

[Garnelo et al. 18] Conditional Neural Process, ICML 2018.

[Kim et al. 19] Attentive Neural Processes, ICLR 2019.

[Sun et al. 19] Functional Variational Bayesian Neural Networks, ICLR 2019.

[Louizos et al. 19] The Functional Neural Process, NeurIPS 2019.

[Amersfoort et al. 20] Uncertainty Estimation Using a Single Deep Deterministic Neural Network, ICML 2020.

[Dusenberry et al. 20] Efficient and Scalable Bayesian Neural Nets with Rank-1 Factors, ICML 2020.

[Wenzel et al. 20] How Good is the Bayes Posterior in Deep Neural Networks Really?, ICML 2020.

[Lee et al. 20] Bootstrapping Neural Processes, arXiv preprint 2020.

[Rezende and Mohamed 15] Variational Inference with Normalizing Flows, ICML 2015.

[Germain et al. 15] MADE: Masked Autoencoder for Distribution Estimation, ICML 2015.

[Kingma et al. 16] Improved Variational Inference with Inverse Autoregressive Flow, NIPS 2016.

[Oord et al. 16] Pixel Recurrent Neural Networks, ICML 2016.

[Dinh et al. 17] Density Estimation Using Real NVP, ICLR 2017.

[Papamakarios et al. 17 Masked Autoregressive Flow for Density Estimation, NIPS 2017.

[Huang et al.18] Neural Autoregressive Flows, ICML 2018.

[Kingma and Dhariwal 18] Glow: Generative Flow with Invertible 1x1 Convolutions, NeurIPS 2018.

[Ho et al. 19] Flow++: Improving Flow-Based Generative Models with Variational Dequantization and Architecture Design, ICML 2019.

[Chen et al. 19] Residual Flows for Invertible Generative Modeling, NeurIPS 2019.

[Tran et al. 19] Discrete Flows: Invertible Generative
Models of Discrete Data, NeurIPS 2019.

[Ping et al. 20] WaveFlow: A Compact Flow-based Model for Raw Audio, ICML 2020.

[Vahdat and Kautz 20] NVAE: A Deep Hierarchical Variational Autoencoder, arXiv preprint, 2020.

[Goodfellow et al. 14] Generative Adversarial Nets, NIPS 2014.

[Radford et al. 15] Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks, ICLR 2016.

[Chen et al. 16] InfoGAN: Interpreting Representation Learning by Information Maximizing Generative Adversarial Nets, NIPS 2016.

[Arjovsky et al. 17] Wasserstein Generative Adversarial Networks, ICML 2017.

[Zhu et al. 17] Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks, ICCV 2017.

[Zhang et al. 17] Adversarial Feature Matching for Text Generation, ICML 2017.

[Karras et al. 18] Progressive Growing of GANs for Improved Quality, Stability, and Variation, ICLR 2018.

[Brock et al. 19] Large Scale GAN Training for High-Fidelity Natural Image Synthesis, ICLR 2019.

[Karras et al. 19] A Style-Based Generator Architecture for Generative Adversarial Networks, CVPR 2019.

[Xu et al. 19] Modeling Tabular Data using Conditional GAN, NeurIPS 2019.

[Karras et al. 20] Analyzing and Improving the Image Quality of StyleGAN, CVPR 2020.

[Zhao et al. 20] Feature Quantization Improves GAN Training, ICML 2020.

[Sinha et al. 20] Small-GAN: Speeding up GAN Training using Core-Sets, ICML 2020.

[Mnih et al. 13] Playing Atari with Deep Reinforcement Learning, NIPS Deep Learning Workshop 2013.

[Silver et al. 14] Deterministic Policy Gradient Algorithms, ICML 2014.

[Schulman et al. 15] Trust Region Policy Optimization, ICML 2015.

[Lillicrap et al. 16] Continuous Control with Deep Reinforcement Learning, ICLR 2016.

[Schaul et al. 16] Prioritized Experience Replay, ICLR 2016.

[Wang et al. 16] Dueling Network Architectures for Deep Reinforcement Learning, ICML 2016.

[Mnih et al. 16] Asynchronous Methods for Deep Reinforcement Learning, ICML 2016.

[Schulman et al. 17] Proximal Policy Optimization Algorithms, arXiv preprint, 2017.

[Nachum et al. 18] Data-Efficient Hierarchical Reinforcement Learning, NeurIPS 2018.

[Ha et al. 18] Recurrent World Models Facilitate Policy Evolution, NeurIPS 2018.

[Burda et al. 19] Large-Scale Study of Curiosity-Driven Learning, ICLR 2019.

[Vinyals et al. 19] Grandmaster level in StarCraft II using multi-agent reinforcement learning, Nature, 2019.

[Bellemare et al. 19] A Geometric Perspective on Optimal Representations for Reinforcement Learning, NeurIPS 2019.

[Janner et al. 19] When to Trust Your Model: Model-Based Policy Optimization, NeurIPS 2019.

[Fellows et al. 19] VIREL: A Variational Inference Framework for Reinforcement Learning, NeurIPS 2019.

[Kaiser et al. 20] Model Based Reinforcement Learning for Atari, ICLR 2020.

[Agarwal et al. 20] An Optimistic Perspective on Offline Reinforcement Learning, ICML 2020.

[Fedus et al. 20] Revisiting Fundamentals of Experience Replay, ICML 2020.

[Lee et al. 20] Batch Reinforcement Learning with Hyperparameter Gradients, ICML 2020.

[Raileanu et al. 20] Automatic Data Augmentation for Generalization in Deep Reinforcement Learning, arXiv preprint, 2020.

[Han et al. 15] Learning both Weights and Connections for Efficient Neural Networks, NIPS 2015.

[Wen et al. 16] Learning Structured Sparsity in Deep Neural Networks, NIPS 2016

[Han et al. 16] Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding, ICLR 2016

[Molchanov et al. 17] Variational Dropout Sparsifies Deep Neural Networks, ICML 2017

[Luizos et al. 17] Bayesian Compression for Deep Learning, NIPS 2017.

[Luizos et al. 18] Learning Sparse Neural Networks Through L0 Regularization, ICLR 2018.

[Howard et al. 18] MobileNets: Efficient Convolutional Neural Networks for Mobile Vision
Applications, CVPR 2018.

[Frankle and Carbin 19] The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks, ICLR 2019.

[Lee et al. 19] SNIP: Single-Shot Network Pruning Based On Connection Sensitivity, ICLR 2019.

[Liu et al. 19] Rethinking the Value of Network Pruning, ICLR 2019.

[Jung et al. 19] Learning to Quantize Deep Networks by Optimizing Quantization Intervals with Task Loss, CVPR 2019.

[Morcos et al. 19] One ticket to win them all: generalizing lottery ticket initializations across datasets and optimizers, NeurIPS 2019.

[Renda et al. 20] Comparing Rewinding and Fine-tuning in Neural Network Pruning, ICLR 2020.

[Ye et al. 20] Good Subnetworks Provably Exist: Pruning via Greedy Forward Selection, ICML 2020.

[Frankle et al. 20] Linear Mode Connectivity and the Lottery Ticket Hypothesis, ICML 2020.

[Li et al. 20] Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers, ICML 2020.

[Nagel et al. 20] Up or Down? Adaptive Rounding for Post-Training Quantization, ICML 2020.

[Meng et al. 20] Training Binary Neural Networks using the Bayesian Learning Rule, ICML 2020.

[Santoro et al. 16] Meta-Learning with Memory-Augmented Neural Networks, ICML 2016

[Vinyals et al. 16] Matching Networks for One Shot Learning, NIPS 2016

[Edwards and Storkey 17] Towards a Neural Statistician, ICLR 2017

[Finn et al. 17] Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks, ICML 2017

[Snell et al. 17] Prototypical Networks for Few-shot Learning, NIPS 2017.

[Nichol et al. 18] On First-Order Meta-learning Algorithms, arXiv preprint, 2018.

[Lee and Choi 18] Gradient-Based Meta-Learning with Learned Layerwise Metric and Subspace, ICML 2018.

[Liu et al. 19] Learning to Propagate Labels: Transductive Propagation Network for Few-shot Learning, ICLR 2019.

[Gordon et al. 19] Meta-Learning Probabilistic Inference for Prediction, ICLR 2019.

[Ravi and Beatson 19] Amortized Bayesian Meta-Learning, ICLR 2019.

[Rakelly et al. 19] Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables, ICML 2019.

[Shu et al. 19] Meta-Weight-Net: Learning an Explicit Mapping For Sample Weighting, NeurIPS 2019.

[Finn et al. 19] Online Meta-Learning, ICML 2019.

[Lee et al. 20] Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distribution Tasks, ICLR 2020.

[Yin et al. 20] Meta-Learning without Memorization, ICLR 2020.

[Iakovleva et al. 20] Meta-Learning with Shared Amortized Variational Inference, ICML 2020.

[Bronskill et al. 20] TaskNorm: Rethinking Batch Normalization for Meta-Learning, ICML 2020.

[Rusu et al. 16] Progressive Neural Networks, arXiv preprint, 2016

[Kirkpatrick et al. 17] Overcoming catastrophic forgetting in neural networks, PNAS 2017

[Lee et al. 17] Overcoming Catastrophic Forgetting by Incremental Moment Matching, NIPS 2017

[Shin et al. 17] Continual Learning with Deep Generative Replay, NIPS 2017.

[Lopez-Paz and Ranzato 17] Gradient Episodic Memory for Continual Learning, NIPS 2017.

[Yoon et al. 18] Lifelong Learning with Dynamically Expandable Networks, ICLR 2018.

[Nguyen et al. 18] Variational Continual Learning, ICLR 2018.

[Schwarz et al. 18] Progress & Compress: A Scalable Framework for Continual Learning, ICML 2018.

[Chaudhry et al. 19] Efficient Lifelong Learning with A-GEM, ICLR 2019.

[Rao et al. 19] Continual Unsupervised Representation Learning, NeurIPS 2019.

[Rolnick et al. 19] Experience Replay for Continual Learning, NeurIPS 2019.

[Jerfel et al. 20] Reconciling Meta-Learning and Continual Learning with Online Mixtures of Tasks, NeurIPS 2019.

[Yoon et al. 20] Scalable and Order-robust Continual Learning with Additive Parameter Decomposition, ICLR 2020.

[Knoblauch et al. 20] Optimal Continual Learning has Perfect Memory and is NP-HARD, ICML 2020.

[Remasesh et al. 20] Anatomy of Catastrophic Forgetting: Hidden Representations and Task Semantics, Continual Learning Workshop, ICML 2020.

[Ribeiro et al. 16] "Why Should I Trust You?" Explaining the Predictions of Any Classifier, KDD 2016

[Kim et al. 16] Examples are not Enough, Learn to Criticize! Criticism for Interpretability, NIPS 2016

[Choi et al. 16] RETAIN: An Interpretable Predictive Model for Healthcare using Reverse Time Attention Mechanism, NIPS 2016

[Koh et al. 17] Understanding Black-box Predictions via Influence Functions, ICML 2017

[Bau et al. 17] Network Dissection: Quantifying Interpretability of Deep Visual Representations, CVPR 2017

[Selvaraju et al. 17] Grad-CAM: Visual Explanation from Deep Networks via Gradient-based Localization, ICCV 2017.

[Kim et al. 18] Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV), ICML 2018.

[Heo et al. 18] Uncertainty-Aware Attention for Reliable Interpretation and Prediction, NeurIPS 2018.

[Bau et al. 19] GAN Dissection: Visualizing and Understanding Generative Adversarial Networks, ICLR 2019.

[Ghorbani et al. 19] Towards Automatic Concept-based Explanations, NeurIPS 2019.

[Coenen et al. 19] Visualizing and Measuring the Geometry of BERT, NeurIPS 2019.

[Heo et al. 20] Cost-Effective Interactive Attention Learning with Neural Attention Processes, ICML 2020.

[Agarwal et al. 20] Neural Additive Models: Interpretable Machine Learning with Neural Nets, arXiv preprint, 2020.

[Guo et al. 17] On Calibration of Modern Neural Networks, ICML 2017.

[Lakshminarayanan et al. 17] Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles, NIPS 2017.

[Liang et al. 18] Enhancing the Reliability of Out-of-distrubition Image Detection in Neural Networks, ICLR 2018.

[Lee et al. 18] Training Confidence-calibrated Classifiers for Detecting Out-of-Distribution Samples, ICLR 2018.

[Kuleshov et al. 18] Accurate Uncertainties for Deep Learning Using Calibrated Regression, ICML 2018.

[Jiang et al. 18] To Trust Or Not To Trust A Classifier, NeurIPS 2018.

[Madras et al. 18] Predict Responsibly: Improving Fairness and Accuracy by Learning to Defer, NeurIPS 2018.

[Maddox et al. 19] A Simple Baseline for Bayesian Uncertainty in Deep Learning, NeurIPS 2019.

[Kull et al. 19] Beyond temperature scaling: Obtaining well-calibrated multiclass probabilities with Dirichlet calibration, NeurIPS 2019.

[Thulasidasan et al. 19] On Mixup Training: Improved Calibration and Predictive Uncertainty for Deep Neural Networks, NeurIPS 2019.

[Ovadia et al. 19] Can You Trust Your Model’s Uncertainty? Evaluating Predictive Uncertainty Under Dataset Shift, NeurIPS 2019.

[Hendrycks et al. 20] AugMix: A Simple Data Processing Method to Improve Robustness and Uncertainty, ICLR 2020.

[Filos et al. 20] Can Autonomous Vehicles Identify, Recover From, and Adapt to Distribution Shifts?, ICML 2020.

[Szegedy et al. 14] Intriguing Properties of Neural Networks, ICLR 2014.

[Goodfellow et al. 15] Explaining and Harnessing Adversarial Examples, ICLR 2015.

[Kurakin et al. 17] Adversarial Machine Learning at Scale, ICLR 2017.

[Madry et al. 18] Toward Deep Learning Models Resistant to Adversarial Attacks, ICLR 2018.

[Eykholt et al. 18] Robust Physical-World Attacks on Deep Learning Visual Classification.

[Athalye et al. 18] Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples, ICML 2018.

[Zhang et al. 19] Theoretically Principled Trade-off between Robustness and Accuracy, ICML 2019.

[Carmon et al. 19] Unlabeled Data Improves Adversarial Robustness, NeurIPS 2019.

[Ilyas et al. 19] Adversarial Examples are not Bugs, They Are Features, NeurIPS 2019.

[Li et al. 19] Certified Adversarial Robustness with Additive Noise, NeurIPS 2019.

[Tramèr and Boneh 19] Adversarial Training and Robustness for Multiple Perturbations, NeurIPS 2019.

[Shafahi et al. 19] Adversarial Training for Free!, NeurIPS 2019.

[Wong et al. 20] Fast is Better Than Free: Revisiting Adversarial Training, ICLR 2020.

[Madaan et al. 20] Adversarial Neural Pruning with Latent Vulnerability Suppression, ICML 2020.

[Maini et al. 20] Adversarial Robustness Against the Union of Multiple Perturbation Models, ICML 2020.

[Li et al. 16] Gated Graph Sequence Neural Networks, ICLR 2016.

[Hamilton et al. 17] Inductive Representation Learning on Large Graphs, NIPS 2017.

[Kipf and Welling 17] Semi-Supervised Classification with Graph Convolutional Networks, ICLR 2017.

[Velickovic et al. 18] Graph Attention Networks, ICLR 2018.

[Ying et al. 18] Hierarchical Graph Representation Learning with Differentiable Pooling, NeurIPS 2018.

[Xu et al. 19] How Powerful are Graph Neural Networks?, ICLR 2019.

[Maron et al. 19] Provably Powerful Graph Networks, NeurIPS 2019.

[Yun et al. 19] Graph Transformer Neteworks, NeurIPS 2019.

[Loukas 20] What Graph Neural Networks Cannot Learn: Depth vs Width, ICLR 2020.

[Bianchi et al. 20] Spectral Clustering with Graph Neural Networks for Graph Pooling, ICML 2020.

[Xhonneux et al. 20] Continuous Graph Neural Networks, ICML 2020.

[Garg et al. 20] Generalization and Representational Limits of Graph Neural Networks, ICML 2020.

[Bécigneul et al. 20] Optimal Transport Graph Neural Networks, arXiv preprint 2020.

[Zoph and Le 17] Neural Architecture Search with Reinforcement Learning, ICLR 2017.

[Baker et al. 17] Designing Neural Network Architectures using Reinforcement Learning, ICLR 2017.

[Real et al. 17] Large-Scale Evolution of Image Classifiers, ICML 2017.

[Liu et al. 18] Hierarchical Representations for Efficient Architecture Search, ICLR 2018.

[Pham et al. 18] Efficient Neural Architecture Search via Parameters Sharing, ICML 2018.

[Luo et al. 18] Neural Architecture Optimization, NeurIPS 2018.

[Liu et al. 19] DARTS: Differentiable Architecture Search, ICLR 2019.

[Tan et al. 19] MnasNet: Platform-Aware Neural Architecture Search for Mobile, CVPR 2019.

[Cai et al. 19] ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware, ICLR 2019.

[Zhou et al. 19] BayesNAS: A Bayesian Approach for Neural Architecture Search, ICML 2019.

[Tan and Le 19] EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks, ICML 2019.

[Guo et al. 19] NAT: Neural Architecture Transformer for Accurate and Compact Architectures, NeurIPS 2019.

[Chen et al. 19] DetNAS: Backbone Search for Object Detection, NeurIPS 2019.

[Dong and Yang 20] NAS-Bench-201: Extending the Scope of Reproducible Neural Architecture Search, ICLR 2020.

[Zela et al. 20] Understanding and Robustifying Differentiable Architecture Search, ICLR 2020.

[Such et al. 20] Generative Teaching Networks: Accelerating Neural Architecture Search by Learning to Generate Synthetic Training Data, ICML 2020.

[Li et al. 20] Neural Architecture Search in A Proxy Validation Loss Landscape, ICML 2020.

[Konečný et al. 16] Federated Optimization: Distributed Machine Learning for On-Device Intelligence, arXiv Preprint, 2016.

[Konečný et al. 16] Federated Learning: Strategies for Improving Communication Efficiency, NIPS Workshop on Private Multi-Party Machine Learning 2016.

[McMahan et al. 17] Communication-Efficient Learning of Deep Networks from Decentralized Data, AISTATS 2017.

[Smith et al. 17] Federated Multi-Task Learning, NIPS 2017.

[Li et al. 20] Federated Optimization in Heterogeneous Networks, MLSys 2020.

<!--[Mohri et al. 19] Agnostic Federated Learning, ICML 2019.-->
[Yurochkin et al. 19] Bayesian Nonparametric Federated Learning of Neural Networks, ICML 2019.

[Bonawitz et al. 19] Towards Federated Learning at Scale: System Design, MLSys 2019.

[Wang et al. 20] Federated Learning with Matched Averaging, ICLR 2020.

[Li et al. 20] On the Convergence of FedAvg on Non-IID data, ICLR 2020.

[Karimireddy et al. 20] SCAFFOLD: Stochastic Controlled Averaging for Federated Learning, ICML 2020.

[Yu et al. 20] Federated Learning with Only Positive Labels, ICML 2020.

[Hamer et al. 20] FedBoost: Communication-Efficient Algorithms for Federated Learning, ICML 2020.

[Rothchild et al. 20] FetchSGD: Communication-Efficient Federated Learning with Sketching, ICML 2020.

[Pathak and Wainwright 20] FedSplit: An Algorithminc Framework for Fast Federated Optimization, NeurIPS 2020.

[Dosovitskiy et al. 14] Discriminative Unsupervised Feature Learning with Convolutional Neural Networks, NIPS 2014.

[Pathak et al. 16] Context Encoders: Feature Learning by Inpainting, CVPR 2016.

[Norrozi and Favaro et al. 16] Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles, ECCV 2016.

[Gidaris et al. 18] Unsupervised Representation Learning by Predicting Image Rotations, ICLR 2018.

[He et al. 20] Momentum Contrast for Unsupervised Visual Representation Learning, CVPR 2020.

[Chen et al. 20] A Simple Framework for Contrastive Learning of Visual Representations, ICML 2020.

[Mikolov et al. 13] Efficient Estimation of Word Representations in Vector Space, ICLR 2013.

[Devlin et al. 19] BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, NAACL 2019.

[Clark et al. 20] ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators, ICLR 2020.

[Hu et al. 20] Strategies for Pre-training Graph Neural Networks, ICLR 2020.

[Chen et al. 20] Generative Pretraining from Pixels, ICML 2020.

[Grill et al. 20] Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning, arXiv preprint, 2020.

[Chen et al. 20] Big Self-Supervised Models are Strong Semi-Supervised Learners, arXiv preprint, 2020.

[Laskin et al. 20] CURL: Contrastive Unsupervised Representations for Reinforcement Learning, ICML 2020.