A curated list of backdoor learning resources
A curated list of Backdoor Learning resources. For more details and our categorization criteria, please refer to our survey.
Backdoor learning is an emerging research area, which discusses the security issues of the training process towards machine learning algorithms. It is critical for safely adopting third-party algorithms in reality. Although backdoor learning shares certain similarity with adversarial learning (which concentrates on the security issues of the inference process), they do have essential differences and can be easily distinguished.
Note: 'Backdoor' is also commonly called the 'Neural Trojan' or 'Trojan'.
Please help to contribute this list by contacting me or add pull request
Markdown format:
markdown - Paper Name. [[pdf]](link) [[code]](link) - Author 1, Author 2, and Author 3. *Conference/Journal*, Year.
Backdoor Learning: A Survey. [pdf]
Backdoor Attacks and Countermeasures on Deep Learning: A Comprehensive Review. [pdf]
Data Security for Machine Learning: Data Poisoning, Backdoor Attacks, and Defenses. [pdf]
Deep Learning Backdoors. [pdf]
A Survey on Neural Trojans. [pdf]
A Master Key Backdoor for Universal Impersonation Attack against DNN-based Face Verification. [link]
WaNet - Imperceptible Warping-based Backdoor Attack. [pdf]
Deep Feature Space Trojan Attack of Neural Networks by Controlled Detoxification. [pdf] [code]
Backdoors Hidden in Facial Features: A Novel Invisible Backdoor Attack against Face Recognition Systems. [link]
One-to-N & N-to-One: Two Advanced Backdoor Attacks against Deep Learning Models. [pdf]
Invisible Backdoor Attacks on Deep Neural Networks via Steganography and Regularization. [pdf] [arXiv Version (2019)]
Composite Backdoor Attack for Deep Neural Network by Mixing Existing Benign Features. [pdf]
Input-Aware Dynamic Backdoor Attack. [pdf] [code]
Hidden Trigger Backdoor Attacks. [pdf] [code]
Bypassing Backdoor Detection Algorithms in Deep Learning. [pdf]
Backdoor Embedding in Convolutional Neural Network Models via Invisible Perturbation. [pdf]
Reflection Backdoor: A Natural Backdoor Attack on Deep Neural Networks. [pdf] [code]
Can Adversarial Weight Perturbations Inject Neural Backdoors? [pdf]
Clean-Label Backdoor Attacks on Video Recognition Models. [pdf] [code]
Escaping Backdoor Attack Detection of Deep Learning. [link]
Live Trojan Attacks on Deep Neural Networks. [pdf] [code]
Backdooring and Poisoning Neural Networks with Image-Scaling Attacks. [pdf]
Backdoor Attack with Sample-Specific Triggers. [pdf]
Blind Backdoors in Deep Learning Models. [pdf]
HaS-Nets: A Heal and Select Mechanism to Defend DNNs Against Backdoor Attacks for Data Collection Scenarios. [pdf]
FaceHack: Triggering Backdoored Facial Recognition Systems Using Facial Characteristics. [pdf]
Light Can Hack Your Face! Black-box Backdoor Attack on Face Recognition Systems. [pdf]
Class-Oriented Poisoning Attack. [pdf]
Dynamic Backdoor Attacks Against Machine Learning Models. [pdf]
Latent Backdoor Attacks on Deep Neural Networks. [pdf]
A New Backdoor Attack in CNNS by Training Set Corruption Without Label Poisoning. [pdf]
Label-Consistent Backdoor Attacks. [pdf] [code]
BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain. [pdf] [journal]
Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning. [pdf] [code]
Trojaning Attack on Neural Networks. [pdf]
An Embarrassingly Simple Approach for Trojan Attack in Deep Neural Networks. [pdf] [code]
TBT: Targeted Neural Network Attack with Bit Trojan. [pdf] [code]
DeepPayload: Black-box Backdoor Attack on Deep Learning Models through Neural Payload Injection. [pdf]
TrojanNet: Embedding Hidden Trojan Horse Models in Neural Network. [pdf]
Don't Trigger Me! A Triggerless Backdoor Attack Against Deep Neural Networks. [pdf]
Backdooring Convolutional Neural Networks via Targeted Weight Perturbations. [pdf]
DeepSweep: An Evaluation Framework for Mitigating DNN Backdoor Attacks using Data Augmentation. [pdf]
Neural Trojans. [pdf]
Rethinking the Trigger of Backdoor Attack. [pdf]
ConFoc: Content-Focus Protection Against Trojan Attacks on Neural Networks. [pdf]
Februus: Input Purification Defense Against Trojan Attacks on Deep Neural Network Systems. [pdf]
Model Agnostic Defense against Backdoor Attacks in Machine Learning. [pdf]
Neural Attention Distillation: Erasing Backdoor Triggers from Deep Neural Networks. [pdf] [code]
Bridging Mode Connectivity in Loss Landscapes and Adversarial Robustness. [pdf] [code]
Fine-Pruning: Defending Against Backdooring Attacks on Deep Neural Networks. [pdf] [code]
Neural Trojans. [pdf]
Disabling Backdoor and Identifying Poison Data by using Knowledge Distillation in Backdoor Attacks on Deep Neural Networks. [pdf]
Defending against Backdoor Attack on Deep Neural Networks. [pdf]
Neural Network Laundering: Removing Black-Box Backdoor Watermarks from Deep Neural Networks. [pdf]
HaS-Nets: A Heal and Select Mechanism to Defend DNNs Against Backdoor Attacks for Data Collection Scenarios. [pdf]
Detection of Backdoors in Trained Classifiers Without Access to the Training Set. [pdf]
Towards Inspecting and Eliminating Trojan Backdoors in Deep Neural Networks. [pdf] [previous version] [code]
GangSweep: Sweep out Neural Backdoors by GAN. [pdf]
Neural Cleanse: Identifying and Mitigating Backdoor Attacks in Neural Networks. [pdf] [code]
Defending Neural Backdoors via Generative Distribution Modeling. [pdf] [code]
DeepInspect: A Black-box Trojan Detection and Mitigation Framework for Deep Neural Networks. [pdf]
Revealing Perceptible Backdoors in DNNs Without the Training Set via the Maximum Achievable Misclassification Fraction Statistic. [pdf]
Backdoor Scanning for Deep Neural Networks through K-Arm Optimization. [pdf]
Scalable Backdoor Detection in Neural Networks. [pdf]
NNoculation: Broad Spectrum and Targeted Treatment of Backdoored DNNs. [pdf] [code]
Detecting AI Trojans Using Meta Neural Analysis. [pdf]
Universal Litmus Patterns: Revealing Backdoor Attacks in CNNs. [pdf] [code]
One-Pixel Signature: Characterizing CNN Models for Backdoor Detection. [pdf]
Practical Detection of Trojan Neural Networks: Data-Limited and Data-Free Cases. [pdf] [code]
Detecting Backdoor Attacks via Class Difference in Deep Neural Networks. [pdf]
Detecting Trojaned DNNs Using Counterfactual Attributions. [pdf]
Cassandra: Detecting Trojaned Networks from Adversarial Perturbations. [pdf]
Odyssey: Creation, Analysis and Detection of Trojan Models. [pdf] [dataset]
Noise-response Analysis for Rapid Detection of Backdoors in Deep Neural Networks. [pdf]
NeuronInspect: Detecting Backdoors in Neural Networks via Output Explanations. [pdf]
Robust Anomaly Detection and Backdoor Attack Detection via Differential Privacy. [pdf] [code]
Strong Data Augmentation Sanitizes Poisoning and Backdoor Attacks Without an Accuracy Trade-off. [pdf]
On the Effectiveness of Mitigating Data Poisoning Attacks with Gradient Shaping. [pdf] [code]
Removing Backdoor-Based Watermarks in Neural Networks with Limited Data. [pdf]
Demon in the Variant: Statistical Analysis of DNNs for Robust Backdoor Contamination Detection. [pdf]
CLEANN: Accelerated Trojan Shield for Embedded Neural Networks. [pdf]
Robust Anomaly Detection and Backdoor Attack Detection via Differential Privacy. [pdf] [code]
SentiNet: Detecting Localized Universal Attacks Against Deep Learning Systems. [pdf]
STRIP: A Defence Against Trojan Attacks on Deep Neural Networks. [pdf] [extension] [code]
Detecting Backdoor Attacks on Deep Neural Networks by Activation Clustering. [pdf]
Deep Probabilistic Models to Detect Data Poisoning Attacks. [pdf]
Spectral Signatures in Backdoor Attacks. [pdf] [code]
Exposing Backdoors in Robust Machine Learning Models. [pdf]
A Unified Framework for Analyzing and Detecting Malicious Examples of DNN Models. [pdf]
HaS-Nets: A Heal and Select Mechanism to Defend DNNs Against Backdoor Attacks for Data Collection Scenarios. [pdf]
Poison as a Cure: Detecting & Neutralizing Variable-Sized Backdoor Attacks in Deep Neural Networks. [pdf]
Certified Robustness to Label-Flipping Attacks via Randomized Smoothing. [pdf]
On Certifying Robustness against Backdoor Attacks via Randomized Smoothing. [pdf]
RAB: Provable Robustness Against Backdoor Attacks. [pdf] [code]
Weight Poisoning Attacks on Pre-trained Models. [pdf] [code]
A Backdoor Attack Against LSTM-based Text Classification Systems. [pdf]
Poison Attacks against Text Datasets with Conditional Adversarially Regularized Autoencoder. [pdf]
Detecting Universal Trigger’s Adversarial Attack with Honeypot. [pdf]
ONION: A Simple and Effective Defense Against Textual Backdoor Attacks. [pdf]
Mitigating Backdoor Attacks in LSTM-based Text Classification Systems by Backdoor Keyword Identification. [pdf]
Trojaning Language Models for Fun and Profit. [pdf]
BadNL: Backdoor Attacks Against NLP Models. [pdf]
Graph Backdoor. [pdf]
Backdoor Attacks to Graph Neural Networks. [pdf]
Stop-and-Go: Exploring Backdoor Attacks on Deep Reinforcement Learning-based Traffic Congestion Control Systems. [pdf]
Trojdrl: Trojan attacks on deep reinforcement learning agents. [pdf]
Design of Intentional Backdoors in Sequential Models. [pdf]
How to Backdoor Federated Learning. [pdf]
Curse or Redemption? How Data Heterogeneity Affects the Robustness of Federated Learning. [pdf]
Attack of the Tails: Yes, You Really Can Backdoor Federated Learning. [pdf]
DBA: Distributed Backdoor Attacks against Federated Learning. [pdf]
Defending Against Backdoors in Federated Learning with Robust Learning Rate. [pdf]
The Limitations of Federated Learning in Sybil Settings. [pdf] [extension] [code]
Backdoor Attacks and Defenses in Feature-partitioned Collaborative Learning. [pdf]
Can You Really Backdoor Federated Learning? [pdf]
On Provable Backdoor Defense in Collaborative Learning. [pdf]
Robust Federated Learning with Attack-Adaptive Aggregation. [pdf] [code]
Meta Federated Learning. [pdf]
FLGUARD: Secure and Private Federated Learning. [pdf]
Toward Robustness and Privacy in Federated Learning: Experimenting with Local and Central Differential Privacy. [pdf]
Backdoor Attacks on Federated Meta-Learning. [pdf]
Dynamic backdoor attacks against federated learning. [pdf]
Federated Learning in Adversarial Settings. [pdf]
BlockFLA: Accountable Federated Learning via Hybrid Blockchain Architecture. [pdf]
Mitigating Backdoor Attacks in Federated Learning. [pdf]
BaFFLe: Backdoor detection via Feedback-based Federated Learning. [pdf]
Learning to Detect Malicious Clients for Robust Federated Learning. [pdf]
Attack-Resistant Federated Learning with Residual-based Reweighting. [pdf] [code]
Backdoor Attacks against Transfer Learning with Pre-trained Deep Learning Models. [pdf]
Weight Poisoning Attacks on Pre-trained Models. [pdf] [code]
Latent Backdoor Attacks on Deep Neural Networks. [pdf]
Red Alarm for Pre-trained Models: Universal Vulnerabilities by Neuron-Level Backdoor Attacks. [pdf] [code]
Backdoor Attack against Speaker Verification [pdf] [code]
NeuroAttack: Undermining Spiking Neural Networks Security through Externally Triggered Bit-Flips. [pdf]
Explainability Matters: Backdoor Attacks on Medical Imaging. [pdf]
Trojan Attacks on Wireless Signal Classification with Adversarial Machine Learning. [pdf]
Backdoor Attacks on the DNN Interpretation System. [pdf]
Embedding and Synthesis of Knowledge in Tree Ensemble Classifiers. [pdf]
BAAAN: Backdoor Attacks Against Autoencoder and GAN-Based Machine Learning Models. [pdf]
Targeted Forgetting and False Memory Formation in Continual Learners through Adversarial Backdoor Attacks. [pdf]
Backdoors in Neural Models of Source Code. [pdf]
EEG-Based Brain-Computer Interfaces Are Vulnerable to Backdoor Attacks. [pdf]
Exploring Backdoor Poisoning Attacks Against Malware Classifiers. [pdf]
Bias Busters: Robustifying DL-based Lithographic Hotspot Detectors Against Backdooring Attacks. [pdf]
On the Trade-off between Adversarial and Backdoor Robustness. [pdf]
A Tale of Evil Twins: Adversarial Inputs versus Poisoned Models. [pdf] [code]
Systematic Evaluation of Backdoor Data Poisoning Attacks on Image Classifiers. [pdf]
On Evaluating Neural Network Backdoor Defenses. [pdf]
Just How Toxic is Data Poisoning? A Unified Benchmark for Backdoor and Data Poisoning Attacks. [pdf] [code]
Rethinking the Trigger of Backdoor Attack. [pdf]
TROJANZOO: Everything You Ever Wanted to Know about Neural Backdoors (But were Afraid to Ask). [pdf] [code]
Poisoned Classifiers are Not Only Backdoored, They are Fundamentally Broken. [pdf] [code]
Effect of Backdoor Attacks over the Complexity of the Latent Space Distribution. [pdf] [code]
Trembling Triggers: Exploring the Sensitivity of Backdoors in DNN-based Face Recognition. [pdf]
Backdoor Attacks on Facial Recognition in the Physical World. [pdf] [Master Thesis]
Noise-response Analysis for Rapid Detection of Backdoors in Deep Neural Networks. [pdf]
Using Honeypots to Catch Adversarial Attacks on Neural Networks. [pdf]
Turning Your Weakness into a Strength: Watermarking Deep Neural Networks by Backdooring. [pdf] [code]
Open-sourced Dataset Protection via Backdoor Watermarking. [pdf]
What Do Deep Nets Learn? Class-wise Patterns Revealed in the Input Space. [pdf]
What Do You See? Evaluation of Explainable Artificial Intelligence (XAI) Interpretability through Neural Backdoors. [pdf]
Towards Probabilistic Verification of Machine Unlearning. [pdf] [code]