Adversarial Robustness for Machine Learning

1st Edition - August 20, 2022
Authors: Pin-Yu Chen, Cho-Jui Hsieh
Language: English
Paperback ISBN:
9 7 8 - 0 - 1 2 - 8 2 4 0 2 0 - 5
eBook ISBN:
9 7 8 - 0 - 1 2 - 8 2 4 2 5 7 - 5

Adversarial Robustness for Machine Learning summarizes the recent progress on this topic and introduces popular algorithms on adversarial attack, defense and veriﬁcation. Sections… Read more

Purchase options

LIMITED OFFER

Save 50% on book bundles

Immediately download your ebook while waiting for your print delivery. No promo code is needed.

Institutional subscription on ScienceDirect

Request a sales quote

Adversarial Robustness for Machine Learning summarizes the recent progress on this topic and introduces popular algorithms on adversarial attack, defense and veriﬁcation. Sections cover adversarial attack, veriﬁcation and defense, mainly focusing on image classiﬁcation applications which are the standard benchmark considered in the adversarial robustness community. Other sections discuss adversarial examples beyond image classification, other threat models beyond testing time attack, and applications on adversarial robustness. For researchers, this book provides a thorough literature review that summarizes latest progress in the area, which can be a good reference for conducting future research.

In addition, the book can also be used as a textbook for graduate courses on adversarial robustness or trustworthy machine learning. While machine learning (ML) algorithms have achieved remarkable performance in many applications, recent studies have demonstrated their lack of robustness against adversarial disturbance. The lack of robustness brings security concerns in ML models for real applications such as self-driving cars, robotics controls and healthcare systems.

Cover image
Title page
Table of Contents
Copyright
Dedication
Biography
Dr. Pin-Yu Chen (1986–present)
Dr. Cho-Jui Hsieh (1985–present)
Preface
Part 1: Preliminaries
Chapter 1: Background and motivation
Abstract
1.1. What is adversarial machine learning?
1.2. Mathematical notations
1.3. Machine learning basics
1.4. Motivating examples
1.5. Practical examples of AI vulnerabilities
1.6. Open-source Python libraries for adversarial robustness
References
Part 2: Adversarial attack
Chapter 2: White-box adversarial attacks
Abstract
2.1. Attack procedure and notations
2.2. Formulating attack as constrained optimization
2.3. Steepest descent, FGSM and PGD attack
2.4. Transforming to an unconstrained optimization problem
2.5. Another way to define attack objective
2.6. Attacks with different ℓp norms
2.7. Universal attack
2.8. Adaptive white-box attack
2.9. Empirical comparison
2.10. Extended reading
References
Chapter 3: Black-box adversarial attacks
Abstract
3.1. Evasion attack taxonomy
3.2. Soft-label black-box attack
3.3. Hard-label black-box attack
3.4. Transfer attack
3.5. Attack dimension reduction
3.6. Empirical comparisons
3.7. Proof of Theorem 1
3.8. Extended reading
References
Chapter 4: Physical adversarial attacks
Abstract
4.1. Physical adversarial attack formulation
4.2. Examples of physical adversarial attacks
4.3. Empirical comparison
4.4. Extending reading
References
Chapter 5: Training-time adversarial attacks
Abstract
5.1. Poisoning attack
5.2. Backdoor attack
5.3. Empirical comparison
5.4. Case study: distributed backdoor attacks on federated learning
5.5. Extended reading
References
Chapter 6: Adversarial attacks beyond image classification
Abstract
6.1. Data modality and task objectives
6.2. Audio adversarial example
6.3. Feature identification
6.4. Graph neural network
6.5. Natural language processing
6.6. Deep reinforcement learning
6.7. Image captioning
6.8. Weight perturbation
6.9. Extended reading
References
Part 3: Robustness verification
Chapter 7: Overview of neural network verification
Abstract
7.1. Robustness verification versus adversarial attack
7.2. Formulations of robustness verification
7.3. Applications of neural network verification
7.4. Extended reading
References
Chapter 8: Incomplete neural network verification
Abstract
8.1. A convex relaxation framework
8.2. Linear bound propagation methods
8.3. Convex relaxation in the dual space
8.4. Recent progresses in linear relaxation-based methods
8.5. Extended reading
References
Chapter 9: Complete neural network verification
Abstract
9.1. Mixed integer programming
9.2. Branch and bound
9.3. Branch-and-bound with linear bound propagation
9.4. Empirical comparison
References
Chapter 10: Verification against semantic perturbations
Abstract
10.1. Semantic adversarial example
10.2. Semantic perturbation layer
10.3. Input space refinement for semantify-NN
10.4. Empirical comparison
References
Part 4: Adversarial defense
Chapter 11: Overview of adversarial defense
Abstract
11.1. Empirical defense versus certified defense
11.2. Overview of empirical defenses
References
Chapter 12: Adversarial training
Abstract
12.1. Formulating adversarial training as bilevel optimization
12.2. Faster adversarial training
12.3. Improvements on adversarial training
12.4. Extended reading
References
Chapter 13: Randomization-based defense
Abstract
13.1. Earlier attempts and the EoT attack
13.2. Adding randomness to each layer
13.3. Certified defense with randomized smoothing
13.4. Extended reading
References
Chapter 14: Certified robustness training
Abstract
14.1. A framework for certified robust training
14.2. Existing algorithms and their performances
14.3. Empirical comparison
14.4. Extended reading
References
Chapter 15: Adversary detection
Abstract
15.1. Detecting adversarial inputs
15.2. Detecting adversarial audio inputs
15.3. Detecting Trojan models
15.4. Extended reading
References
Chapter 16: Adversarial robustness of beyond neural network models
Abstract
16.1. Evaluating the robustness of K-nearest-neighbor models
16.2. Defenses with nearest-neighbor classifiers
16.3. Evaluating the robustness of decision tree ensembles
References
Chapter 17: Adversarial robustness in meta-learning and contrastive learning
Abstract
17.1. Fast adversarial robustness adaptation in model-agnostic meta-learning
17.2. Adversarial robustness preservation for contrastive learning: from pretraining to finetuning
References
Part 5: Applications beyond attack and defense
Chapter 18: Model reprogramming
Abstract
18.1. Reprogramming voice models for time series classification
18.2. Reprogramming general image models for medical image classification
18.3. Theoretical justification of model reprogramming
18.4. Proofs
18.5. Extended reading
References
Chapter 19: Contrastive explanations
Abstract
19.1. Contrastive explanations method
19.2. Contrastive explanations with monotonic attribute functions
19.3. Empirical comparison
19.4. Extended reading
References
Chapter 20: Model watermarking and fingerprinting
Abstract
20.1. Model watermarking
20.2. Model fingerprinting
20.3. Empirical comparison
20.4. Extended reading
References
Chapter 21: Data augmentation for unsupervised machine learning
Abstract
21.1. Adversarial examples for unsupervised machine learning models
21.2. Empirical comparison
References
References
References
Index

Pin-Yu Chen

Pin-Yu Chen: Dr. Pin-Yu Chen is a principal research staff member at IBM Thomas J. Watson Research Center, Yorktown Heights, NY, USA. He is also the chief scientist of RPI-IBM AI Research Collaboration and PI of ongoing MIT-IBM Watson AI Lab projects. Dr. Chen received his Ph.D. degree in electrical engineering and computer science from the University of Michigan, Ann Arbor, USA, in 2016. Dr. Chen’s recent research focuses on adversarial machine learning and robustness of neural networks. His long-term research vision is to build trustworthy machine learning systems. He is a co-author of the book “Adversarial Robustness for Machine Learning”. At IBM Research, he received several research accomplishment awards, including IBM Master Inventor, IBM Corporate Technical Award, and IBM Pat Goldberg Memorial Best Paper. His research contributes to IBM open-source libraries including Adversarial Robustness Toolbox (ART 360) and AI Explainability 360 (AIX 360). He has published more than 50 papers related to trustworthy machine learning at major AI and machine learning conferences, given tutorials at NeurIPS’22, AAAI(’22,’23), IJCAI’21, CVPR(’20,’21,’23), ECCV’20, ICASSP(’20,’22,’23), KDD’19, and Big Data’18, and organized several workshops for adversarial machine learning. He received the IEEE GLOBECOM 2010 GOLD Best Paper Award and UAI 2022 Best Paper Runner-Up Award.

Affiliations and expertise

Principal Research Scientist, IBM Thomas J. Watson Research Center, Yorktown Heights, NY, USA

Cho-Jui Hsieh

Dr. Cho-Jui Hsieh is an Assistant Professor at the UCLA Computer Science department. His research focuses on developing algorithms and optimization techniques for training large-scale and robust machine learning models. He publishes in top-tier machine learning conferences including ICML, NIPS, KDD, ICLR and has won the best paper awards at KDD 2010, ICDM 2012, ICPP 2018, best paper ﬁnalist at AISEC 2017 and best student paper ﬁnalist at SC 2019. He is also the author of several widely used open source machine learning software including LIBLINEAR. His work has been cited by more than 13,000 times on Google scholar.

Affiliations and expertise

Assistant Professor, UCLA Computer Science Department, USA