Review of Artificial Intelligence Adversarial Attack and Defense Technologies
这篇文章是深度学习中对抗攻击和防御的一个综述性文章(2019)。文章首先介绍了攻击在训练阶段和测试阶段的实现方法。然后分别总结了对抗技术在 CV, NLP, 网络安全和在真实世界中的应用。最后还介绍了三类主要的对抗防御方法:修改数据、修改模型、使用辅助工具。另外还提出了一种用于生成对抗性文本样本的算法。
Abstract(摘要)
In recent years, artificial intelligence technologies have been widely used in computer vision, natural language processing, automatic driving, and other fields. However, artificial intelligence systems are vulnerable to adversarial attacks, which limit the applications of artificial intelligence (AI) technologies in key security fields. Therefore, improving the robustness of AI systems against adversarial attacks has played an increasingly important role in the further development of AI. This paper aims to comprehensively summarize the latest research progress on adversarial attack and defense technologies in deep learning. According to the target model’s different stages where the adversarial attack occurred, this paper expounds the adversarial attack methods in the training stage and testing stage respectively. Then, we sort out the applications of adversarial attack technologies in computer vision, natural language processing, cyberspace security, and the physical world. Finally, we describe the existing adversarial defense methods respectively in three main categories, i.e., modifying data, modifying models and using auxiliary tools. Review of Artificial Intelligence Adversarial Attack and Defense Technologies
Adversarial Samples and Adversarial Attack Strategies
对抗样本
对抗样本是论文 [^1][^1] 中首次提出的,指的是添加肉眼不可见的扰动,造成目标网络分错类(对于分类来说)。最典型的来说就是下面这张图: