|
|
|
| Targeted adversarial attack method based on dual guidance |
Yue SUN( ),Xinglan ZHANG*( ) |
| School of Computer Science, Beijing University of Technology, Beijing 100124, China |
|
|
|
Abstract A generative adversarial attack method based on dual guidance of target class impressions and regularized adversarial examples was proposed to enhance the transferability of targeted adversarial samples. The adversarial perturbations of shallow features were generated by leveraging the skip-connection mechanism of the UNet model to improve the attack effectiveness of the adversarial samples. To improve the targeted attack success rate, the generator was guided to generate adversarial perturbations containing the features of target classes using the impression images and labels of target classes as input. The Dropout technique was employed on the generated adversarial perturbations in the training phase to reduce the dependence of the generator on surrogate models, thereby improving the generalization performance of the adversarial samples. Experimental results demonstrated that the adversarial samples generated by the proposed method exhibited significant targeted transferability on the MNIST, CIFAR10, and SVHN datasets when attacking classification models such as ResNet18 and DenseNet. The average black-box targeted attack success rate was improved by more than 1.6% compared with that of the benchmark attack method MIM, demonstrating that the adversarial samples generated by the proposed method could evaluate the robustness of the deep models more effectively.
|
|
Received: 20 February 2025
Published: 15 December 2025
|
|
|
| Fund: 国家自然科学基金资助项目(62202017). |
|
Corresponding Authors:
Xinglan ZHANG
E-mail: sunyues2022@emails.bjut.edu.cn;zhangxinglan@bjut.edu.cn
|
基于双重引导的目标对抗攻击方法
为了提升目标对抗样本的迁移性能,提出基于目标类别印象和正则化对抗样本双重引导的生成式对抗攻击方法. 利用UNet模型的跳跃连接机制生成浅层特征的对抗扰动,增强对抗样本的攻击性. 将目标类别的类印象图和标签作为输入,引导生成器生成含有目标类别特征的对抗扰动,提高目标攻击成功率. 在训练阶段对生成的对抗扰动使用Dropout技术,降低生成器对替代模型的依赖,以提升对抗样本的泛化性能. 实验结果表明,在MNIST、CIFAR10以及SVHN数据集上,所提方法生成的对抗样本在ResNet18、DenseNet等分类模型上均有较好的目标迁移攻击效果,平均黑盒目标攻击成功率比基准攻击方法MIM提高了1.6%以上,说明所提方法生成的对抗样本可以更有效地评估深度模型的鲁棒性.
关键词:
深度学习,
对抗攻击,
对抗样本,
黑盒攻击,
目标攻击
|
|
| [1] |
侯小虎, 贾晓芬, 赵佰亭 眼底病变OCT图像的轻量化识别算法[J]. 浙江大学学报: 工学版, 2023, 57 (12): 2448- 2455 HOU Xiaohu, JIA Xiaofen, ZHAO Baiting Lightweight recognition algorithm for OCT images of fundus lesions[J]. Journal of Zhejiang University: Engineering Science, 2023, 57 (12): 2448- 2455
|
|
|
| [2] |
ZHAO Y, LV W, XU S, et al. DETRs beat YOLOs on real-time object detection [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2024: 16965–16974.
|
|
|
| [3] |
HU Y, YANG J, CHEN L, et al. Planning-oriented autonomous driving [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver: IEEE, 2023: 17853–17862.
|
|
|
| [4] |
郑诚, 陈雪灵. 方面语义增强的融合网络用于方面级情感分析 [J/OL]. 小型微型计算机系统, 2024: 1–12. (2024-09-24). https://kns.cnki.net/kcms/detail/detail.aspx?filename=XXWX2024092000Q&dbname=CJFD&dbcode=CJFQ. ZHENG Cheng, CHEN Xueling. Aspect semantic enhanced fusion network for aspect-based sentiment analysis [J/OL]. Journal of Chinese Computer Systems, 2024: 1–12. (2024-09-24). https://kns.cnki.net/kcms/detail/detail.aspx?filename=XXWX2024092000Q&dbname=CJFD&dbcode=CJFQ.
|
|
|
| [5] |
秦臻, 庄添铭, 朱国淞, 等 面向人工智能模型的安全攻击和防御策略综述[J]. 计算机研究与发展, 2024, 61 (10): 2627- 2648 QIN Zhen, ZHUANG Tianming, ZHU Guosong, et al Survey of security attack and defense strategies for artificial intelligence model[J]. Journal of Computer Research and Development, 2024, 61 (10): 2627- 2648
doi: 10.7544/issn1000-1239.202440449
|
|
|
| [6] |
SZEGEDY C, ZAREMBA W, SUTSKEVER I, et al. Intriguing properties of neural networks [C]// International Conference on Learning Representations. Banff: International Machine Learning Society, 2014: 1-10.
|
|
|
| [7] |
BAI Y, WANG Y, ZENG Y, et al Query efficient black-box adversarial attack on deep neural networks[J]. Pattern Recognition, 2023, 133: 109037
doi: 10.1016/j.patcog.2022.109037
|
|
|
| [8] |
REN Y, ZHU H, SUI X, et al Crafting transferable adversarial examples via contaminating the salient feature variance[J]. Information Sciences, 2023, 644: 119273
doi: 10.1016/j.ins.2023.119273
|
|
|
| [9] |
DONG Y, LIAO F, PANG T, et al. Boosting adversarial attacks with momentum [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 9185–9193.
|
|
|
| [10] |
CROCE F, HEIN M. Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks [C]// Proceedings of the International Conference on Machine Learning. [S.l.]: JMLR. org, 2020: 2206–2216.
|
|
|
| [11] |
WANG X, HE K. Enhancing the transferability of adversarial attacks through variance tuning [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021: 1924–1933.
|
|
|
| [12] |
PENG A, LIN Z, ZENG H, et al. Boosting transferability of adversarial example via an enhanced Euler’s method [C]// IEEE International Conference on Acoustics, Speech and Signal Processing. Rhodes Island: IEEE, 2023: 1–5.
|
|
|
| [13] |
WANG J, CHEN Z, JIANG K, et al Boosting the transferability of adversarial attacks with global momentum initialization[J]. Expert Systems with Applications, 2024, 255: 124757
doi: 10.1016/j.eswa.2024.124757
|
|
|
| [14] |
DONG Y, PANG T, SU H, et al. Evading defenses to transferable adversarial examples by translation-invariant attacks [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 4307–4316.
|
|
|
| [15] |
WU L, ZHAO L, PU B, et al. Boosting the transferability of adversarial examples via adaptive attention and gradient purification methods [C]// Proceedings of the International Joint Conference on Neural Networks. Yokohama: IEEE, 2024: 1–7.
|
|
|
| [16] |
ZHU Z, CHEN H, WANG X, et al. GE-AdvGAN: improving the transferability of adversarial samples by gradient editing-based adversarial generative model [C]// Proceedings of the 2024 SIAM International Conference on Data Mining. Houston: SIAM, 2024: 706–714.
|
|
|
| [17] |
QIAN Y, HE S, ZHAO C, et al. LEA2: a lightweight ensemble adversarial attack via non-overlapping vulnerable frequency regions [C]// Proceedings of the IEEE/CVF International Conference on Computer Vision. Paris: IEEE, 2023: 4487–4498.
|
|
|
| [18] |
LI M, DENG C, LI T, et al. Towards transferable targeted attack [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 638−646.
|
|
|
| [19] |
CHEN B, YIN J, CHEN S, et al. An adaptive model ensemble adversarial attack for boosting adversarial transferability [C]// Proceedings of the IEEE/CVF International Conference on Computer Vision. Paris: IEEE, 2023: 4466–4475.
|
|
|
| [20] |
LIN J, SONG C, HE K, et al. Nesterov accelerated gradient and scale invariance for improving transferability of adversarial examples [C]// International Conference on Learning Representations. [S.l.]: International Machine Learning Society, 2020: 1–12.
|
|
|
| [21] |
XIE C, ZHANG Z, ZHOU Y, et al. Improving transferability of adversarial examples with input diversity [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 2725–2734.
|
|
|
| [22] |
QIAN Y, CHEN K, WANG B, et al Enhancing transferability of adversarial examples through mixed-frequency inputs[J]. IEEE Transactions on Information Forensics and Security, 2024, 19: 7633- 7645
doi: 10.1109/TIFS.2024.3430508
|
|
|
| [23] |
ZHAO Z, LIU Z, LARSON M. On success and simplicity: a second look at transferable targeted attacks [C]// Proceedings of the 35th International Conference on Neural Information Processing Systems. [S.l.]: NeurIPS Foundation, 2021: 6115–6128.
|
|
|
| [24] |
WU H, OU G, WU W, et al. Improving transferable targeted adversarial attacks with model self-enhancement [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2024: 24615–24624.
|
|
|
| [25] |
颜景坤. 人脸识别对抗攻击算法研究[D]. 武汉: 华中科技大学, 2020. YAN Jingkun. Research of adversarial attack method in face recognition [D]. Wuhan: Huazhong University of Science and Technology, 2020.
|
|
|
| [26] |
XIAO C, LI B, ZHU J, et al. Generating adversarial examples with adversarial networks [C]// Proceedings of the 27th International Joint Conference on Artificial Intelligence. Stockholm: AAAI Press, 2018: 3905–3911.
|
|
|
| [27] |
MOPURI K R, UPPALA P K, BABU R V. Ask, acquire, and attack: data-free UAP generation using class impressions [C]// Proceedings of the European Conference on Computer Vision. Munich: Springer, 2018: 20–35.
|
|
|
| [28] |
SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition [C]// International Conference on Learning Representations. San Diego: ICLR, 2015: 1–14.
|
|
|
| [29] |
KRIZHEVSKY A, HINTON G. Learning multiple layers of features from tiny images [R]. Toronto: University of Toronto, 2009.
|
|
|
| [30] |
NETZER Y, WANG T, COATES A, et al. Reading digits in natural images with unsupervised feature learning [C]// Proceedings of the NIPS Workshop on Deep Learning and Unsupervised Feature Learning. Granada: NeurIPS Foundation, 2011: 1–9.
|
|
|
| [31] |
LECUN Y, CORTES C, BURGES C J C The MNIST database of handwritten digits[J]. Neural Computation, 1998, 10 (5): 1191- 1232
|
|
|
| [32] |
HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 770–778.
|
|
|
| [33] |
HUANG G, LIU Z, VAN DER MAATEN L, et al. Densely connected convolutional networks [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 2261–2269
|
|
|
| [34] |
ZAGORUYKO S, KOMODAKIS N. Wide residual networks [C]// Proceedings of the British Machine Vision Conference. York: BMVA Press, 2016: 87.1–87.12.
|
|
|
| [35] |
SZEGEDY C, VANHOUCKE V, IOFFE S, et al. Rethinking the inception architecture for computer vision [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 2818–2826.
|
|
|
| [36] |
KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks [C]// Proceedings of the 26th International Conference on Neural Information Processing Systems. Lake Tahoe: NeurIPS Foundation, 2012: 1097–1105.
|
|
|
| [37] |
WANG X, HE K. Enhancing the transferability of adversarial attacks through variance tuning. [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021: 1924–1933.
|
|
|
| [38] |
LIN Q, LUO C, NIU Z, et al. Boosting adversarial transferability across model genus by deformation-constrained warping [C]// Proceedings of the AAAI Conference on Artificial Intelligence. Vancouver: AAAI Press, 2024: 3459–3467.
|
|
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
| |
Shared |
|
|
|
|
| |
Discussed |
|
|
|
|