Please wait a minute...
Journal of ZheJiang University (Engineering Science)  2026, Vol. 60 Issue (1): 81-89    DOI: 10.3785/j.issn.1008-973X.2026.01.008
    
Targeted adversarial attack method based on dual guidance
Yue SUN(),Xinglan ZHANG*()
School of Computer Science, Beijing University of Technology, Beijing 100124, China
Download: HTML     PDF(2259KB) HTML
Export: BibTeX | EndNote (RIS)      

Abstract  

A generative adversarial attack method based on dual guidance of target class impressions and regularized adversarial examples was proposed to enhance the transferability of targeted adversarial samples. The adversarial perturbations of shallow features were generated by leveraging the skip-connection mechanism of the UNet model to improve the attack effectiveness of the adversarial samples. To improve the targeted attack success rate, the generator was guided to generate adversarial perturbations containing the features of target classes using the impression images and labels of target classes as input. The Dropout technique was employed on the generated adversarial perturbations in the training phase to reduce the dependence of the generator on surrogate models, thereby improving the generalization performance of the adversarial samples. Experimental results demonstrated that the adversarial samples generated by the proposed method exhibited significant targeted transferability on the MNIST, CIFAR10, and SVHN datasets when attacking classification models such as ResNet18 and DenseNet. The average black-box targeted attack success rate was improved by more than 1.6% compared with that of the benchmark attack method MIM, demonstrating that the adversarial samples generated by the proposed method could evaluate the robustness of the deep models more effectively.



Key wordsdeep learning      adversarial attack      adversarial example      black-box attack      targeted attack     
Received: 20 February 2025      Published: 15 December 2025
CLC:  TP 391  
Fund:  国家自然科学基金资助项目(62202017).
Corresponding Authors: Xinglan ZHANG     E-mail: sunyues2022@emails.bjut.edu.cn;zhangxinglan@bjut.edu.cn
Cite this article:

Yue SUN,Xinglan ZHANG. Targeted adversarial attack method based on dual guidance. Journal of ZheJiang University (Engineering Science), 2026, 60(1): 81-89.

URL:

https://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2026.01.008     OR     https://www.zjujournals.com/eng/Y2026/V60/I1/81


基于双重引导的目标对抗攻击方法

为了提升目标对抗样本的迁移性能,提出基于目标类别印象和正则化对抗样本双重引导的生成式对抗攻击方法. 利用UNet模型的跳跃连接机制生成浅层特征的对抗扰动,增强对抗样本的攻击性. 将目标类别的类印象图和标签作为输入,引导生成器生成含有目标类别特征的对抗扰动,提高目标攻击成功率. 在训练阶段对生成的对抗扰动使用Dropout技术,降低生成器对替代模型的依赖,以提升对抗样本的泛化性能. 实验结果表明,在MNIST、CIFAR10以及SVHN数据集上,所提方法生成的对抗样本在ResNet18、DenseNet等分类模型上均有较好的目标迁移攻击效果,平均黑盒目标攻击成功率比基准攻击方法MIM提高了1.6%以上,说明所提方法生成的对抗样本可以更有效地评估深度模型的鲁棒性.


关键词: 深度学习,  对抗攻击,  对抗样本,  黑盒攻击,  目标攻击 
Fig.1 Examples of class impression maps generated using VGG16 or LeNet models
Fig.2 Structure diagram of adversarial sample generator
Fig.3 Overall framework of proposed method
分类模型ACC/%
MNISTCIFAR10CIFAR10+ATSVHN
ResNet1899.5786.6784.1194.22
VGG1687.8779.5193.71
DenseNet90.0983.9494.93
WideResNet91.8988.2595.23
Inv390.3084.5094.43
LeNet99.20
AlexNet99.28
Tab.1 Classification accuracy of different classification models on original datasets
Fig.4 Targeted ASRs of various attack methods on MNIST dataset
攻击方法tASR/%
VGG16ResNet18Inv3DenseNetWideResNet平均
MIM100.0083.4593.6493.9794.4191.37
Auto-PGD100.0073.6886.6887.3387.7483.86
DIM99.6478.2283.7284.2784.9582.79
TIM83.6629.7126.8226.2029.0827.95
SIM99.5373.4781.5881.7481.2579.51
VMI-FGSM99.9282.6490.3091.3290.7288.75
DO-M-DI295.0955.1267.4668.2572.1265.74
DeCowA98.9368.7671.4074.8677.0873.03
IDAA99.0486.7790.7589.7891.7489.76
LOGIT99.7574.4978.1180.2885.2579.53
POTRIP93.2251.3253.5854.1755.6453.68
AdvGAN97.0839.6861.0870.2865.3459.10
本研究方法98.3193.5296.0197.0997.6696.07
Tab.2 Targeted ASRs of various attack methods on CIFAR10 dataset
攻击方法tASR/%
VGG16ResNet18Inv3DenseNetWideResNet平均
MIM99.9197.3494.7895.7596.2896.04
Auto-PGD100.0091.1283.7286.9387.5487.33
DIM99.8191.7486.1488.6289.0688.89
TIM99.1883.6476.8279.7480.3280.13
SIM99.9194.9992.0693.3393.3693.44
VMI-FGSM99.9895.5193.3294.1794.4594.36
DO-M-DI299.8692.8287.3489.7890.7490.17
DeCowA99.1689.7783.8686.1386.6886.61
IDAA87.9783.5280.7880.3981.0581.44
LOGIT99.9796.8093.0894.6095.4994.99
POTRIP99.6992.1988.5688.4789.5289.69
AdvGAN97.5997.0094.7595.3095.9195.74
本研究方法97.9797.7997.0797.7897.6997.58
Tab.3 Targeted ASRs of various attack methods on SVHN dataset
攻击方法tASR/%
VGG16ResNet18Inv3DenseNetWideResNet
MIM16.9319.6631.0322.9122.06
Auto-PGD17.4828.0757.5944.1139.26
DIM35.1757.9769.9964.1667.42
TIM35.4332.0724.3930.1333.63
SIM25.0940.3661.1552.0552.96
VMI-FGSM25.8336.5558.9147.0946.63
DO-M-DI211.0226.6937.0430.8741.12
DeCowA15.4124.9628.4525.5130.16
IDAA38.7748.9566.7959.9149.07
LOGIT39.5550.1059.2951.3855.14
POTRIP29.0837.2848.2242.8940.67
AdvGAN4.0931.4835.9535.1341.44
本研究方法28.4272.7383.9975.5486.51
Tab.4 Targeted ASRs of various attack methods on CIFAR10 dataset when attacking adversarial training models
Fig.5 Targeted ASRs of various attack methods on CIFAR10 dataset when attacking defensive models
数据集攻击方法tASR/%
ResNet18Inv3DenseNetWideResNet平均
CIFAR10AdvGAN39.6861.0870.2865.3459.10
AdvGAN+U93.0095.8096.1997.0395.51
本研究方法93.5296.0197.0997.6696.07
SVHNAdvGAN97.0094.7595.3095.9195.74
AdvGAN+U97.1196.1897.0497.1596.87
本研究方法97.7997.0797.7897.6997.58
Tab.5 Targeted ASRs of various generative attack methods on CIFAR10 and SVHN datasets
Fig.6 Targeted ASRs of proposed method with different dropout probabilities
Fig.7 Targeted ASRs of proposed method with different dropout probabilities when attacking adversarially trained models
[1]   侯小虎, 贾晓芬, 赵佰亭 眼底病变OCT图像的轻量化识别算法[J]. 浙江大学学报: 工学版, 2023, 57 (12): 2448- 2455
HOU Xiaohu, JIA Xiaofen, ZHAO Baiting Lightweight recognition algorithm for OCT images of fundus lesions[J]. Journal of Zhejiang University: Engineering Science, 2023, 57 (12): 2448- 2455
[2]   ZHAO Y, LV W, XU S, et al. DETRs beat YOLOs on real-time object detection [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2024: 16965–16974.
[3]   HU Y, YANG J, CHEN L, et al. Planning-oriented autonomous driving [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver: IEEE, 2023: 17853–17862.
[4]   郑诚, 陈雪灵. 方面语义增强的融合网络用于方面级情感分析 [J/OL]. 小型微型计算机系统, 2024: 1–12. (2024-09-24). https://kns.cnki.net/kcms/detail/detail.aspx?filename=XXWX2024092000Q&dbname=CJFD&dbcode=CJFQ.
ZHENG Cheng, CHEN Xueling. Aspect semantic enhanced fusion network for aspect-based sentiment analysis [J/OL]. Journal of Chinese Computer Systems, 2024: 1–12. (2024-09-24). https://kns.cnki.net/kcms/detail/detail.aspx?filename=XXWX2024092000Q&dbname=CJFD&dbcode=CJFQ.
[5]   秦臻, 庄添铭, 朱国淞, 等 面向人工智能模型的安全攻击和防御策略综述[J]. 计算机研究与发展, 2024, 61 (10): 2627- 2648
QIN Zhen, ZHUANG Tianming, ZHU Guosong, et al Survey of security attack and defense strategies for artificial intelligence model[J]. Journal of Computer Research and Development, 2024, 61 (10): 2627- 2648
doi: 10.7544/issn1000-1239.202440449
[6]   SZEGEDY C, ZAREMBA W, SUTSKEVER I, et al. Intriguing properties of neural networks [C]// International Conference on Learning Representations. Banff: International Machine Learning Society, 2014: 1-10.
[7]   BAI Y, WANG Y, ZENG Y, et al Query efficient black-box adversarial attack on deep neural networks[J]. Pattern Recognition, 2023, 133: 109037
doi: 10.1016/j.patcog.2022.109037
[8]   REN Y, ZHU H, SUI X, et al Crafting transferable adversarial examples via contaminating the salient feature variance[J]. Information Sciences, 2023, 644: 119273
doi: 10.1016/j.ins.2023.119273
[9]   DONG Y, LIAO F, PANG T, et al. Boosting adversarial attacks with momentum [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 9185–9193.
[10]   CROCE F, HEIN M. Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks [C]// Proceedings of the International Conference on Machine Learning. [S.l.]: JMLR. org, 2020: 2206–2216.
[11]   WANG X, HE K. Enhancing the transferability of adversarial attacks through variance tuning [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021: 1924–1933.
[12]   PENG A, LIN Z, ZENG H, et al. Boosting transferability of adversarial example via an enhanced Euler’s method [C]// IEEE International Conference on Acoustics, Speech and Signal Processing. Rhodes Island: IEEE, 2023: 1–5.
[13]   WANG J, CHEN Z, JIANG K, et al Boosting the transferability of adversarial attacks with global momentum initialization[J]. Expert Systems with Applications, 2024, 255: 124757
doi: 10.1016/j.eswa.2024.124757
[14]   DONG Y, PANG T, SU H, et al. Evading defenses to transferable adversarial examples by translation-invariant attacks [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 4307–4316.
[15]   WU L, ZHAO L, PU B, et al. Boosting the transferability of adversarial examples via adaptive attention and gradient purification methods [C]// Proceedings of the International Joint Conference on Neural Networks. Yokohama: IEEE, 2024: 1–7.
[16]   ZHU Z, CHEN H, WANG X, et al. GE-AdvGAN: improving the transferability of adversarial samples by gradient editing-based adversarial generative model [C]// Proceedings of the 2024 SIAM International Conference on Data Mining. Houston: SIAM, 2024: 706–714.
[17]   QIAN Y, HE S, ZHAO C, et al. LEA2: a lightweight ensemble adversarial attack via non-overlapping vulnerable frequency regions [C]// Proceedings of the IEEE/CVF International Conference on Computer Vision. Paris: IEEE, 2023: 4487–4498.
[18]   LI M, DENG C, LI T, et al. Towards transferable targeted attack [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 638−646.
[19]   CHEN B, YIN J, CHEN S, et al. An adaptive model ensemble adversarial attack for boosting adversarial transferability [C]// Proceedings of the IEEE/CVF International Conference on Computer Vision. Paris: IEEE, 2023: 4466–4475.
[20]   LIN J, SONG C, HE K, et al. Nesterov accelerated gradient and scale invariance for improving transferability of adversarial examples [C]// International Conference on Learning Representations. [S.l.]: International Machine Learning Society, 2020: 1–12.
[21]   XIE C, ZHANG Z, ZHOU Y, et al. Improving transferability of adversarial examples with input diversity [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 2725–2734.
[22]   QIAN Y, CHEN K, WANG B, et al Enhancing transferability of adversarial examples through mixed-frequency inputs[J]. IEEE Transactions on Information Forensics and Security, 2024, 19: 7633- 7645
doi: 10.1109/TIFS.2024.3430508
[23]   ZHAO Z, LIU Z, LARSON M. On success and simplicity: a second look at transferable targeted attacks [C]// Proceedings of the 35th International Conference on Neural Information Processing Systems. [S.l.]: NeurIPS Foundation, 2021: 6115–6128.
[24]   WU H, OU G, WU W, et al. Improving transferable targeted adversarial attacks with model self-enhancement [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2024: 24615–24624.
[25]   颜景坤. 人脸识别对抗攻击算法研究[D]. 武汉: 华中科技大学, 2020.
YAN Jingkun. Research of adversarial attack method in face recognition [D]. Wuhan: Huazhong University of Science and Technology, 2020.
[26]   XIAO C, LI B, ZHU J, et al. Generating adversarial examples with adversarial networks [C]// Proceedings of the 27th International Joint Conference on Artificial Intelligence. Stockholm: AAAI Press, 2018: 3905–3911.
[27]   MOPURI K R, UPPALA P K, BABU R V. Ask, acquire, and attack: data-free UAP generation using class impressions [C]// Proceedings of the European Conference on Computer Vision. Munich: Springer, 2018: 20–35.
[28]   SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition [C]// International Conference on Learning Representations. San Diego: ICLR, 2015: 1–14.
[29]   KRIZHEVSKY A, HINTON G. Learning multiple layers of features from tiny images [R]. Toronto: University of Toronto, 2009.
[30]   NETZER Y, WANG T, COATES A, et al. Reading digits in natural images with unsupervised feature learning [C]// Proceedings of the NIPS Workshop on Deep Learning and Unsupervised Feature Learning. Granada: NeurIPS Foundation, 2011: 1–9.
[31]   LECUN Y, CORTES C, BURGES C J C The MNIST database of handwritten digits[J]. Neural Computation, 1998, 10 (5): 1191- 1232
[32]   HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 770–778.
[33]   HUANG G, LIU Z, VAN DER MAATEN L, et al. Densely connected convolutional networks [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 2261–2269
[34]   ZAGORUYKO S, KOMODAKIS N. Wide residual networks [C]// Proceedings of the British Machine Vision Conference. York: BMVA Press, 2016: 87.1–87.12.
[35]   SZEGEDY C, VANHOUCKE V, IOFFE S, et al. Rethinking the inception architecture for computer vision [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 2818–2826.
[36]   KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks [C]// Proceedings of the 26th International Conference on Neural Information Processing Systems. Lake Tahoe: NeurIPS Foundation, 2012: 1097–1105.
[37]   WANG X, HE K. Enhancing the transferability of adversarial attacks through variance tuning. [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021: 1924–1933.
[38]   LIN Q, LUO C, NIU Z, et al. Boosting adversarial transferability across model genus by deformation-constrained warping [C]// Proceedings of the AAAI Conference on Artificial Intelligence. Vancouver: AAAI Press, 2024: 3459–3467.
[1] Zhihang ZHU,Yunfeng YAN,Donglian QI. Image generation for power personnel behaviors based on diffusion model with multimodal prompts[J]. Journal of ZheJiang University (Engineering Science), 2026, 60(1): 43-51.
[2] Jizhong DUAN,Haiyuan LI. Multi-scale parallel magnetic resonance imaging reconstruction based on variational model and Transformer[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(9): 1826-1837.
[3] Fujian WANG,Zetian ZHANG,Xiqun CHEN,Dianhai WANG. Usage prediction of shared bike based on multi-channel graph aggregation attention mechanism[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(9): 1986-1995.
[4] Hong ZHANG,Xuecheng ZHANG,Guoqiang WANG,Panlong GU,Nan JIANG. Real-time positioning and control of soft robot based on three-dimensional vision[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(8): 1574-1582.
[5] Shengju WANG,Zan ZHANG. Missing value imputation algorithm based on accelerated diffusion model[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(7): 1471-1480.
[6] Dongping ZHANG,Dawei WANG,Shuji HE,Siliang TANG,Zhiyong LIU,Zhongqiu LIU. Remaining useful life prediction of aircraft engines based on cross-dimensional feature fusion[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(7): 1504-1513.
[7] Yongqing CAI,Cheng HAN,Wei QUAN,Wudi CHEN. Visual induced motion sickness estimation model based on attention mechanism[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(6): 1110-1118.
[8] Lihong WANG,Xinqian LIU,Jing LI,Zhiquan FENG. Network intrusion detection method based on federated learning and spatiotemporal feature fusion[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(6): 1201-1210.
[9] Huizhi XU,Xiuqing WANG. Perception of distance and speed of front vehicle based on vehicle image features[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(6): 1219-1232.
[10] Zan CHEN,Ran LI,Yuanjing FENG,Yongqiang LI. Video snapshot compressive imaging reconstruction based on temporal super-resolution[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(5): 956-963.
[11] Li MA,Yongshun WANG,Yao HU,Lei FAN. Pre-trained long-short spatiotemporal interleaved Transformer for traffic flow prediction applications[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(4): 669-678.
[12] Qiaohong CHEN,Menghao GUO,Xian FANG,Qi SUN. Image captioning based on cross-modal cascaded diffusion model[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(4): 787-794.
[13] Zhengyu GU,Feifei LAI,Chen GENG,Ximing WANG,Yakang DAI. Knowledge-guided infarct segmentation of ischemic stroke[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(4): 814-820.
[14] Minghui YAO,Yueyan WANG,Qiliang WU,Yan NIU,Cong WANG. Siamese networks algorithm based on small human motion behavior recognition[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(3): 504-511.
[15] Liming LIANG,Pengwei LONG,Jiaxin JIN,Renjie LI,Lu ZENG. Steel surface defect detection algorithm based on improved YOLOv8s[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(3): 512-522.