Please wait a minute...
Journal of ZheJiang University (Engineering Science)  2025, Vol. 59 Issue (12): 2566-2575    DOI: 10.3785/j.issn.1008-973X.2025.12.011
    
Industrial image anomaly detection method based on adversarial learning of abnormal features
Tianfei WANG1(),Wenjun ZHOU1,*(),Sheng XIANG2,Yuhang HE1,Bo PENG1
1. School of Computer Science and Software Engineering, Southwest Petroleum University, Chengdu 610500, China
2. School of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
Download: HTML     PDF(2791KB) HTML
Export: BibTeX | EndNote (RIS)      

Abstract  

A novel anomaly detection method named EDA (enhancing anomaly detection via adversarial anomaly learning) was proposed, to address the challenges of industrial image anomaly detection, including the scarcity of anomalous samples, the complexity of annotation, and the high computational cost of deep models. The proposed approach consisted of two key stages. 1) Anomaly learning and embedding stage: a generative adversarial network (GAN) architecture was employed to learn anomalous features. The generator’s parameters were reduced to ensure lightweight design, and subpixel convolution was introduced to enhance anomalous information. Random regions were selected from normal images, refined using the SAM (segment anything) model, and then anomalous features were generated in these refined regions, providing prior anomalous features and corresponding masks for the anomaly detection stage. 2) Anomaly detection stage: a Contrast U-net network was introduced to improve sensitivity to anomalous features and enhance the accuracy of identification and localization through supervised training. Experimental results on the MVTec dataset demonstrated the superior performance of the proposed method, achieving an image-level AUROC of 98.2%, a pixel-level AUROC of 97.8%, and an AU-PR of 81.1%, showing significant advantages and outstanding performance in the field of industrial image anomaly detection and segmentation.



Key wordsanomaly detection      generative adversarial network      anomaly generation      contrast operator      deep learning     
Received: 18 December 2024      Published: 25 November 2025
CLC:  TP 391  
Fund:  四川省自然科学基金资助项目(2023NSFSC0504).
Corresponding Authors: Wenjun ZHOU     E-mail: tianfeifeiwang@outlook.com;zhouwenjun@swpu.edu.cn
Cite this article:

Tianfei WANG,Wenjun ZHOU,Sheng XIANG,Yuhang HE,Bo PENG. Industrial image anomaly detection method based on adversarial learning of abnormal features. Journal of ZheJiang University (Engineering Science), 2025, 59(12): 2566-2575.

URL:

https://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2025.12.011     OR     https://www.zjujournals.com/eng/Y2025/V59/I12/2566


基于异常特征对抗学习的工业图像异常检测方法

为了解决工业图像异常检测中遇到的异常样本稀缺、标注过程复杂及深度模型计算开销大的问题,提出新的异常检测方法EDA. 该方法分为2个阶段. 1)异常学习和嵌入阶段,采用生成式对抗网络(GAN)架构来学习异常特征,通过缩减生成器参数量以保证网络轻量化,引入亚像素卷积以增强异常信息,随后在正常图像中随机选择区域,通过SAM (segment anything)模型进行区域的细化处理,在细化处理后的区域生成异常信息,为异常检测阶段提供先验异常特征及相应掩码. 2)异常检测阶段,引入Contrast U-Net网络利用有监督训练方式增强对异常特征的敏感度,并提升识别与定位的准确性. 在MVTec数据集上进行的实验结果表明,所提方法性能优异,图像级别AUROC为98.2%,像素级别AUROC为97.8%,AU-PR为81.1%,具有显著优势,在图像异常检测分割领域具有出色表现.


关键词: 异常检测,  生成对抗网络,  异常图像生成,  对比度算子,  深度学习 
方法类别关键技术优点缺点
基于重建编码器将输入的工业图像压缩成低维特征向量;解码器再根据该特征向量重建图像无须对缺陷进行标注,通用性较强精度及准确度不足,容易误判
基于嵌入将图像映射到低维的特征空间,计算其特征嵌入与正常图像特征嵌入之间的距离能够有效提取图像的深层特征,在特征空间中进行距离计算,可量化图像的异常程度,便于分析和决策若训练数据不足或存在偏差,
会影响模型的泛化能力
基于知识蒸馏将教师模型的知识迁移到轻量级的学生模型中降低模型的计算成本,内存需求低蒸馏效果依赖超参数,
缺乏泛化能力
基于生成模型通过生成器和判别器的对抗训练,生成器逐渐学习到正常图像的分布特征能够生成与真实图像相似的样本,可用于数据增强训练过程不稳定,容易出
现梯度消失、爆炸的问题
Tab.1 Comparison of commonly used methods for industrial anomaly detection
Fig.1 Overall structure of EDA
Fig.2 Illustration of anomaly learning and embedding
Fig.3 Structure of Bottleneck block
Fig.4 Schematic of mask region generation
Fig.5 Illustration of anomaly detection module
Fig.6 Network structure of anomaly detection module
Fig.7 Examples from MVTec dataset
类别AUROC
USAE-SSIMRIADPaDimCutPasteCLGANMB-PFMATSNM本研究算法
1)注:斜线前、后数据分别表示图像级以及像素级AUROC结果
bottle99.0/97.81)88.0/93.099.9/98.499.9/98.398.2/97.697.6/92.6100.0/98.4100.0/98.396.8/98.5
capsule86.1/96.861.0/94.088.4/92.891.3/98.598.2/97.498.2/98.494.5/94.393.7/98.596.2/97.3
grid81.0/89.969.0/94.099.6/98.896.7/97.3100.0/97.599.3/98.798.0/98.895.2/98.7100.0/99.6
leather88.2/97.846.0/78.0100.0/99.4100.0/99.2100.0/99.5100.0/99.7100.0/96.4100.0/99.5100.0/99.9
pill87.9/96.560.0/91.083.8/95.793.3/95.794.9/95.798.1/97.396.5/95.293.7/96.598.2/96.5
tile99.1/92.552.0/59..098.7/89.198.1/98.194.6/90.596.5/94.199.6/96.295.9/97.9100.0/98.6
transistor81.8/97.852.0/90.090.9/87.797.4/97.596.1/93.096.4/93.397.8/97.891.6/87.595.0/92.1
zipper91.9/95.680.0/88.098.1/97.890.3/98.599.9/99.399.3/97.897.4/98.296.3/98.5100.0/99.1
cable86.2/91.961.0/82.081.9/84.292.7/96.781.2/90.098.3/95.698.8/96.791.3/96.893.2/96.7
carpet91.6/93.567.0/87.084.2/96.399.8/99.193.9/98.398.2/97.8100.0/99.297.8/98.396.5/99.2
hazelnut93.1/98.254.0/97.083.3/96.192.0/98.298.3/97.399.0/98.1100.0/99.199.8/98.4100.0/98.8
metalnut82.0/97.254.0/89.088.5/92.598.7/97.299.9/93.197.9/96.8100.0/97.298.6/96.798.9/97.8
screw54.9/97.451.0/92.084.5/98.885.8/98.588.7/96.795.2/94.991.8/97.792.1/98.996.6/98.9
toothbrush95.3/97.974.0/96.0100.0/98.996.1/98.899.4/98.198.2/96.688.6/98.691.4/98.9100.0/97.9
wood97.7/92.183.0/73.093.0/85.899.2/94.999.1/95.598.9/96.999.5/95.698.8/96.997.6/94.5
平均89.7/95.363.4/88.091.3/94.295.3/97.596.1/96.097.5/95.197.5/97.395.7/97.498.2/97.8
Tab.2 Image- and pixel-level AUROC comparison results for different models
类别US[9]AE-SSIMRIADPaDimCutPasteCLGANMB-PFMATSNM本研究算法
bottle74.276.473.077.979.676.778.786.6
capsule25.938.233.432.369.646.252.772.8
grid10.136.458.042.664.945.345.168.5
leather40.949.145.254.675.446.857.476.3
pill62.051.660.251.876.378.666.472.5
tile65.352.651.767.287.680.389.195.6
transistor27.139.271.370.867.756.870.377.5
zipper36.163.416.668.568.755.672.686.3
cable48.224.434.355.673.867.769.475.3
carpet52.261.449.757.387.658.380.290.7
hazelnut57.833.837.453.768.460.776.396.3
metalnut83.564.339.462.567.778.177.675.6
screw17.843.951.758.669.352.669.474.2
toothbrush37.750.640.646.859.653.454.967.4
wood53.338.242.379.377.346.778.776.8
平均46.1448.247.058.667.561.168.9281.1
Tab.3 Pixel-level AU-PR comparison results for different models
Fig.8 Qualitative comparison of different models
设计第1阶段第2阶段
设计1Contrast U-Net
设计2U-Net
设计3Contrast U-Net
Tab.4 Ablation experiment design
类别AUROC
设计1设计2设计3
1)注:斜线前、后数据分别表示图像级以及像素级AUROC结果
bottle98.2/69.71)96.8/97.297.2/98.5
capsule77.5/63.794.9/91.097.0/97.3
grid75.6/66.3100.0/99.4100.0/99.6
leather83.1/57.7100.0/97.4100.0/99.9
pill89.0/65.796.2/96.498.2/97.5
tile98.5/78.799.8/99.3100.0/98.6
transistor91.4/59.692.7/88.996.3/92.1
zipper94.3/64.2100.0/98.4100.0/99.1
cable54.0/69.488.3/93.486.2/96.7
carpet52.5/56.190.6/93.897.6/99.2
hazelnut94.3/64.299.9/99.6100.0/98.8
metalnut89.5/86.599.0/99.198.9/97.8
screw84.5/54.697.3/99.596.6/98.9
toothbrush86.6/76.9100.0/97.7100.0/97.9
wood91.0/67.999.6/94.997.6/94.5
平均84.0/66.7597.0/96.498.2/97.8
Tab.5 Image- and piexl-level AUROC results of ablation experiment
方法SSIMPSNR/dB
原始网络0.7618.54
改进后0.9224.36
Tab.6 Ablation experiment results of GAN
MethodsParam/106MACs/109
US18.623.44
AE-SSIM62.3712.61
RIAD
PaDim68.8811.46
CutPaste11.771.82
CLGAN
MB-PFM22.863.62
ATSNM
本研究算法0.952.64
Tab.7 Comparison of time complexity
Fig.9 Illustration of similarity of real and simulated abnormal image feature distribution
Fig.10 Qualitative result examples for texture image
[1]   吕承侃, 沈飞, 张正涛, 等 图像异常检测研究现状综述[J]. 自动化学报, 2022, 48 (6): 1402- 1428
LV Chengkan, SHEN Fei, ZHANG Zhengtao, et al Review of image anomaly detection[J]. Acta Automatica Sinica, 2022, 48 (6): 1402- 1428
[2]   LIU J, XIE G, WANG J, et al Deep industrial image anomaly detection: a survey[J]. Machine Intelligence Research, 2024, 21 (1): 104- 135
doi: 10.1007/s11633-023-1459-z
[3]   KIRILLOV A, MINTUN E, RAVI N, et al. Segment anything [C]// IEEE/CVF International Conference on Computer Vision. Paris: IEEE, 2023: 3992–4003.
[4]   HUANG G, LIU Z, VAN DER MAATEN L, et al. Densely connected convolutional networks [C]// IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 2261–2269.
[5]   SHI W, CABALLERO J, HUSZÁR F, et al. Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network [C]// IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 1874–1883.
[6]   BERGMANN P, FAUSER M, SATTLEGGER D, et al. MVTec AD: a comprehensive real-world dataset for unsupervised anomaly detection [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 9584–9592.
[7]   ZHOU W, WANG T, HE Y, et al Contrast U-Net driven by sufficient texture extraction for carotid plaque detection[J]. Mathematical Biosciences and Engineering, 2023, 20 (9): 15623- 15640
doi: 10.3934/mbe.2023697
[8]   HU J, SHEN L, SUN G. Squeeze-and-excitation networks [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 7132–7141.
[9]   HE Y, XIANG S, ZHOU W, et al. A novel contrast operator for robust object searching [C]// 17th International Conference on Computational Intelligence and Security. Chengdu: IEEE, 2021: 309–313.
[10]   LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection [C]// IEEE International Conference on Computer Vision. Venice: IEEE, 2017: 2999–3007.
[11]   BERMAN M, TRIKI A R, BLASCHKO M B. The lovasz-softmax loss: a tractable surrogate for the optimization of the intersection-over-union measure in neural networks [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 4413–4421.
[12]   DEFARD T, SETKOV A, LOESCH A, et al. PaDiM: a patch distribution modeling framework for anomaly detection and localization [C]// International conference on pattern recognition. Cham: Springer International Publishing, 2021: 475–489.
[13]   BERGMANN P, FAUSER M, SATTLEGGER D, et al. Uninformed students: student-teacher anomaly detection with discriminative latent embeddings [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 4182−4191.
[14]   BERGMANN P, LOWS S, FAUSER M, et al. Improving unsupervised defect segmentation by applying structural similarity to autoencoders.
[15]   ZAVRTANIK V, KRISTAN M, SKOČAJ D Reconstruction by inpainting for visual anomaly detection[J]. Pattern Recognition, 2021, 112: 107706
doi: 10.1016/j.patcog.2020.107706
[16]   LI C L, SOHN K, YOON J, et al. CutPaste: self-supervised learning for anomaly detection and localization [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021: 9659-9669.
[17]   张玥, 陈锡伟, 陈梦丹, 等 基于对比学习生成对抗网络的无监督工业品表面异常检测[J]. 电子测量与仪器学报, 2023, 37 (10): 193- 201
ZHANG Yue, CHEN Xiwei, CHEN Mengdan, et al Unsupervised surface anomaly detection of industrial products based on contrastive learning generative adversarial network[J]. Journal of Electronic Measurement and Instrumentation, 2023, 37 (10): 193- 201
[18]   WAN Q, GAO L, LI X, et al Unsupervised image anomaly detection and segmentation based on pretrained feature mapping[J]. IEEE Transactions on Industrial Informatics, 2023, 19 (3): 2330- 2339
doi: 10.1109/TII.2022.3182385
[19]   孔森林, 张辉, 黄镇南, 等 面向工业图像异常检测的非对称师生网络模型[J]. 计算机科学, 2024, 51 (Suppl.2): 331- 337
KONG Senlin, ZHANG Hui, HUANG Zhennan, et al Asymmetric teacher-student network model for industrial image anomaly detection[J]. Computer Science, 2024, 51 (Suppl.2): 331- 337
[1] Jizhong DUAN,Haiyuan LI. Multi-scale parallel magnetic resonance imaging reconstruction based on variational model and Transformer[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(9): 1826-1837.
[2] Fujian WANG,Zetian ZHANG,Xiqun CHEN,Dianhai WANG. Usage prediction of shared bike based on multi-channel graph aggregation attention mechanism[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(9): 1986-1995.
[3] Hong ZHANG,Xuecheng ZHANG,Guoqiang WANG,Panlong GU,Nan JIANG. Real-time positioning and control of soft robot based on three-dimensional vision[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(8): 1574-1582.
[4] Shengju WANG,Zan ZHANG. Missing value imputation algorithm based on accelerated diffusion model[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(7): 1471-1480.
[5] Dongping ZHANG,Dawei WANG,Shuji HE,Siliang TANG,Zhiyong LIU,Zhongqiu LIU. Remaining useful life prediction of aircraft engines based on cross-dimensional feature fusion[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(7): 1504-1513.
[6] Yongqing CAI,Cheng HAN,Wei QUAN,Wudi CHEN. Visual induced motion sickness estimation model based on attention mechanism[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(6): 1110-1118.
[7] Jian XIAO,Liangliang WU,Xinze HE,Xin HU. Image feature matching algorithm based on anomaly detection[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(6): 1140-1147.
[8] Lihong WANG,Xinqian LIU,Jing LI,Zhiquan FENG. Network intrusion detection method based on federated learning and spatiotemporal feature fusion[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(6): 1201-1210.
[9] Huizhi XU,Xiuqing WANG. Perception of distance and speed of front vehicle based on vehicle image features[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(6): 1219-1232.
[10] Zan CHEN,Ran LI,Yuanjing FENG,Yongqiang LI. Video snapshot compressive imaging reconstruction based on temporal super-resolution[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(5): 956-963.
[11] Qincheng JIANG,Jianfeng TAO,Yangyang WANG,Yulei ZHANG,Chengliang LIU. EWT-LSTM based industrial robot joint anomaly detection[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(5): 982-994.
[12] Li MA,Yongshun WANG,Yao HU,Lei FAN. Pre-trained long-short spatiotemporal interleaved Transformer for traffic flow prediction applications[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(4): 669-678.
[13] Qiaohong CHEN,Menghao GUO,Xian FANG,Qi SUN. Image captioning based on cross-modal cascaded diffusion model[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(4): 787-794.
[14] Zhengyu GU,Feifei LAI,Chen GENG,Ximing WANG,Yakang DAI. Knowledge-guided infarct segmentation of ischemic stroke[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(4): 814-820.
[15] Minghui YAO,Yueyan WANG,Qiliang WU,Yan NIU,Cong WANG. Siamese networks algorithm based on small human motion behavior recognition[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(3): 504-511.