Please wait a minute...
浙江大学学报(工学版)  2025, Vol. 59 Issue (7): 1403-1410    DOI: 10.3785/j.issn.1008-973X.2025.07.008
计算机技术与控制工程     
面向风格扩散的共享特征学习算法
申锦琛3(),黄蕊3,蒋澈3,戚萌2,崔嘉1,3,*()
1. 华南理工大学 亚热带建筑与城市科学全国重点实验室,广东 广州 510641
2. 山东师范大学 信息科学与工程学院,山东 济南 250358
3. 华南理工大学 设计学院,广东 广州 510006
Shared feature learning algorithm for style diffusion
Jinchen SHEN3(),Rui HUANG3,Che JIANG3,Meng QI2,Jia CUI1,3,*()
1. State Key Laboratory of Subtropical Building and Urban Science, South China University of Technology, Guangzhou 510641, China
2. School of Information Science and Engineering, Shandong Normal University, Jinan 250358, China
3. School of Design, South China University of Technology, Guangzhou 510006, China
 全文: PDF(1348 KB)   HTML
摘要:

为了解决风格扩散问题,学习图像的主风格特征以提升风格分类的准确率. 借鉴风格的可迁移特性,提出非对称结构的钻石模型,将同类数据中可相互迁移的特征定义为风格类内共享特征,用来学习数据的主风格特征. 基于自动编码器结构提出由2个生成过程组成的钻石模型,第一个生成过程通过同类风格数据采样学习可迁移特征作为风格主特征(共享特征),降低子风格干扰;第二个生成过程通过重建损失保持图像主风格的连续. 由多任务学习框架同时优化共享特征学习和分类模型,实现基于主风格的类别特征学习. 在5个风格数据集(2个油画数据集、1个中国画数据集、1个建筑数据集和1个时尚数据集)中开展对比实验,与现有风格分类模型相比,所提模型的准确率提升了2~7个百分点,验证了模型的有效性和先进性.

关键词: 风格分类共享特征学习自动编码器风格特征风格迁移    
Abstract:

To address the problem of style diffusion, the principal style features of an image were learned to improve the accuracy of image style classification. The transferable nature of style was utilized, and an asymmetric diamond model was proposed, which defined transferable features within similar data as intra-class shared features to learn the dominant style of the data. A diamond model consisting of two-generation processes was introduced based on the autoencoder structure. In the first process, similar style data were sampled to learn transferable features as the dominant style features (shared features), thereby reducing sub-style interference. In the second process, reconstruction loss was applied to maintain the continuity of the image’s dominant style. Through a multi-task learning framework, shared feature learning and the classification model were optimized simultaneously to achieve category feature learning based on the dominant style. Comparative experiments were conducted on five style datasets (two oil painting datasets, one Chinese painting dataset, one architectural dataset, and one fashion dataset). Compared with existing approaches, the accuracy of the proposed model improved by 2 to 7 percentage points, which validated the effectiveness and advancement of the model.

Key words: style classification    shared feature learning    autoencoder    style features    style transfer
收稿日期: 2024-06-25 出版日期: 2025-07-25
CLC:  TP 391.4  
基金资助: 浙江大学计算机辅助设计与图形系统全国重点实验室开放课题(A2416);广州市哲学社会科学发展“十四五”规划项目(2024GZGJ17);中央高校基本科研业务费专项(2022ZYGXZR020);山东省自然科学基金联合基金资助项目(ZR2021LZL011).
通讯作者: 崔嘉     E-mail: 202221055779@mail.scut.edu.cn;cuijia1247@scut.edu.cn
作者简介: 申锦琛(1999—),男,硕士生,从事图像风格分类研究. orcid.org/0009-0007-4789-9585. E-mail:202221055779@mail.scut.edu.cn
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
作者相关文章  
申锦琛
黄蕊
蒋澈
戚萌
崔嘉

引用本文:

申锦琛,黄蕊,蒋澈,戚萌,崔嘉. 面向风格扩散的共享特征学习算法[J]. 浙江大学学报(工学版), 2025, 59(7): 1403-1410.

Jinchen SHEN,Rui HUANG,Che JIANG,Meng QI,Jia CUI. Shared feature learning algorithm for style diffusion. Journal of ZheJiang University (Engineering Science), 2025, 59(7): 1403-1410.

链接本文:

https://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2025.07.008        https://www.zjujournals.com/eng/CN/Y2025/V59/I7/1403

图 1  共享特征学习框架
图 2  钻石模型的网络结构
图 3  主流数据集图片示例
数据库领域图片总数风格标签总数
Painting91 (P91)油画2 33813
Pandora (Pan)油画7 72612
Chinese Painting (CP)国画5 1105
Arch建筑10 11325
Fashion Style14 (FS)时尚13 12614
表 1  主流风格数据集参数
图 4  距离度量的准确率比较
模块ACC/%
Painting91Pandora
E58.3546.64
E-D58.93 (+0.58)47.17 (+0.53)
${T_{\text{f}}}+{T_{\text{f}}}$63.07 (+4.14)51.39 (+4.75)
本研究67.78 (+9.43)56.68 (+10.04)
表 2  钻石模型的模块消融实验
${L_{\text{f}}}$${L_{\text{b}}}$${L_{\text{s}}}$${L_{\text{t}}}$ACC/%
Painting91Pandora
×××57.2250.88
×××57.0151.46
×××61.5954.94
×××64.2255.35
表 3  钻石模型的损失函数消融实验
模型ACC/%
P91PanCPArchFS
VGG1658.4249.7352.6761.4168.22
VGG1958.1146.4452.7760.1166.14
ResNet5064.9351.6557.0365.1271.13
ResNet10165.5052.6156.5366.4270.00
InceptionV353.4142.8355.8861.5262.70
DAE58.8248.7152.6658.5561.48
SCAE63.6548.6455.1659.6174.33
SSCAE64.0749.3855.6860.4875.02
DDS62.2152.3553.13
MCCFNet66.6051.3959.1066.1268.38
STSACLF60.4155.8058.5560.8164.47
本研究167.3956.6755.2765.5771.67
本研究269.1256.9859.4169.0377.17
表 4  不同风格分类模型的性能对比
1 WANG B, ZHANG S, ZHANG J, et al Architectural style classification based on CNN and channel–spatial attention[J]. Signal, Image and Video Processing, 2023, 17 (1): 99- 107
doi: 10.1007/s11760-022-02208-0
2 FU R, LI J, YANG C, et al Image colour application rules of Shanghai style Chinese paintings based on machine learning algorithm[J]. Engineering Applications of Artificial Intelligence, 2024, 132: 107903
doi: 10.1016/j.engappai.2024.107903
3 ZHAO R, LIU K. Research on painting image classification based on convolution neural network [C]// Proceedings of the Third International Conference on Artificial Intelligence and Computer Engineering. Wuhan: SPIE, 2023: 225.
4 JIANG S, SHAO M, JIA C, et al Learning consensus representation for weak style classification[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40 (12): 2906- 2919
doi: 10.1109/TPAMI.2017.2771766
5 SHAJINI M, RAMANAN A A knowledge-sharing semi-supervised approach for fashion clothes classification and attribute prediction[J]. The Visual Computer, 2022, 38 (11): 3551- 3561
doi: 10.1007/s00371-021-02178-3
6 ZHANG H, LUO Y, ZHANG L, et al Considering three elements of aesthetics: multi-task self-supervised feature learning for image style classification[J]. Neurocomputing, 2023, 520: 262- 273
doi: 10.1016/j.neucom.2022.10.076
7 ZHAO W, ZHOU D, QIU X, et al How to represent paintings: a painting classification using artistic comments[J]. Sensors, 2021, 21 (6): 1940
doi: 10.3390/s21061940
8 CASTELLANO G, LELLA E, VESSIO G Visual link retrieval and knowledge discovery in painting datasets[J]. Multimedia Tools and Applications, 2021, 80 (5): 6599- 6616
doi: 10.1007/s11042-020-09995-z
9 EFTHYMIOU A, RUDINAC S, KACKOVIC M, et al. Graph neural networks for knowledge enhanced visual representation of paintings [EB/OL]. (2021–05–17)[2024–06–09]. https://arxiv.org/pdf/2105.08190.
10 STERMAN S, HUANG E, LIU V, et al. Interacting with literary style through computational tools [C]// Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. Honolulu: ACM, 2020: 1–12.
11 MITCHELL V W, HARVEY W S, WOOD G Where does all the ‘know how’ go? The role of tacit knowledge in research impact[J]. Higher Education Research and Development, 2022, 41 (5): 1664- 1678
doi: 10.1080/07294360.2021.1937066
12 CUI J, LIU Y Q, LU H J, et al PortraitNET: photo-realistic portrait cartoon style transfer with self-supervised semantic supervision[J]. Neurocomputing, 2021, 465: 114- 127
doi: 10.1016/j.neucom.2021.08.088
13 SUN M, ZHANG D, WANG Z, et al Monte Carlo Convex Hull Model for classification of traditional Chinese paintings[J]. Neurocomputing, 2016, 171: 788- 797
doi: 10.1016/j.neucom.2015.08.013
14 GENG J, ZHANG X, YAN Y, et al MCCFNet: multi-channel color fusion network for cognitive classification of traditional Chinese paintings[J]. Cognitive Computation, 2023, 15 (6): 2050- 2061
doi: 10.1007/s12559-023-10172-1
15 LIU S, YANG J, AGAIAN S S, et al Novel features for art movement classification of portrait paintings[J]. Image and Vision Computing, 2021, 108: 104121
doi: 10.1016/j.imavis.2021.104121
16 WANG Z, SUN M, HAN Y, et al Supervised heterogeneous sparse feature selection for Chinese paintings classification[J]. Journal of Computer-Aided Design and Computer Graphics, 2013, 25 (12): 1848- 1855
17 CUI J, ZANG M, LIU Z, et al BIM product style classification and retrieval based on long-range style dependencies[J]. Buildings, 2023, 13 (9): 2280
doi: 10.3390/buildings13092280
18 杨冰, 许端清, 杨鑫, 等 基于艺术风格相似性规则的绘画图像分类[J]. 浙江大学学报: 工学版, 2013, 47 (8): 1486- 1492
YANG Bing, XU Duanqing, YANG Xin, et al Painting image classification based on aesthetic style similarity rule[J]. Journal of Zhejiang University: Engineering Science, 2013, 47 (8): 1486- 1492
19 谢秦秦, 何朗, 徐汝利 基于多特征融合的油画艺术风格分类[J]. 计算机科学, 2023, 50 (3): 223- 230
XIE Qinqin, HE Lang, XU Ruli Classification of oil painting art style based on multi-feature fusion[J]. Computer Science, 2023, 50 (3): 223- 230
doi: 10.11896/jsjkx.211200110
20 钱文华, 徐丹, 徐瑾, 等 基于信息熵的风格绘画分类研究[J]. 图学学报, 2019, 40 (6): 991- 999
QIAN Wenhua, XU Dan, XU Jin, et al Artistic paintings classification based on information entropy[J]. Journal of Graphics, 2019, 40 (6): 991- 999
21 PAN S J, YANG Q A survey on transfer learning[J]. IEEE Transactions on Knowledge and Data Engineering, 2010, 22 (10): 1345- 1359
doi: 10.1109/TKDE.2009.191
22 KRIZHEVSKY A, SUTSKEVER I, HINTON G E ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60 (6): 84- 90
doi: 10.1145/3065386
23 LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: common objects in context [C]// Computer Vision – ECCV 2014. [S. l.]: Springer, 2014: 740–755.
24 CHAIB S, YAO H, GU Y, et al. Deep feature extraction and combination for remote sensing image classification based on pre-trained CNN models [C]// Proceedings of the Ninth International Conference on Digital Image Processing. Hong Kong: SPIE, 2017: 104203D.
25 SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition [EB/OL]. (2015–04–10)[2024–06–09]. https://arxiv.org/pdf/1409.1556.
26 SZEGEDY C, VANHOUCKE V, IOFFE S, et al. Rethinking the inception architecture for computer vision [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 2818–2826.
27 HUANG G, LIU Z, VAN DER MAATEN L, et al. Densely connected convolutional networks [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 2261–2269.
28 MILANI F, FRATERNALI P A dataset and a convolutional model for iconography classification in paintings[J]. Journal on Computing and Cultural Heritage, 2021, 14 (4): 1- 18
29 ALIREZAZADEH P, DORNAIKA F, MOUJAHID A A deep learning loss based on additive cosine margin: application to fashion style and face recognition[J]. Applied Soft Computing, 2022, 131: 109776
doi: 10.1016/j.asoc.2022.109776
30 STREZOSKI G, WORRING M. OmniArt: multi-task deep learning for artistic data analysis [EB/OL]. (2017–08–02)[2024–06–09]. https://arxiv.org/pdf/1708.00684.
31 YIN X C, YIN X, HUANG K, et al Robust text detection in natural scene images[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36 (5): 970- 983
doi: 10.1109/TPAMI.2013.182
32 BIANCO S, MAZZINI D, NAPOLETANO P, et al Multitask painting categorization by deep multibranch neural network[J]. Expert Systems with Applications, 2019, 135: 90- 101
doi: 10.1016/j.eswa.2019.05.036
33 WANG Z, ZHAO L, XING W. StyleDiffusion: controllable disentangled style transfer via diffusion models [C]// Proceedings of the IEEE/CVF International Conference on Computer Vision. Paris: IEEE, 2023: 7643–7655.
34 CUI J, LIU G, JIA Z L, et al Similar visual complexity analysis model based on subjective perception[J]. IEEE Access, 2019, 7: 148873- 148881
doi: 10.1109/ACCESS.2019.2946695
35 MENIS-MASTROMICHALAKIS O, SOFOU N, STAMOU G. Deep ensemble art style recognition [C]// Proceedings of the International Joint Conference on Neural Networks. Glasgow: IEEE, 2020: 1–8.
36 ELFWING S, UCHIBE E, DOYA K Sigmoid-weighted linear units for neural network function approximation in reinforcement learning[J]. Neural Networks, 2018, 107: 3- 11
doi: 10.1016/j.neunet.2017.12.012
37 VINCENT P, LAROCHELLE H, BENGIO Y, et al. Extracting and composing robust features with denoising autoencoders [C]// Proceedings of the 25th International Conference on Machine Learning. Helsinki: ACM, 2008: 1096–1103.
38 BOURLARD H, KAMP Y Auto-association by multilayer perceptrons and singular value decomposition[J]. Biological Cybernetics, 1988, 59 (4): 291- 294
39 LI C, HARRISON B. StyleM: stylized metrics for image captioning built with contrastive N-grams [EB/OL]. (2022–01–04)[2024–06–09]. https://arxiv.org/pdf/2201.00975.
40 KHAN F S, BEIGPOUR S, VAN DE WEIJER J, et al Painting-91: a large scale database for computational painting categorization[J]. Machine Vision and Applications, 2014, 25 (6): 1385- 1397
doi: 10.1007/s00138-014-0621-6
41 FLOREA C, CONDOROVICI R, VERTAN C, et al. Pandora: description of a painting database for art movement recognition with baselines and perspectives [C]// Proceedings of the 24th European Signal Processing Conference. Budapest: IEEE, 2016: 918–922.
42 湛颖, 高妍, 谢凌云 中国国画情感—美感数据库[J]. 中国图象图形学报, 2019, 24 (12): 2267- 2278
ZHAN Ying, GAO Yan, XIE Lingyun Database for emotion and aesthetic analysis of traditional Chinese paintings[J]. Journal of Image and Graphics, 2019, 24 (12): 2267- 2278
doi: 10.11834/jig.190102
43 TAKAGI M, SIMO-SERRA E, IIZUKA S, et al. What makes a style: experimental analysis of fashion prediction [C]// Proceedings of the IEEE International Conference on Computer Vision Workshops. Venice: IEEE, 2017: 2247–2253.
44 XU Z, TAO D, ZHANG Y, et al. Architectural style classification using multinomial latent logistic regression [C]// Computer Vision – ECCV 2014. [S. l.]: Springer, 2014: 600–615.
45 HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 770–778.
46 VINCENT P, LAROCHELLE H, LAJOIE I, et al Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion[J]. Journal of Machine Learning Research, 2010, 11: 3371- 3408
[1] 张雪峰,陈秀莉,僧德文. 融合用户信任和影响力的top-N推荐算法[J]. 浙江大学学报(工学版), 2020, 54(2): 311-319.
[2] 赵廷廷,王喆,卢奕南. 基于传播概率矩阵的异构信息网络表示学习[J]. 浙江大学学报(工学版), 2019, 53(3): 548-554.