面向风格扩散的共享特征学习算法

doi:10.3785/j.issn.1008-973X.2025.07.008

浙江大学学报(工学版)

2025, Vol. 59

Issue (7): 1403-1410 DOI: 10.3785/j.issn.1008-973X.2025.07.008

计算机技术与控制工程

面向风格扩散的共享特征学习算法

申锦琛3(

),黄蕊3,蒋澈3,戚萌2,崔嘉1,3,*(

)

1. 华南理工大学亚热带建筑与城市科学全国重点实验室，广东广州 510641
2. 山东师范大学信息科学与工程学院，山东济南 250358
3. 华南理工大学设计学院，广东广州 510006

Shared feature learning algorithm for style diffusion

Jinchen SHEN3(

),Rui HUANG3,Che JIANG3,Meng QI2,Jia CUI1,3,*(

)

1. State Key Laboratory of Subtropical Building and Urban Science, South China University of Technology, Guangzhou 510641, China
2. School of Information Science and Engineering, Shandong Normal University, Jinan 250358, China
3. School of Design, South China University of Technology, Guangzhou 510006, China

全文: PDF(1348 KB) HTML

摘要：

为了解决风格扩散问题，学习图像的主风格特征以提升风格分类的准确率. 借鉴风格的可迁移特性，提出非对称结构的钻石模型，将同类数据中可相互迁移的特征定义为风格类内共享特征，用来学习数据的主风格特征. 基于自动编码器结构提出由2个生成过程组成的钻石模型，第一个生成过程通过同类风格数据采样学习可迁移特征作为风格主特征（共享特征），降低子风格干扰；第二个生成过程通过重建损失保持图像主风格的连续. 由多任务学习框架同时优化共享特征学习和分类模型，实现基于主风格的类别特征学习. 在5个风格数据集(2个油画数据集、1个中国画数据集、1个建筑数据集和1个时尚数据集)中开展对比实验，与现有风格分类模型相比，所提模型的准确率提升了2~7个百分点，验证了模型的有效性和先进性.

关键词： 风格分类; 共享特征学习; 自动编码器; 风格特征; 风格迁移

Abstract:

To address the problem of style diffusion, the principal style features of an image were learned to improve the accuracy of image style classification. The transferable nature of style was utilized, and an asymmetric diamond model was proposed, which defined transferable features within similar data as intra-class shared features to learn the dominant style of the data. A diamond model consisting of two-generation processes was introduced based on the autoencoder structure. In the first process, similar style data were sampled to learn transferable features as the dominant style features (shared features), thereby reducing sub-style interference. In the second process, reconstruction loss was applied to maintain the continuity of the image’s dominant style. Through a multi-task learning framework, shared feature learning and the classification model were optimized simultaneously to achieve category feature learning based on the dominant style. Comparative experiments were conducted on five style datasets (two oil painting datasets, one Chinese painting dataset, one architectural dataset, and one fashion dataset). Compared with existing approaches, the accuracy of the proposed model improved by 2 to 7 percentage points, which validated the effectiveness and advancement of the model.

Key words: style classification shared feature learning autoencoder style features style transfer

收稿日期: 2024-06-25 出版日期: 2025-07-25

CLC:

TP 391.4

基金资助: 浙江大学计算机辅助设计与图形系统全国重点实验室开放课题（A2416）；广州市哲学社会科学发展“十四五”规划项目（2024GZGJ17）；中央高校基本科研业务费专项（2022ZYGXZR020）；山东省自然科学基金联合基金资助项目（ZR2021LZL011）.

通讯作者: 崔嘉 E-mail: 202221055779@mail.scut.edu.cn;cuijia1247@scut.edu.cn

作者简介: 申锦琛（1999—），男，硕士生，从事图像风格分类研究. orcid.org/0009-0007-4789-9585. E-mail：202221055779@mail.scut.edu.cn

	服务
	把本文推荐给朋友
	加入引用管理器
	E-mail Alert
	作者相关文章
	申锦琛
	黄蕊
	蒋澈
	戚萌
	崔嘉

引用本文:

申锦琛,黄蕊,蒋澈,戚萌,崔嘉. 面向风格扩散的共享特征学习算法[J]. 浙江大学学报(工学版), 2025, 59(7): 1403-1410.

Jinchen SHEN,Rui HUANG,Che JIANG,Meng QI,Jia CUI. Shared feature learning algorithm for style diffusion. Journal of ZheJiang University (Engineering Science), 2025, 59(7): 1403-1410.

链接本文:

https://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2025.07.008 或 https://www.zjujournals.com/eng/CN/Y2025/V59/I7/1403

图 1 共享特征学习框架

图 2 钻石模型的网络结构

图 3 主流数据集图片示例

表 1 主流风格数据集参数

图 4 距离度量的准确率比较

表 2 钻石模型的模块消融实验

表 3 钻石模型的损失函数消融实验

表 4 不同风格分类模型的性能对比

1	WANG B, ZHANG S, ZHANG J, et al Architectural style classification based on CNN and channel–spatial attention[J]. Signal, Image and Video Processing, 2023, 17 (1): 99- 107 doi: 10.1007/s11760-022-02208-0
2	FU R, LI J, YANG C, et al Image colour application rules of Shanghai style Chinese paintings based on machine learning algorithm[J]. Engineering Applications of Artificial Intelligence, 2024, 132: 107903 doi: 10.1016/j.engappai.2024.107903
3	ZHAO R, LIU K. Research on painting image classification based on convolution neural network [C]// Proceedings of the Third International Conference on Artificial Intelligence and Computer Engineering. Wuhan: SPIE, 2023: 225.
4	JIANG S, SHAO M, JIA C, et al Learning consensus representation for weak style classification[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40 (12): 2906- 2919 doi: 10.1109/TPAMI.2017.2771766
5	SHAJINI M, RAMANAN A A knowledge-sharing semi-supervised approach for fashion clothes classification and attribute prediction[J]. The Visual Computer, 2022, 38 (11): 3551- 3561 doi: 10.1007/s00371-021-02178-3
6	ZHANG H, LUO Y, ZHANG L, et al Considering three elements of aesthetics: multi-task self-supervised feature learning for image style classification[J]. Neurocomputing, 2023, 520: 262- 273 doi: 10.1016/j.neucom.2022.10.076
7	ZHAO W, ZHOU D, QIU X, et al How to represent paintings: a painting classification using artistic comments[J]. Sensors, 2021, 21 (6): 1940 doi: 10.3390/s21061940
8	CASTELLANO G, LELLA E, VESSIO G Visual link retrieval and knowledge discovery in painting datasets[J]. Multimedia Tools and Applications, 2021, 80 (5): 6599- 6616 doi: 10.1007/s11042-020-09995-z
9	EFTHYMIOU A, RUDINAC S, KACKOVIC M, et al. Graph neural networks for knowledge enhanced visual representation of paintings [EB/OL]. (2021–05–17)[2024–06–09]. https://arxiv.org/pdf/2105.08190.
10	STERMAN S, HUANG E, LIU V, et al. Interacting with literary style through computational tools [C]// Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. Honolulu: ACM, 2020: 1–12.
11	MITCHELL V W, HARVEY W S, WOOD G Where does all the ‘know how’ go? The role of tacit knowledge in research impact[J]. Higher Education Research and Development, 2022, 41 (5): 1664- 1678 doi: 10.1080/07294360.2021.1937066
12	CUI J, LIU Y Q, LU H J, et al PortraitNET: photo-realistic portrait cartoon style transfer with self-supervised semantic supervision[J]. Neurocomputing, 2021, 465: 114- 127 doi: 10.1016/j.neucom.2021.08.088
13	SUN M, ZHANG D, WANG Z, et al Monte Carlo Convex Hull Model for classification of traditional Chinese paintings[J]. Neurocomputing, 2016, 171: 788- 797 doi: 10.1016/j.neucom.2015.08.013
14	GENG J, ZHANG X, YAN Y, et al MCCFNet: multi-channel color fusion network for cognitive classification of traditional Chinese paintings[J]. Cognitive Computation, 2023, 15 (6): 2050- 2061 doi: 10.1007/s12559-023-10172-1
15	LIU S, YANG J, AGAIAN S S, et al Novel features for art movement classification of portrait paintings[J]. Image and Vision Computing, 2021, 108: 104121 doi: 10.1016/j.imavis.2021.104121
16	WANG Z, SUN M, HAN Y, et al Supervised heterogeneous sparse feature selection for Chinese paintings classification[J]. Journal of Computer-Aided Design and Computer Graphics, 2013, 25 (12): 1848- 1855
17	CUI J, ZANG M, LIU Z, et al BIM product style classification and retrieval based on long-range style dependencies[J]. Buildings, 2023, 13 (9): 2280 doi: 10.3390/buildings13092280
18	杨冰, 许端清, 杨鑫, 等基于艺术风格相似性规则的绘画图像分类[J]. 浙江大学学报: 工学版, 2013, 47 (8): 1486- 1492 YANG Bing, XU Duanqing, YANG Xin, et al Painting image classification based on aesthetic style similarity rule[J]. Journal of Zhejiang University: Engineering Science, 2013, 47 (8): 1486- 1492
19	谢秦秦, 何朗, 徐汝利基于多特征融合的油画艺术风格分类[J]. 计算机科学, 2023, 50 (3): 223- 230 XIE Qinqin, HE Lang, XU Ruli Classification of oil painting art style based on multi-feature fusion[J]. Computer Science, 2023, 50 (3): 223- 230 doi: 10.11896/jsjkx.211200110
20	钱文华, 徐丹, 徐瑾, 等基于信息熵的风格绘画分类研究[J]. 图学学报, 2019, 40 (6): 991- 999 QIAN Wenhua, XU Dan, XU Jin, et al Artistic paintings classification based on information entropy[J]. Journal of Graphics, 2019, 40 (6): 991- 999
21	PAN S J, YANG Q A survey on transfer learning[J]. IEEE Transactions on Knowledge and Data Engineering, 2010, 22 (10): 1345- 1359 doi: 10.1109/TKDE.2009.191
22	KRIZHEVSKY A, SUTSKEVER I, HINTON G E ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60 (6): 84- 90 doi: 10.1145/3065386
23	LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: common objects in context [C]// Computer Vision – ECCV 2014. [S. l.]: Springer, 2014: 740–755.
24	CHAIB S, YAO H, GU Y, et al. Deep feature extraction and combination for remote sensing image classification based on pre-trained CNN models [C]// Proceedings of the Ninth International Conference on Digital Image Processing. Hong Kong: SPIE, 2017: 104203D.
25	SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition [EB/OL]. (2015–04–10)[2024–06–09]. https://arxiv.org/pdf/1409.1556.
26	SZEGEDY C, VANHOUCKE V, IOFFE S, et al. Rethinking the inception architecture for computer vision [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 2818–2826.
27	HUANG G, LIU Z, VAN DER MAATEN L, et al. Densely connected convolutional networks [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 2261–2269.
28	MILANI F, FRATERNALI P A dataset and a convolutional model for iconography classification in paintings[J]. Journal on Computing and Cultural Heritage, 2021, 14 (4): 1- 18
29	ALIREZAZADEH P, DORNAIKA F, MOUJAHID A A deep learning loss based on additive cosine margin: application to fashion style and face recognition[J]. Applied Soft Computing, 2022, 131: 109776 doi: 10.1016/j.asoc.2022.109776
30	STREZOSKI G, WORRING M. OmniArt: multi-task deep learning for artistic data analysis [EB/OL]. (2017–08–02)[2024–06–09]. https://arxiv.org/pdf/1708.00684.
31	YIN X C, YIN X, HUANG K, et al Robust text detection in natural scene images[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36 (5): 970- 983 doi: 10.1109/TPAMI.2013.182
32	BIANCO S, MAZZINI D, NAPOLETANO P, et al Multitask painting categorization by deep multibranch neural network[J]. Expert Systems with Applications, 2019, 135: 90- 101 doi: 10.1016/j.eswa.2019.05.036
33	WANG Z, ZHAO L, XING W. StyleDiffusion: controllable disentangled style transfer via diffusion models [C]// Proceedings of the IEEE/CVF International Conference on Computer Vision. Paris: IEEE, 2023: 7643–7655.
34	CUI J, LIU G, JIA Z L, et al Similar visual complexity analysis model based on subjective perception[J]. IEEE Access, 2019, 7: 148873- 148881 doi: 10.1109/ACCESS.2019.2946695
35	MENIS-MASTROMICHALAKIS O, SOFOU N, STAMOU G. Deep ensemble art style recognition [C]// Proceedings of the International Joint Conference on Neural Networks. Glasgow: IEEE, 2020: 1–8.
36	ELFWING S, UCHIBE E, DOYA K Sigmoid-weighted linear units for neural network function approximation in reinforcement learning[J]. Neural Networks, 2018, 107: 3- 11 doi: 10.1016/j.neunet.2017.12.012
37	VINCENT P, LAROCHELLE H, BENGIO Y, et al. Extracting and composing robust features with denoising autoencoders [C]// Proceedings of the 25th International Conference on Machine Learning. Helsinki: ACM, 2008: 1096–1103.
38	BOURLARD H, KAMP Y Auto-association by multilayer perceptrons and singular value decomposition[J]. Biological Cybernetics, 1988, 59 (4): 291- 294
39	LI C, HARRISON B. StyleM: stylized metrics for image captioning built with contrastive N-grams [EB/OL]. (2022–01–04)[2024–06–09]. https://arxiv.org/pdf/2201.00975.
40	KHAN F S, BEIGPOUR S, VAN DE WEIJER J, et al Painting-91: a large scale database for computational painting categorization[J]. Machine Vision and Applications, 2014, 25 (6): 1385- 1397 doi: 10.1007/s00138-014-0621-6
41	FLOREA C, CONDOROVICI R, VERTAN C, et al. Pandora: description of a painting database for art movement recognition with baselines and perspectives [C]// Proceedings of the 24th European Signal Processing Conference. Budapest: IEEE, 2016: 918–922.
42	湛颖, 高妍, 谢凌云中国国画情感—美感数据库[J]. 中国图象图形学报, 2019, 24 (12): 2267- 2278 ZHAN Ying, GAO Yan, XIE Lingyun Database for emotion and aesthetic analysis of traditional Chinese paintings[J]. Journal of Image and Graphics, 2019, 24 (12): 2267- 2278 doi: 10.11834/jig.190102
43	TAKAGI M, SIMO-SERRA E, IIZUKA S, et al. What makes a style: experimental analysis of fashion prediction [C]// Proceedings of the IEEE International Conference on Computer Vision Workshops. Venice: IEEE, 2017: 2247–2253.
44	XU Z, TAO D, ZHANG Y, et al. Architectural style classification using multinomial latent logistic regression [C]// Computer Vision – ECCV 2014. [S. l.]: Springer, 2014: 600–615.
45	HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 770–778.
46	VINCENT P, LAROCHELLE H, LAJOIE I, et al Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion[J]. Journal of Machine Learning Research, 2010, 11: 3371- 3408

[1]	张雪峰,陈秀莉,僧德文. 融合用户信任和影响力的top-N推荐算法[J]. 浙江大学学报(工学版), 2020, 54(2): 311-319.
[2]	赵廷廷,王喆,卢奕南. 基于传播概率矩阵的异构信息网络表示学习[J]. 浙江大学学报(工学版), 2019, 53(3): 548-554.

Viewed

Full text

Abstract

Cited

Shared

Discussed