Shared feature learning algorithm for style diffusion

doi:10.3785/j.issn.1008-973X.2025.07.008

Journal of ZheJiang University (Engineering Science)

2025, Vol. 59

Issue (7): 1403-1410 DOI: 10.3785/j.issn.1008-973X.2025.07.008

Shared feature learning algorithm for style diffusion

Jinchen SHEN3(

),Rui HUANG3,Che JIANG3,Meng QI2,Jia CUI1,3,*(

)

1. State Key Laboratory of Subtropical Building and Urban Science, South China University of Technology, Guangzhou 510641, China
2. School of Information Science and Engineering, Shandong Normal University, Jinan 250358, China
3. School of Design, South China University of Technology, Guangzhou 510006, China

Download:

HTML

PDF(1348KB) HTML
Export: BibTeX | EndNote (RIS)

Abstract

To address the problem of style diffusion, the principal style features of an image were learned to improve the accuracy of image style classification. The transferable nature of style was utilized, and an asymmetric diamond model was proposed, which defined transferable features within similar data as intra-class shared features to learn the dominant style of the data. A diamond model consisting of two-generation processes was introduced based on the autoencoder structure. In the first process, similar style data were sampled to learn transferable features as the dominant style features (shared features), thereby reducing sub-style interference. In the second process, reconstruction loss was applied to maintain the continuity of the image’s dominant style. Through a multi-task learning framework, shared feature learning and the classification model were optimized simultaneously to achieve category feature learning based on the dominant style. Comparative experiments were conducted on five style datasets (two oil painting datasets, one Chinese painting dataset, one architectural dataset, and one fashion dataset). Compared with existing approaches, the accuracy of the proposed model improved by 2 to 7 percentage points, which validated the effectiveness and advancement of the model.

Key words： style classification shared feature learning autoencoder style features style transfer

Received: 25 June 2024 Published: 25 July 2025

CLC:

TP 391.4

Fund: 浙江大学计算机辅助设计与图形系统全国重点实验室开放课题（A2416）；广州市哲学社会科学发展“十四五”规划项目（2024GZGJ17）；中央高校基本科研业务费专项（2022ZYGXZR020）；山东省自然科学基金联合基金资助项目（ZR2021LZL011）.

Corresponding Authors: Jia CUI E-mail: 202221055779@mail.scut.edu.cn;cuijia1247@scut.edu.cn

	Service
	E-mail this article
	Add to my bookshelf
	Add to citation manager
	E-mail Alert
	RSS
	Articles by authors
	Jinchen SHEN
	Rui HUANG
	Che JIANG
	Meng QI
	Jia CUI

Cite this article:

Jinchen SHEN,Rui HUANG,Che JIANG,Meng QI,Jia CUI. Shared feature learning algorithm for style diffusion. Journal of ZheJiang University (Engineering Science), 2025, 59(7): 1403-1410.

URL:

https://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2025.07.008 OR https://www.zjujournals.com/eng/Y2025/V59/I7/1403

面向风格扩散的共享特征学习算法

为了解决风格扩散问题，学习图像的主风格特征以提升风格分类的准确率. 借鉴风格的可迁移特性，提出非对称结构的钻石模型，将同类数据中可相互迁移的特征定义为风格类内共享特征，用来学习数据的主风格特征. 基于自动编码器结构提出由2个生成过程组成的钻石模型，第一个生成过程通过同类风格数据采样学习可迁移特征作为风格主特征（共享特征），降低子风格干扰；第二个生成过程通过重建损失保持图像主风格的连续. 由多任务学习框架同时优化共享特征学习和分类模型，实现基于主风格的类别特征学习. 在5个风格数据集(2个油画数据集、1个中国画数据集、1个建筑数据集和1个时尚数据集)中开展对比实验，与现有风格分类模型相比，所提模型的准确率提升了2~7个百分点，验证了模型的有效性和先进性.

关键词： 风格分类, 共享特征学习, 自动编码器, 风格特征, 风格迁移

Fig.1 Shared feature learning framework

Fig.2 Network structure of diamond model

Fig.3 Image samples from dominant style dataset

Tab.1 Parameters of dominant style dataset

Fig.4 Accuracy comparison of distance metrics

Tab.2 Modular ablation experiments for diamond model

Tab.3 Loss function ablation experiments for diamond model

Tab.4 Performance comparison of different style classification models


[1]	WANG B, ZHANG S, ZHANG J, et al Architectural style classification based on CNN and channel–spatial attention[J]. Signal, Image and Video Processing, 2023, 17 (1): 99- 107 doi: 10.1007/s11760-022-02208-0

[2]	FU R, LI J, YANG C, et al Image colour application rules of Shanghai style Chinese paintings based on machine learning algorithm[J]. Engineering Applications of Artificial Intelligence, 2024, 132: 107903 doi: 10.1016/j.engappai.2024.107903

[3]	ZHAO R, LIU K. Research on painting image classification based on convolution neural network [C]// Proceedings of the Third International Conference on Artificial Intelligence and Computer Engineering. Wuhan: SPIE, 2023: 225.

[4]	JIANG S, SHAO M, JIA C, et al Learning consensus representation for weak style classification[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40 (12): 2906- 2919 doi: 10.1109/TPAMI.2017.2771766

[5]	SHAJINI M, RAMANAN A A knowledge-sharing semi-supervised approach for fashion clothes classification and attribute prediction[J]. The Visual Computer, 2022, 38 (11): 3551- 3561 doi: 10.1007/s00371-021-02178-3

[6]	ZHANG H, LUO Y, ZHANG L, et al Considering three elements of aesthetics: multi-task self-supervised feature learning for image style classification[J]. Neurocomputing, 2023, 520: 262- 273 doi: 10.1016/j.neucom.2022.10.076

[7]	ZHAO W, ZHOU D, QIU X, et al How to represent paintings: a painting classification using artistic comments[J]. Sensors, 2021, 21 (6): 1940 doi: 10.3390/s21061940

[8]	CASTELLANO G, LELLA E, VESSIO G Visual link retrieval and knowledge discovery in painting datasets[J]. Multimedia Tools and Applications, 2021, 80 (5): 6599- 6616 doi: 10.1007/s11042-020-09995-z

[9]	EFTHYMIOU A, RUDINAC S, KACKOVIC M, et al. Graph neural networks for knowledge enhanced visual representation of paintings [EB/OL]. (2021–05–17)[2024–06–09]. https://arxiv.org/pdf/2105.08190.

[10]	STERMAN S, HUANG E, LIU V, et al. Interacting with literary style through computational tools [C]// Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. Honolulu: ACM, 2020: 1–12.

[11]	MITCHELL V W, HARVEY W S, WOOD G Where does all the ‘know how’ go? The role of tacit knowledge in research impact[J]. Higher Education Research and Development, 2022, 41 (5): 1664- 1678 doi: 10.1080/07294360.2021.1937066

[12]	CUI J, LIU Y Q, LU H J, et al PortraitNET: photo-realistic portrait cartoon style transfer with self-supervised semantic supervision[J]. Neurocomputing, 2021, 465: 114- 127 doi: 10.1016/j.neucom.2021.08.088

[13]	SUN M, ZHANG D, WANG Z, et al Monte Carlo Convex Hull Model for classification of traditional Chinese paintings[J]. Neurocomputing, 2016, 171: 788- 797 doi: 10.1016/j.neucom.2015.08.013

[14]	GENG J, ZHANG X, YAN Y, et al MCCFNet: multi-channel color fusion network for cognitive classification of traditional Chinese paintings[J]. Cognitive Computation, 2023, 15 (6): 2050- 2061 doi: 10.1007/s12559-023-10172-1

[15]	LIU S, YANG J, AGAIAN S S, et al Novel features for art movement classification of portrait paintings[J]. Image and Vision Computing, 2021, 108: 104121 doi: 10.1016/j.imavis.2021.104121

[16]	WANG Z, SUN M, HAN Y, et al Supervised heterogeneous sparse feature selection for Chinese paintings classification[J]. Journal of Computer-Aided Design and Computer Graphics, 2013, 25 (12): 1848- 1855

[17]	CUI J, ZANG M, LIU Z, et al BIM product style classification and retrieval based on long-range style dependencies[J]. Buildings, 2023, 13 (9): 2280 doi: 10.3390/buildings13092280

[18]	杨冰, 许端清, 杨鑫, 等基于艺术风格相似性规则的绘画图像分类[J]. 浙江大学学报: 工学版, 2013, 47 (8): 1486- 1492 YANG Bing, XU Duanqing, YANG Xin, et al Painting image classification based on aesthetic style similarity rule[J]. Journal of Zhejiang University: Engineering Science, 2013, 47 (8): 1486- 1492

[19]	谢秦秦, 何朗, 徐汝利基于多特征融合的油画艺术风格分类[J]. 计算机科学, 2023, 50 (3): 223- 230 XIE Qinqin, HE Lang, XU Ruli Classification of oil painting art style based on multi-feature fusion[J]. Computer Science, 2023, 50 (3): 223- 230 doi: 10.11896/jsjkx.211200110

[20]	钱文华, 徐丹, 徐瑾, 等基于信息熵的风格绘画分类研究[J]. 图学学报, 2019, 40 (6): 991- 999 QIAN Wenhua, XU Dan, XU Jin, et al Artistic paintings classification based on information entropy[J]. Journal of Graphics, 2019, 40 (6): 991- 999

[21]	PAN S J, YANG Q A survey on transfer learning[J]. IEEE Transactions on Knowledge and Data Engineering, 2010, 22 (10): 1345- 1359 doi: 10.1109/TKDE.2009.191

[22]	KRIZHEVSKY A, SUTSKEVER I, HINTON G E ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60 (6): 84- 90 doi: 10.1145/3065386

[23]	LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: common objects in context [C]// Computer Vision – ECCV 2014. [S. l.]: Springer, 2014: 740–755.

[24]	CHAIB S, YAO H, GU Y, et al. Deep feature extraction and combination for remote sensing image classification based on pre-trained CNN models [C]// Proceedings of the Ninth International Conference on Digital Image Processing. Hong Kong: SPIE, 2017: 104203D.

[25]	SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition [EB/OL]. (2015–04–10)[2024–06–09]. https://arxiv.org/pdf/1409.1556.

[26]	SZEGEDY C, VANHOUCKE V, IOFFE S, et al. Rethinking the inception architecture for computer vision [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 2818–2826.

[27]	HUANG G, LIU Z, VAN DER MAATEN L, et al. Densely connected convolutional networks [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 2261–2269.

[28]	MILANI F, FRATERNALI P A dataset and a convolutional model for iconography classification in paintings[J]. Journal on Computing and Cultural Heritage, 2021, 14 (4): 1- 18

[29]	ALIREZAZADEH P, DORNAIKA F, MOUJAHID A A deep learning loss based on additive cosine margin: application to fashion style and face recognition[J]. Applied Soft Computing, 2022, 131: 109776 doi: 10.1016/j.asoc.2022.109776

[30]	STREZOSKI G, WORRING M. OmniArt: multi-task deep learning for artistic data analysis [EB/OL]. (2017–08–02)[2024–06–09]. https://arxiv.org/pdf/1708.00684.

[31]	YIN X C, YIN X, HUANG K, et al Robust text detection in natural scene images[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36 (5): 970- 983 doi: 10.1109/TPAMI.2013.182

[32]	BIANCO S, MAZZINI D, NAPOLETANO P, et al Multitask painting categorization by deep multibranch neural network[J]. Expert Systems with Applications, 2019, 135: 90- 101 doi: 10.1016/j.eswa.2019.05.036

[33]	WANG Z, ZHAO L, XING W. StyleDiffusion: controllable disentangled style transfer via diffusion models [C]// Proceedings of the IEEE/CVF International Conference on Computer Vision. Paris: IEEE, 2023: 7643–7655.

[34]	CUI J, LIU G, JIA Z L, et al Similar visual complexity analysis model based on subjective perception[J]. IEEE Access, 2019, 7: 148873- 148881 doi: 10.1109/ACCESS.2019.2946695

[35]	MENIS-MASTROMICHALAKIS O, SOFOU N, STAMOU G. Deep ensemble art style recognition [C]// Proceedings of the International Joint Conference on Neural Networks. Glasgow: IEEE, 2020: 1–8.

[36]	ELFWING S, UCHIBE E, DOYA K Sigmoid-weighted linear units for neural network function approximation in reinforcement learning[J]. Neural Networks, 2018, 107: 3- 11 doi: 10.1016/j.neunet.2017.12.012

[37]	VINCENT P, LAROCHELLE H, BENGIO Y, et al. Extracting and composing robust features with denoising autoencoders [C]// Proceedings of the 25th International Conference on Machine Learning. Helsinki: ACM, 2008: 1096–1103.

[38]	BOURLARD H, KAMP Y Auto-association by multilayer perceptrons and singular value decomposition[J]. Biological Cybernetics, 1988, 59 (4): 291- 294

[39]	LI C, HARRISON B. StyleM: stylized metrics for image captioning built with contrastive N-grams [EB/OL]. (2022–01–04)[2024–06–09]. https://arxiv.org/pdf/2201.00975.

[40]	KHAN F S, BEIGPOUR S, VAN DE WEIJER J, et al Painting-91: a large scale database for computational painting categorization[J]. Machine Vision and Applications, 2014, 25 (6): 1385- 1397 doi: 10.1007/s00138-014-0621-6

[41]	FLOREA C, CONDOROVICI R, VERTAN C, et al. Pandora: description of a painting database for art movement recognition with baselines and perspectives [C]// Proceedings of the 24th European Signal Processing Conference. Budapest: IEEE, 2016: 918–922.

[42]	湛颖, 高妍, 谢凌云中国国画情感—美感数据库[J]. 中国图象图形学报, 2019, 24 (12): 2267- 2278 ZHAN Ying, GAO Yan, XIE Lingyun Database for emotion and aesthetic analysis of traditional Chinese paintings[J]. Journal of Image and Graphics, 2019, 24 (12): 2267- 2278 doi: 10.11834/jig.190102

[43]	TAKAGI M, SIMO-SERRA E, IIZUKA S, et al. What makes a style: experimental analysis of fashion prediction [C]// Proceedings of the IEEE International Conference on Computer Vision Workshops. Venice: IEEE, 2017: 2247–2253.

[44]	XU Z, TAO D, ZHANG Y, et al. Architectural style classification using multinomial latent logistic regression [C]// Computer Vision – ECCV 2014. [S. l.]: Springer, 2014: 600–615.

[45]	HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 770–778.

[46]	VINCENT P, LAROCHELLE H, LAJOIE I, et al Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion[J]. Journal of Machine Learning Research, 2010, 11: 3371- 3408

[1]	Ruo-ran CHENG,Xiao-li ZHAO,Hao-jun ZHOU,Han-chen YE. Review of Chinese font style transfer research based on deep learning[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(3): 510-519, 530.

[2]	Tong CHEN,Jian-feng GUO,Xin-zhong HAN,Xue-li XIE,Jian-xiang XI. Visible and infrared image matching method based on generative adversarial model[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(1): 63-74.

[3]	Xing LIU,Jian-bo YU. Attention convolutional GRU-based autoencoder and its application in industrial process monitoring[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(9): 1643-1651.

[4]	Yu-hui XU,Jun-qing SHU,Ya SONG,Yu ZHENG,Tang-bin XIA. Remaining useful life prediction of turbofan engine based on similarity in multiple time scales[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(10): 1937-1947.

[5]	Jin-sheng JIANG,Hao-ran REN,Han-ye LI. Seismic data processing based on convolutional autoencoder[J]. Journal of ZheJiang University (Engineering Science), 2020, 54(5): 978-984.

[6]	WEI Chao, LUO Sen-lin, ZHANG Jing, PAN Li-min. Short text manifold representation based on AutoEncoder network[J]. Journal of ZheJiang University (Engineering Science), 2015, 49(8): 1591-1599.

Viewed

Full text

Abstract

Cited

Shared

Discussed