Please wait a minute...
Journal of ZheJiang University (Engineering Science)  2025, Vol. 59 Issue (7): 1403-1410    DOI: 10.3785/j.issn.1008-973X.2025.07.008
    
Shared feature learning algorithm for style diffusion
Jinchen SHEN3(),Rui HUANG3,Che JIANG3,Meng QI2,Jia CUI1,3,*()
1. State Key Laboratory of Subtropical Building and Urban Science, South China University of Technology, Guangzhou 510641, China
2. School of Information Science and Engineering, Shandong Normal University, Jinan 250358, China
3. School of Design, South China University of Technology, Guangzhou 510006, China
Download: HTML     PDF(1348KB) HTML
Export: BibTeX | EndNote (RIS)      

Abstract  

To address the problem of style diffusion, the principal style features of an image were learned to improve the accuracy of image style classification. The transferable nature of style was utilized, and an asymmetric diamond model was proposed, which defined transferable features within similar data as intra-class shared features to learn the dominant style of the data. A diamond model consisting of two-generation processes was introduced based on the autoencoder structure. In the first process, similar style data were sampled to learn transferable features as the dominant style features (shared features), thereby reducing sub-style interference. In the second process, reconstruction loss was applied to maintain the continuity of the image’s dominant style. Through a multi-task learning framework, shared feature learning and the classification model were optimized simultaneously to achieve category feature learning based on the dominant style. Comparative experiments were conducted on five style datasets (two oil painting datasets, one Chinese painting dataset, one architectural dataset, and one fashion dataset). Compared with existing approaches, the accuracy of the proposed model improved by 2 to 7 percentage points, which validated the effectiveness and advancement of the model.



Key wordsstyle classification      shared feature learning      autoencoder      style features      style transfer     
Received: 25 June 2024      Published: 25 July 2025
CLC:  TP 391.4  
Fund:  浙江大学计算机辅助设计与图形系统全国重点实验室开放课题(A2416);广州市哲学社会科学发展“十四五”规划项目(2024GZGJ17);中央高校基本科研业务费专项(2022ZYGXZR020);山东省自然科学基金联合基金资助项目(ZR2021LZL011).
Corresponding Authors: Jia CUI     E-mail: 202221055779@mail.scut.edu.cn;cuijia1247@scut.edu.cn
Cite this article:

Jinchen SHEN,Rui HUANG,Che JIANG,Meng QI,Jia CUI. Shared feature learning algorithm for style diffusion. Journal of ZheJiang University (Engineering Science), 2025, 59(7): 1403-1410.

URL:

https://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2025.07.008     OR     https://www.zjujournals.com/eng/Y2025/V59/I7/1403


面向风格扩散的共享特征学习算法

为了解决风格扩散问题,学习图像的主风格特征以提升风格分类的准确率. 借鉴风格的可迁移特性,提出非对称结构的钻石模型,将同类数据中可相互迁移的特征定义为风格类内共享特征,用来学习数据的主风格特征. 基于自动编码器结构提出由2个生成过程组成的钻石模型,第一个生成过程通过同类风格数据采样学习可迁移特征作为风格主特征(共享特征),降低子风格干扰;第二个生成过程通过重建损失保持图像主风格的连续. 由多任务学习框架同时优化共享特征学习和分类模型,实现基于主风格的类别特征学习. 在5个风格数据集(2个油画数据集、1个中国画数据集、1个建筑数据集和1个时尚数据集)中开展对比实验,与现有风格分类模型相比,所提模型的准确率提升了2~7个百分点,验证了模型的有效性和先进性.


关键词: 风格分类,  共享特征学习,  自动编码器,  风格特征,  风格迁移 
Fig.1 Shared feature learning framework
Fig.2 Network structure of diamond model
Fig.3 Image samples from dominant style dataset
数据库领域图片总数风格标签总数
Painting91 (P91)油画2 33813
Pandora (Pan)油画7 72612
Chinese Painting (CP)国画5 1105
Arch建筑10 11325
Fashion Style14 (FS)时尚13 12614
Tab.1 Parameters of dominant style dataset
Fig.4 Accuracy comparison of distance metrics
模块ACC/%
Painting91Pandora
E58.3546.64
E-D58.93 (+0.58)47.17 (+0.53)
${T_{\text{f}}}+{T_{\text{f}}}$63.07 (+4.14)51.39 (+4.75)
本研究67.78 (+9.43)56.68 (+10.04)
Tab.2 Modular ablation experiments for diamond model
${L_{\text{f}}}$${L_{\text{b}}}$${L_{\text{s}}}$${L_{\text{t}}}$ACC/%
Painting91Pandora
×××57.2250.88
×××57.0151.46
×××61.5954.94
×××64.2255.35
Tab.3 Loss function ablation experiments for diamond model
模型ACC/%
P91PanCPArchFS
VGG1658.4249.7352.6761.4168.22
VGG1958.1146.4452.7760.1166.14
ResNet5064.9351.6557.0365.1271.13
ResNet10165.5052.6156.5366.4270.00
InceptionV353.4142.8355.8861.5262.70
DAE58.8248.7152.6658.5561.48
SCAE63.6548.6455.1659.6174.33
SSCAE64.0749.3855.6860.4875.02
DDS62.2152.3553.13
MCCFNet66.6051.3959.1066.1268.38
STSACLF60.4155.8058.5560.8164.47
本研究167.3956.6755.2765.5771.67
本研究269.1256.9859.4169.0377.17
Tab.4 Performance comparison of different style classification models
[1]   WANG B, ZHANG S, ZHANG J, et al Architectural style classification based on CNN and channel–spatial attention[J]. Signal, Image and Video Processing, 2023, 17 (1): 99- 107
doi: 10.1007/s11760-022-02208-0
[2]   FU R, LI J, YANG C, et al Image colour application rules of Shanghai style Chinese paintings based on machine learning algorithm[J]. Engineering Applications of Artificial Intelligence, 2024, 132: 107903
doi: 10.1016/j.engappai.2024.107903
[3]   ZHAO R, LIU K. Research on painting image classification based on convolution neural network [C]// Proceedings of the Third International Conference on Artificial Intelligence and Computer Engineering. Wuhan: SPIE, 2023: 225.
[4]   JIANG S, SHAO M, JIA C, et al Learning consensus representation for weak style classification[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40 (12): 2906- 2919
doi: 10.1109/TPAMI.2017.2771766
[5]   SHAJINI M, RAMANAN A A knowledge-sharing semi-supervised approach for fashion clothes classification and attribute prediction[J]. The Visual Computer, 2022, 38 (11): 3551- 3561
doi: 10.1007/s00371-021-02178-3
[6]   ZHANG H, LUO Y, ZHANG L, et al Considering three elements of aesthetics: multi-task self-supervised feature learning for image style classification[J]. Neurocomputing, 2023, 520: 262- 273
doi: 10.1016/j.neucom.2022.10.076
[7]   ZHAO W, ZHOU D, QIU X, et al How to represent paintings: a painting classification using artistic comments[J]. Sensors, 2021, 21 (6): 1940
doi: 10.3390/s21061940
[8]   CASTELLANO G, LELLA E, VESSIO G Visual link retrieval and knowledge discovery in painting datasets[J]. Multimedia Tools and Applications, 2021, 80 (5): 6599- 6616
doi: 10.1007/s11042-020-09995-z
[9]   EFTHYMIOU A, RUDINAC S, KACKOVIC M, et al. Graph neural networks for knowledge enhanced visual representation of paintings [EB/OL]. (2021–05–17)[2024–06–09]. https://arxiv.org/pdf/2105.08190.
[10]   STERMAN S, HUANG E, LIU V, et al. Interacting with literary style through computational tools [C]// Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. Honolulu: ACM, 2020: 1–12.
[11]   MITCHELL V W, HARVEY W S, WOOD G Where does all the ‘know how’ go? The role of tacit knowledge in research impact[J]. Higher Education Research and Development, 2022, 41 (5): 1664- 1678
doi: 10.1080/07294360.2021.1937066
[12]   CUI J, LIU Y Q, LU H J, et al PortraitNET: photo-realistic portrait cartoon style transfer with self-supervised semantic supervision[J]. Neurocomputing, 2021, 465: 114- 127
doi: 10.1016/j.neucom.2021.08.088
[13]   SUN M, ZHANG D, WANG Z, et al Monte Carlo Convex Hull Model for classification of traditional Chinese paintings[J]. Neurocomputing, 2016, 171: 788- 797
doi: 10.1016/j.neucom.2015.08.013
[14]   GENG J, ZHANG X, YAN Y, et al MCCFNet: multi-channel color fusion network for cognitive classification of traditional Chinese paintings[J]. Cognitive Computation, 2023, 15 (6): 2050- 2061
doi: 10.1007/s12559-023-10172-1
[15]   LIU S, YANG J, AGAIAN S S, et al Novel features for art movement classification of portrait paintings[J]. Image and Vision Computing, 2021, 108: 104121
doi: 10.1016/j.imavis.2021.104121
[16]   WANG Z, SUN M, HAN Y, et al Supervised heterogeneous sparse feature selection for Chinese paintings classification[J]. Journal of Computer-Aided Design and Computer Graphics, 2013, 25 (12): 1848- 1855
[17]   CUI J, ZANG M, LIU Z, et al BIM product style classification and retrieval based on long-range style dependencies[J]. Buildings, 2023, 13 (9): 2280
doi: 10.3390/buildings13092280
[18]   杨冰, 许端清, 杨鑫, 等 基于艺术风格相似性规则的绘画图像分类[J]. 浙江大学学报: 工学版, 2013, 47 (8): 1486- 1492
YANG Bing, XU Duanqing, YANG Xin, et al Painting image classification based on aesthetic style similarity rule[J]. Journal of Zhejiang University: Engineering Science, 2013, 47 (8): 1486- 1492
[19]   谢秦秦, 何朗, 徐汝利 基于多特征融合的油画艺术风格分类[J]. 计算机科学, 2023, 50 (3): 223- 230
XIE Qinqin, HE Lang, XU Ruli Classification of oil painting art style based on multi-feature fusion[J]. Computer Science, 2023, 50 (3): 223- 230
doi: 10.11896/jsjkx.211200110
[20]   钱文华, 徐丹, 徐瑾, 等 基于信息熵的风格绘画分类研究[J]. 图学学报, 2019, 40 (6): 991- 999
QIAN Wenhua, XU Dan, XU Jin, et al Artistic paintings classification based on information entropy[J]. Journal of Graphics, 2019, 40 (6): 991- 999
[21]   PAN S J, YANG Q A survey on transfer learning[J]. IEEE Transactions on Knowledge and Data Engineering, 2010, 22 (10): 1345- 1359
doi: 10.1109/TKDE.2009.191
[22]   KRIZHEVSKY A, SUTSKEVER I, HINTON G E ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60 (6): 84- 90
doi: 10.1145/3065386
[23]   LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: common objects in context [C]// Computer Vision – ECCV 2014. [S. l.]: Springer, 2014: 740–755.
[24]   CHAIB S, YAO H, GU Y, et al. Deep feature extraction and combination for remote sensing image classification based on pre-trained CNN models [C]// Proceedings of the Ninth International Conference on Digital Image Processing. Hong Kong: SPIE, 2017: 104203D.
[25]   SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition [EB/OL]. (2015–04–10)[2024–06–09]. https://arxiv.org/pdf/1409.1556.
[26]   SZEGEDY C, VANHOUCKE V, IOFFE S, et al. Rethinking the inception architecture for computer vision [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 2818–2826.
[27]   HUANG G, LIU Z, VAN DER MAATEN L, et al. Densely connected convolutional networks [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 2261–2269.
[28]   MILANI F, FRATERNALI P A dataset and a convolutional model for iconography classification in paintings[J]. Journal on Computing and Cultural Heritage, 2021, 14 (4): 1- 18
[29]   ALIREZAZADEH P, DORNAIKA F, MOUJAHID A A deep learning loss based on additive cosine margin: application to fashion style and face recognition[J]. Applied Soft Computing, 2022, 131: 109776
doi: 10.1016/j.asoc.2022.109776
[30]   STREZOSKI G, WORRING M. OmniArt: multi-task deep learning for artistic data analysis [EB/OL]. (2017–08–02)[2024–06–09]. https://arxiv.org/pdf/1708.00684.
[31]   YIN X C, YIN X, HUANG K, et al Robust text detection in natural scene images[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36 (5): 970- 983
doi: 10.1109/TPAMI.2013.182
[32]   BIANCO S, MAZZINI D, NAPOLETANO P, et al Multitask painting categorization by deep multibranch neural network[J]. Expert Systems with Applications, 2019, 135: 90- 101
doi: 10.1016/j.eswa.2019.05.036
[33]   WANG Z, ZHAO L, XING W. StyleDiffusion: controllable disentangled style transfer via diffusion models [C]// Proceedings of the IEEE/CVF International Conference on Computer Vision. Paris: IEEE, 2023: 7643–7655.
[34]   CUI J, LIU G, JIA Z L, et al Similar visual complexity analysis model based on subjective perception[J]. IEEE Access, 2019, 7: 148873- 148881
doi: 10.1109/ACCESS.2019.2946695
[35]   MENIS-MASTROMICHALAKIS O, SOFOU N, STAMOU G. Deep ensemble art style recognition [C]// Proceedings of the International Joint Conference on Neural Networks. Glasgow: IEEE, 2020: 1–8.
[36]   ELFWING S, UCHIBE E, DOYA K Sigmoid-weighted linear units for neural network function approximation in reinforcement learning[J]. Neural Networks, 2018, 107: 3- 11
doi: 10.1016/j.neunet.2017.12.012
[37]   VINCENT P, LAROCHELLE H, BENGIO Y, et al. Extracting and composing robust features with denoising autoencoders [C]// Proceedings of the 25th International Conference on Machine Learning. Helsinki: ACM, 2008: 1096–1103.
[38]   BOURLARD H, KAMP Y Auto-association by multilayer perceptrons and singular value decomposition[J]. Biological Cybernetics, 1988, 59 (4): 291- 294
[39]   LI C, HARRISON B. StyleM: stylized metrics for image captioning built with contrastive N-grams [EB/OL]. (2022–01–04)[2024–06–09]. https://arxiv.org/pdf/2201.00975.
[40]   KHAN F S, BEIGPOUR S, VAN DE WEIJER J, et al Painting-91: a large scale database for computational painting categorization[J]. Machine Vision and Applications, 2014, 25 (6): 1385- 1397
doi: 10.1007/s00138-014-0621-6
[41]   FLOREA C, CONDOROVICI R, VERTAN C, et al. Pandora: description of a painting database for art movement recognition with baselines and perspectives [C]// Proceedings of the 24th European Signal Processing Conference. Budapest: IEEE, 2016: 918–922.
[42]   湛颖, 高妍, 谢凌云 中国国画情感—美感数据库[J]. 中国图象图形学报, 2019, 24 (12): 2267- 2278
ZHAN Ying, GAO Yan, XIE Lingyun Database for emotion and aesthetic analysis of traditional Chinese paintings[J]. Journal of Image and Graphics, 2019, 24 (12): 2267- 2278
doi: 10.11834/jig.190102
[43]   TAKAGI M, SIMO-SERRA E, IIZUKA S, et al. What makes a style: experimental analysis of fashion prediction [C]// Proceedings of the IEEE International Conference on Computer Vision Workshops. Venice: IEEE, 2017: 2247–2253.
[44]   XU Z, TAO D, ZHANG Y, et al. Architectural style classification using multinomial latent logistic regression [C]// Computer Vision – ECCV 2014. [S. l.]: Springer, 2014: 600–615.
[45]   HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 770–778.
[46]   VINCENT P, LAROCHELLE H, LAJOIE I, et al Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion[J]. Journal of Machine Learning Research, 2010, 11: 3371- 3408
[1] Ruo-ran CHENG,Xiao-li ZHAO,Hao-jun ZHOU,Han-chen YE. Review of Chinese font style transfer research based on deep learning[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(3): 510-519, 530.
[2] Tong CHEN,Jian-feng GUO,Xin-zhong HAN,Xue-li XIE,Jian-xiang XI. Visible and infrared image matching method based on generative adversarial model[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(1): 63-74.
[3] Xing LIU,Jian-bo YU. Attention convolutional GRU-based autoencoder and its application in industrial process monitoring[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(9): 1643-1651.
[4] Yu-hui XU,Jun-qing SHU,Ya SONG,Yu ZHENG,Tang-bin XIA. Remaining useful life prediction of turbofan engine based on similarity in multiple time scales[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(10): 1937-1947.
[5] Jin-sheng JIANG,Hao-ran REN,Han-ye LI. Seismic data processing based on convolutional autoencoder[J]. Journal of ZheJiang University (Engineering Science), 2020, 54(5): 978-984.
[6] WEI Chao, LUO Sen-lin, ZHANG Jing, PAN Li-min. Short text manifold representation based on AutoEncoder network[J]. Journal of ZheJiang University (Engineering Science), 2015, 49(8): 1591-1599.