Please wait a minute...
Journal of ZheJiang University (Engineering Science)  2023, Vol. 57 Issue (7): 1326-1334    DOI: 10.3785/j.issn.1008-973X.2023.07.007
    
Calligraphy generation algorithm based on improved generative adversarial network
Yun-hong LI(),Jiao-jiao DUAN,Xue-ping SU,Lei-tao ZHANG,Hui-kang YU,Xing-rui LIU
School of Electronics and Information, Xi’an Polytechnic University, Xi’an 710048, China
Download: HTML     PDF(2087KB) HTML
Export: BibTeX | EndNote (RIS)      

Abstract  

An improved zi2zi generative adversarial network for generating calligraphic characters was proposed to solve the problems of missing strokes, wrong glyph structure, blurred images and poor quality in generating fonts by generative adversarial networks. The residual block with a convolution kernel of 1 was introduced into the encoder to improve the ability of the generator to extract detailed features of calligraphic fonts. The context-aware attention structure was increased to extract the stylistic features of calligraphic fonts. The spectral normalization was used in the discriminator to enhance model’s stability, which avoided pattern collapse due to unstable model training. The minimum absolute error L1 norm was used to constraint the font edge features, which made the font outline clearer. Then two styles of calligraphy characters were generated. The test results of the target style datasets of Yan Zhenqing Regular Script and Zhao Mengfu Running Script showed that the subjective and objective evaluation results of the proposed algorithm were better than the comparison algorithm compared with zi2zi. The peak signal-to-noise ratio was increased by 1.58 and 1.76 dB respectively, the structural similarity was increased by 5.66% and 6.91% respectively, and the perceptual similarity was reduced by 4.21% and 6.20% respectively.



Key wordscalligraphy generation      deep learning      generative adversarial network      context-aware attention      edge loss     
Received: 02 August 2022      Published: 17 July 2023
CLC:  TP 391  
Fund:  国家自然科学基金资助项目(61902301);陕西省科技厅自然科学基础研究重点资助项目(2022JZ-35)
Cite this article:

Yun-hong LI,Jiao-jiao DUAN,Xue-ping SU,Lei-tao ZHANG,Hui-kang YU,Xing-rui LIU. Calligraphy generation algorithm based on improved generative adversarial network. Journal of ZheJiang University (Engineering Science), 2023, 57(7): 1326-1334.

URL:

https://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2023.07.007     OR     https://www.zjujournals.com/eng/Y2023/V57/I7/1326


基于改进生成对抗网络的书法字生成算法

针对生成对抗网络生成字体存在笔画缺失、字形结构错乱、图像模糊与质量差的问题,提出改进zi2zi生成对抗网络的书法字生成算法. 在编码器中引入卷积核为1的残差块,提高生成器提取书法字体细节特征的能力,通过增加上下文感知注意力结构提取书法字体的风格特征. 在判别器中利用谱归一化增强模型的稳定性,避免因模型训练不稳定而带来的模式崩塌. 采用最小绝对误差L1范数约束生成字体边缘特征,使得字体轮廓更加清晰,最终生成2种风格的书法字. 颜真卿楷书与赵孟頫行书目标风格数据集的测试结果表明,提出算法的主观客观评价结果均优于对比算法,与zi2zi相比,峰值信噪比分别提高了1.58、1.76 dB,结构相似性分别提高了5.66%、6.91%,感知相似性分别降低了4.21%、6.20%.


关键词: 书法字生成,  深度学习,  生成对抗网络,  上下文感知注意力,  边缘损失 
Fig.1 zi2zi network structure
Fig.2 Overall framework of improved network
网络层类型 网络参数
卷积层 4×4 Conv, 64 filters, 2 strides
卷积单元层 4×4 Conv-IN-LRelu, 128 filters, 2 strides
卷积单元层 4×4 Conv-IN-LRelu, 256 filters, 2 strides
卷积单元层 4×4 Conv-IN-LRelu, 512 filters, 2 strides
卷积单元层 4×4 Conv-IN-LRelu, 512 filters, 2 strides
卷积单元层 4×4 Conv-IN-LRelu, 512 filters, 2 strides
残差单元层 Resnet Block 1×1 Conv, 512 filters,1stride
卷积单元层 4×4 Conv-IN-LRelu, 512 filters, 2 strides
残差单元层 Resnet Block 1×1 Conv, 512 filters, 1 stride
卷积层 4×4 Conv-LRelu, 512 filters, 2 strides
Tab.1 Network parameters of content encoder
Fig.3 Residual block
Fig.4 Context-aware attention block
Fig.5 Self-built samples of two datasets
Fig.6 Ablation experiments of different module (target font: Yan Zhenqing regular script)
Fig.7 Ablation experiments of different module (target font: Zhao Mengfu running script)
网络模型 SSIM PSNR/dB LPIPS
zi2zi 0.6975 9.3097 0.2383
zi2zi+残差 0.7115 9.8567 0.2452
zi2zi+残差+风格编码 0.7394 10.8484 0.1842
zi2zi+残差+风格编码+边缘损失 0.7425 10.8228 0.1821
Tab.2 Evaluation index of ablation experiment (target font: Yan Zhenqing regular script)
网络模型 SSIM PSNR/dB LPIPS
zi2zi 0.6917 8.5425 0.2947
zi2zi+残差 0.6918 8.4810 0.2845
zi2zi+残差+风格编码 0.7607 10.8883 0.2177
zi2zi+残差+风格编码+边缘损失 0.7665 10.9052 0.2152
Tab.3 Evaluation index of ablation experiment (target font: Zhao Mengfu running script)
Fig.8 Results generated by different algorithms (target font: Yan Zhenqing regular script)
Fig.9 Results generated by different algorithms (target font: Zhao Mengfu running script)
算法 SSIM PSNR/dB LPIPS
CycleGAN 0.5492 7.2531 0.3300
DenseNet CycleGAN 0.4958 7.2698 0.3472
zi2zi 0.6728 9.0505 0.2289
EMD 0.5567 8.7037 0.3229
CalliGAN 0.6254 9.2908 0.2385
LFFont 0.6016 9.1077 0.2311
MXFont 0.6151 9.3029 0.2497
本文算法 0.7249 10.6253 0.1868
Tab.4 Evaluation indexes of different algorithms (target font: Yan Zhenqing regular script)
算法 SSIM PSNR/dB LPIPS
CycleGAN 0.5617 7.1628 0.3712
DenseNet CycleGAN 0.5317 6.9037 0.3608
zi2zi 0.6348 7.7615 0.3223
EMD 0.5614 8.4337 0.3595
CalliGAN 0.5737 7.8590 0.3271
LFFont 0.5941 8.1400 0.3025
MXFont 0.6078 7.8666 0.3087
本文算法 0.7039 9.5221 0.2603
Tab.5 Evaluation indexes of different algorithms (target font: Zhao Mengfu running script)
Fig.10 Results of source font in simkai
Fig.11 Results of source font in simsun
源字体 目标字体 SSIM PSNR/dB LPIPS
楷体 楷书 0.7044 10.1207 0.1965
楷体 行书 0.6914 9.2937 0.2503
宋体 楷书 0.7153 10.4740 0.2042
宋体 行书 0.6971 9.5003 0.2495
Tab.6 Evaluation indexes of target font generated by different source fonts
Fig.12 Loss curves of training process
[1]   程若然, 赵晓丽, 周浩军, 等 基于深度学习的中文字体风格转换研究综述[J]. 浙江大学学报: 工学版, 2022, 56 (3): 510- 519
CHENG Ruo-ran, ZHAO Xiao-li, ZHOU Hao-jun, et al Review of Chinese font style conversion based on deep learning[J]. Journal of Zhejiang University: Engineering Science, 2022, 56 (3): 510- 519
[2]   曾锦山, 陈琪, 王明文 基于田字格变换的自监督汉字字体生成[J]. 中国科学: 信息科学, 2022, 52 (1): 145- 159
ZENG Jin-shan, CHEN Qi, WANG Ming-wen Self-supervised font generation of Chinese characters based on Tian Zige transformation[J]. Science of China: Information Science, 2022, 52 (1): 145- 159
[3]   黄子君, 陈琪, 罗文兵 基于深度学习的汉字生成方法[J]. 计算机工程与应用, 2021, 57 (17): 29- 36
HUANG Zi-jun, CHEN Qi, LUO Wen-bing Chinese character generation method based on deep learning[J]. Computer Engineering and Application, 2021, 57 (17): 29- 36
doi: 10.3778/j.issn.1002-8331.2103-0297
[4]   TIAN Y. Rewrite [CP/OL]. [2022-08-02]. https://github.com/kaonashi-tyc/Rewrite.
[5]   ISOLA P, ZHU J Y, ZHOU T, et al. Image-to-image translation with conditional adversarial networks [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE, 2017: 5967-5976.
[6]   TIAN Y. Zi2zi [CP/OL]. [2022-08-02]. https://github.com/kaonashi-tyc/zi2zi.
[7]   JIANG Y, LIAN Z, TANG Y, et al. DCFont: an end-to-end deep Chinese font generation system [C]// SIGGRAPH Asia 2017 Technical Briefs. New York: ACM, 2017: 1-4.
[8]   CHANG B, ZHANG Q, PAN S, et al. Generating handwritten Chinese characters using CycleGAN [C]// 2018 IEEE Winter Conference on Applications of Computer Vision. Lake Tahoe: IEEE, 2018: 199-207.
[9]   ZHU J Y, PARK T, ISOLA P, et al. Unpaired image-to-image translation using cycle-consistent adversarial networks [C]// 2017 IEEE International Conference on Computer Vision. Venice: IEEE, 2017: 2242-2251.
[10]   REN C, LYU S, ZHAN H, et al SAFont: automatic font synthesis using self-attention mechanisms[J]. Australia Journal of Intelligent Information Processing Systems, 2019, 16 (2): 19- 25
[11]   WU S J, YANG C Y, HSU J Y. CalliGAN: style and structure-aware Chinese calligraphy character generator [EB/OL]. (2020-05-26) [2022-08-02]. https://arxiv.org/abs/2005.12500.
[12]   PARK S, CHUN S, CHA J, et al. Few-shot font generation with localized style representations and factorization [C]// Proceedings of the AAAI Conference on Artificial Intelligence. Vancouver: AAAI, 2021, 35(3): 2393-2402.
[13]   XIE Y, CHEN X, SUN L, et al. DGFont: deformable generative networks for unsupervised font generation [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021: 5130-5140.
[14]   PARK S, CHUN S, CHA J, et al. Multiple heads are better than one: few-shot font generation with multiple localized experts [C]// Proceedings of the IEEE/CVF International Conference on Computer Vision. Montreal: IEEE, 2021: 13900-13909.
[15]   KONG Y, LUO C, MA W, et al. Look closer to supervise better: one-shot font generation via component-based discriminator [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022: 13482-13491.
[16]   ODENA A, OLAH C, SHLENS J. Conditional image synthesis with auxiliary classifier GANs [C]// Proceedings of the 34th International Conference on Machine Learning. Sydney: ACM, 2017, 70: 2642–2651.
[17]   TAIGMAN Y, POLYAK A, WOLF L. Unsupervised cross-domain image generation [EB/OL]. (2016-10-07) [2022-08-02]. https://arxiv.org/abs/1611.02200.
[18]   WANG Z, BOVIK A C, SHEIKH H R, et al Image quality assessment: from error visibility to structural similarity[J]. IEEE Transactions on Image Processing, 2004, 13 (4): 600- 612
doi: 10.1109/TIP.2003.819861
[19]   ZHANG R, ISOLA P, EFROS A A, et al. The unreasonable effectiveness of deep features as a perceptual metric [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE, 2018: 586-595.
[1] Zhe YANG,Hong-wei GE,Ting LI. Framework of feature fusion and distribution with mixture of experts for parallel recommendation algorithm[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(7): 1317-1325.
[2] Xin-lei ZHOU,Hai-ting GU,Jing LIU,Yue-ping XU,Fang GENG,Chong WANG. Daily water supply prediction method based on integrated learning and deep learning[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(6): 1120-1127.
[3] Pei-feng LIU,Lu QIAN,Xing-wei ZHAO,Bo TAO. Continual learning framework of named entity recognition in aviation assembly domain[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(6): 1186-1194.
[4] Jia-chi ZHAO,Tian-qi WANG,Li-fang ZENG,Xue-ming SHAO. Rapid prediction of unsteady aerodynamic characteristics of flapping wing based on GRU[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(6): 1251-1256.
[5] Xiao-lu CAO,Fu-nan LU,Xiang ZHU,Li-bo WENG,Shu-fang LU,Fei GAO. Sketch-based compatible clothing image generation[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(5): 939-947.
[6] Yu-ting SU,Rong-xuan LU,Wei ZHANG. Vehicle re-identification algorithm based on attention mechanism and adaptive weight[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(4): 712-718.
[7] Qing-lu MA,Jia-ping LU,Xiao-yao TANG,Xue-feng DUAN. Improved YOLOv5s flame and smoke detection method in road tunnels[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(4): 784-794.
[8] Yao ZENG,Fa-qin GAO. Surface defect detection algorithm of electronic components based on improved YOLOv5[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(3): 455-465.
[9] Huan LAN,Jian-bo YU. Steel surface defect detection based on deep learning 3D reconstruction[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(3): 466-476.
[10] Ju-xiang ZENG,Ping-hui WANG,Yi-dong DING,Lin LAN,Lin-xi CAI,Xiao-hong GUAN. Graph neural network based node embedding enhancement model for node classification[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(2): 219-225.
[11] Jian-sha LU,Qin BAO,Hong-tao TANG,Yi-ping SHAO,Wen-bin ZHAO. Optimal tag selection method for device-free human tracking system[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(2): 415-425.
[12] Jun-chi MA,Xiao-xin DI,Zong-tao DUAN,Lei TANG. Survey on program representation learning[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(1): 155-169.
[13] Chen YE,Hong-fei ZHAN,Ying-jun LIN,Jun-he YU,Rui WANG,Wu-chang ZHONG. Design knowledge recommendation based on inference-context-aware activation model[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(1): 32-46.
[14] Jin-zhen LIU,Fei CHEN,Hui XIONG. Open electrical impedance imaging algorithm based on multi-scale residual network model[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(9): 1789-1795.
[15] Wan-liang WANG,Tie-jun WANG,Jia-cheng CHEN,Wen-bo YOU. Medical image segmentation method combining multi-scale and multi-head attention[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(9): 1796-1805.