1. School of Computer and Communication Engineering, Zhengzhou University of Light Industry, Zhengzhou 450000, China 2. Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai 200240, China 3. School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116, China
An image cartoonization method that incorporated attention mechanism and structural line extraction was proposed in order to address the problem that image cartoonization does not highlight important feature information in the image and insufficient edge processing. The generator network with fused attention mechanism was constructed, which extracted more important and richer image information from different features by fusing the connections between features in space and channels. A line extraction region processing module (LERM) in parallel with the global one was designed to perform adversarial training on the edge regions of cartoon textures in order to better learn cartoon textures. This method not only generates cartoonish images with high perceptual quality in terms of important areas and details, but also avoids the loss of content and color. The extensive experimental results showed that the proposed method achieved better cartoonization, which validated the effectiveness of the method.
Tab.1FID values for generated image and target image
Fig.8Comparison chart of loss function ablation experiment
Fig.9Comparison chart of component ablation experiment
序号
模型
FID
A
局部分支
177.3
B
全局分支
135.8
C
无颜色重建损失
135.0
D
无注意力模块
134.5
E
Canny
134.0
F
整个模型
133.7
Tab.2FID value of ablation experiment
[1]
梅洪, 陈昭炯 基于Mean Shift和FDoG的图像卡通化渲染[J]. 计算机工程与应用, 2016, 52 (10): 213- 217 MEI Hong, CHEN Zhaojiong Cartoonish rendering of images based on Mean Shift and FDoG[J]. Computer Engineering and Applications, 2016, 52 (10): 213- 217
doi: 10.3778/j.issn.1002-8331.1407-0015
[2]
刘侠 基于OpencCV中Mean Shift的图像卡通化处理[J]. 信息与电脑: 理论版, 2020, 32 (20): 54- 57 LIU Xia Image cartoon processing based on Mean Shift in OpencCV[J]. Information and Computer: Theory Edition, 2020, 32 (20): 54- 57
[3]
CANNY J A computational approach to edge detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1986, PAMI-8 (6): 679- 698
doi: 10.1109/TPAMI.1986.4767851
[4]
WU Z, ZHU Z, DU J, et al. CCPL: contrastive coherence preserving loss for versatile style transfer [C]// 17th European Conference on Computer Vision . Cham: Springer, 2022: 189-206.
[5]
ZHANG Y, TANG F, DONG W, et al. Domain enhanced arbitrary image style transfer via contrastive learning [C]// ACM SIGGRAPH 2022 Conference Proceedings . Vancouver: ACM, 2022: 1-8.
[6]
ZHU J Y, PARK T, ISOLA P, et al. Unpaired image-to-image translation using cycle-consistent adversarial networks [C]// IEEE International Conference on Computer Vision . Venice: IEEE, 2017: 2223-2232.
[7]
CHEN Y, LAI Y K, LIU Y J. Cartoongan: generative adversarial networks for photo cartoonization [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Salt Lake City: IEEE, 2018: 9465-9474.
[8]
PĘŚKO M, SVYSTUN A, ANDRUSZKIEWICZ P, et al Comixify: transform video into comics[J]. Fundamenta Informaticae, 2019, 168 (2-4): 311- 333
doi: 10.3233/FI-2019-1834
[9]
WANG X, YU J. Learning to cartoonize using white-box cartoon representations [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Seattle: IEEE, 2020: 8090-8099.
[10]
LI R, WU C H, LIU S, et al SDP-GAN: saliency detail preservation generative adversarial networks for high perceptual quality style transfer[J]. IEEE Transactions on Image Processing, 2020, 30: 374- 385
[11]
CHEN J, LIU G, CHEN X. Animegan: a novel lightweight GAN for photo animation [C]// International Symposium on Intelligence Computation and Applications . Singapore: Springer, 2020: 242-256.
[12]
GATYS L A, ECKER A S, BETHGE M. Image style transfer using convolutional neural networks [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Las Vegas: IEEE, 2016: 2414-2423.
[13]
SHU Y, YI R, XIA M, et al GAN-based multi-style photo cartoonization[J]. IEEE Transactions on Visualization and Computer Graphics, 2021, 28 (10): 3376- 3390
[14]
DONG Y, TAN W, TAO D, et al cartoonLossGAN: learning surface and coloring of images for cartoonization[J]. IEEE Transactions on Image Processing, 2021, 31: 485- 498
[15]
GAO X, ZHANG Y, TIAN Y. Learning to incorporate texture saliency adaptive attention to image cartoonization [EB/OL]. (2022-08-02)[2023-06-01]. https://arxiv.org/abs/2208.01587.
[16]
KANG H, LEE S, CHUI C K Flow-based image abstraction[J]. IEEE Transactions on Visualization and Computer Graphics, 2008, 15 (1): 62- 76
[17]
WINNEMOELLER H, KYPRIANIDIS J E, OLSEN S C XDoG: an extended difference-of-Gaussians compendium including advanced image stylization[J]. Computers and Graphics, 2012, 36 (6): 740- 753
doi: 10.1016/j.cag.2012.03.004
[18]
SÝKORA D, BURIÁNEK J, ŽÁRA J Segmentation of black and white cartoons[J]. Image and Vision Computing, 2005, 23 (9): 767- 782
doi: 10.1016/j.imavis.2005.05.010
[19]
SÝKORA D, BURIÁNEK J, ŽÁRA J. Sketching cartoons by example [C]// Proceedings of Eurographics Workshop on Sketch Based Interfaces and Modeling . Schoten: Eurographics Association, 2005: 27-33.
[20]
ZHANG S H, CHEN T, ZHANG Y F, et al Vectorizing cartoon animations[J]. IEEE Transactions on Visualization and Computer Graphics, 2009, 15 (4): 618- 629
doi: 10.1109/TVCG.2009.9
[21]
LIU X, MAO X, YANG X, et al Stereoscopizing cel animations[J]. ACM Transactions on Graphics, 2013, 32 (6): 1- 10
[22]
LI C, LIU X, WONG T T Deep extraction of manga structural lines[J]. ACM Transactions on Graphics, 2017, 36 (4): 1- 12
[23]
WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module [C]// Proceedings of the European Conference on Computer Vision . Munich: Elsevier, 2018: 3-19.
[24]
RUSSAKOVSKY O, DENG J, SU H, et al Imagenet large scale visual recognition challenge[J]. International Journal of Computer Vision, 2015, 115 (3): 211- 252
doi: 10.1007/s11263-015-0816-y
[25]
SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition [EB/OL]. (2014-09-04)[2023-06-01]. https://arxiv.org/abs/1409.1556.
[26]
MAO X, LI Q, XIE H, et al. Least squares generative adversarial networks [C]// Proceedings of the IEEE International Conference on Computer Vision . Venice: IEEE, 2017: 2794-2802.
[27]
HEUSEL M, RAMSAUER H, UNTERTHINER T, et al. GANs trained by a two time-scale update rule converge to a local Nash equilibrium [C]// Advances in Neural Information Processing Systems . Long Beach: MIT Press, 2017: 30.
[28]
SZEGEDY C, VANHOUCKE V, IOFFE S, et al. Rethinking the inception architecture for computer vision [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Las Vegas: IEEE, 2016: 2818-2826.