Please wait a minute...
Journal of ZheJiang University (Engineering Science)  2023, Vol. 57 Issue (5): 939-947    DOI: 10.3785/j.issn.1008-973X.2023.05.010
    
Sketch-based compatible clothing image generation
Xiao-lu CAO(),Fu-nan LU,Xiang ZHU,Li-bo WENG,Shu-fang LU*(),Fei GAO
College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou 310023, China
Download: HTML     PDF(2843KB) HTML
Export: BibTeX | EndNote (RIS)      

Abstract  

A new sketch-based method for generating compatible clothing was proposed, aiming at the lack of diversity and compatibility of the existing clothing image generation methods. Users were allowed to input both sketches and reference clothing drawings to generate diverse clothing images, and the generated images were faithful to the description of the sketch in content and compatible with the reference clothing in style. A novel network framework consisting of two encoding networks and one decoding network was designed. Encoding networks were used to extract the features of the reference clothing and user drawn sketches, and the decoding network was used to generate images. A decoding network was constructed to generated images. The authenticity discrimination network and the compatibility discrimination network were constructed. Several joint loss functions which were composed of the adversarial loss, reconstruction loss, perception loss, style loss and edge loss, were designed to guide the network to generate realistic clothing images, which compatible with the style of reference clothing images. Quantitative experimental results showed that the proposed method improved the quality of generated images, and the overall performance was better than the baseline methods. Qualitative experimental results showed that the proposed method was consistent with the sketch description and could generate the diverse results.



Key wordsclothing image generation      deep learning      generative adversarial network      clothing compatibility      image translation     
Received: 06 May 2022      Published: 09 May 2023
CLC:  TP 391  
Fund:  浙江省自然科学基金资助项目(LQ22F020008);浙江省尖兵领雁研发攻关计划项目(2022C01120)
Corresponding Authors: Shu-fang LU     E-mail: xiaolucao@outlook.com;sflu@zjut.edu.cn
Cite this article:

Xiao-lu CAO,Fu-nan LU,Xiang ZHU,Li-bo WENG,Shu-fang LU,Fei GAO. Sketch-based compatible clothing image generation. Journal of ZheJiang University (Engineering Science), 2023, 57(5): 939-947.

URL:

https://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2023.05.010     OR     https://www.zjujournals.com/eng/Y2023/V57/I5/939


基于草图的兼容性服装生成方法

针对现有服装图像生成方法多样性不足、兼容性欠缺的问题,提出新的基于草图的兼容性服装生成方法. 允许用户输入草图和参考服装图,生成多样化的服装图像,使得内容上忠于草图的描述,风格上与参考服装兼容. 设计由2个编码网络和1个解码网络构成的新颖网络框架,其中编码网络用于提取参考服装和用户绘制草图的特征,解码用于生成图像. 构建真实性判别网络和兼容性判别网络,设计由对抗损失、重建损失、感知损失、风格损失和边缘损失相结合的多项联合损失函数,引导网络生成逼真的服装图像,并与参考服装图像的风格兼容. 定量实验结果表明,所提方法提高了图像生成质量,整体表现优于基线方法; 定性实验结果表明,所提方法生成图像更符合草图描述,可以生成多样化的结果.


关键词: 服装图像生成,  深度学习,  生成对抗网络,  服装兼容性,  图像翻译 
Fig.1 Framework of sketch-based compatible clothing image generation network
Fig.2 Network architecture of generator
Fig.3 Network architecture of discriminators
方法 SSIM FID IS
Pix2Pix[3] 0.795 3 50.237 2 4.719 7
CycleGAN[5] 0.546 2 72.322 1 4.228 5
DiscoGAN[12] 0.652 5 51.029 5 4.699 9
CocosNet[18] 0.654 1 38.001 8 4.384 1
CocosNet v2[19] 0.765 9 23.462 4 4.502 6
Anime2Clothing[39] 0.637 4 52.329 3 4.854 3
w/o CSC Block 0.719 2 22.406 4 4.878 7
Ours 0.794 0 20.202 3 4.930 7
Tab.1 Comparison of three indicators of generated images of different methods
方法 真实性 得分 兼容性 得分 方法 真实性 得分 兼容性 得分
Pix2Pix[3] 5.44 5.61 CocosNet v2[19] 17.56 17.25
CycleGAN[5] 8.94 9.67 Anime2Clothing[39] 20.00 19.11
DiscoGAN[12] 11.19 12.81 Ours 22.08 21.00
CocosNet[18] 14.78 14.56
Tab.2 User study on authenticity and compatibility
Fig.4 Comparison results of images generated by different methods under same input
Fig.5 Generated image comparison results when inputting same sketch with different reference garments
Fig.6 Generated image comparison results when inputting same reference garments with different sketchs
[1]   GOODFELLOW I J, POUGETABADIE J, MIRZA M, et al Generative adversarial nets[J]. Advances in Neural Information Processing Systems, 2014, 27: 2672- 2680
[2]   KARRAS T, LAINE S, AILA T. A style-based generator architecture for generative adversarial networks [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Los Angeles: IEEE, 2019: 4401-4410.
[3]   ISOLA P, ZHU J Y, ZHOU T, et al. Image-to-image translation with conditional adversarial networks [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 5967-5976.
[4]   PARK T, LIU M Y, WANG T C, et al. Semantic image synthesis with spatially-adaptive normalization [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 2332-2341.
[5]   ZHU J Y, PARK T, ISOLA P, et al. Unpaired image-to-image translation using cycle-consistent adversarial networks [C]// Proceedings of the IEEE International Conference on Computer Vision. Venice: IEEE, 2017: 2223-2232.
[6]   JOHNSON J, ALAHI A, LFF. Perceptual losses for real-time style transfer and super-resolution [C]// Proceedings of the European Conference on Computer Vision. Cham: Springer, 2016: 694-711.
[7]   ZHAO Z, MA X. A compensation method of two-stage image generation for human-ai collaborated in-situ fashion design in augmented reality environment [C]// IEEE International Conference on Artificial Intelligence and Virtual Reality (AIVR). Taiwan: IEEE, 2018: 76-83.
[8]   YU C, HU Y, CHEN Y, et al. Personalized fashion design [C]// Proceedings of the IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019: 9045-9054.
[9]   HAN Y, YANG S, WANG W, et al. From design draft to real attire: unaligned fashion image translation [C]// Proceedings of the 28th ACM International Conference on Multimedia. Seattle: ACM, 2020: 1533-1541.
[10]   WANG T C, LIU M Y, ZHU J Y, et al. High-resolution image synthesis and semantic manipulation with conditional GANs [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 8798-8807.
[11]   常佳, 王玉德, 吉燕妮 基于改进的U-Net生成对抗网络的图像翻译算法[J]. 通信技术, 2020, 53 (2): 327- 334
CHANG Jia, WANG Yu-dei, JI Yan-ni Algorithm based on modified U-Net generative adversarial network[J]. Communications Technology, 2020, 53 (2): 327- 334
doi: 10.3969/j.issn.1002-0802.2020.02.011
[12]   KIM T, CHA M, KIM H, et al. Learning to discover cross-domain relations with generative adversarial networks [C]// Proceedings of the International Conference on Machine Learning. Belgium: PMLR, 2017: 1857-1865.
[13]   CHOI Y, CHOI M, KIM M, et al. Stargan: unified generative adversarial networks for multi-domain image-to-image translation [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2018: 8789-8797.
[14]   XU Y, XIE S, WU W, et al. Maximum Spatial perturbation consistency for unpaired image-to-image translation [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022: 18311-18320.
[15]   LIU B, SONG K, ZHU Y, et al. Sketch-to-art: synthesizing stylized art images from sketches [C]// Proceedings of the Asian Conference on Computer Vision. Cham: Springer, 2020: 207-222.
[16]   CHOI Y, UH Y, YOO J, et al. StarGANv2: diverse image synthesis for multiple domains [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 8185-8194.
[17]   RICHARDSON E, ALALUF Y, PATASHNIK O, et al. Encoding in style: a stylegan encoder for image-to-image translation [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021: 2287-2296.
[18]   ZHANG P, ZHANG B, CHEN D, et al. Cross-domain correspondence learning for exemplar-based image translation [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 5142-5152.
[19]   ZHOU X, ZHANG B, ZHANG T, et al. Cocosnet v2: Full-resolution correspondence learning for image translation [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021: 11460-11470.
[20]   VASILEVA M I, PLUMMER B A, DUSAD K, et al. Learning type-aware embeddings for fashion compatibility [C]// Proceedings of the European Conference on Computer Vision (ECCV). Cham: Springer, 2018: 390-405.
[21]   HAN X, WU Z, JIANG Y G, et al. Learning fashion compatibility with bidirectional lstms [C]// Proceedings of the 25th ACM International Conference on Multimedia. Mountain View: ACM, 2017: 1078–1086.
[22]   TAUTKUTE I, TRZCIŃSKI T, SKORUPA A P, et al Deepstyle: multimodal search engine for fashion and interior design[J]. IEEE Access, 2019, 7: 84613- 84628
doi: 10.1109/ACCESS.2019.2923552
[23]   MCAULEY J, TARGETT C, SHI Q, et al. Image-based recommendations on styles and substitutes [C]// Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval. Santiago: ACM, 2015: 43-52.
[24]   VEIT A, KOVACS B, BELL S, et al. Learning visual clothing style with heterogeneous dyadic co-occurrences [C]// Proceedings of the IEEE International Conference on Computer Vision. Santiago: IEEE, 2015: 4642-4650.
[25]   LIN Y L, TRAN S, DAVIS L S. Fashion outfit complementary item retrieval [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 3308-3316.
[26]   LIU L, ZHANG H, ZHOU D Clothing generation by multi-modal embedding: a compatibility matrix-regularized GAN model[J]. Image and Vision Computing, 2021, 107: 104097
doi: 10.1016/j.imavis.2021.104097
[27]   DONG X, SONG X, ZHENG N, et al TryonCM2: try-on-enhanced fashion compatibility modeling framework[J]. IEEE Transactions on Neural Networks and Learning Systems, 2022, 1- 12
[28]   WANG Y, WEI Y, QIAN X, et al Sketch-guided scenery image outpainting[J]. IEEE Transactions on Image Processing, 2021, 30: 2643- 2655
doi: 10.1109/TIP.2021.3054477
[29]   NGUYEN T, LE T, VU H, et al. Dual discriminator generative adversarial nets [EB/OL]. [2017-09-12]. https://arxiv.org/pdf/1709.03831.pdf.
[30]   SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition [EB/OL]. [2015-04-10]. https://arxiv.org/pdf/1409.1556.pdf.
[31]   XIE S, TU Z. Holistically-nested edge detection [C]// Proceedings of the IEEE International Conference on Computer Vision. Santiago: IEEE, 2015: 1395-1403.
[32]   SONG X, FENG F, LIU J, et al. Neurostylist: Neural compatibility modeling for clothing matching [C]// Proceedings of the 25th ACM International Conference on Multimedia. Mountain View: ACM, 2017: 753-761.
[33]   CANNY J A computational approach to edge detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1986, 8 (6): 679- 698
[34]   KINGMA D P, BA J. Adam: a method for stochastic optimization [EB/OL]. (2017-01-30)[2023-4-18]. http://orxiv.org/pdf/1412.6980.pdf.
[35]   HEUSEL M, RAMSAUER H, UNTERTHINER T, et al. Gans trained by a two time-scale update rule converge to a local nash equilibrium [C]// Advances in Neural Information Processing Systems. Long Beach: CA, 2017: 6626-6637.
[36]   SALIMANS T, GOODFELLOW I, ZAREMBA W, et al. Improved techniques for training gans [C]// Advances in Neural Information Processing Systems. Barcelona: CA, 2016: 2226-2234.
[37]   WANG Z, BOVIK A C, SHEIKH H R, et al Image quality assessment: from error visibility to structural similarity[J]. IEEE Transactions on Image Processing, 2004, 13 (4): 600- 612
doi: 10.1109/TIP.2003.819861
[38]   SZEGEDY C, VANHOUCKE V, IOFFE S, et al. Rethinking the inception architecture for computer vision [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 2818-2826.
[1] Yu-ting SU,Rong-xuan LU,Wei ZHANG. Vehicle re-identification algorithm based on attention mechanism and adaptive weight[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(4): 712-718.
[2] Qing-lu MA,Jia-ping LU,Xiao-yao TANG,Xue-feng DUAN. Improved YOLOv5s flame and smoke detection method in road tunnels[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(4): 784-794.
[3] Yao ZENG,Fa-qin GAO. Surface defect detection algorithm of electronic components based on improved YOLOv5[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(3): 455-465.
[4] Huan LAN,Jian-bo YU. Steel surface defect detection based on deep learning 3D reconstruction[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(3): 466-476.
[5] Ju-xiang ZENG,Ping-hui WANG,Yi-dong DING,Lin LAN,Lin-xi CAI,Xiao-hong GUAN. Graph neural network based node embedding enhancement model for node classification[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(2): 219-225.
[6] Jian-sha LU,Qin BAO,Hong-tao TANG,Yi-ping SHAO,Wen-bin ZHAO. Optimal tag selection method for device-free human tracking system[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(2): 415-425.
[7] Jun-chi MA,Xiao-xin DI,Zong-tao DUAN,Lei TANG. Survey on program representation learning[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(1): 155-169.
[8] Chen YE,Hong-fei ZHAN,Ying-jun LIN,Jun-he YU,Rui WANG,Wu-chang ZHONG. Design knowledge recommendation based on inference-context-aware activation model[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(1): 32-46.
[9] Jin-zhen LIU,Fei CHEN,Hui XIONG. Open electrical impedance imaging algorithm based on multi-scale residual network model[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(9): 1789-1795.
[10] Wan-liang WANG,Tie-jun WANG,Jia-cheng CHEN,Wen-bo YOU. Medical image segmentation method combining multi-scale and multi-head attention[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(9): 1796-1805.
[11] Kun HAO,Kuo WANG,Bei-bei WANG. Lightweight underwater biological detection algorithm based on improved Mobilenet-YOLOv3[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(8): 1622-1632.
[12] Yong-sheng ZHAO,Rui-xiang LI,Na-na NIU,Zhi-yong ZHAO. Shape control method of fuselage driven by digital twin[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(7): 1457-1463.
[13] Wen-chao BAI,Xi-xian HAN,Jin-bao WANG. Efficient approximate query processing framework based on conditional generative model[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(5): 995-1005.
[14] Li HE,Shan-min PANG. Face reconstruction from voice based on age-supervised learning and face prior information[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(5): 1006-1016.
[15] Xue-qin ZHANG,Tian-ren LI. Breast cancer pathological image classification based on Cycle-GAN and improved DPN network[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(4): 727-735.