Please wait a minute...
浙江大学学报(工学版)  2023, Vol. 57 Issue (5): 939-947    DOI: 10.3785/j.issn.1008-973X.2023.05.010
计算机技术与控制工程     
基于草图的兼容性服装生成方法
曹晓璐(),卢富男,朱翔,翁立波,卢书芳*(),高飞
浙江工业大学 计算机科学与技术学院,浙江 杭州 310023
Sketch-based compatible clothing image generation
Xiao-lu CAO(),Fu-nan LU,Xiang ZHU,Li-bo WENG,Shu-fang LU*(),Fei GAO
College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou 310023, China
 全文: PDF(2843 KB)   HTML
摘要:

针对现有服装图像生成方法多样性不足、兼容性欠缺的问题,提出新的基于草图的兼容性服装生成方法. 允许用户输入草图和参考服装图,生成多样化的服装图像,使得内容上忠于草图的描述,风格上与参考服装兼容. 设计由2个编码网络和1个解码网络构成的新颖网络框架,其中编码网络用于提取参考服装和用户绘制草图的特征,解码用于生成图像. 构建真实性判别网络和兼容性判别网络,设计由对抗损失、重建损失、感知损失、风格损失和边缘损失相结合的多项联合损失函数,引导网络生成逼真的服装图像,并与参考服装图像的风格兼容. 定量实验结果表明,所提方法提高了图像生成质量,整体表现优于基线方法; 定性实验结果表明,所提方法生成图像更符合草图描述,可以生成多样化的结果.

关键词: 服装图像生成深度学习生成对抗网络服装兼容性图像翻译    
Abstract:

A new sketch-based method for generating compatible clothing was proposed, aiming at the lack of diversity and compatibility of the existing clothing image generation methods. Users were allowed to input both sketches and reference clothing drawings to generate diverse clothing images, and the generated images were faithful to the description of the sketch in content and compatible with the reference clothing in style. A novel network framework consisting of two encoding networks and one decoding network was designed. Encoding networks were used to extract the features of the reference clothing and user drawn sketches, and the decoding network was used to generate images. A decoding network was constructed to generated images. The authenticity discrimination network and the compatibility discrimination network were constructed. Several joint loss functions which were composed of the adversarial loss, reconstruction loss, perception loss, style loss and edge loss, were designed to guide the network to generate realistic clothing images, which compatible with the style of reference clothing images. Quantitative experimental results showed that the proposed method improved the quality of generated images, and the overall performance was better than the baseline methods. Qualitative experimental results showed that the proposed method was consistent with the sketch description and could generate the diverse results.

Key words: clothing image generation    deep learning    generative adversarial network    clothing compatibility    image translation
收稿日期: 2022-05-06 出版日期: 2023-05-09
CLC:  TP 391  
基金资助: 浙江省自然科学基金资助项目(LQ22F020008);浙江省尖兵领雁研发攻关计划项目(2022C01120)
通讯作者: 卢书芳     E-mail: xiaolucao@outlook.com;sflu@zjut.edu.cn
作者简介: 曹晓璐(2000—),女,硕士生,从事图像处理研究. orcid.org/0000-0002-8494-2913. E-mail: xiaolucao@outlook.com
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
作者相关文章  
曹晓璐
卢富男
朱翔
翁立波
卢书芳
高飞

引用本文:

曹晓璐,卢富男,朱翔,翁立波,卢书芳,高飞. 基于草图的兼容性服装生成方法[J]. 浙江大学学报(工学版), 2023, 57(5): 939-947.

Xiao-lu CAO,Fu-nan LU,Xiang ZHU,Li-bo WENG,Shu-fang LU,Fei GAO. Sketch-based compatible clothing image generation. Journal of ZheJiang University (Engineering Science), 2023, 57(5): 939-947.

链接本文:

https://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2023.05.010        https://www.zjujournals.com/eng/CN/Y2023/V57/I5/939

图 1  基于草图的兼容性服装生成网络架构图
图 2  生成器网络架构
图 3  判别器网络架构
方法 SSIM FID IS
Pix2Pix[3] 0.795 3 50.237 2 4.719 7
CycleGAN[5] 0.546 2 72.322 1 4.228 5
DiscoGAN[12] 0.652 5 51.029 5 4.699 9
CocosNet[18] 0.654 1 38.001 8 4.384 1
CocosNet v2[19] 0.765 9 23.462 4 4.502 6
Anime2Clothing[39] 0.637 4 52.329 3 4.854 3
w/o CSC Block 0.719 2 22.406 4 4.878 7
Ours 0.794 0 20.202 3 4.930 7
表 1  不同方法生成图像的3个指标之间的对比
方法 真实性 得分 兼容性 得分 方法 真实性 得分 兼容性 得分
Pix2Pix[3] 5.44 5.61 CocosNet v2[19] 17.56 17.25
CycleGAN[5] 8.94 9.67 Anime2Clothing[39] 20.00 19.11
DiscoGAN[12] 11.19 12.81 Ours 22.08 21.00
CocosNet[18] 14.78 14.56
表 2  图像真实性与兼容性的主观评估实验结果
图 4  同一输入下的不同方法生成图像的对比结果
图 5  输入同一张草图与不同的参考服装所生成的服装图像对比
图 6  输入同一张参考服装与不同的草图所生成的服装图像对比
1 GOODFELLOW I J, POUGETABADIE J, MIRZA M, et al Generative adversarial nets[J]. Advances in Neural Information Processing Systems, 2014, 27: 2672- 2680
2 KARRAS T, LAINE S, AILA T. A style-based generator architecture for generative adversarial networks [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Los Angeles: IEEE, 2019: 4401-4410.
3 ISOLA P, ZHU J Y, ZHOU T, et al. Image-to-image translation with conditional adversarial networks [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 5967-5976.
4 PARK T, LIU M Y, WANG T C, et al. Semantic image synthesis with spatially-adaptive normalization [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 2332-2341.
5 ZHU J Y, PARK T, ISOLA P, et al. Unpaired image-to-image translation using cycle-consistent adversarial networks [C]// Proceedings of the IEEE International Conference on Computer Vision. Venice: IEEE, 2017: 2223-2232.
6 JOHNSON J, ALAHI A, LFF. Perceptual losses for real-time style transfer and super-resolution [C]// Proceedings of the European Conference on Computer Vision. Cham: Springer, 2016: 694-711.
7 ZHAO Z, MA X. A compensation method of two-stage image generation for human-ai collaborated in-situ fashion design in augmented reality environment [C]// IEEE International Conference on Artificial Intelligence and Virtual Reality (AIVR). Taiwan: IEEE, 2018: 76-83.
8 YU C, HU Y, CHEN Y, et al. Personalized fashion design [C]// Proceedings of the IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019: 9045-9054.
9 HAN Y, YANG S, WANG W, et al. From design draft to real attire: unaligned fashion image translation [C]// Proceedings of the 28th ACM International Conference on Multimedia. Seattle: ACM, 2020: 1533-1541.
10 WANG T C, LIU M Y, ZHU J Y, et al. High-resolution image synthesis and semantic manipulation with conditional GANs [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 8798-8807.
11 常佳, 王玉德, 吉燕妮 基于改进的U-Net生成对抗网络的图像翻译算法[J]. 通信技术, 2020, 53 (2): 327- 334
CHANG Jia, WANG Yu-dei, JI Yan-ni Algorithm based on modified U-Net generative adversarial network[J]. Communications Technology, 2020, 53 (2): 327- 334
doi: 10.3969/j.issn.1002-0802.2020.02.011
12 KIM T, CHA M, KIM H, et al. Learning to discover cross-domain relations with generative adversarial networks [C]// Proceedings of the International Conference on Machine Learning. Belgium: PMLR, 2017: 1857-1865.
13 CHOI Y, CHOI M, KIM M, et al. Stargan: unified generative adversarial networks for multi-domain image-to-image translation [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2018: 8789-8797.
14 XU Y, XIE S, WU W, et al. Maximum Spatial perturbation consistency for unpaired image-to-image translation [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022: 18311-18320.
15 LIU B, SONG K, ZHU Y, et al. Sketch-to-art: synthesizing stylized art images from sketches [C]// Proceedings of the Asian Conference on Computer Vision. Cham: Springer, 2020: 207-222.
16 CHOI Y, UH Y, YOO J, et al. StarGANv2: diverse image synthesis for multiple domains [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 8185-8194.
17 RICHARDSON E, ALALUF Y, PATASHNIK O, et al. Encoding in style: a stylegan encoder for image-to-image translation [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021: 2287-2296.
18 ZHANG P, ZHANG B, CHEN D, et al. Cross-domain correspondence learning for exemplar-based image translation [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 5142-5152.
19 ZHOU X, ZHANG B, ZHANG T, et al. Cocosnet v2: Full-resolution correspondence learning for image translation [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021: 11460-11470.
20 VASILEVA M I, PLUMMER B A, DUSAD K, et al. Learning type-aware embeddings for fashion compatibility [C]// Proceedings of the European Conference on Computer Vision (ECCV). Cham: Springer, 2018: 390-405.
21 HAN X, WU Z, JIANG Y G, et al. Learning fashion compatibility with bidirectional lstms [C]// Proceedings of the 25th ACM International Conference on Multimedia. Mountain View: ACM, 2017: 1078–1086.
22 TAUTKUTE I, TRZCIŃSKI T, SKORUPA A P, et al Deepstyle: multimodal search engine for fashion and interior design[J]. IEEE Access, 2019, 7: 84613- 84628
doi: 10.1109/ACCESS.2019.2923552
23 MCAULEY J, TARGETT C, SHI Q, et al. Image-based recommendations on styles and substitutes [C]// Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval. Santiago: ACM, 2015: 43-52.
24 VEIT A, KOVACS B, BELL S, et al. Learning visual clothing style with heterogeneous dyadic co-occurrences [C]// Proceedings of the IEEE International Conference on Computer Vision. Santiago: IEEE, 2015: 4642-4650.
25 LIN Y L, TRAN S, DAVIS L S. Fashion outfit complementary item retrieval [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 3308-3316.
26 LIU L, ZHANG H, ZHOU D Clothing generation by multi-modal embedding: a compatibility matrix-regularized GAN model[J]. Image and Vision Computing, 2021, 107: 104097
doi: 10.1016/j.imavis.2021.104097
27 DONG X, SONG X, ZHENG N, et al TryonCM2: try-on-enhanced fashion compatibility modeling framework[J]. IEEE Transactions on Neural Networks and Learning Systems, 2022, 1- 12
28 WANG Y, WEI Y, QIAN X, et al Sketch-guided scenery image outpainting[J]. IEEE Transactions on Image Processing, 2021, 30: 2643- 2655
doi: 10.1109/TIP.2021.3054477
29 NGUYEN T, LE T, VU H, et al. Dual discriminator generative adversarial nets [EB/OL]. [2017-09-12]. https://arxiv.org/pdf/1709.03831.pdf.
30 SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition [EB/OL]. [2015-04-10]. https://arxiv.org/pdf/1409.1556.pdf.
31 XIE S, TU Z. Holistically-nested edge detection [C]// Proceedings of the IEEE International Conference on Computer Vision. Santiago: IEEE, 2015: 1395-1403.
32 SONG X, FENG F, LIU J, et al. Neurostylist: Neural compatibility modeling for clothing matching [C]// Proceedings of the 25th ACM International Conference on Multimedia. Mountain View: ACM, 2017: 753-761.
33 CANNY J A computational approach to edge detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1986, 8 (6): 679- 698
34 KINGMA D P, BA J. Adam: a method for stochastic optimization [EB/OL]. (2017-01-30)[2023-4-18]. http://orxiv.org/pdf/1412.6980.pdf.
35 HEUSEL M, RAMSAUER H, UNTERTHINER T, et al. Gans trained by a two time-scale update rule converge to a local nash equilibrium [C]// Advances in Neural Information Processing Systems. Long Beach: CA, 2017: 6626-6637.
36 SALIMANS T, GOODFELLOW I, ZAREMBA W, et al. Improved techniques for training gans [C]// Advances in Neural Information Processing Systems. Barcelona: CA, 2016: 2226-2234.
37 WANG Z, BOVIK A C, SHEIKH H R, et al Image quality assessment: from error visibility to structural similarity[J]. IEEE Transactions on Image Processing, 2004, 13 (4): 600- 612
doi: 10.1109/TIP.2003.819861
38 SZEGEDY C, VANHOUCKE V, IOFFE S, et al. Rethinking the inception architecture for computer vision [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 2818-2826.
[1] 苏育挺,陆荣烜,张为. 基于注意力和自适应权重的车辆重识别算法[J]. 浙江大学学报(工学版), 2023, 57(4): 712-718.
[2] 马庆禄,鲁佳萍,唐小垚,段学锋. 改进YOLOv5s的公路隧道烟火检测方法[J]. 浙江大学学报(工学版), 2023, 57(4): 784-794.
[3] 曾耀,高法钦. 基于改进YOLOv5的电子元件表面缺陷检测算法[J]. 浙江大学学报(工学版), 2023, 57(3): 455-465.
[4] 兰欢,余建波. 基于深度学习三维成型的钢板表面缺陷检测[J]. 浙江大学学报(工学版), 2023, 57(3): 466-476.
[5] 曾菊香,王平辉,丁益东,兰林,蔡林熹,管晓宏. 面向节点分类的图神经网络节点嵌入增强模型[J]. 浙江大学学报(工学版), 2023, 57(2): 219-225.
[6] 鲁建厦,包秦,汤洪涛,邵益平,赵文彬. 无设备人体追踪系统的择优标签方法[J]. 浙江大学学报(工学版), 2023, 57(2): 415-425.
[7] 马骏驰,迪骁鑫,段宗涛,唐蕾. 程序表示学习综述[J]. 浙江大学学报(工学版), 2023, 57(1): 155-169.
[8] 叶晨,战洪飞,林颖俊,余军合,王瑞,钟武昌. 基于推理-情境感知激活模型的设计知识推荐[J]. 浙江大学学报(工学版), 2023, 57(1): 32-46.
[9] 刘近贞,陈飞,熊慧. 多尺度残差网络模型的开放式电阻抗成像算法[J]. 浙江大学学报(工学版), 2022, 56(9): 1789-1795.
[10] 王万良,王铁军,陈嘉诚,尤文波. 融合多尺度和多头注意力的医疗图像分割方法[J]. 浙江大学学报(工学版), 2022, 56(9): 1796-1805.
[11] 郝琨,王阔,王贝贝. 基于改进Mobilenet-YOLOv3的轻量级水下生物检测算法[J]. 浙江大学学报(工学版), 2022, 56(8): 1622-1632.
[12] 夏杰锋,唐武勤,杨强. 光伏航拍红外图像的热斑自动检测方法[J]. 浙江大学学报(工学版), 2022, 56(8): 1640-1647.
[13] 赵永胜,李瑞祥,牛娜娜,赵志勇. 数字孪生驱动的机身形状控制方法[J]. 浙江大学学报(工学版), 2022, 56(7): 1457-1463.
[14] 白文超,韩希先,王金宝. 基于条件生成模型的高效近似查询处理框架[J]. 浙江大学学报(工学版), 2022, 56(5): 995-1005.
[15] 何立,庞善民. 结合年龄监督和人脸先验的语音-人脸图像重建[J]. 浙江大学学报(工学版), 2022, 56(5): 1006-1016.