114 |
HAO Y R, CHI Z W, DONG L, et al. Optimizing Prompts For Text-To-Image Generation[Z]. (2022-12-19). https://doi.org/10.48550/arXiv.2212.09611.
|
115 |
WITTEVEEN S, ANDREWS M. Investigating Prompt Engineering in Diffusion Models[Z]. (2022-11-21). https://doi.org/10.48550/arXiv.2211. 15462.
|
116 |
WANG Z J, MONTOYA E, MUNECHIKA D, et al. DiffusionDB: A Large-scale Prompt Gallery Dataset for Text-to-Image Generative Models[Z]. (2022-10-26). https://doi.org/10.48550/arXiv.2210. 14896.
|
117 |
SONG J M, MENG C L, ERMON S. Denoising Diffusion Implicit Models[Z]. (2020-10-06). https://doi.org/10.48550/arXiv.2010.02502.
|
118 |
LU C, ZHOU Y H, BAO F, et al. DPM-Solver: A Fast Ode Solver for Diffusion Probabilistic Model Sampling in Around 10 Steps[Z]. (2022-06-02).https://doi.org/10.48550/arXiv.2206.00927.
|
119 |
LU C, ZHOU Y H, BAO F, et al. DPM-Solver++: Fast Solver for Guided Sampling of Diffusion Probabilistic Models[Z]. (2022-11-02). https://doi.org/10.48550/arXiv.2211.01095.
|
120 |
ZHANG Q S, TAO M L, CHEN Y X. GDDIM: Generalized Denoising Diffusion Implicit Models[Z].(2022-06-11). https://doi.org/10.48550/arXiv.2206. 05564.
|
121 |
BAO F, LI C X, ZHU J, et al. Analytic-DPM: An Analytic Estimate of the Optimal Reverse Variance in Diffusion Probabilistic Models[Z]. (2022-01-17). https://doi.org/10.48550/arXiv.2201.06503.
|
122 |
LUHMAN E, LUHMAN T. Knowledge Distillation in Iterative Generative Models for Improved Sampling Speed[Z]. (2021-01-07). https://doi.org/10.48550/arXiv.2101.02388.
|
123 |
SALIMANS T, HO J. Progressive Distillation for Fast Sampling of Diffusion Models[Z]. (2022-02-01). https://doi.org/10.48550/arXiv.2202.00512.
|
124 |
MENG C L, ROMBACH R, GAO R Q, et al. On distillation of guided diffusion models[C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver: IEEE, 2023: 14297-14306. DOI:10.1109/cvpr52729.2023.01374
doi: 10.1109/cvpr52729.2023.01374
|
125 |
BAO F, NIE S, XUE K W, et al. All are worth words: A vit backbone for diffusion models[C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver: IEEE, 2023: 22669-22679. DOI:10.1109/cvpr52729.2023.02171
doi: 10.1109/cvpr52729.2023.02171
|
126 |
PEEBLES W, XIE S. Scalable Diffusion Models with Transformers[Z]. (2022-12-19).https://doi.org/10.48550/arXiv.2212.09748.
|
1 |
WEI L Y, LEFEBVRE S, KWATRA V, et al. State of the Art in Example-Based Texture Synthesis[R]. Eindhoven: Eurographics Association, 2009: 93-117.
|
2 |
HAN C, RISSER E, RAMAMOORTHI R, et al. Multiscale texture synthesis[J]. ACM Transactions on Graphics, 2008, 27(3): 1-8. DOI:10.1145/1360612.1360650
doi: 10.1145/1360612.1360650
|
3 |
MAKTHAL S, ROSS A. Synthesis of iris images using Markov random fields[C]// 2005 13th European Signal Processing Conference. Antalya: IEEE, 2005: 1-4.
|
4 |
OSINDERO S, HINTON G E. Modeling image patches with a directed hierarchy of Markov random fields[J]. Advances in Neural Information Processing Systems. 2008, 20:1121-1128.
|
5 |
GOODFELLOW I, POUGET-ABADIE J, MIRZA M, et al. Generative adversarial networks[J]. Communications of the ACM, 2020, 63(11): 139-144. DOI:10.1007/978-3-030-50017-7_10
doi: 10.1007/978-3-030-50017-7_10
|
6 |
MIRZA M, OSINDERO S. Conditional Generative Adversarial Nets[Z]. (2014-11-06). https://arXiv.org/abs/1411.1784.
|
7 |
OORD A V D, KALCHBRENNER N, VINYALS O, et al. Conditional Image Generation with PixelCNN Decoders[Z]. (2016-06-16). https://doi.org/10.48550/arXiv.1606.05328.
|
8 |
SALIMANS T, KARPATHY A, CHEN X, et al. PixelCNN++: Improving the PixelCNN with Discretized Logistic Mixture Likelihood and Other Modifications[Z]. (2017-01-19). https://doi.org/10.48550/arXiv.1701.05517.
|
9 |
KINGMA D P, WELLING M. Auto-Encoding Variational Bayes[Z]. (2013-12-20). https://doi.org/10.48550/arXiv.1312.6114.
|
10 |
DINH L, KRUEGER D, BENGIO Y. NICE: Non-Linear Independent Components Estimation[Z]. (2014-10-30). https://doi.org/10.48550/arXiv.1410. 8516.
|
11 |
DINH L, SOHL-DICKSTEIN J, BENGIO S. Density Estimation Using Real NVP[Z]. (2016-05-27). https://doi.org/10.48550/arXiv.1605.08803.
|
12 |
LECUN Y, CHOPRA S, HADSELL R, et al. A tutorial on energy-based learning[C]//BAKIR G, HOFMAN T, SCHÖLKOPF B. Predicting Structured Data. Cambridge: MIT Press, 2006. doi:10.7551/mitpress/7443.003.0014
doi: 10.7551/mitpress/7443.003.0014
|
13 |
NGIAM J, CHEN Z, KOH P W, et al. Learning deep energy models[C]// 28th International Conference on International Conference on Machine Learning. Bellevue: Omnipress, 2011: 1105-1112.
|
14 |
HO J, JAIN A, ABBEEL P. Denoising Diffusion Probabilistic Models[Z]. (2020-06-19). https://doi.org/10.48550/arXiv.2006.11239.
|
15 |
SONG Y, ERMON S. Generative modeling by estimating gradients of the data distribution[C]//Thirty-third Conference on Neural Information Processing Systems(NeurIPS). Vancouver: NeurIPS, 2019.
|
16 |
ZHANG H, XU T, LI H S, et al. Stackgan: Text to photo-realistic image synthesis with stacked generative adversarial networks[C]// 2017 IEEE International Conference on Computer Vision (ICCV). Venice: IEEE, 2017: 5908-5916. DOI:10. 1109/iccv.2017.629
doi: 10. 1109/iccv.2017.629
|
17 |
XU T, ZHANG P C, HUANG Q Y, et al. AttnGan: Fine-grained text to image generation with attentional generative adversarial networks[C]// 2018 IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 1316-1324. DOI:10.1109/cvpr.2018.00143
doi: 10.1109/cvpr.2018.00143
|
18 |
DHARIWAL P, NICHOL A. Diffusion models beat GANs on image synthesis[J]. Advances in Neural Information Processing Systems, 2021, 11: 8780-8794.
|
19 |
RAMESH A, PAVLOV M, GOH G, et al. Zero-shot text-to-image generation[C]// International Conference on Machine Learning. Online: PMLR, 2021: 8821-8831.
|
20 |
KARRAS T, LAINE S, AILA T. A style-based generator architecture for generative adversarial networks[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 4401-4410. DOI:10.1109/cvpr.2019.00453
doi: 10.1109/cvpr.2019.00453
|
21 |
WU H H, SEETHARAMAN P, KUMAR K, et al. Wav2clip: Learning robust audio representations from clip[C]// 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Online: IEEE, 2022: 4563-4567. DOI:10.1109/icassp43922.2022.9747669
doi: 10.1109/icassp43922.2022.9747669
|
22 |
SONG Y, SOHL-DICKSTEIN J, KINGMA D P, et al. Score-Based Generative Modeling Through Stochastic Differential Equations[Z]. (2020-11-26). https://doi.org/10.48550/arXiv.2011.13456.
|
23 |
DHARIWAL P, NICHOL A. Diffusion models beat gans on image synthesis[J]. Advances in Neural Information Processing Systems, 2021, 34: 8780-8794.
|
24 |
NICHOL A, DHARIWAL P. Improved Denoising Diffusion Probabilistic Models[Z]. (2021-02-18). https://arxiv.org/abs/2102.09672.
|
25 |
NICHOL A, DHARIWAL P, RAMESH A, et al. GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models[Z]. (2021-12-20). https://arxiv.org/abs/2112.10741.
|
26 |
RONNEBERGER O, FISCHER P, BROX T. U-Net: Convolutional networks for biomedical image segmentation[C]// 18th International Conference on Medical Image Computing and Computer-Assisted Intervention. Munich: MICCAI, 2015: 234-241. doi:10.1007/978-3-319-24574-4_28
doi: 10.1007/978-3-319-24574-4_28
|
27 |
SOHL-DICKSTEIN J, WEISS E, MAHESWARANATHAN N, et al. Deep unsupervised learning using nonequilibrium thermodynamics[C]// 32th International Conference on Machine Learning. Lille: PMLR, 2015: 2256-2265.
|
28 |
SONG Y, ERMON S. Improved techniques for training score-based generative models[J]. Advances in Neural Information Processing Systems, 2020, 33: 12438-12448. doi:10.48550/arXiv.2006.09011
doi: 10.48550/arXiv.2006.09011
|
29 |
SONG Y, DURKAN C, MURRAY I, et al. Maximum likelihood training of score-based diffusion models[J]. Advances in Neural Information Processing Systems, 2021, 34: 1415-1428.
|
30 |
BROCK A, DONAHUE J, SIMONYAN K. Large Scale GAN Training for High Fidelity Natural Image Synthesis[Z]. (2018-09-28). https://doi.org/10.48550/arXiv.1809.11096.
|
31 |
HO J, SALIMANS T. Classifier-Free Diffusion Guidance[Z]. (2022-07-26). https://doi.org/10. 48550/arXiv.2207.12598.
|
32 |
ROMBACH R, BLATTMANN A, LORENZ D, et al. High-Resolution Image Synthesis with Latent Diffusion Models[Z]. (2021-12-20). https://doi.org/10.48550/arXiv.2112.10752.
|
33 |
RAMESH A, DHARIWAL P, NICHOL A, et al. Hierarchical Text-Conditional Image Generation with CLIP Latents[Z]. (2022-04-13). https://doi.org/10.48550/arXiv.2204.06125.
|
34 |
SAHARIA C, CHAN W, SAXENA S, et al. Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding[Z]. (2022-03-23). https://doi.org/10.48550/arXiv.2205.11487.
|
35 |
RADFORD A, KIM J W, HALLACY C, et al. Learning transferable visual models from natural language supervision[C]// International Conference on Machine Learning. Online: PMLR, 2021: 8748-8763.
|
36 |
LIU X, PARK D H, AZADI S, et al. More control for free image synthesis with semantic diffusion guidance[C]// Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. Vancouver: IEEE, 2023: 289-299. doi:10.1109/wacv56688.2023.00037
doi: 10.1109/wacv56688.2023.00037
|
37 |
AVRAHAMI O, LISCHINSKI D, FRIED O. Blended diffusion for text-driven editing of natural images[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans: IEEE, 2022: 18187-18197. DOI:10.1109/CVPR52688.2022.01767
doi: 10.1109/CVPR52688.2022.01767
|
38 |
KWON M, JEONG J, UH Y. Diffusion Models Already Have a Semantic Latent Space[Z]. (2022-10-20). https://doi.org/10.48550/arXiv.2210. 10960.
|
39 |
KIM G, KWON T, YE J C. Diffusionclip: Text-guided diffusion models for robust image manipulation[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022: 2426-2435. DOI:10.1109/cvpr52688.2022.00246
doi: 10.1109/cvpr52688.2022.00246
|
40 |
GU S Y, CHEN D, BAO J M, et al. Vector quantized diffusion model for text-to-image synthesis[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022: 10696-10706. DOI:10.1109/cvpr52688.2022.01043
doi: 10.1109/cvpr52688.2022.01043
|
41 |
HO J, SAHARIA C, CHAN W, et al. Cascaded diffusion models for high fidelity image generation[J]. The Journal of Machine Learning Research, 2022, 23(47): 1-33.
|
42 |
SAHARIA C, HO J, CHAN W, et al. Image super-resolution via iterative refinement[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 45(4): 4713-4726. DOI:10. 1109/tpami.2022.3204461
doi: 10. 1109/tpami.2022.3204461
|
43 |
YOUNG P, LAI A, HODOSH M, et al. From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions[J]. Transactions of the Association for Computational Linguistics, 2014, 2: 67-78. DOI:10.1162/tacl_a_00166
doi: 10.1162/tacl_a_00166
|
44 |
LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft coco: Common objects in context[C]// 13th European Conference on Computer Vision (ECCV). Zurich: Springer, 2014: 740-755. doi:10.1007/978-3-319-10602-1_48
doi: 10.1007/978-3-319-10602-1_48
|
45 |
CHANGPINYO S, SHARMA P, DING N, et al. Conceptual 12M: Pushing web-scale image-text pre-training to recognize long-tail visual concepts[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021: 3558-3568. DOI:10.1109/cvpr46437.2021.00356
doi: 10.1109/cvpr46437.2021.00356
|
46 |
SRINIVASAN K, RAMAN K, CHEN J, et al. Wit: Wikipedia-based image text dataset for multimodal multilingual machine learning[C]// 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. Online: ACM, 2021: 2443-2449. DOI:10.1145/3404835. 3463257
doi: 10.1145/3404835. 3463257
|
47 |
GU J X, MENG X J, LU G S, et al. Wukong: 100 Million Large-Scale Chinese Cross-Modal Pre-Training Dataset and a Foundation Framework[Z]. (2022-02-14). https://doi.org/10.48550/arXiv.2202. 06767.
|
48 |
SCHUHMANN C, VENCU R, BEAUMONT R, et al. Laion-400M: Open Dataset of Clip-Filtered 400 Million Image-Text Pairs[Z]. (2021-11-03). https://doi.org/10.48550/arXiv.2111.02114.
|
49 |
MINWOO B, BEOMHEE P, HAECHEON K, et al. COYO-700M: Image-Text Pair Dataset[Z]. https://github.com/kakaobrain/coyo-dataset.
|
50 |
SCHUHMANN C, BEAUMONT R, VENCU R, et al. Laion-5B: An Open Large-Scale Dataset for Training Next Generation Image-Text Models[Z]. (2022-10-16). https://doi.org/10.48550/arXiv.2210. 08402.
|
51 |
FENG Z, ZHANG Z, YU X, et al. ERNIE-ViLG 2.0: Improving Text-to-Image Diffusion Model with Knowledge-Enhanced Mixture-of-Denoising-Experts[Z]. (2022-10-27). https://doi.org/10. 48550/ arXiv.2210.15257.
|
52 |
BALAJI Y, NAH S, HUANG X, et al. EDiffi: Text-To-Image Diffusion Models with an Ensemble of Expert Denoisers[Z]. (2022-11-02). https://doi.org/10.48550/arXiv.2211.01324.
|
53 |
HOOGEBOOM E, HEEK J, SALIMANS T. Simple Diffusion: End-to-End Diffusion for High Resolution Images[Z]. (2023-01-26). https://doi.org/10.48550/arXiv.2301.11093.
|
54 |
ESSER P, ROMBACH R, OMMER B. Taming transformers for high-resolution image synthesis[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021: 12868-12878. DOI:10.1109/cvpr46437. 2021.01268
doi: 10.1109/cvpr46437. 2021.01268
|
55 |
RAFFEL C, SHAZEER N, ROBERTS A, et al. Exploring the limits of transfer learning with a unified text-to-text transformer[J]. The Journal of Machine Learning Research, 2020, 21(1): 5485-5551.
|
56 |
BORJI A. Generated Faces in the Wild: Quantitative Comparison of Stable Diffusion, Midjourney and DALL E2[Z]. (2022-10-02). https://doi.org/10. 48550/arXiv.2210.00586.
|
57 |
YE H, YANG X, TAKAC M, et al. Improving Text-to-Image Synthesis Using Contrastive Learning[Z]. (2021-07-06). https://doi.org/10. 48550/arXiv.2107.02423.
|
58 |
ZHANG H, KOH J Y, BALDRIDGE J, et al. Cross-modal contrastive learning for text-to-image generation[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021: 833-842. doi:10.1109/cvpr46437.2021.00089
doi: 10.1109/cvpr46437.2021.00089
|
59 |
ZHOU Y F, ZHANG R Y, CHEN C Y, et al. LAFITE: Towards Language-Free Training for Text-to-Image Generation[Z]. (2021-11-27). https://doi.org/10.48550/arXiv.2111.13792.
|
60 |
DING M, ZHENG W D, HONG W Y, et al. CogView2: Faster and Better Text-to-Image Generation via Hierarchical Transformers[Z]. (2022-04-28).https://doi.org/10.48550/arXiv.2204.14217.
|
61 |
GAFNI O, POLYAK A, ASHUAL O, et al. Make-a-scene: Scene-based text-to-image generation with human priors[C]// 17th European Conference on Computer Vision. Israel: Springer, 2022: 89-106. doi:10.1007/978-3-031-19784-0_6
doi: 10.1007/978-3-031-19784-0_6
|
62 |
YU J H, XU Y Z, KOH J Y, et al. Scaling Autoregressive Models for Content-Rich Text-to-Image Generation[Z]. (2022-06-22). https://doi.org/10.48550/arXiv.2206.10789.
|
63 |
LEE K M, LIU H, RYU M, et al. Aligning Text-To-Image Models Using Human Feedback[Z]. (2023-02-23). https://doi.org/10.48550/arXiv.2302. 12192.
|
64 |
ZHANG Q S, SONG J M, HUANG X, et al. DiffCollage: Parallel Generation of Large Content with Diffusion Models[Z]. (2023-03-30). https://doi.org/10.48550/arXiv.2303.17076.
|
65 |
SCHRAMOWSKI P, BRACK M, DEISEROTH B, et al. Safe Latent Diffusion: Mitigating Inappropriate Degeneration in Diffusion Models[Z]. (2022-11-09). https://doi.org/10.48550/arXiv.2211.05105.
|
66 |
FRIEDRICH F, SCHRAMOWSKI P, BRACK M, et al. Fair Diffusion: Instructing Text-to-Image Generation Models on Fairness[Z]. (2023-02-07). https://doi.org/10.48550/arXiv.2302.10893.
|
67 |
ZHU Y, WU Y, OLSZEWSKI K, et al. Discrete contrastive diffusion for cross-modal music and image generation[C]// The Eleventh International Conference on Learning Representations. Kigali Rwanda: ICLR, 2023: .
|
68 |
LIU N, LI S, DU Y L, et al. Compositional visual generation with composable diffusion models[C]// 17th European Conference on Computer Vision. Israel: Springer, 2022: 423-439. doi:10.1007/978-3-031-19790-1_26
doi: 10.1007/978-3-031-19790-1_26
|
69 |
LIEW J H, YAN H, ZHOU D, et al. MagicMix: Semantic Mixing with Diffusion Models[Z]. (2022-10-28). https://doi.org/10.48550/arXiv.2210. 16056.
|
70 |
MA W D K, LEWIS J P, KLEIJN W B, et al. Directed Diffusion: Direct Control of Object Placement through Attention Guidance[Z]. (2023-02-25). https://doi.org/10.48550/arXiv.2302.13153.
|
71 |
CHEFER H, ALALUF Y, VINKER Y, et al. Attend-and-Excite: Attention-Based Semantic Guidance for Text-to-Image Diffusion Models[Z]. (2023-01-31). https://doi.org/10.48550/arXiv.2301.13826.
|
72 |
GRAVE E, JOULIN A, USUNIER N. Improving Neural Language Models with a Continuous Cache[Z]. (2016-12-13). https://doi.org/10. 48550/arXiv. 1612.04426.
|
73 |
ROMBACH R, BLATTMANN A, OMMER B. Text-Guided Synthesis of Artistic Images with Retrieval-Augmented Diffusion Models[Z]. (2022-07-26). https://doi.org/10.48550/arXiv.2207.13038.
|
74 |
BLATTMANN A, ROMBACH R, OKTAY K, et al. Retrieval-augmented diffusion models[J]. Advances in Neural Information Processing Systems, 2022, 35: 15309-15324.
|
75 |
CHEN W H, HU H X, SAHARIA C, et al. Re-Imagen: Retrieval-Augmented Text-to-Image Generator[Z]. (2022-09-29). https://doi.org/10.48550/arXiv.2209.14491.
|
76 |
SHEYNIN S, ASHUAL O, POLYAK A, et al. KNN-Diffusion: Image Generation via Large-Scale Retrieval[Z]. (2022-04-06). https://doi.org/10.48550/arXiv.2204.02849.
|
77 |
GAL R, ALALUF Y, ATZMON Y, et al. An Image is Worth One Word: Personalizing Text-to-Image Generation Using Textual Inversion[Z]. (2022-08-02). https:// doi.org/10.48550/arXiv.2208.01618.
|
78 |
RUIZ N, LI Y, JAMPANI V, et al. DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation[Z]. (2022-08-25). https://doi.org/10.48550/arXiv.2208.12242.
|
79 |
DONG Z Y, WEI P X, LIN L. DreamArtist: Towards Controllable One-Shot Text-to-Image Generation via Contrastive Prompt-Tuning[Z]. (2022-11-21). https://doi.org/10.48550/arXiv.2211. 11337.
|
80 |
KUMARI N, ZHANG B, ZHANG R, et al. Multi-Concept Customization of Text-to-Image Diffusion[Z]. (2022-12-08). https://doi.org/10.48550/arXiv.2212.04488.
|
81 |
WEI Y, ZHANG Y, JI Z, et al. Elite: Encoding Visual Concepts into Textual Embeddings for Customized Text-to-Image Generation[Z]. (2023-02-27). https://doi.org/10.48550/arXiv.2302.13848.
|
82 |
LIU Z H, FENG R L, ZHU K, et al. Cones: Concept Neurons in Diffusion Models for Customized Generation[Z]. (2023-03-09). https://doi.org/10.48550/arXiv.2303.05125.
|
83 |
HAN L G, LI Y X, ZHANG H, et al. SVDiff: Compact Parameter Space for Diffusion Fine-Tuning[Z]. (2023-03-20). https://doi.org/10.48550/arXiv.2303. 11305.
|
84 |
PATASHNIK O, GARIBI D, AZURI I, et al. Localizing Object-Level Shape Variations with Text-to-Image Diffusion Models[Z]. (2023-03-20). https://doi.org/10.48550/arXiv.2303.11306.
|
85 |
HUANG Z Q, WU T X, JIANG Y M, et al. ReVersion: Diffusion-Based Relation Inversion from Images[Z]. (2023-03-23). https://doi.org/10.48550/arXiv. 2303.13495.
|
86 |
WANG T F, ZHANG T, ZHANG B, et al. Pretraining is All You Need for Image-to-Image Translation[Z]. (2022-05-25). https://doi.org/10. 48550/arXiv.2205.12952.
|
87 |
VOYNOV A, ABERMAN K, COHEN-OR D. Sketch-Guided Text-to-Image Diffusion Models[Z]. (2022-11-24). https://doi.org/10.48550/arXiv. 2211. 13752.
|
88 |
MAUNGMAUNG A, SHING M, MITSUI K, et al. Text-Guided Scene Sketch-to-Photo Synthesis[Z]. (2023-02-14). https://doi.org/10.48550/arXiv. 2302. 06883.
|
89 |
CHENG S I, CHEN Y J, CHIU W C, et al. Adaptively-realistic image generation from stroke and sketch with diffusion model[C]// 2023 IEEE/CVF Winter Conference on Applications of Computer Vision. Waikoloa: IEEE, 2023: 4043-4051. DOI:10. 1109/wacv56688.2023.00404
doi: 10. 1109/wacv56688.2023.00404
|
90 |
PENG Y C, ZHAO C Q, XIE H R, et al. DiffFaceSketch: High-Fidelity Face Image Synthesis with Sketch-Guided Latent Diffusion Model[Z]. (2023-02-14). https://doi.org/10.48550/arXiv.2302. 06908.
|
91 |
CHENG J X, LIANG X, SHI X J, et al. LayoutDiffuse: Adapting Foundational Diffusion Models for Layout-to-Image Generation[Z]. (2023-02-16). https://doi.org/10.48550/arXiv.2302.08908.
|
92 |
BAR-TAL O, YARIV L, LIPMAN Y, et al. MultiDiffusion: Fusing Diffusion Paths for Controlled Image Generation[Z]. (2023-02-16). https://doi.org/10.48550/arXiv.2302.08113.
|
93 |
AVRAHAMI O, HAYES T, GAFNI O, et al. SpaText: Spatio-textual representation for controllable image generation[C]// 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Vancouver: IEEE, 2023: 18370-18380. DOI:10.1109/CVPR52729.2023.01762
doi: 10.1109/CVPR52729.2023.01762
|
94 |
HAM C, HAYS J, LU J, et al. Modulating Pretrained Diffusion Models for Multimodal Image Synthesis[Z]. (2023-02-24). https://doi.org/10. 48550/arXiv.2302.12764.
|
95 |
YANG L, HUANG Z L, SONG Y, et al. Diffusion-Based Scene Graph to Image Generation with Masked Contrastive Pre-Training[Z]. (2022-11-21). https://doi.org/10.48550/arXiv.2211.11138.
|
96 |
LI Y H, LIU H T, WU Q Y, et al. GLIGEN: Open-Set Grounded Text-to-Image Generation[Z]. (2023-01-17). https://doi.org/10.48550/arXiv.2301.07093.
|
97 |
SARUKKAI V, LI L, MA A, et al. Collage Diffusion[Z]. (2023-03-01).https://doi.org/10.48550/ arXiv.2303.00262.
|
98 |
ZHANG L, AGRAWALA M. Adding Conditional Control to Text-to-Image Diffusion Models[Z]. (2023-02-10). https://doi.org/10.48550/arXiv.2302. 05543.
|
99 |
HUANG L H, CHEN D, LIU Y, et al. Composer: Creative and Controllable Image Synthesis with Composable Conditions[Z]. (2023-02-20). https://doi.org/10.48550/arXiv.2302.09778.
|
100 |
YU J W, WANG Y H, ZHAO C, et al. Freedom: Training-Free Energy-Guided Conditional Diffusion Model[Z]. (2023-03-17). https://doi.org/10.48550/arXiv.2303.09833.
|
101 |
LUGMAYR A, DANELLJAN M, ROMERO A, et al. RePaint: Inpainting using Denoising Diffusion Probabilistic Models[Z]. (2022-01-24). https://doi.org/10.48550/arXiv.2201.09865.
|
102 |
LI W B, YU X, ZHOU K, et al. SDM: Spatial Diffusion Model for Large Hole Image Inpainting[Z]. (2022-12-06). https://doi.org/10.48550/arXiv.2212. 02963.
|
103 |
LI R, TAN R T, CHEONG L F. All in one bad weather removal using architectural search[C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 3172-3182. DOI:10.1109/cvpr42600.2020.00324
doi: 10.1109/cvpr42600.2020.00324
|
104 |
CHEN H T, WANG Y H, GUO T Y, et al. Pre-trained image processing transformer[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021: 12294-12305. DOI:10.1109/cvpr46437.2021.01212
doi: 10.1109/cvpr46437.2021.01212
|
105 |
ZHU Y R, WANG T Y, FU X Y, et al. Learning Weather-General and Weather-Specific Features for Image Restoration Under Multiple Adverse Weather Conditions[C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver: IEEE, 2023: 21747-21758. DOI:10. 1109/cvpr52729.2023.02083
doi: 10. 1109/cvpr52729.2023.02083
|
106 |
KAWAR B, ELAD M, ERMON S, et al. Denoising Diffusion Restoration Models[Z]. (2022-01-27). https://doi.org/10.48550/arXiv.2201.11793.
|
107 |
WANG Y H, YU J W, ZHANG J. Zero-Shot Image Restoration Using Denoising Diffusion Null-Space Model[Z]. (2022-12-01). https://doi.org/10.48550/arXiv.2212.00490.
|
108 |
SAHARIA C, CHAN W, CHANG H, et al. Palette: Image-to-image diffusion models[C]// ACM SIGGRAPH 2022. Vancouver: ACM, 2022: 1-10. DOI:10.1145/3528233.3530757
doi: 10.1145/3528233.3530757
|
109 |
PAN X C, QIN P D, LI Y H, et al. Synthesizing Coherent Story with Auto-Regressive Latent Diffusion Models[Z]. (2022-11-20). https://doi.org/10.48550/ arXiv.2211.10950.
|
110 |
JEONG H, KWON G, YE J C. Zero-shot Generation of Coherent Storybook from Plain Text Story using Diffusion Models[Z]. (2023-02-08). https://doi.org/10.48550/arXiv.2302.03900.
|
111 |
NIKANKIN Y, HAIM N, IRANI M. SinFusion: Training Diffusion Models on a Single Image or Video[Z]. (2022-11-21). https://doi.org/10.48550/arXiv.2211.11743.
|
112 |
ZHAO Y Q, PANG T Y, DU C, et al. A Recipe for Watermarking Diffusion Models[Z]. (2023-03-17). https://doi.org/10.48550/arXiv.2303.10137.
|