Loading…
Interactive Fashion Content Generation Using LLMs and Latent Diffusion Models
Fashionable image generation aims to synthesize images of diverse fashion prevalent around the globe, helping fashion designers in real-time visualization by giving them a basic customized structure of how a specific design preference would look in real life and what further improvements can be made...
Saved in:
Published in: | arXiv.org 2023-05 |
---|---|
Main Authors: | , |
Format: | Article |
Language: | English |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | |
container_end_page | |
container_issue | |
container_start_page | |
container_title | arXiv.org |
container_volume | |
creator | Krishna Sri Ipsit Mantri Sasikumar, Nevasini |
description | Fashionable image generation aims to synthesize images of diverse fashion prevalent around the globe, helping fashion designers in real-time visualization by giving them a basic customized structure of how a specific design preference would look in real life and what further improvements can be made for enhanced customer satisfaction. Moreover, users can alone interact and generate fashionable images by just giving a few simple prompts. Recently, diffusion models have gained popularity as generative models owing to their flexibility and generation of realistic images from Gaussian noise. Latent diffusion models are a type of generative model that use diffusion processes to model the generation of complex data, such as images, audio, or text. They are called "latent" because they learn a hidden representation, or latent variable, of the data that captures its underlying structure. We propose a method exploiting the equivalence between diffusion models and energy-based models (EBMs) and suggesting ways to compose multiple probability distributions. We describe a pipeline on how our method can be used specifically for new fashionable outfit generation and virtual try-on using LLM-guided text-to-image generation. Our results indicate that using an LLM to refine the prompts to the latent diffusion model assists in generating globally creative and culturally diversified fashion styles and reducing bias. |
format | article |
fullrecord | <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2824144632</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2824144632</sourcerecordid><originalsourceid>FETCH-proquest_journals_28241446323</originalsourceid><addsrcrecordid>eNqNitEKgjAUQEcQJOU_DHoW9G6a75YV6Fs9y8hZE7kr79b3p9EH9HTgnLNgAQiRRLkEWLGQqI_jGLIdpKkIWH1Gp0d1c-ateanoYSzywk4SHT9qnJqb1ZUM3nlV1cQVtrxS32Fvus7T3Gvb6oE2bNmpgXT445pty8OlOEXP0b68Jtf01o84pQZykImUmQDx3_UBaLg9aw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2824144632</pqid></control><display><type>article</type><title>Interactive Fashion Content Generation Using LLMs and Latent Diffusion Models</title><source>Publicly Available Content Database (Proquest) (PQ_SDU_P3)</source><creator>Krishna Sri Ipsit Mantri ; Sasikumar, Nevasini</creator><creatorcontrib>Krishna Sri Ipsit Mantri ; Sasikumar, Nevasini</creatorcontrib><description>Fashionable image generation aims to synthesize images of diverse fashion prevalent around the globe, helping fashion designers in real-time visualization by giving them a basic customized structure of how a specific design preference would look in real life and what further improvements can be made for enhanced customer satisfaction. Moreover, users can alone interact and generate fashionable images by just giving a few simple prompts. Recently, diffusion models have gained popularity as generative models owing to their flexibility and generation of realistic images from Gaussian noise. Latent diffusion models are a type of generative model that use diffusion processes to model the generation of complex data, such as images, audio, or text. They are called "latent" because they learn a hidden representation, or latent variable, of the data that captures its underlying structure. We propose a method exploiting the equivalence between diffusion models and energy-based models (EBMs) and suggesting ways to compose multiple probability distributions. We describe a pipeline on how our method can be used specifically for new fashionable outfit generation and virtual try-on using LLM-guided text-to-image generation. Our results indicate that using an LLM to refine the prompts to the latent diffusion model assists in generating globally creative and culturally diversified fashion styles and reducing bias.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Audio data ; Customer satisfaction ; Fashion designers ; Fashion models ; Image enhancement ; Image processing ; Random noise ; User satisfaction</subject><ispartof>arXiv.org, 2023-05</ispartof><rights>2023. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.proquest.com/docview/2824144632?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>780,784,25753,37012,44590</link.rule.ids></links><search><creatorcontrib>Krishna Sri Ipsit Mantri</creatorcontrib><creatorcontrib>Sasikumar, Nevasini</creatorcontrib><title>Interactive Fashion Content Generation Using LLMs and Latent Diffusion Models</title><title>arXiv.org</title><description>Fashionable image generation aims to synthesize images of diverse fashion prevalent around the globe, helping fashion designers in real-time visualization by giving them a basic customized structure of how a specific design preference would look in real life and what further improvements can be made for enhanced customer satisfaction. Moreover, users can alone interact and generate fashionable images by just giving a few simple prompts. Recently, diffusion models have gained popularity as generative models owing to their flexibility and generation of realistic images from Gaussian noise. Latent diffusion models are a type of generative model that use diffusion processes to model the generation of complex data, such as images, audio, or text. They are called "latent" because they learn a hidden representation, or latent variable, of the data that captures its underlying structure. We propose a method exploiting the equivalence between diffusion models and energy-based models (EBMs) and suggesting ways to compose multiple probability distributions. We describe a pipeline on how our method can be used specifically for new fashionable outfit generation and virtual try-on using LLM-guided text-to-image generation. Our results indicate that using an LLM to refine the prompts to the latent diffusion model assists in generating globally creative and culturally diversified fashion styles and reducing bias.</description><subject>Audio data</subject><subject>Customer satisfaction</subject><subject>Fashion designers</subject><subject>Fashion models</subject><subject>Image enhancement</subject><subject>Image processing</subject><subject>Random noise</subject><subject>User satisfaction</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNqNitEKgjAUQEcQJOU_DHoW9G6a75YV6Fs9y8hZE7kr79b3p9EH9HTgnLNgAQiRRLkEWLGQqI_jGLIdpKkIWH1Gp0d1c-ateanoYSzywk4SHT9qnJqb1ZUM3nlV1cQVtrxS32Fvus7T3Gvb6oE2bNmpgXT445pty8OlOEXP0b68Jtf01o84pQZykImUmQDx3_UBaLg9aw</recordid><startdate>20230515</startdate><enddate>20230515</enddate><creator>Krishna Sri Ipsit Mantri</creator><creator>Sasikumar, Nevasini</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20230515</creationdate><title>Interactive Fashion Content Generation Using LLMs and Latent Diffusion Models</title><author>Krishna Sri Ipsit Mantri ; Sasikumar, Nevasini</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_28241446323</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Audio data</topic><topic>Customer satisfaction</topic><topic>Fashion designers</topic><topic>Fashion models</topic><topic>Image enhancement</topic><topic>Image processing</topic><topic>Random noise</topic><topic>User satisfaction</topic><toplevel>online_resources</toplevel><creatorcontrib>Krishna Sri Ipsit Mantri</creatorcontrib><creatorcontrib>Sasikumar, Nevasini</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection (Proquest) (PQ_SDU_P3)</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database (Proquest) (PQ_SDU_P3)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Krishna Sri Ipsit Mantri</au><au>Sasikumar, Nevasini</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Interactive Fashion Content Generation Using LLMs and Latent Diffusion Models</atitle><jtitle>arXiv.org</jtitle><date>2023-05-15</date><risdate>2023</risdate><eissn>2331-8422</eissn><abstract>Fashionable image generation aims to synthesize images of diverse fashion prevalent around the globe, helping fashion designers in real-time visualization by giving them a basic customized structure of how a specific design preference would look in real life and what further improvements can be made for enhanced customer satisfaction. Moreover, users can alone interact and generate fashionable images by just giving a few simple prompts. Recently, diffusion models have gained popularity as generative models owing to their flexibility and generation of realistic images from Gaussian noise. Latent diffusion models are a type of generative model that use diffusion processes to model the generation of complex data, such as images, audio, or text. They are called "latent" because they learn a hidden representation, or latent variable, of the data that captures its underlying structure. We propose a method exploiting the equivalence between diffusion models and energy-based models (EBMs) and suggesting ways to compose multiple probability distributions. We describe a pipeline on how our method can be used specifically for new fashionable outfit generation and virtual try-on using LLM-guided text-to-image generation. Our results indicate that using an LLM to refine the prompts to the latent diffusion model assists in generating globally creative and culturally diversified fashion styles and reducing bias.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | EISSN: 2331-8422 |
ispartof | arXiv.org, 2023-05 |
issn | 2331-8422 |
language | eng |
recordid | cdi_proquest_journals_2824144632 |
source | Publicly Available Content Database (Proquest) (PQ_SDU_P3) |
subjects | Audio data Customer satisfaction Fashion designers Fashion models Image enhancement Image processing Random noise User satisfaction |
title | Interactive Fashion Content Generation Using LLMs and Latent Diffusion Models |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-01T20%3A59%3A15IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Interactive%20Fashion%20Content%20Generation%20Using%20LLMs%20and%20Latent%20Diffusion%20Models&rft.jtitle=arXiv.org&rft.au=Krishna%20Sri%20Ipsit%20Mantri&rft.date=2023-05-15&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2824144632%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-proquest_journals_28241446323%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2824144632&rft_id=info:pmid/&rfr_iscdi=true |