Loading…
AniSora: Exploring the Frontiers of Animation Video Generation in the Sora Era
Animation has gained significant interest in the recent film and TV industry. Despite the success of advanced video generation models like Sora, Kling, and CogVideoX in generating natural videos, they lack the same effectiveness in handling animation videos. Evaluating animation video generation is...
Saved in:
Published in: | arXiv.org 2024-12 |
---|---|
Main Authors: | , , , , , , , , , , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | |
container_end_page | |
container_issue | |
container_start_page | |
container_title | arXiv.org |
container_volume | |
creator | Jiang, Yudong Xu, Baohan Yang, Siqian Yin, Mingyu Liu, Jing Xu, Chao Wang, Siqi Wu, Yidi Zhu, Bingwen Zhang, Xinwen Zheng, Xingyu Xu, Jixuan Zhang, Yue Hou, Jinlong Sun, Huyang |
description | Animation has gained significant interest in the recent film and TV industry. Despite the success of advanced video generation models like Sora, Kling, and CogVideoX in generating natural videos, they lack the same effectiveness in handling animation videos. Evaluating animation video generation is also a great challenge due to its unique artist styles, violating the laws of physics and exaggerated motions. In this paper, we present a comprehensive system, AniSora, designed for animation video generation, which includes a data processing pipeline, a controllable generation model, and an evaluation dataset. Supported by the data processing pipeline with over 10M high-quality data, the generation model incorporates a spatiotemporal mask module to facilitate key animation production functions such as image-to-video generation, frame interpolation, and localized image-guided animation. We also collect an evaluation benchmark of 948 various animation videos, the evaluation on VBench and human double-blind test demonstrates consistency in character and motion, achieving state-of-the-art results in animation video generation. Our evaluation benchmark will be publicly available at https://github.com/bilibili/Index-anisora. |
format | article |
fullrecord | <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_3145272903</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3145272903</sourcerecordid><originalsourceid>FETCH-proquest_journals_31452729033</originalsourceid><addsrcrecordid>eNqNzM0KwjAQBOAgCBbtOyx4LqRJY9WbSKsnL4rXEjDVlJqtmxR8fOvPA3gamPmYEYuElGmyzISYsNj7hnMuFrlQSkbssHH2iKTXUDy7Fsm6K4SbgZLQBWvIA9YwmLsOFh2c7cUg7Iwz9C2s-_D3BRSkZ2xc69ab-JdTNi-L03afdISP3vhQNdiTG6ZKppkSuVhxKf9TL3JyPVQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3145272903</pqid></control><display><type>article</type><title>AniSora: Exploring the Frontiers of Animation Video Generation in the Sora Era</title><source>Publicly Available Content Database</source><creator>Jiang, Yudong ; Xu, Baohan ; Yang, Siqian ; Yin, Mingyu ; Liu, Jing ; Xu, Chao ; Wang, Siqi ; Wu, Yidi ; Zhu, Bingwen ; Zhang, Xinwen ; Zheng, Xingyu ; Xu, Jixuan ; Zhang, Yue ; Hou, Jinlong ; Sun, Huyang</creator><creatorcontrib>Jiang, Yudong ; Xu, Baohan ; Yang, Siqian ; Yin, Mingyu ; Liu, Jing ; Xu, Chao ; Wang, Siqi ; Wu, Yidi ; Zhu, Bingwen ; Zhang, Xinwen ; Zheng, Xingyu ; Xu, Jixuan ; Zhang, Yue ; Hou, Jinlong ; Sun, Huyang</creatorcontrib><description>Animation has gained significant interest in the recent film and TV industry. Despite the success of advanced video generation models like Sora, Kling, and CogVideoX in generating natural videos, they lack the same effectiveness in handling animation videos. Evaluating animation video generation is also a great challenge due to its unique artist styles, violating the laws of physics and exaggerated motions. In this paper, we present a comprehensive system, AniSora, designed for animation video generation, which includes a data processing pipeline, a controllable generation model, and an evaluation dataset. Supported by the data processing pipeline with over 10M high-quality data, the generation model incorporates a spatiotemporal mask module to facilitate key animation production functions such as image-to-video generation, frame interpolation, and localized image-guided animation. We also collect an evaluation benchmark of 948 various animation videos, the evaluation on VBench and human double-blind test demonstrates consistency in character and motion, achieving state-of-the-art results in animation video generation. Our evaluation benchmark will be publicly available at https://github.com/bilibili/Index-anisora.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Animation ; Benchmarks ; Controllability ; Data processing ; Datasets ; Human motion ; Image quality ; Spatiotemporal data ; Video data</subject><ispartof>arXiv.org, 2024-12</ispartof><rights>2024. This work is published under http://creativecommons.org/licenses/by-nc-sa/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.proquest.com/docview/3145272903?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>780,784,25753,37012,44590</link.rule.ids></links><search><creatorcontrib>Jiang, Yudong</creatorcontrib><creatorcontrib>Xu, Baohan</creatorcontrib><creatorcontrib>Yang, Siqian</creatorcontrib><creatorcontrib>Yin, Mingyu</creatorcontrib><creatorcontrib>Liu, Jing</creatorcontrib><creatorcontrib>Xu, Chao</creatorcontrib><creatorcontrib>Wang, Siqi</creatorcontrib><creatorcontrib>Wu, Yidi</creatorcontrib><creatorcontrib>Zhu, Bingwen</creatorcontrib><creatorcontrib>Zhang, Xinwen</creatorcontrib><creatorcontrib>Zheng, Xingyu</creatorcontrib><creatorcontrib>Xu, Jixuan</creatorcontrib><creatorcontrib>Zhang, Yue</creatorcontrib><creatorcontrib>Hou, Jinlong</creatorcontrib><creatorcontrib>Sun, Huyang</creatorcontrib><title>AniSora: Exploring the Frontiers of Animation Video Generation in the Sora Era</title><title>arXiv.org</title><description>Animation has gained significant interest in the recent film and TV industry. Despite the success of advanced video generation models like Sora, Kling, and CogVideoX in generating natural videos, they lack the same effectiveness in handling animation videos. Evaluating animation video generation is also a great challenge due to its unique artist styles, violating the laws of physics and exaggerated motions. In this paper, we present a comprehensive system, AniSora, designed for animation video generation, which includes a data processing pipeline, a controllable generation model, and an evaluation dataset. Supported by the data processing pipeline with over 10M high-quality data, the generation model incorporates a spatiotemporal mask module to facilitate key animation production functions such as image-to-video generation, frame interpolation, and localized image-guided animation. We also collect an evaluation benchmark of 948 various animation videos, the evaluation on VBench and human double-blind test demonstrates consistency in character and motion, achieving state-of-the-art results in animation video generation. Our evaluation benchmark will be publicly available at https://github.com/bilibili/Index-anisora.</description><subject>Animation</subject><subject>Benchmarks</subject><subject>Controllability</subject><subject>Data processing</subject><subject>Datasets</subject><subject>Human motion</subject><subject>Image quality</subject><subject>Spatiotemporal data</subject><subject>Video data</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNqNzM0KwjAQBOAgCBbtOyx4LqRJY9WbSKsnL4rXEjDVlJqtmxR8fOvPA3gamPmYEYuElGmyzISYsNj7hnMuFrlQSkbssHH2iKTXUDy7Fsm6K4SbgZLQBWvIA9YwmLsOFh2c7cUg7Iwz9C2s-_D3BRSkZ2xc69ab-JdTNi-L03afdISP3vhQNdiTG6ZKppkSuVhxKf9TL3JyPVQ</recordid><startdate>20241219</startdate><enddate>20241219</enddate><creator>Jiang, Yudong</creator><creator>Xu, Baohan</creator><creator>Yang, Siqian</creator><creator>Yin, Mingyu</creator><creator>Liu, Jing</creator><creator>Xu, Chao</creator><creator>Wang, Siqi</creator><creator>Wu, Yidi</creator><creator>Zhu, Bingwen</creator><creator>Zhang, Xinwen</creator><creator>Zheng, Xingyu</creator><creator>Xu, Jixuan</creator><creator>Zhang, Yue</creator><creator>Hou, Jinlong</creator><creator>Sun, Huyang</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20241219</creationdate><title>AniSora: Exploring the Frontiers of Animation Video Generation in the Sora Era</title><author>Jiang, Yudong ; Xu, Baohan ; Yang, Siqian ; Yin, Mingyu ; Liu, Jing ; Xu, Chao ; Wang, Siqi ; Wu, Yidi ; Zhu, Bingwen ; Zhang, Xinwen ; Zheng, Xingyu ; Xu, Jixuan ; Zhang, Yue ; Hou, Jinlong ; Sun, Huyang</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_31452729033</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Animation</topic><topic>Benchmarks</topic><topic>Controllability</topic><topic>Data processing</topic><topic>Datasets</topic><topic>Human motion</topic><topic>Image quality</topic><topic>Spatiotemporal data</topic><topic>Video data</topic><toplevel>online_resources</toplevel><creatorcontrib>Jiang, Yudong</creatorcontrib><creatorcontrib>Xu, Baohan</creatorcontrib><creatorcontrib>Yang, Siqian</creatorcontrib><creatorcontrib>Yin, Mingyu</creatorcontrib><creatorcontrib>Liu, Jing</creatorcontrib><creatorcontrib>Xu, Chao</creatorcontrib><creatorcontrib>Wang, Siqi</creatorcontrib><creatorcontrib>Wu, Yidi</creatorcontrib><creatorcontrib>Zhu, Bingwen</creatorcontrib><creatorcontrib>Zhang, Xinwen</creatorcontrib><creatorcontrib>Zheng, Xingyu</creatorcontrib><creatorcontrib>Xu, Jixuan</creatorcontrib><creatorcontrib>Zhang, Yue</creatorcontrib><creatorcontrib>Hou, Jinlong</creatorcontrib><creatorcontrib>Sun, Huyang</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>AUTh Library subscriptions: ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>SciTech Premium Collection (Proquest) (PQ_SDU_P3)</collection><collection>ProQuest Engineering Collection</collection><collection>ProQuest Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Jiang, Yudong</au><au>Xu, Baohan</au><au>Yang, Siqian</au><au>Yin, Mingyu</au><au>Liu, Jing</au><au>Xu, Chao</au><au>Wang, Siqi</au><au>Wu, Yidi</au><au>Zhu, Bingwen</au><au>Zhang, Xinwen</au><au>Zheng, Xingyu</au><au>Xu, Jixuan</au><au>Zhang, Yue</au><au>Hou, Jinlong</au><au>Sun, Huyang</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>AniSora: Exploring the Frontiers of Animation Video Generation in the Sora Era</atitle><jtitle>arXiv.org</jtitle><date>2024-12-19</date><risdate>2024</risdate><eissn>2331-8422</eissn><abstract>Animation has gained significant interest in the recent film and TV industry. Despite the success of advanced video generation models like Sora, Kling, and CogVideoX in generating natural videos, they lack the same effectiveness in handling animation videos. Evaluating animation video generation is also a great challenge due to its unique artist styles, violating the laws of physics and exaggerated motions. In this paper, we present a comprehensive system, AniSora, designed for animation video generation, which includes a data processing pipeline, a controllable generation model, and an evaluation dataset. Supported by the data processing pipeline with over 10M high-quality data, the generation model incorporates a spatiotemporal mask module to facilitate key animation production functions such as image-to-video generation, frame interpolation, and localized image-guided animation. We also collect an evaluation benchmark of 948 various animation videos, the evaluation on VBench and human double-blind test demonstrates consistency in character and motion, achieving state-of-the-art results in animation video generation. Our evaluation benchmark will be publicly available at https://github.com/bilibili/Index-anisora.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | EISSN: 2331-8422 |
ispartof | arXiv.org, 2024-12 |
issn | 2331-8422 |
language | eng |
recordid | cdi_proquest_journals_3145272903 |
source | Publicly Available Content Database |
subjects | Animation Benchmarks Controllability Data processing Datasets Human motion Image quality Spatiotemporal data Video data |
title | AniSora: Exploring the Frontiers of Animation Video Generation in the Sora Era |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-07T18%3A58%3A10IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=AniSora:%20Exploring%20the%20Frontiers%20of%20Animation%20Video%20Generation%20in%20the%20Sora%20Era&rft.jtitle=arXiv.org&rft.au=Jiang,%20Yudong&rft.date=2024-12-19&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E3145272903%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-proquest_journals_31452729033%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=3145272903&rft_id=info:pmid/&rfr_iscdi=true |