Loading…

AniSora: Exploring the Frontiers of Animation Video Generation in the Sora Era

Animation has gained significant interest in the recent film and TV industry. Despite the success of advanced video generation models like Sora, Kling, and CogVideoX in generating natural videos, they lack the same effectiveness in handling animation videos. Evaluating animation video generation is...

Full description

Saved in:
Bibliographic Details
Published in:arXiv.org 2024-12
Main Authors: Jiang, Yudong, Xu, Baohan, Yang, Siqian, Yin, Mingyu, Liu, Jing, Xu, Chao, Wang, Siqi, Wu, Yidi, Zhu, Bingwen, Zhang, Xinwen, Zheng, Xingyu, Xu, Jixuan, Zhang, Yue, Hou, Jinlong, Sun, Huyang
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Jiang, Yudong
Xu, Baohan
Yang, Siqian
Yin, Mingyu
Liu, Jing
Xu, Chao
Wang, Siqi
Wu, Yidi
Zhu, Bingwen
Zhang, Xinwen
Zheng, Xingyu
Xu, Jixuan
Zhang, Yue
Hou, Jinlong
Sun, Huyang
description Animation has gained significant interest in the recent film and TV industry. Despite the success of advanced video generation models like Sora, Kling, and CogVideoX in generating natural videos, they lack the same effectiveness in handling animation videos. Evaluating animation video generation is also a great challenge due to its unique artist styles, violating the laws of physics and exaggerated motions. In this paper, we present a comprehensive system, AniSora, designed for animation video generation, which includes a data processing pipeline, a controllable generation model, and an evaluation dataset. Supported by the data processing pipeline with over 10M high-quality data, the generation model incorporates a spatiotemporal mask module to facilitate key animation production functions such as image-to-video generation, frame interpolation, and localized image-guided animation. We also collect an evaluation benchmark of 948 various animation videos, the evaluation on VBench and human double-blind test demonstrates consistency in character and motion, achieving state-of-the-art results in animation video generation. Our evaluation benchmark will be publicly available at https://github.com/bilibili/Index-anisora.
format article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_3145272903</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3145272903</sourcerecordid><originalsourceid>FETCH-proquest_journals_31452729033</originalsourceid><addsrcrecordid>eNqNzM0KwjAQBOAgCBbtOyx4LqRJY9WbSKsnL4rXEjDVlJqtmxR8fOvPA3gamPmYEYuElGmyzISYsNj7hnMuFrlQSkbssHH2iKTXUDy7Fsm6K4SbgZLQBWvIA9YwmLsOFh2c7cUg7Iwz9C2s-_D3BRSkZ2xc69ab-JdTNi-L03afdISP3vhQNdiTG6ZKppkSuVhxKf9TL3JyPVQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3145272903</pqid></control><display><type>article</type><title>AniSora: Exploring the Frontiers of Animation Video Generation in the Sora Era</title><source>Publicly Available Content Database</source><creator>Jiang, Yudong ; Xu, Baohan ; Yang, Siqian ; Yin, Mingyu ; Liu, Jing ; Xu, Chao ; Wang, Siqi ; Wu, Yidi ; Zhu, Bingwen ; Zhang, Xinwen ; Zheng, Xingyu ; Xu, Jixuan ; Zhang, Yue ; Hou, Jinlong ; Sun, Huyang</creator><creatorcontrib>Jiang, Yudong ; Xu, Baohan ; Yang, Siqian ; Yin, Mingyu ; Liu, Jing ; Xu, Chao ; Wang, Siqi ; Wu, Yidi ; Zhu, Bingwen ; Zhang, Xinwen ; Zheng, Xingyu ; Xu, Jixuan ; Zhang, Yue ; Hou, Jinlong ; Sun, Huyang</creatorcontrib><description>Animation has gained significant interest in the recent film and TV industry. Despite the success of advanced video generation models like Sora, Kling, and CogVideoX in generating natural videos, they lack the same effectiveness in handling animation videos. Evaluating animation video generation is also a great challenge due to its unique artist styles, violating the laws of physics and exaggerated motions. In this paper, we present a comprehensive system, AniSora, designed for animation video generation, which includes a data processing pipeline, a controllable generation model, and an evaluation dataset. Supported by the data processing pipeline with over 10M high-quality data, the generation model incorporates a spatiotemporal mask module to facilitate key animation production functions such as image-to-video generation, frame interpolation, and localized image-guided animation. We also collect an evaluation benchmark of 948 various animation videos, the evaluation on VBench and human double-blind test demonstrates consistency in character and motion, achieving state-of-the-art results in animation video generation. Our evaluation benchmark will be publicly available at https://github.com/bilibili/Index-anisora.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Animation ; Benchmarks ; Controllability ; Data processing ; Datasets ; Human motion ; Image quality ; Spatiotemporal data ; Video data</subject><ispartof>arXiv.org, 2024-12</ispartof><rights>2024. This work is published under http://creativecommons.org/licenses/by-nc-sa/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.proquest.com/docview/3145272903?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>780,784,25753,37012,44590</link.rule.ids></links><search><creatorcontrib>Jiang, Yudong</creatorcontrib><creatorcontrib>Xu, Baohan</creatorcontrib><creatorcontrib>Yang, Siqian</creatorcontrib><creatorcontrib>Yin, Mingyu</creatorcontrib><creatorcontrib>Liu, Jing</creatorcontrib><creatorcontrib>Xu, Chao</creatorcontrib><creatorcontrib>Wang, Siqi</creatorcontrib><creatorcontrib>Wu, Yidi</creatorcontrib><creatorcontrib>Zhu, Bingwen</creatorcontrib><creatorcontrib>Zhang, Xinwen</creatorcontrib><creatorcontrib>Zheng, Xingyu</creatorcontrib><creatorcontrib>Xu, Jixuan</creatorcontrib><creatorcontrib>Zhang, Yue</creatorcontrib><creatorcontrib>Hou, Jinlong</creatorcontrib><creatorcontrib>Sun, Huyang</creatorcontrib><title>AniSora: Exploring the Frontiers of Animation Video Generation in the Sora Era</title><title>arXiv.org</title><description>Animation has gained significant interest in the recent film and TV industry. Despite the success of advanced video generation models like Sora, Kling, and CogVideoX in generating natural videos, they lack the same effectiveness in handling animation videos. Evaluating animation video generation is also a great challenge due to its unique artist styles, violating the laws of physics and exaggerated motions. In this paper, we present a comprehensive system, AniSora, designed for animation video generation, which includes a data processing pipeline, a controllable generation model, and an evaluation dataset. Supported by the data processing pipeline with over 10M high-quality data, the generation model incorporates a spatiotemporal mask module to facilitate key animation production functions such as image-to-video generation, frame interpolation, and localized image-guided animation. We also collect an evaluation benchmark of 948 various animation videos, the evaluation on VBench and human double-blind test demonstrates consistency in character and motion, achieving state-of-the-art results in animation video generation. Our evaluation benchmark will be publicly available at https://github.com/bilibili/Index-anisora.</description><subject>Animation</subject><subject>Benchmarks</subject><subject>Controllability</subject><subject>Data processing</subject><subject>Datasets</subject><subject>Human motion</subject><subject>Image quality</subject><subject>Spatiotemporal data</subject><subject>Video data</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNqNzM0KwjAQBOAgCBbtOyx4LqRJY9WbSKsnL4rXEjDVlJqtmxR8fOvPA3gamPmYEYuElGmyzISYsNj7hnMuFrlQSkbssHH2iKTXUDy7Fsm6K4SbgZLQBWvIA9YwmLsOFh2c7cUg7Iwz9C2s-_D3BRSkZ2xc69ab-JdTNi-L03afdISP3vhQNdiTG6ZKppkSuVhxKf9TL3JyPVQ</recordid><startdate>20241219</startdate><enddate>20241219</enddate><creator>Jiang, Yudong</creator><creator>Xu, Baohan</creator><creator>Yang, Siqian</creator><creator>Yin, Mingyu</creator><creator>Liu, Jing</creator><creator>Xu, Chao</creator><creator>Wang, Siqi</creator><creator>Wu, Yidi</creator><creator>Zhu, Bingwen</creator><creator>Zhang, Xinwen</creator><creator>Zheng, Xingyu</creator><creator>Xu, Jixuan</creator><creator>Zhang, Yue</creator><creator>Hou, Jinlong</creator><creator>Sun, Huyang</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20241219</creationdate><title>AniSora: Exploring the Frontiers of Animation Video Generation in the Sora Era</title><author>Jiang, Yudong ; Xu, Baohan ; Yang, Siqian ; Yin, Mingyu ; Liu, Jing ; Xu, Chao ; Wang, Siqi ; Wu, Yidi ; Zhu, Bingwen ; Zhang, Xinwen ; Zheng, Xingyu ; Xu, Jixuan ; Zhang, Yue ; Hou, Jinlong ; Sun, Huyang</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_31452729033</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Animation</topic><topic>Benchmarks</topic><topic>Controllability</topic><topic>Data processing</topic><topic>Datasets</topic><topic>Human motion</topic><topic>Image quality</topic><topic>Spatiotemporal data</topic><topic>Video data</topic><toplevel>online_resources</toplevel><creatorcontrib>Jiang, Yudong</creatorcontrib><creatorcontrib>Xu, Baohan</creatorcontrib><creatorcontrib>Yang, Siqian</creatorcontrib><creatorcontrib>Yin, Mingyu</creatorcontrib><creatorcontrib>Liu, Jing</creatorcontrib><creatorcontrib>Xu, Chao</creatorcontrib><creatorcontrib>Wang, Siqi</creatorcontrib><creatorcontrib>Wu, Yidi</creatorcontrib><creatorcontrib>Zhu, Bingwen</creatorcontrib><creatorcontrib>Zhang, Xinwen</creatorcontrib><creatorcontrib>Zheng, Xingyu</creatorcontrib><creatorcontrib>Xu, Jixuan</creatorcontrib><creatorcontrib>Zhang, Yue</creatorcontrib><creatorcontrib>Hou, Jinlong</creatorcontrib><creatorcontrib>Sun, Huyang</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>AUTh Library subscriptions: ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>SciTech Premium Collection (Proquest) (PQ_SDU_P3)</collection><collection>ProQuest Engineering Collection</collection><collection>ProQuest Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Jiang, Yudong</au><au>Xu, Baohan</au><au>Yang, Siqian</au><au>Yin, Mingyu</au><au>Liu, Jing</au><au>Xu, Chao</au><au>Wang, Siqi</au><au>Wu, Yidi</au><au>Zhu, Bingwen</au><au>Zhang, Xinwen</au><au>Zheng, Xingyu</au><au>Xu, Jixuan</au><au>Zhang, Yue</au><au>Hou, Jinlong</au><au>Sun, Huyang</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>AniSora: Exploring the Frontiers of Animation Video Generation in the Sora Era</atitle><jtitle>arXiv.org</jtitle><date>2024-12-19</date><risdate>2024</risdate><eissn>2331-8422</eissn><abstract>Animation has gained significant interest in the recent film and TV industry. Despite the success of advanced video generation models like Sora, Kling, and CogVideoX in generating natural videos, they lack the same effectiveness in handling animation videos. Evaluating animation video generation is also a great challenge due to its unique artist styles, violating the laws of physics and exaggerated motions. In this paper, we present a comprehensive system, AniSora, designed for animation video generation, which includes a data processing pipeline, a controllable generation model, and an evaluation dataset. Supported by the data processing pipeline with over 10M high-quality data, the generation model incorporates a spatiotemporal mask module to facilitate key animation production functions such as image-to-video generation, frame interpolation, and localized image-guided animation. We also collect an evaluation benchmark of 948 various animation videos, the evaluation on VBench and human double-blind test demonstrates consistency in character and motion, achieving state-of-the-art results in animation video generation. Our evaluation benchmark will be publicly available at https://github.com/bilibili/Index-anisora.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2024-12
issn 2331-8422
language eng
recordid cdi_proquest_journals_3145272903
source Publicly Available Content Database
subjects Animation
Benchmarks
Controllability
Data processing
Datasets
Human motion
Image quality
Spatiotemporal data
Video data
title AniSora: Exploring the Frontiers of Animation Video Generation in the Sora Era
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-07T18%3A58%3A10IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=AniSora:%20Exploring%20the%20Frontiers%20of%20Animation%20Video%20Generation%20in%20the%20Sora%20Era&rft.jtitle=arXiv.org&rft.au=Jiang,%20Yudong&rft.date=2024-12-19&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E3145272903%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-proquest_journals_31452729033%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=3145272903&rft_id=info:pmid/&rfr_iscdi=true