Loading…
Innovative Deep Learning-based Video Editing Tool
Deep learning has acted as the main driver of emerging technologies. Recently, it has been used more frequently to aid in the video editing process to save time. Normally it would take a human editor hour or even days and weeks to edit the video footage, while artificial intelligence (AI) could perf...
Saved in:
Main Authors: | , , , , , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Deep learning has acted as the main driver of emerging technologies. Recently, it has been used more frequently to aid in the video editing process to save time. Normally it would take a human editor hour or even days and weeks to edit the video footage, while artificial intelligence (AI) could perform the task in relatively less time. To get the most advantage of the evolving technologies, we use in this work a combination of state-of-the-art deep learning models to make editing videos as easy as editing text. The video will be transcribed to editable text to allow deleting any undesired parts, in which the equivalent part in the video will be automatically removed. Similarly, adding parts to the script will generate the equivalent part in the video considering the speaker's voice and lip movement. Achieving those required functionalities, three main deep learning models are required: (i) Speech recognition model to generate the video script, (ii) Text to speech model with speaker encoder to generate the added parts with the voice of the speaker, and (iii) A lip movement generator model takes the speaker's image and syncs the lips with the added script. In this work, we focus on presenting the first two models. |
---|---|
ISSN: | 2475-2320 |
DOI: | 10.1109/ICENCO49852.2021.9698919 |