Loading…

Blind consumer video quality assessment with spatial-temporal perception and fusion

Blind quality assessment for user-generated content (UGC) or consumer videos is challenging in computer vision. Two open issues are yet to be addressed: how to effectively extract high-dimensional spatial-temporal features of consumer videos and how to appropriately model the relationship between th...

Full description

Saved in:
Bibliographic Details
Published in:Multimedia tools and applications 2024-02, Vol.83 (7), p.18969-18986
Main Authors: Niu, Yuzhen, Zheng, Yuming, Wang, Zhenlong, Zhong, Mengzhen, Zhao, Tiesong
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Blind quality assessment for user-generated content (UGC) or consumer videos is challenging in computer vision. Two open issues are yet to be addressed: how to effectively extract high-dimensional spatial-temporal features of consumer videos and how to appropriately model the relationship between these features and user perceptions within a unified blind video quality assessment (BVQA). To tackle these issues, we propose a novel BVQA model with spatial-temporal perception and fusion. Firstly, we develop two perception modules to extract the perceptual-distortion-related features separately from the spatial and temporal domains. In particular, the temporal-domain features are obtained with a combination of 3D ConvNet and residual frames for their high efficiencies in capturing the motion-specific temporal features. Secondly, we propose a feature fusion module that adaptively combines spatial-temporal features. Finally, we map the fused features onto perceptual quality. Experimental results demonstrate that our model outperforms other advanced methods in conducting subjective video quality prediction.
ISSN:1573-7721
1380-7501
1573-7721
DOI:10.1007/s11042-023-16242-8