Loading…

SSFE-M: A Self-Supervised Feature Extraction Model for Enhanced Camera Calibration

Predicting the corresponding homographies from broadcast frames is an important task in camera calibration. However, the prediction accuracy of many previous methods is unsatisfactory due to low-quality semantic segmentation and feature extraction. In this letter, an effective camera calibration mod...

Full description

Saved in:
Bibliographic Details
Published in:IEEE signal processing letters 2024, Vol.31, p.1179-1183
Main Authors: Zhang, Neng, Izquierdo, Ebroul
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Predicting the corresponding homographies from broadcast frames is an important task in camera calibration. However, the prediction accuracy of many previous methods is unsatisfactory due to low-quality semantic segmentation and feature extraction. In this letter, an effective camera calibration model is proposed to address these issues. The proposed model consists of four processing stages. First, a pix2pix conditional Generative Adversarial Network (cGAN) is employed to transform the broadcast frame into the segmented image. This network can perform robustly even with limited training data. Second, a self-supervised feature extraction network is proposed to represent the segmented image as a 128-dimensional vector, which distinctively captures critical features of the segmented image. Third, the extracted feature vector is sent to a homography database to compute the best-matching homography using the K-D tree algorithm. Finally, the predicted homography is refined by running an Enhanced Correlation Coefficient (ECC) technique. The proposed model is evaluated on the 2014 World Cup and National Basketball datasets. The achieved results are compared to the state-of-the-art approaches, as well as several variants based on the U-net, VGG-16, and ResNet-50. Moreover, the difference between area-based and line-based segmentation is compared and analyzed. The experimental results demonstrate that the robustness and effectiveness of the proposed model are very competitive in diverse sports environments.
ISSN:1070-9908
1558-2361
DOI:10.1109/LSP.2024.3389830