Loading…

Determining the trustworthiness of DNNs in classification tasks using generalized feature-based confidence metric

•The Generalized Feature-Based Confidence Metric was proposed in order to evaluate the quality of feature vectors in DNNs and estimate the models’ confidence in their predictions.•A method was suggested based on the former metric, which can determine the trustworthiness of models, distinguish betwee...

Full description

Saved in:
Bibliographic Details
Published in:Pattern recognition 2023-10, Vol.142, p.109683, Article 109683
Main Authors: Haghpanah, Mohammad Amin, Tale Masouleh, Mehdi, Kalhor, Ahmad
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:•The Generalized Feature-Based Confidence Metric was proposed in order to evaluate the quality of feature vectors in DNNs and estimate the models’ confidence in their predictions.•A method was suggested based on the former metric, which can determine the trustworthiness of models, distinguish between them, and pick the superior one.•The practicality of the mentioned method and metric was investigated through four diverse case studies and empirically proved. Determining the confidence of Deep Neural Networks in predictions is crucial for building reliable and robust systems. However, it has received minor attention among other areas related to Deep Learning. The confidence of DNNs in predictions is highly correlated with their ability in feature extraction. Consequently, a more robust feature extractor in DNNs leads to a more confident and trustworthy model. In this study, a method is designed in order to determine the trustworthiness of DNNs based on the quality of their feature extraction components. The concept of feature quality is defined based on the models’ confidence in predictions. In a situation where two DNNs have approximately the same accuracy, the superior model has more confidence in its predictions. Hence, it is less influenced by overfitting, making it more robust and reliable in unseen and noisy environments. Determining such a model is not always possible with the well-known accuracy metric. Accordingly, a novel metric named Generalized Feature-Based Confidence Metric is proposed, which is capable of profoundly evaluating the models’ confidence in predictions. It analyzes layer-by-layer feature vectors generated by DNNs and evaluates their quality. Altogether, these utilities boost assessing and comparing different models with varying widths and depths, improving them, and picking the best one. The practicality of the proposed method and metric is investigated through four significantly diverse case studies and empirically proved. Three of them are reputable benchmarking datasets, namely, CIFAR-10, CIFAR-100, and Fashion-MNIST. Moreover, a new high-quality dataset for the Hand Rubbing problem (made by the authors) is used to analyze the proposed method’s performance in a real-world application. Overall, the proposed metric is able to distinguish between different models from about 1% to 8% in terms of confidence in predictions where the models possess almost the same accuracy (0.5% difference or lower).
ISSN:0031-3203
1873-5142
DOI:10.1016/j.patcog.2023.109683