Loading…

Effective high-to-low-level feature aggregation network for endoscopic image classification

Purpose The accuracy improvement in endoscopic image classification matters to the endoscopists in diagnosing and choosing suitable treatment for patients. Existing CNN-based methods for endoscopic image classification tend to use the deepest abstract features without considering the contribution of...

Full description

Saved in:
Bibliographic Details
Published in:International journal for computer assisted radiology and surgery 2022-07, Vol.17 (7), p.1225-1233
Main Authors: Li, Sheng, Yao, Jiafeng, Cao, Jing, Kong, Xueting, Zhu, Jinhui
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Purpose The accuracy improvement in endoscopic image classification matters to the endoscopists in diagnosing and choosing suitable treatment for patients. Existing CNN-based methods for endoscopic image classification tend to use the deepest abstract features without considering the contribution of low-level features, while the latter is of great significance in the actual diagnosis of intestinal diseases. Methods To make full use of both high-level and low-level features, we propose a novel two-stream network for endoscopic image classification. Specifically, the backbone stream is utilized to extract high-level features. In the fusion stream, low-level features are generated by a bottom-up multi-scale gradual integration (BMGI) method, and the input of BMGI is refined by top-down attention learning modules. Besides, a novel correction loss is proposed to clarify the relationship between high-level and low-level features. Results Experiments on the KVASIR dataset demonstrate that the proposed framework can obtain an overall classification accuracy of 97.33% with Kappa coefficient of 95.25%. Compared to the existing models, the two evaluation indicators have increased by 2% and 2.25%, respectively, at least. Conclusion In this study, we proposed a two-stream network that fuses the high-level and low-level features for endoscopic image classification. The experiment results show that the high-to-low-level feature can better represent the endoscopic image and enable our model to outperform several state-of-the-art classification approaches. In addition, the proposed correction loss could regularize the consistency between backbone stream and fusion stream. Thus, the fused feature can reduce the intra-class distances and make accurate label prediction.
ISSN:1861-6429
1861-6410
1861-6429
DOI:10.1007/s11548-022-02591-6