Loading…

Anomaly Detection via a Bottleneck Structure Robust to Noise in In-Distribution Samples

Anomaly detection aims to distinguish data that are not part of in-distribution (ID) samples. It detects unknown defect images considered out-of-distribution (OOD) samples using only existing defect images in manufacturing. Recently, networks pretrained on large datasets, such as ImageNet, have been...

Full description

Saved in:
Bibliographic Details
Published in:IEEE access 2024, Vol.12, p.130264-130270
Main Authors: Cho, Yeongkyu, Lee, Donghwan, Hyun, Youngjoo, Kim, Wooju
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Anomaly detection aims to distinguish data that are not part of in-distribution (ID) samples. It detects unknown defect images considered out-of-distribution (OOD) samples using only existing defect images in manufacturing. Recently, networks pretrained on large datasets, such as ImageNet, have been used to extract ID features. These methods use the distance between the extracted features of the testing sample and all the training samples as an anomaly score. This approach identifies anomaly samples simply and effectively without requiring additional training. However, the extracted feature is high dimensional; thus, it contains considerable unnecessary information, resulting in a sensitivity to noise in some ID samples. In addition, during the nearest-neighbor search process, false positive occurs as a result of an ID sample with a low probability of occurrence. This paper presents a feature extraction methodology that is robust to noise by adopting a bottleneck structure used in reverse knowledge distillation to solve the noise sensitivity problem. Next, the false positive was solved using K-means clustering-based augmentation for samples far from each cluster center. Finally, we propose a weighted scoring function that increases the anomaly score when the nearest ID samples from the testing sample are rare. Applying the proposed methodology to the electric vehicle battery dataset and CIFAR10 revealed that the performance improved compared to the baseline model.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2024.3400122