Loading…

ABFL: An autoencoder based practical approach for software fault localization

•We introduce a novel way to extract features from source code automatically. To the best of our knowledge, we are the first to apply the autoencoder in fault localization.•We present a fault localization approach, ABFL that combines SBFL techniques with the latent representation of source code to i...

Full description

Saved in:

Bibliographic Details
Published in:	Information sciences 2020-02, Vol.510, p.108-121
Main Authors:	Peng, Zhendong, Xiao, Xi, Hu, Guangwu, Kumar Sangaiah, Arun, Atiquzzaman, Mohammed, Xia, Shutao
Format:	Article
Language:	English
Subjects:	Autoencoder Debugging Fault localization SBFL
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	•We introduce a novel way to extract features from source code automatically. To the best of our knowledge, we are the first to apply the autoencoder in fault localization.•We present a fault localization approach, ABFL that combines SBFL techniques with the latent representation of source code to improve the accuracy of fault localization.•We evaluate ABFL on 357 bugs from 5 different software projects in the Defects4J benchmark. ABFL substantially outperforms 14 SBFL techniques, ranking 14%, 26%, and 34% faults at the top 1, 3, and 5 of the ranked list, respectively. Fault localization is essential to software debugging. Despite existing techniques, such as mutation analysis, development history and bug reports, have made great contributions to fault localization, the challenge of infeasibility still exits in practice due to expense of mutation analysis, lacking of development history and bug reports. To improve accuracy and feasibility in fault code locating, in this paper, we propose ABFL, an Autoencoder Based practical approach for Fault Localization. ABFL first introduces an autoencoder to extract 32 features from software static source code. Then it employs Spectrum Based Fault Localization (SBFL) techniques to calculate 14 types of scores, which are taken as another group of features in software running time. Finally, relying on the constructed ranking model, ABFL integrates two groups of features together and precisely locates faulty statements in code. The executed extensive experiments on the Defects4J repository show that our approach is superior to the state-of-the-art SBFL techniques, ranking the faulty statement at the 1st, 3rd, and 5th positions with 49, 94, and 123 faults, respectively.
ISSN:	0020-0255 1872-6291
DOI:	10.1016/j.ins.2019.08.077