Loading…

Boosting crash-inducing change localization with rank-performance-based feature subset selection

Given a bucket of crash reports, it would be helpful for developers to find and fix the corresponding defects if the crash-inducing software changes can be automatically located. Recently, an approach called ChangeLocator was proposed, which used ten change-level features to train a supervised model...

Full description

Saved in:
Bibliographic Details
Published in:Empirical software engineering : an international journal 2020-05, Vol.25 (3), p.1905-1950
Main Authors: Guo, Zhaoqiang, Li, Yanhui, Ma, Wanwangying, Zhou, Yuming, Lu, Hongmin, Chen, Lin, Xu, Baowen
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Given a bucket of crash reports, it would be helpful for developers to find and fix the corresponding defects if the crash-inducing software changes can be automatically located. Recently, an approach called ChangeLocator was proposed, which used ten change-level features to train a supervised model based on the data from the historical fixed crashes. It was reported that ChangeLocator achieved a good performance in terms of Recall@1, MAP, and MRR, when all the ten features were combined together. However, in ChangeLocator, the redundancy between features are neglected, which may degrade the localization effectiveness. In this paper, we propose an improved approach ChangeRanker with a rank-performance-based feature selection technology (Rfs) to boost the effectiveness of crash-inducing change localization. Our experimental results on NetBeans show that ChangeRanker can achieve an improvement of 35.9%, 17.4%, and 15.3% over ChangeLocator in terms of Recall@1, MRR, and MAP, respectively. Furthermore, compared with three popular feature selection approaches, Rfs is able to select more informative features to boost localization effectiveness. In order to assess the real generalization capability of the proposed extension, we adapt ChangeRanker and ChangeLocator to locate bug-inducing changes on three additional data sets. Again, we observe that, on average, ChangeRanker achieves an improvement of 115.3%, 37.6%, and 41.2% in terms of Recall@1, MRR, and MAP, respectively. This indicates that our proposed rank-performance-based feature selection method has a good generalization capability. In summary, our work provides an easy-to-use approach to boosting the performance of the state-of-the-art crash-inducing change localization approach.
ISSN:1382-3256
1573-7616
DOI:10.1007/s10664-020-09802-1