Loading…

Using a balanced scorecard to identify opportunities to improve code review effectiveness: an industrial experience report

Peer code review is a widely adopted software engineering practice to ensure code quality and ensure software reliability in both the commercial and open-source software projects. Due to the large effort overhead associated with practicing code reviews, project managers often wonder, if their code r...

Full description

Saved in:
Bibliographic Details
Published in:Empirical software engineering : an international journal 2021-11, Vol.26 (6), Article 129
Main Authors: Hasan, Masum, Iqbal, Anindya, Islam, Mohammad Rafid Ul, Rahman, A.J.M. Imtiajur, Bosu, Amiangshu
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Peer code review is a widely adopted software engineering practice to ensure code quality and ensure software reliability in both the commercial and open-source software projects. Due to the large effort overhead associated with practicing code reviews, project managers often wonder, if their code reviews are effective and if there are improvement opportunities in that respect. Since project managers at Samsung Research Bangladesh (SRBD) were also intrigued by these questions, this research developed, deployed, and evaluated a production-ready solution using the Balanced SCorecard (BSC) strategy that SRBD managers can use in their day-to-day management to monitor individual developer’s, a particular project’s or the entire organization’s code review effectiveness. Following the four-step framework of the BSC strategy, we– 1) defined the operation goals of this research, 2) defined a set of metrics to measure the effectiveness of code reviews, 3) developed an automated mechanism to measure those metrics, and 4) developed and evaluated a monitoring application to inform the key stakeholders. Our automated model to identify useful code reviews achieves 7.88% and 14.39% improvement in terms of accuracy and minority class F 1 score respectively over the models proposed in prior studies. It also outperforms human evaluators from SRBD, that the model replaces, by a margin of 25.32% and 23.84% respectively in terms of accuracy and minority class F 1 score. In our post-deployment survey, SRBD developers and managers indicated that they found our solution as useful and it provided them with important insights to help their decision makings.
ISSN:1382-3256
1573-7616
DOI:10.1007/s10664-021-10038-w