Loading…

Dev2vec: Representing domain expertise of developers in an embedding space

Accurate assessment of the domain expertise of developers is essential for assigning the proper candidate to contribute to a project, or to attend a job role. Since the potential candidate can come from a large pool, the automated assessment of this domain expertise is a desirable goal. While previo...

Full description

Saved in:
Bibliographic Details
Published in:Information and software technology 2023-07, Vol.159, p.107218, Article 107218
Main Authors: Dakhel, Arghavan Moradi, Desmarais, Michel C., Khomh, Foutse
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Accurate assessment of the domain expertise of developers is essential for assigning the proper candidate to contribute to a project, or to attend a job role. Since the potential candidate can come from a large pool, the automated assessment of this domain expertise is a desirable goal. While previous methods have had some success within a single software project, the assessment of a developer’s domain expertise from contributions across multiple projects is more challenging. In this paper, we employ doc2vec to represent the domain expertise of developers across multiple projects as embedding vectors, and assess expertise level from authored code fragments. For this purpose, we derived embedding vectors from different sources that contain evidence of developers’ expertise, such as the description of repositories they contributed, their issue resolving history, and API calls in their commits. We name it dev2vec and demonstrate its effectiveness in representing and assessing the technical specialization of developers. Our results indicate that encoding the expertise of developers in an embedding vector outperforms state-of-the-art methods and improves the F1-score up to 21%. Moreover, our findings suggest that the “issue resolving history” of developers is the most informative source of information to represent the domain expertise of developers in embedding spaces. Our proposed approach sheds light on the effectiveness of representing the technical expertise of developers in embedding vectors, and it can act as initial filtering for recruiters and project managers.
ISSN:0950-5849
1873-6025
DOI:10.1016/j.infsof.2023.107218