Loading…

Application of Source Code Plagiarism Detection and Grouping Techniques for Short Programs

Academic misconduct in programming assignments, such as excessive collaboration with peers or use of online resources, is of growing concern to the integrity of engineering education. Software development skills have become increasingly relevant to many fields, and consequentially the number of stud...

Full description

Saved in:
Bibliographic Details
Main Authors: Ryman, Dylan, Imbrie, P.K., Kastner, Jeff
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Academic misconduct in programming assignments, such as excessive collaboration with peers or use of online resources, is of growing concern to the integrity of engineering education. Software development skills have become increasingly relevant to many fields, and consequentially the number of students submitting software assignments has also increased dramatically. The online learning environment caused by current global conditions broadens opportunities for students to engage in unethical academic behavior. Current techniques for the identification of academic misconduct in software submissions suffer from multiple issues such as the production of an excessive amount of difficult to interpret data, and the use of algorithms with limited effectiveness on short program submissions, such as those containing fewer than fifty lines. This paper presents how several existing techniques for identifying software similarity are made more effective when combined and fine-tuned in a way that increases sensitivity and decreases noise for short source code submissions. This paper shows how similarity results can be applied in a robust framework for determining which submissions are similar enough to warrant investigation. Finally, this paper introduces a new technique for grouping similar submissions, which helps identify collections of student submissions that all contain matching features. This increases efficiency by reducing the number of submission pairs that require human analysis. The application of these techniques enables comprehensive analysis of similarity in short software submissions, reduction of noise through the use of robust methods for determining which submissions were likely involved in academic misconduct, and improvements in human review efficiency by grouping sets of similar submissions and reducing the total amount of data for review.
ISSN:2377-634X
DOI:10.1109/FIE49875.2021.9637268