Loading…

Measuring Plagiarism in Introductory Programming Course Assignments

Measuring plagiarism in programming assignments is an essential task to the educational procedure. This paper discusses the methods of plagiarism and its detection in introductory programming course assignments written in C++. A small corpus of assignments is made publically available. A general fra...

Full description

Saved in:
Bibliographic Details
Published in:arXiv.org 2022-05
Main Authors: Humayoun, Muhammad, Hashmi, Muhammad Adnan, Khan, Ali Hanzala
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Measuring plagiarism in programming assignments is an essential task to the educational procedure. This paper discusses the methods of plagiarism and its detection in introductory programming course assignments written in C++. A small corpus of assignments is made publically available. A general framework to compute the similarity between a solution pair is developed that uses the three token-based similarity methods as features and predicts if the solution is plagiarized. The importance of each feature is also measured, which in return ranks the effectiveness of each method in use. Finally, the artificially generated dataset improves the results compared to the original data. We achieved an F1 score of 0.955 and 0.971 on original and synthetic datasets.
ISSN:2331-8422