Loading…

Application of winnowing algorithm in development of lecturer research performance information system

In this research, a system is made that can check the similarity using the winnowing algorithm. This algorithm will detect the similarity of research titles which is helpful to prevent duplication of research and search for references from similar studies. In this system, the user can check a title...

Full description

Saved in:
Bibliographic Details
Main Authors: Pratama, Ramadhani Noor, Najwaini, Effan, Rozaq, Abdul
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In this research, a system is made that can check the similarity using the winnowing algorithm. This algorithm will detect the similarity of research titles which is helpful to prevent duplication of research and search for references from similar studies. In this system, the user can check a title by inputting it on the form provided; then, the system will check the title’s similarity to all titles that have been stored in the database. The application of the winnowing algorithm requires parameter values of N-Gram, Window, and Prime Numbers. These three parameters will have an impact on the results of checking the similarity of the winnowing algorithm. This study focuses on checking the similarity of the title, which has far fewer words than the entire article content. This comparison of similarities to sentences with a small number of words makes determining the three winnowing parameter values very important. This study conducted a trial to obtain the optimal parameter values. This study concluded that the higher the N-Gram value, the smaller the percentage of similarity for the same window. At the same N-Gram, the higher the window value, the similarity percentage does not change significantly but has a decreasing trend. Based on the test results, the optimal prime number value is 23; The optimal N-Gram is 5; the optimal window is 2.
ISSN:0094-243X
1551-7616
DOI:10.1063/5.0118709