Loading…

New Descriptors of Textual Records: Getting Help from Frequent Itemsets

The analysis of numerical data, whether structured, semi-structured, or raw, is of paramount importance in many sectors of economic, scientific, or simply social activity. The process of extraction of association rules is based on the lexical quality of the text and on the minimum support set by the...

Full description

Saved in:
Bibliographic Details
Published in:Vietnam journal of computer science 2020-11, Vol.7 (4), p.355-372
Main Authors: Bokhabrine, Ayoub, Biskri, Ismaïl, Ghazzali, Nadia
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The analysis of numerical data, whether structured, semi-structured, or raw, is of paramount importance in many sectors of economic, scientific, or simply social activity. The process of extraction of association rules is based on the lexical quality of the text and on the minimum support set by the user. In this paper, we implemented a platform named “IDETEX” capable of extracting itemsets from textual data and using it for the experimentation in different types of clustering methods, such as K -Medoids and Hierarchical clustering. The experiments conducted demonstrate the potential of the proposed approach for defining similarity between segments.
ISSN:2196-8888
2196-8896
DOI:10.1142/S2196888820500207