Loading…

Could LSA become a “Bifactor” model? Towards a model with general and group factors

•Some limitations of a method to semantically interpret LSA dimensions are tackled.•The method produces an orthogonal non-latent space from the LSA original latent one.•A limitation is that the non-latent space does not represent the common variance.•A Bifactor Model inspired method introduces an ad...

Full description

Saved in:

Bibliographic Details
Published in:	Expert systems with applications 2019-10, Vol.131, p.71-80
Main Authors:	Jorge-Botana, Guillermo, Olmos, Ricardo, Luzón, José María
Format:	Article
Language:	English
Subjects:	Bifactor model Distributional semantics Expert systems Factor analysis Inbuilt-Rubric method Latent semantic analysis Rotation Semantic analysis Semantics Text assessment
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	•Some limitations of a method to semantically interpret LSA dimensions are tackled.•The method produces an orthogonal non-latent space from the LSA original latent one.•A limitation is that the non-latent space does not represent the common variance.•A Bifactor Model inspired method introduces an additional common variance dimension.•The corrections proposed outperforms the current Inbuilt-Rubric version. One insufficiently grounded criticism made against Latent Semantic Analysis is that it is impossible to semantically interpret its dimensions. This is not true, as several studies have transformed the latent semantic space to interpret them, by means of some methods. One of them is the Inbuilt-Rubric method. Rather than grouping concepts around dimensions, as in Exploratory Factor Analysis based rotation methods, the Inbuilt-Rubric is a method that perform an “a priori” imposition of concepts onto the latent semantic space. It uses a confirmatory strategy. This study seeks to propose solutions for two limitations found in the current Inbuilt-Rubric methodology: one solution is inspired by Bifactor Models and the management of common variance of the concepts involved; and the other one is based in randomizing the sequence to perform the process. Both methods outperform the current Inbuilt-Rubric version in relevant content detection. The reported improvements can be incorporated into expert systems that use Latent Semantic Analysis and Inbuilt-Rubric in relevant content detection or text classification tasks.
ISSN:	0957-4174 1873-6793
DOI:	10.1016/j.eswa.2019.04.055