Loading…
El análisis estilométrico aplicado a la literatura española: las novelas policiacas e históricas
This paper demonstrates that a computer can determine the authorship of a text. To this end we created a corpus of 122 contemporary novels written in Spanish (69 historical novels, 50 crime novels, and 3 westerns). The corpus was then studied using stylo, a stylometric analysis package written in th...
Saved in:
Published in: | Caracteres (Salamanca) 2016, Vol.5 (2), p.196-245 |
---|---|
Main Author: | |
Format: | Article |
Language: | Spanish |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | This paper demonstrates that a computer can determine the authorship of a text.
To this end we created a corpus of 122 contemporary novels written in Spanish
(69 historical novels, 50 crime novels, and 3 westerns). The corpus was then
studied using stylo, a stylometric analysis package written in the programming
language R. We chose to apply the simplest of the multiple types of analysis
offered by this package: cluster analysis. The results are very interesting: by
taking into account just the 100 most frequently used words (MFW), the
computer was able to group the different works of each author as well as assigning those published under a pseudonym to the true author without
incurring in any errors.
En este artículo se trata de mostrar si un ordenador es capaz de determinar la
autoría de un texto. Para ello se ha creado un corpus de 122 novelas
contemporáneas (69 de tema histórico, 50 policiacas y 3 del oeste) y se han
analizado con el paquete de análisis estilométrico stylo. De todos los análisis
que ofrece este paquete, escrito en R, se ha utilizado el más sencillo: el análisis
de grupos. Los resultados han sido muy interesantes ya que con un mínimo de
100 palabras (las más frecuentes) el ordenador ha sido capaz de agrupar, sin
error alguno, las distintas obras de cada autor y ha sabido asignar al autor real
aquellas que se publicaron bajo seudónimo. |
---|---|
ISSN: | 2254-4496 2254-4496 |