Loading…

Exploiting linguistic information from Nepali transcripts for early detection of Alzheimer's disease using natural language processing and machine learning techniques

•A novel manually annotated Alzheimer's disease dataset for low resource language i.e., Nepalese, consisting of 168 Alzheimer's disease and 98 control normal patients is presented. The dataset is publicly available for research community;.•This paper presented a NLP based framework for the...

Full description

Saved in:
Bibliographic Details
Published in:International journal of human-computer studies 2022-04, Vol.160, p.102761, Article 102761
Main Authors: Adhikari, Surabhi, Thapa, Surendrabikram, Naseem, Usman, Singh, Priyanka, Huo, Huan, Bharathy, Gnana, Prasad, Mukesh
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:•A novel manually annotated Alzheimer's disease dataset for low resource language i.e., Nepalese, consisting of 168 Alzheimer's disease and 98 control normal patients is presented. The dataset is publicly available for research community;.•This paper presented a NLP based framework for the early detection of AD patients using a Nepali transcripts and develop a visualization of content present in textual data.•This paper demonstrated a word cloud of most common words are presented to give qualitative analysis.•The performance of different state-of-the-art machine learning-based textual classification mechanisms are presented, and baseline results for each are reported. Alzheimer's disease (AD) is considered as progressing brain disease, which can be slowed down with the early detection and proper treatment by identifying the early symptoms. Language change serves as an early sign that a patient's cognitive functions have been impacted, potentially leading to early detection. The effects of language changes are being studied thoroughly in the English language to analyze the linguistic patterns in AD patients using Natural Language Processing (NLP). However, it has not been much explored in local languages and low-resourced languages like Nepali. In this paper, we have created a novel dataset on low resources language, i.e., Nepali, consisting of transcripts of the AD patients and control normal subjects. We have also presented baselines by applying various machine learning (ML) and deep learning (DL) algorithms on a novel dataset for the early detection of AD. The proposed work incorporates the speech decline of AD patients in order to classify them as control subjects or AD patients. This study makes an effective conclusion that the difficulty in processing information of AD patients reflects in their speech narratives of patients while describing a picture. The dataset is made publicly available.
ISSN:1071-5819
1095-9300
DOI:10.1016/j.ijhcs.2021.102761