Dataset Information

Detection of dementia on voice recordings using deep learning: a Framingham Heart Study.

ABSTRACT:

Background

Identification of reliable, affordable, and easy-to-use strategies for detection of dementia is sorely needed. Digital technologies, such as individual voice recordings, offer an attractive modality to assess cognition but methods that could automatically analyze such data are not readily available.

Methods and findings

We used 1264 voice recordings of neuropsychological examinations administered to participants from the Framingham Heart Study (FHS), a community-based longitudinal observational study. The recordings were 73 min in duration, on average, and contained at least two speakers (participant and examiner). Of the total voice recordings, 483 were of participants with normal cognition (NC), 451 recordings were of participants with mild cognitive impairment (MCI), and 330 were of participants with dementia (DE). We developed two deep learning models (a two-level long short-term memory (LSTM) network and a convolutional neural network (CNN)), which used the audio recordings to classify if the recording included a participant with only NC or only DE and to differentiate between recordings corresponding to those that had DE from those who did not have DE (i.e., NDE (NC+MCI)). Based on 5-fold cross-validation, the LSTM model achieved a mean (±std) area under the receiver operating characteristic curve (AUC) of 0.740 ± 0.017, mean balanced accuracy of 0.647 ± 0.027, and mean weighted F1 score of 0.596 ± 0.047 in classifying cases with DE from those with NC. The CNN model achieved a mean AUC of 0.805 ± 0.027, mean balanced accuracy of 0.743 ± 0.015, and mean weighted F1 score of 0.742 ± 0.033 in classifying cases with DE from those with NC. For the task related to the classification of participants with DE from NDE, the LSTM model achieved a mean AUC of 0.734 ± 0.014, mean balanced accuracy of 0.675 ± 0.013, and mean weighted F1 score of 0.671 ± 0.015. The CNN model achieved a mean AUC of 0.746 ± 0.021, mean balanced accuracy of 0.652 ± 0.020, and mean weighted F1 score of 0.635 ± 0.031 in classifying cases with DE from those who were NDE.

Conclusion

This proof-of-concept study demonstrates that automated deep learning-driven processing of audio recordings of neuropsychological testing performed on individuals recruited within a community cohort setting can facilitate dementia screening.

SUBMITTER: Xue C

PROVIDER: S-EPMC8409004 | biostudies-literature | 2021 Aug

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Detection of dementia on voice recordings using deep learning: a Framingham Heart Study.

Xue Chonghua C Karjadi Cody C Paschalidis Ioannis Ch IC Au Rhoda R Kolachalama Vijaya B VB

Alzheimer's research & therapy 20210831 1

<h4>Background</h4>Identification of reliable, affordable, and easy-to-use strategies for detection of dementia is sorely needed. Digital technologies, such as individual voice recordings, offer an attractive modality to assess cognition but methods that could automatically analyze such data are not readily available.<h4>Methods and findings</h4>We used 1264 voice recordings of neuropsychological examinations administered to participants from the Framingham Heart Study (FHS), a community-based l ...[more]

PMID: 34465384

Similar Datasets

Project description:BackgroundWith the aging global population and the rising burden of Alzheimer disease and related dementias (ADRDs), there is a growing focus on identifying mild cognitive impairment (MCI) to enable timely interventions that could potentially slow down the onset of clinical dementia. The production of speech by an individual is a cognitively complex task that engages various cognitive domains. The ease of audio data collection highlights the potential cost-effectiveness and noninvasive nature of using human speech as a tool for cognitive assessment.ObjectiveThis study aimed to construct a machine learning pipeline that incorporates speaker diarization, feature extraction, feature selection, and classification to identify a set of acoustic features derived from voice recordings that exhibit strong MCI detection capability.MethodsThe study included 100 MCI cases and 100 cognitively normal controls matched for age, sex, and education from the Framingham Heart Study. Participants' spoken responses on neuropsychological tests were recorded, and the recorded audio was processed to identify segments of each participant's voice from recordings that included voices of both testers and participants. A comprehensive set of 6385 acoustic features was then extracted from these voice segments using OpenSMILE and Praat software. Subsequently, a random forest model was constructed to classify cognitive status using the features that exhibited significant differences between the MCI and cognitively normal groups. The MCI detection performance of various audio lengths was further examined.ResultsAn optimal subset of 29 features was identified that resulted in an area under the receiver operating characteristic curve of 0.87, with a 95% CI of 0.81-0.94. The most important acoustic feature for MCI classification was the number of filled pauses (importance score=0.09, P=3.10E-08). There was no substantial difference in the performance of the model trained on the acoustic features derived from different lengths of voice recordings.ConclusionsThis study showcases the potential of monitoring changes to nonsemantic and acoustic features of speech as a way of early ADRD detection and motivates future opportunities for using human speech as a measure of brain health.

Project description:BackgroundThe prevalence of dementia is expected to soar as the average life expectancy increases, but recent estimates suggest that the age-specific incidence of dementia is declining in high-income countries. Temporal trends are best derived through continuous monitoring of a population over a long period with the use of consistent diagnostic criteria. We describe temporal trends in the incidence of dementia over three decades among participants in the Framingham Heart Study.MethodsParticipants in the Framingham Heart Study have been under surveillance for incident dementia since 1975. In this analysis, which included 5205 persons 60 years of age or older, we used Cox proportional-hazards models adjusted for age and sex to determine the 5-year incidence of dementia during each of four epochs. We also explored the interactions between epoch and age, sex, apolipoprotein E ε4 status, and educational level, and we examined the effects of these interactions, as well as the effects of vascular risk factors and cardiovascular disease, on temporal trends.ResultsThe 5-year age- and sex-adjusted cumulative hazard rates for dementia were 3.6 per 100 persons during the first epoch (late 1970s and early 1980s), 2.8 per 100 persons during the second epoch (late 1980s and early 1990s), 2.2 per 100 persons during the third epoch (late 1990s and early 2000s), and 2.0 per 100 persons during the fourth epoch (late 2000s and early 2010s). Relative to the incidence during the first epoch, the incidence declined by 22%, 38%, and 44% during the second, third, and fourth epochs, respectively. This risk reduction was observed only among persons who had at least a high school diploma (hazard ratio, 0.77; 95% confidence interval, 0.67 to 0.88). The prevalence of most vascular risk factors (except obesity and diabetes) and the risk of dementia associated with stroke, atrial fibrillation, or heart failure have decreased over time, but none of these trends completely explain the decrease in the incidence of dementia.ConclusionsAmong participants in the Framingham Heart Study, the incidence of dementia has declined over the course of three decades. The factors contributing to this decline have not been completely identified. (Funded by the National Institutes of Health.).

Dataset Information

Detection of dementia on voice recordings using deep learning: a Framingham Heart Study.

Background

Methods and findings

Conclusion

Publications

Detection of dementia on voice recordings using deep learning: a Framingham Heart Study.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets