scholarly journals A Comprehensive Machine Learning Framework for the Exact Prediction of the Age of Onset in Familial and Sporadic Alzheimer’s Disease

Diagnostics ◽  
2021 ◽  
Vol 11 (5) ◽  
pp. 887
Author(s):  
Jorge I. Vélez ◽  
Luiggi A. Samper ◽  
Mauricio Arcos-Holzinger ◽  
Lady G. Espinosa ◽  
Mario A. Isaza-Ruget ◽  
...  

Machine learning (ML) algorithms are widely used to develop predictive frameworks. Accurate prediction of Alzheimer’s disease (AD) age of onset (ADAOO) is crucial to investigate potential treatments, follow-up, and therapeutic interventions. Although genetic and non-genetic factors affecting ADAOO were elucidated by other research groups and ours, the comprehensive and sequential application of ML to provide an exact estimation of the actual ADAOO, instead of a high-confidence-interval ADAOO that may fall, remains to be explored. Here, we assessed the performance of ML algorithms for predicting ADAOO using two AD cohorts with early-onset familial AD and with late-onset sporadic AD, combining genetic and demographic variables. Performance of ML algorithms was assessed using the root mean squared error (RMSE), the R-squared (R2), and the mean absolute error (MAE) with a 10-fold cross-validation procedure. For predicting ADAOO in familial AD, boosting-based ML algorithms performed the best. In the sporadic cohort, boosting-based ML algorithms performed best in the training data set, while regularization methods best performed for unseen data. ML algorithms represent a feasible alternative to accurately predict ADAOO with little human intervention. Future studies may include predicting the speed of cognitive decline in our cohorts using ML.

GeroScience ◽  
2021 ◽  
Author(s):  
Caitlin S. Latimer ◽  
Nicole F. Liachko

AbstractAlzheimer’s disease (AD) is traditionally defined by the presence of two types of protein aggregates in the brain: amyloid plaques comprised of the protein amyloid-β (Aβ) and neurofibrillary tangles containing the protein tau. However, a large proportion (up to 57%) of AD patients also have TDP-43 aggregates present as an additional comorbid pathology. The presence of TDP-43 aggregates in AD correlates with hippocampal sclerosis, worse brain atrophy, more severe cognitive impairment, and more rapid cognitive decline. In patients with mixed Aβ, tau, and TDP-43 pathology, TDP-43 may interact with neurodegenerative processes in AD, worsening outcomes. While considerable progress has been made to characterize TDP-43 pathology in AD and late-onset dementia, there remains a critical need for mechanistic studies to understand underlying disease biology and develop therapeutic interventions. This perspectives article reviews the current understanding of these processes from autopsy cohort studies and model organism-based research, and proposes targeting neurotoxic synergies between tau and TDP-43 as a new therapeutic strategy for AD with comorbid TDP-43 pathology.


2021 ◽  
Author(s):  
Louise Bloch ◽  
Christoph M. Friedrich

Abstract Background: The prediction of whether Mild Cognitive Impaired (MCI) subjects will prospectively develop Alzheimer's Disease (AD) is important for the recruitment and monitoring of subjects for therapy studies. Machine Learning (ML) is suitable to improve early AD prediction. The etiology of AD is heterogeneous, which leads to noisy data sets. Additional noise is introduced by multicentric study designs and varying acquisition protocols. This article examines whether an automatic and fair data valuation method based on Shapley values can identify subjects with noisy data. Methods: An ML-workow was developed and trained for a subset of the Alzheimer's Disease Neuroimaging Initiative (ADNI) cohort. The validation was executed for an independent ADNI test data set and for the Australian Imaging, Biomarker and Lifestyle Flagship Study of Ageing (AIBL) cohort. The workow included volumetric Magnetic Resonance Imaging (MRI) feature extraction, subject sample selection using data Shapley, Random Forest (RF) and eXtreme Gradient Boosting (XGBoost) for model training and Kernel SHapley Additive exPlanations (SHAP) values for model interpretation. This model interpretation enables clinically relevant explanation of individual predictions. Results: The XGBoost models which excluded 116 of the 467 subjects from the training data set based on their Logistic Regression (LR) data Shapley values outperformed the models which were trained on the entire training data set and which reached a mean classification accuracy of 58.54 % by 14.13 % (8.27 percentage points) on the independent ADNI test data set. The XGBoost models, which were trained on the entire training data set reached a mean accuracy of 60.35 % for the AIBL data set. An improvement of 24.86 % (15.00 percentage points) could be reached for the XGBoost models if those 72 subjects with the smallest RF data Shapley values were excluded from the training data set. Conclusion: The data Shapley method was able to improve the classification accuracies for the test data sets. Noisy data was associated with the number of ApoEϵ4 alleles and volumetric MRI measurements. Kernel SHAP showed that the black-box models learned biologically plausible associations.


2016 ◽  
Vol 8 (332) ◽  
pp. 332ra44-332ra44 ◽  
Author(s):  
Chia-Chen Liu ◽  
Na Zhao ◽  
Yu Yamaguchi ◽  
John R. Cirrito ◽  
Takahisa Kanekiyo ◽  
...  

Accumulation of amyloid-β (Aβ) peptide in the brain is the first critical step in the pathogenesis of Alzheimer’s disease (AD). Studies in humans suggest that Aβ clearance from the brain is frequently impaired in late-onset AD. Aβ accumulation leads to the formation of Aβ aggregates, which injure synapses and contribute to eventual neurodegeneration. Cell surface heparan sulfates (HSs), expressed on all cell types including neurons, have been implicated in several features in the pathogenesis of AD including its colocalization with amyloid plaques and modulatory role in Aβ aggregation. We show that removal of neuronal HS by conditional deletion of the Ext1 gene, which encodes an essential glycosyltransferase for HS biosynthesis, in postnatal neurons of amyloid model APP/PS1 mice led to a reduction in both Aβ oligomerization and the deposition of amyloid plaques. In vivo microdialysis experiments also detected an accelerated rate of Aβ clearance in the brain interstitial fluid, suggesting that neuronal HS either inhibited or represented an inefficient pathway for Aβ clearance. We found that the amounts of various HS proteoglycans (HSPGs) were increased in postmortem human brain tissues from AD patients, suggesting that this pathway may contribute directly to amyloid pathogenesis. Our findings have implications for AD pathogenesis and provide insight into therapeutic interventions targeting Aβ-HSPG interactions.


2019 ◽  
Vol 3 (2) ◽  
Author(s):  
Holly C. Hunsberger ◽  
Priyanka D. Pinky ◽  
Warren Smith ◽  
Vishnu Suppiramaniam ◽  
Miranda N. Reed

Abstract Alzheimer’s disease (AD) is the leading cause of dementia affecting almost 50 million people worldwide. The ε4 allele of Apolipoprotein E (APOE) is the strongest known genetic risk factor for late-onset AD cases, with homozygous APOE4 carriers being approximately 15-times more likely to develop the disease. With 25% of the population being APOE4 carriers, understanding the role of this allele in AD pathogenesis and pathophysiology is crucial. Though the exact mechanism by which ε4 allele increases the risk for AD is unknown, the processes mediated by APOE, including cholesterol transport, synapse formation, modulation of neurite outgrowth, synaptic plasticity, destabilization of microtubules, and β-amyloid clearance, suggest potential therapeutic targets. This review will summarize the impact of APOE on neurons and neuronal signaling, the interactions between APOE and AD pathology, and the association with memory decline. We will then describe current treatments targeting APOE4, complications associated with the current therapies, and suggestions for future areas of research and treatment.


Author(s):  
Ratnavalli Ellajosyula

The term ‘early onset Alzheimer’s disease’ (EOAD) is used when symptoms of Alzheimer’s disease (AD) occur in patients younger than 65 years. EOAD is an uncommon condition and data on epidemiology is limited. Prevalence rates range from 15 to 200 and incidence rates 2.4–22.6 per 100,000 population. Prevalence rates increase with age similar to that for late onset AD. The prevalence of autosomal dominant EOAD is 5.2 per 100,000. Half of these patients have an underlying mutation in amyloid precursor protein, presenilin 1 or 2 genes. Apolipoprotein E genotype is a risk factor for EOAD and homozygotes have an earlier age of onset. Methodological issues and geographical location make comparisons across epidemiological studies difficult. Further cross-national and cross-cultural studies with standardized methodology are necessary to understand the role of risk and protective factors, as well as to estimate the burden of the disease.


1996 ◽  
Vol 2 (1) ◽  
pp. 3-6 ◽  
Author(s):  
Fuki M. Hisama ◽  
Gerard D. Schellenberg

Recent intensive work has highlighted the genetic basis of several forms of Alzheimer's disease (AD). Mutations in the amyloid precursor protein gene on chromosome 21 can cause either an early-onset autosomal dominant AD or hereditary cerebral hemorrhage with amyloidosis. On chromosome 14, a second gene associated with 70 to 90% early-onset familial AD (FAD) was identified by positional cloning in 1995. Still other kindreds show no linkage to either chromosome 21 or chromosome 14; the third locus (on chromosome 1) was recently identified in affected descendants of a group of families known as the Volga Germans. In late-onset (age >65 years) AD, the apolipoprotein E gene allele ∊e4 on chromosome 19 has clearly been shown to be a risk factor for the development of AD and appears to modify the age of onset of the disease. The emerging picture is that AD is a genetically complex, heterogeneous disorder. Precisely how these genetic factors interact with each other and with other yet-to-be-identified genetic and nongenetic (environmental) factors to produce the clinical and pathologic findings in AD remains to be elucidated. The Neuroscientist 2:3–6, 1996


2021 ◽  
Author(s):  
Bojan Bogdanovic ◽  
Tome Eftimov ◽  
Monika Simjanoska

Abstract Background: Alzheimer's disease is still a field of research with lots of open questions. The complexity of the disease prevents the early diagnosis before visible symptoms regarding the individual's cognitive capabilities occur. This research presents an in-depth analysis of a huge data set encompassing medical, cognitive and lifestyle's measurements from more than 12,000 individuals. Several hypothesis were established whose validity has been questioned considering the obtained results.Methods: The importance of appropriate experimental design is highly stressed in the research. Thus, a sequence of methods for handling missing data, redundancy, data imbalance, and correlation analysis have been applied for appropriate preprocessing of the data set, and consequently Random Forest and XGBoost models have been trained and evaluated with special attention to the hyperparameters tuning. Both of the models were explained by using the Shapley values produced by the SHAP method.Results: XGBoost produced the best f1-score of 0.84 and as such is considered to be highly competitive among those published in the literature. This achievement, however, was not the main contribution of this paper. This research's goal was to perform global and local interpretability of both the intelligent models and derive valuable conclusions over the established hypothesis. Those methods led to a single scheme which presents either positive, or, negative influence of the values of each of the features whose importance has been confirmed by means of Shapley values. This scheme might be considered as additional source of knowledge for the physicians and other experts whose concern is the exact diagnosis of early stage of Alzheimer's disease.Conclusion: The conclusions derived from the intelligent models interpretability rejected all the established hypothesis. This research clearly showed the importance of Machine learning explainability approach that opens the black box and clearly unveils the relationships among the features and the diagnoses.


2021 ◽  
Author(s):  
Magdalena Arnal Segura ◽  
Dietmar Fernandez ◽  
Claudia Giambartolomei ◽  
Giorgio Bini ◽  
Eleftherios Samaras ◽  
...  

INTRODUCTION Genome-wide association studies (GWAS) in late onset Alzheimer's disease (LOAD) provide lists of individual genetic determinants. However, GWAS are not good at capturing the synergistic effects among multiple genetic variants and lack good specificity. METHODS We applied tree-based machine learning algorithms (MLs) to discriminate LOAD (> 700 individuals) and age-matched unaffected subjects using single nucleotide variants (SNVs) from AD studies, obtaining specific genomic profiles with the prioritized SNVs. RESULTS The MLs prioritized a set of SNVs located in close proximity genes PVRL2, TOMM40, APOE and APOC1. The captured genomic profiles in this region showed a clear interaction between rs405509 and rs1160985. Additionally, rs405509 located in APOE promoter interacts with rs429358 among others, seemingly neutralizing their predisposing effect. Interactions are characterized by their association with specific comorbidities and the presence of eQTL and sQTLs. DISCUSSION Our approach efficiently discriminates LOAD from controls, capturing genomic profiles defined by interactions among SNVs in a hot-spot region.


Sign in / Sign up

Export Citation Format

Share Document