scholarly journals A new approach to identifying patients with elevated risk for Fabry disease using a machine learning algorithm

2021 ◽  
Vol 16 (1) ◽  
Author(s):  
John L. Jefferies ◽  
Alison K. Spencer ◽  
Heather A. Lau ◽  
Matthew W. Nelson ◽  
Joseph D. Giuliano ◽  
...  

Abstract Background Fabry disease (FD) is a rare genetic disorder characterized by glycosphingolipid accumulation and progressive damage across multiple organ systems. Due to its heterogeneous presentation, the condition is likely significantly underdiagnosed. Several approaches, including provider education efforts and newborn screening, have attempted to address underdiagnosis of FD across the age spectrum, with limited success. Artificial intelligence (AI) methods present another option for improving diagnosis. These methods isolate common health history patterns among patients using longitudinal real-world data, and can be particularly useful when patients experience nonspecific, heterogeneous symptoms over time. In this study, the performance of an AI tool in identifying patients with FD was analyzed. The tool was calibrated using de-identified health record data from a large cohort of nearly 5000 FD patients, and extracted phenotypic patterns from these records. The tool then used this FD pattern information to make individual-level estimates of FD in a testing dataset. Patterns were reviewed and confirmed with medical experts. Results The AI tool demonstrated strong analytic performance in identifying FD patients. In out-of-sample testing, it achieved an area under the receiver operating characteristic curve (AUROC) of 0.82. Strong performance was maintained when testing on male-only and female-only cohorts, with AUROCs of 0.83 and 0.82 respectively. The tool identified small segments of the population with greatly increased prevalence of FD: in the 1% of the population identified by the tool as at highest risk, FD was 23.9 times more prevalent than in the population overall. The AI algorithm used hundreds of phenotypic signals to make predictions and included both familiar symptoms associated with FD (e.g. renal manifestations) as well as less well-studied characteristics. Conclusions The AI tool analyzed in this study performed very well in identifying Fabry disease patients using structured medical history data. Performance was maintained in all-male and all-female cohorts, and the phenotypic manifestations of FD highlighted by the tool were reviewed and confirmed by clinical experts in the condition. The platform’s analytic performance, transparency, and ability to generate predictions based on existing real-world health data may allow it to contribute to reducing persistent underdiagnosis of Fabry disease.

2018 ◽  
Vol 26 (1) ◽  
pp. 43-66 ◽  
Author(s):  
Uday Kamath ◽  
Carlotta Domeniconi ◽  
Kenneth De Jong

Many real-world problems involve massive amounts of data. Under these circumstances learning algorithms often become prohibitively expensive, making scalability a pressing issue to be addressed. A common approach is to perform sampling to reduce the size of the dataset and enable efficient learning. Alternatively, one customizes learning algorithms to achieve scalability. In either case, the key challenge is to obtain algorithmic efficiency without compromising the quality of the results. In this article we discuss a meta-learning algorithm (PSBML) that combines concepts from spatially structured evolutionary algorithms (SSEAs) with concepts from ensemble and boosting methodologies to achieve the desired scalability property. We present both theoretical and empirical analyses which show that PSBML preserves a critical property of boosting, specifically, convergence to a distribution centered around the margin. We then present additional empirical analyses showing that this meta-level algorithm provides a general and effective framework that can be used in combination with a variety of learning classifiers. We perform extensive experiments to investigate the trade-off achieved between scalability and accuracy, and robustness to noise, on both synthetic and real-world data. These empirical results corroborate our theoretical analysis, and demonstrate the potential of PSBML in achieving scalability without sacrificing accuracy.


Author(s):  
Shuji Hao ◽  
Peilin Zhao ◽  
Yong Liu ◽  
Steven C. H. Hoi ◽  
Chunyan Miao

Relative similarity learning~(RSL) aims to learn similarity functions from data with relative constraints. Most previous algorithms developed for RSL are batch-based learning approaches which suffer from poor scalability when dealing with real-world data arriving sequentially. These methods are often designed to learn a single similarity function for a specific task. Therefore, they may be sub-optimal to solve multiple task learning problems. To overcome these limitations, we propose a scalable RSL framework named OMTRSL (Online Multi-Task Relative Similarity Learning). Specifically, we first develop a simple yet effective online learning algorithm for multi-task relative similarity learning. Then, we also propose an active learning algorithm to save the labeling cost. The proposed algorithms not only enjoy theoretical guarantee, but also show high efficacy and efficiency in extensive experiments on real-world datasets.


2021 ◽  
Author(s):  
Aayushi Rathore ◽  
Anu Saini ◽  
Navjot Kaur ◽  
Aparna Singh ◽  
Ojasvi Dutta ◽  
...  

ABSTRACTSepsis is a severe infectious disease with high mortality, and it occurs when chemicals released in the bloodstream to fight an infection trigger inflammation throughout the body and it can cause a cascade of changes that damage multiple organ systems, leading them to fail, even resulting in death. In order to reduce the possibility of sepsis or infection antiseptics are used and process is known as antisepsis. Antiseptic peptides (ASPs) show properties similar to antigram-negative peptides, antigram-positive peptides and many more. Machine learning algorithms are useful in screening and identification of therapeutic peptides and thus provide initial filters or built confidence before using time consuming and laborious experimental approaches. In this study, various machine learning algorithms like Support Vector Machine (SVM), Random Forest (RF), K-Nearest Neighbour (KNN) and Logistic Regression (LR) were evaluated for prediction of ASPs. Moreover, the characteristics physicochemical features of ASPs were also explored to use them in machine learning. Both manual and automatic feature selection methodology was employed to achieve best performance of machine learning algorithms. A 5-fold cross validation and independent data set validation proved RF as the best model for prediction of ASPs. Our RF model showed an accuracy of 97%, Matthew’s Correlation Coefficient (MCC) of 0.93, which are indication of a robust and good model. To our knowledge this is the first attempt to build a machine learning classifier for prediction of ASPs.


2019 ◽  
Vol 106 (5) ◽  
pp. 925-926 ◽  
Author(s):  
Simon Körver ◽  
Ulla Feldt‐Rasmussen ◽  
Einar Svarstad ◽  
Ilkka Kantola ◽  
Mirjam Langeveld

Author(s):  
JAN RENDEK ◽  
LAURENT WENDLING

We present an approach to automatically extract a pertinent subset of soft output classifiers, and to aggregate them into a global decision rule using the Choquet integral. This approach relies on two key points. The first is a learning algorithm that uses a measure of the confusion between the categories to be recognized. The second is a selection scheme that discards weak or redundant decision rules, keeping only the most relevant subset. An experimental study, based on real world data, is then described. It analyzes the improvements achieved by these points first when used independently, then when combined together.


2020 ◽  
Vol 13 (6) ◽  
pp. 913-925
Author(s):  
Dawn A Laney ◽  
Dominique P Germain ◽  
João Paulo Oliveira ◽  
Alessandro P Burlina ◽  
Gustavo Horacio Cabrera ◽  
...  

Abstract The rapid spread of coronavirus disease 2019 (COVID-19) caused by severe acute respiratory syndrome coronavirus 2 has raised questions about Fabry disease (FD) as an independent risk factor for severe COVID-19 symptoms. Available real-world data on 22 patients from an international group of healthcare providers reveals that most patients with FD experience mild-to-moderate COVID-19 symptoms with an additional complication of Fabry pain crises and transient worsening of kidney function in some cases; however, two patients over the age of 55 years with renal or cardiac disease experienced critical COVID-19 complications. These outcomes support the theory that pre-existent tissue injury and inflammation may predispose patients with more advanced FD to a more severe course of COVID-19, while less advanced FD patients do not appear to be more susceptible than the general population. Given these observed risk factors, it is best to reinforce all recommended safety precautions for individuals with advanced FD. Diagnosis of FD should not preclude providing full therapeutic and organ support as needed for patients with FD and severe or critical COVID-19, although a FD-specific safety profile review should always be conducted prior to initiating COVID-19-specific therapies. Continued specific FD therapy with enzyme replacement therapy, chaperone therapy, dialysis, renin–angiotensin blockers or participation to clinical trials during the pandemic is recommended as FD progression will only increase susceptibility to infection. In order to compile outcome data and inform best practices, an international registry for patients affected by Fabry and infected by COVID-19 should be established.


2020 ◽  
Author(s):  
Wen-Cheng Liu ◽  
Chin-Sheng Lin ◽  
Chien-Sung Tsai ◽  
Tien-Ping Tsao ◽  
Cheng-Chung Cheng ◽  
...  

Abstract BackgroundThe initial detection and diagnosis of ST-segment or non-ST-segment elevation myocardial infarction (STEMI or NSTEMI) definitely rely on a 12-lead electrocardiogram (ECG). Delay or misdiagnosis is not unusual by subjective interpretation. Our aim is to develop a DLM as a diagnostic support tool to detect MI based on a 12-lead ECG and to evaluate the performance of this model.MethodsThis study included 1,051 ECGs from 737 coronary angiography (CAG)-validated STEMI patients, 697 ECGs from 287 CAG-validated NSTEMI patients, and 140,336 not-MI ECGs from 76,775 patients at emergency departments. DLM was trained and validated for the performance using 80% and 20% of the ECGs, respectively. A human-machine competition was conducted. The area under the receiver operating characteristic curve (AUC), sensitivity, and specificity were used to evaluate the performance of DLM and experts. STEMI versus not-STEMI, and MI versus not-MI were evaluated by DLM.ResultsThe AUCs of DLM for identifying STEMI and MI were 0.976 and 0.944 in the human-machine competition, respectively, which were significantly better than those of our best clinicians. In the real world setting, DLM presented with AUC of 0.995/0.916 with corresponding sensitivities of 96.9%/77.0%, and specificities of 96.2%/92.9% in the identification of STEMI and MI, respectively. Furthermore, DLM demonstrated sufficient diagnostic capacity for STEMI without the aid of troponin I (TnI) (AUC= 0.996) with corresponding sensitivity and specificity of 98.4% and 96.9%. The AUC of combined DLM and the first recorded TnI for the detection of NSTEMI were increased to 0.978 with corresponding sensitivity and specificity of 91.6% and 96.7%, which was better than that of DLM (0.877) or TnI (0.949) alone. ConclusionsDLM may serve as a diagnostic decision tool to assist intensive or emergency medical system-based networks and frontline physicians in identifying STEMI and NSTEMI in a timely and precise manner to prevent delay or misdiagnosis, and thereby to facilitate subsequent reperfusion therapy.


2018 ◽  
Vol 23 (4) ◽  
pp. 422-432 ◽  
Author(s):  
Barbara Nuñez‐Valdovinos ◽  
Alberto Carmona‐Bayonas ◽  
Paula Jimenez‐Fonseca ◽  
Jaume Capdevila ◽  
Ángel Castaño‐Pascual ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document