scholarly journals An Application of Machine Learning to Etiological Diagnosis of Secondary Hypertension: Retrospective Study Using Electronic Medical Records (Preprint)

2020 ◽  
Author(s):  
Xiaolin Diao ◽  
Yanni Huo ◽  
Zhanzheng Yan ◽  
Haibin Wang ◽  
Jing Yuan ◽  
...  

BACKGROUND Secondary hypertension is a kind of hypertension with definite etiology and may be cured. Patients with suspected secondary hypertension can benefit from detection and treatment in time and, conversely, will have higher risk of morbidity and mortality than patients with primary hypertension. OBJECTIVE The aim of this study was to develop and validate machine learning (ML) prediction models of common etiologies in patients with suspected secondary hypertension. METHODS The analyzed dataset was retrospectively extracted from electronic medical records (EMRs) of patients discharged from Fuwai hospital between January 1, 2016 and June 30, 2019. A total of 7532 unique patients were included and divided into two datasets by time: 6302 patients in 2016-2018 as training dataset for model building and 1230 patients in 2019 as validation dataset for further evaluation. Extreme Gradient Boosting (XGBoost) was adopted to develop five prediction models of four etiologies of secondary hypertension and occurrence of any of them, including renovascular hypertension (RVH), primary aldosteronism (PA), thyroid dysfunction and aortic stenosis. Both univariate logistic analysis and Gini impure method were used for feature selection, while grid search and 10-fold cross-validation were used to select the optimal hyperparameters for each model. RESULTS Validation of the composite outcome prediction model showed good performance with an area under the receiver-operating characteristic curve (AUC) of 0.924 in the validation dataset, while the four prediction models of RVH, PA, thyroid dysfunction and aortic stenosis achieved AUC of 0.938, 0.965, 0.959, 0.946, respectively, in the validation dataset. 79 clinical indicators were identified in all and finally used in our prediction models. The result of subgroup analysis on the composite outcome prediction model demonstrated high discrimination with AUCs all higher than 0.890 among all age groups of adults. CONCLUSIONS The ML prediction models in this study showed good performance in detecting four etiologies of patients with suspected secondary hypertension, thus they may potentially facilitate clinical diagnosis decision making of secondary hypertension in an intelligent way. CLINICALTRIAL

2019 ◽  
Author(s):  
Rayees Rahman ◽  
Arad Kodesh ◽  
Stephen Z Levine ◽  
Sven Sandin ◽  
Abraham Reichenberg ◽  
...  

AbstractImportanceCurrent approaches for early identification of individuals at high risk for autism spectrum disorder (ASD) in the general population are limited, where most ASD patients are not identified until after the age of 4. This is despite substantial evidence suggesting that early diagnosis and intervention improves developmental course and outcome.ObjectiveDevelop a machine learning (ML) method predicting the diagnosis of ASD in offspring in a general population sample, using parental electronic medical records (EMR) available before childbirthDesignPrognostic study of EMR data within a single Israeli health maintenance organization, for the parents of 1,397 ASD children (ICD-9/10), and 94,741 non-ASD children born between January 1st, 1997 through December 31st, 2008. The complete EMR record of the parents was used to develop various ML models to predict the risk of having a child with ASD.Main outcomes and measuresRoutinely available parental sociodemographic information, medical histories and prescribed medications data until offspring’s birth were used to generate features to train various machine learning algorithms, including multivariate logistic regression, artificial neural networks, and random forest. Prediction performance was evaluated with 10-fold cross validation, by computing C statistics, sensitivity, specificity, accuracy, false positive rate, and precision (positive predictive value, PPV).ResultsAll ML models tested had similar performance, achieving an average C statistics of 0.70, sensitivity of 28.63%, specificity of 98.62%, accuracy of 96.05%, false positive rate of 1.37%, and positive predictive value of 45.85% for predicting ASD in this dataset.Conclusion and relevanceML algorithms combined with EMR capture early life ASD risk. Such approaches may be able to enhance the ability for accurate and efficient early detection of ASD in large populations of children.Key pointsQuestionCan autism risk in children be predicted using the pre-birth electronic medical record (EMR) of the parents?FindingsIn this population-based study that included 1,397 children with autism spectrum disorder (ASD) and 94,741 non-ASD children, we developed a machine learning classifier for predicting the likelihood of childhood diagnosis of ASD with an average C statistic of 0.70, sensitivity of 28.63%, specificity of 98.62%, accuracy of 96.05%, false positive rate of 1.37%, and positive predictive value of 45.85%.MeaningThe results presented serve as a proof-of-principle of the potential utility of EMR for the identification of a large proportion of future children at a high-risk of ASD.


2020 ◽  
Vol 63 (1) ◽  
Author(s):  
Rayees Rahman ◽  
Arad Kodesh ◽  
Stephen Z. Levine ◽  
Sven Sandin ◽  
Abraham Reichenberg ◽  
...  

Abstract Background. Current approaches for early identification of individuals at high risk for autism spectrum disorder (ASD) in the general population are limited, and most ASD patients are not identified until after the age of 4. This is despite substantial evidence suggesting that early diagnosis and intervention improves developmental course and outcome. The aim of the current study was to test the ability of machine learning (ML) models applied to electronic medical records (EMRs) to predict ASD early in life, in a general population sample. Methods. We used EMR data from a single Israeli Health Maintenance Organization, including EMR information for parents of 1,397 ASD children (ICD-9/10) and 94,741 non-ASD children born between January 1st, 1997 and December 31st, 2008. Routinely available parental sociodemographic information, parental medical histories, and prescribed medications data were used to generate features to train various ML algorithms, including multivariate logistic regression, artificial neural networks, and random forest. Prediction performance was evaluated with 10-fold cross-validation by computing the area under the receiver operating characteristic curve (AUC; C-statistic), sensitivity, specificity, accuracy, false positive rate, and precision (positive predictive value [PPV]). Results. All ML models tested had similar performance. The average performance across all models had C-statistic of 0.709, sensitivity of 29.93%, specificity of 98.18%, accuracy of 95.62%, false positive rate of 1.81%, and PPV of 43.35% for predicting ASD in this dataset. Conclusions. We conclude that ML algorithms combined with EMR capture early life ASD risk as well as reveal previously unknown features to be associated with ASD-risk. Such approaches may be able to enhance the ability for accurate and efficient early detection of ASD in large populations of children.


IEEE Access ◽  
2020 ◽  
Vol 8 ◽  
pp. 150489-150513 ◽  
Author(s):  
Jahanzaib Latif ◽  
Chuangbai Xiao ◽  
Shanshan Tu ◽  
Sadaqat Ur Rehman ◽  
Azhar Imran ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document