An early prediction model to identify neurological complications of childhood influenza: a random forest model

Abstract Aims We aimed to construct a random forest model to predict atrial fibrillation (AF) in Chinese population. Methods and results This study was comprised of 682 237 subjects with or without AF. Each subject had 19 features that included the subjects’ age, gender, underlying diseases, CHA2DS2-VASc score, and follow-up period. The data were split into train and test sets at an approximate 9:1 ratio: 614 013 data points were placed into the train set and 68 224 data points were placed into the test set. In this study, weighted average F1, precision, and recall values were used to measure prediction model performance. The F1, precision, and recall values were calculated across the train set, the test set, and all data. The area under receiving operating characteristic (ROC) curve was also used to evaluate the performance of the prediction model. The prediction model achieved a k-fold cross-validation accuracy of 0.979 (k = 10). In the test set, the prediction model achieved an F1 value of 0.968, precision value of 0.958, and recall value of 0.979. The area under ROC curve of the model was 0.948 (95% confidence interval 0.947–0.949). This model was validated with a separate dataset. Conclusions This study showed a novel AF risk prediction scheme for Chinese individuals with random forest model methodology.

Download Full-text

Artificial Intelligence Prediction Model for the Cost and Mortality of Renal Replacement Therapy in Aged and Super-Aged Populations in Taiwan

Journal of Clinical Medicine ◽

10.3390/jcm8070995 ◽

2019 ◽

Vol 8 (7) ◽

pp. 995 ◽

Cited By ~ 2

Author(s):

Shih-Yi Lin ◽

Meng-Hsuen Hsieh ◽

Cheng-Li Lin ◽

Meng-Ju Hsieh ◽

Wu-Huei Hsu ◽

...

Keyword(s):

Neural Network ◽

Artificial Intelligence ◽

Artificial Neural Network ◽

Random Forest ◽

Prediction Model ◽

Random Forest Model ◽

Forest Model ◽

Artificial Neural ◽

One Year ◽

The Cost

Background: Prognosis of the aged population requiring maintenance dialysis has been reportedly poor. We aimed to develop prediction models for one-year cost and one-year mortality in aged individuals requiring dialysis to assist decision-making for deciding whether aged people should receive dialysis or not. Methods: We used data from the National Health Insurance Research Database (NHIRD). We identified patients first enrolled in the NHIRD from 2000–2011 for end-stage renal disease (ESRD) who underwent regular dialysis. A total of 48,153 Patients with ESRD aged ≥65 years with complete age and sex information were included in the ESRD cohort. The total medical cost per patient (measured in US dollars) within one year after ESRD diagnosis was our study’s main outcome variable. We were also concerned with mortality as another outcome. In this study, we compared the performance of the random forest prediction model and of the artificial neural network prediction model for predicting patient cost and mortality. Results: In the cost regression model, the random forest model outperforms the artificial neural network according to the mean squared error and mean absolute error. In the mortality classification model, the receiver operating characteristic (ROC) curves of both models were significantly better than the null hypothesis area of 0.5, and random forest model outperformed the artificial neural network. Random forest model outperforms the artificial neural network models achieved similar performance in the test set across all data. Conclusions: Applying artificial intelligence modeling could help to provide reliable information about one-year outcomes following dialysis in the aged and super-aged populations; those with cancer, alcohol-related disease, stroke, chronic obstructive pulmonary disease (COPD), previous hip fracture, osteoporosis, dementia, and previous respiratory failure had higher medical costs and a high mortality rate.

Download Full-text

Real-time AI prediction for major adverse cardiac events in emergency department patients with chest pain

Scandinavian Journal of Trauma Resuscitation and Emergency Medicine ◽

10.1186/s13049-020-00786-x ◽

2020 ◽

Vol 28 (1) ◽

Author(s):

Pei-I Zhang ◽

Chien-Chin Hsu ◽

Yuan Kao ◽

Chia-Jung Chen ◽

Ya-Wei Kuo ◽

...

Keyword(s):

Emergency Department ◽

Chest Pain ◽

Random Forest ◽

Prediction Model ◽

Real Time ◽

Random Forest Model ◽

Cardiac Events ◽

Forest Model ◽

Adverse Cardiac Events ◽

All Cause Mortality

Abstract Background A big-data-driven and artificial intelligence (AI) with machine learning (ML) approach has never been integrated with the hospital information system (HIS) for predicting major adverse cardiac events (MACE) in patients with chest pain in the emergency department (ED). Therefore, we conducted the present study to clarify it. Methods In total, 85,254 ED patients with chest pain in three hospitals between 2009 and 2018 were identified. We randomized the patients into a 70%/30% split for ML model training and testing. We used 14 clinical variables from their electronic health records to construct a random forest model with the synthetic minority oversampling technique preprocessing algorithm to predict acute myocardial infarction (AMI) < 1 month and all-cause mortality < 1 month. Comparisons of the predictive accuracies among random forest, logistic regression, support-vector clustering (SVC), and K-nearest neighbor (KNN) models were also performed. Results Predicting MACE using the random forest model produced areas under the curves (AUC) of 0.915 for AMI < 1 month and 0.999 for all-cause mortality < 1 month. The random forest model had better predictive accuracy than logistic regression, SVC, and KNN. We further integrated the AI prediction model with the HIS to assist physicians with decision-making in real time. Validation of the AI prediction model by new patients showed AUCs of 0.907 for AMI < 1 month and 0.888 for all-cause mortality < 1 month. Conclusions An AI real-time prediction model is a promising method for assisting physicians in predicting MACE in ED patients with chest pain. Further studies to evaluate the impact on clinical practice are warranted.

Download Full-text

A random forest model based on core genome allelic profiles of MRSA for penicillin plus potassium clavulanate susceptibility prediction

Microbial Genomics ◽

10.1099/mgen.0.000610 ◽

2021 ◽

Vol 7 (9) ◽

Author(s):

Hemu Zhuang ◽

Feiteng Zhu ◽

Peng Lan ◽

Shujuan Ji ◽

Lu Sun ◽

...

Keyword(s):

Random Forest ◽

Prediction Model ◽

Core Genome ◽

Drug Susceptibility ◽

Random Forest Model ◽

Training Set ◽

Content Type ◽

Forest Model ◽

Scope Of Application ◽

Potassium Clavulanate

Treatment failure of methicillin-resistant Staphylococcus aureus (MRSA) infections remains problematic in clinical practice because therapeutic options are limited. Penicillin plus potassium clavulanate combination (PENC) was shown to have potential for treating some MRSA infections. We investigated the susceptibility of MRSA isolates and constructed a drug susceptibility prediction model for the phenotype of the PENC. We determined the minimum inhibitory concentration of PENC for MRSA (n=284) in a teaching hospital (SRRSH-MRSA). PENC susceptibility genotypes were analysed using a published genotyping scheme based on the mecA sequence. mecA expression in MRSA isolates was analysed by qPCR. We established a random forest model for predicting PENC-susceptible phenotypes using core genome allelic profiles from cgMLST analysis. We identified S2-R isolates with susceptible mecA genotypes but PENC-resistant phenotypes; these isolates expressed mecA at higher levels than did S2 MRSA (2.61 vs 0.98, P<0.05), indicating the limitation of using a single factor for predicting drug susceptibility. Using the data of selected UK-sourced MRSA (n=74) and MRSA collected in a previous national survey (NA-MRSA, n=471) as a training set, we built a model with accuracies of 0.94 and 0.93 for SRRSH-MRSA and UK-sourced MRSA (n=287, NAM-MRSA) validation sets. The AUROC of this model for SRRSH-MRSA and NAM-MRSA was 0.96 and 0.97. Although the source of the training set data affects the scope of application of the prediction model, our data demonstrated the power of the machine learning approach in predicting susceptibility from cgMLST results.

Download Full-text

A Prediction Model for High Risk of Positive RT-PCR Tests in Discharged Patients with COVID-19 Based on Random Forest Model

SSRN Electronic Journal ◽

10.2139/ssrn.3745156 ◽

2020 ◽

Author(s):

Yawei Qian ◽

Guang Zeng ◽

Yue Pan ◽

Yang Liu ◽

Yufeng Yuan ◽

...

Keyword(s):

High Risk ◽

Random Forest ◽

Prediction Model ◽

Random Forest Model ◽

Rt Pcr ◽

Forest Model ◽

Discharged Patients

Download Full-text

Spatial modeling of gully head erosion on the Loess Plateau using a certainty factor and random forest model

The Science of The Total Environment ◽

10.1016/j.scitotenv.2021.147040 ◽

2021 ◽

Vol 783 ◽

pp. 147040

Author(s):

Chengcheng Jiang ◽

Wen Fan ◽

Ningyu Yu ◽

Enlong Liu

Keyword(s):

Random Forest ◽

Loess Plateau ◽

Spatial Modeling ◽

Random Forest Model ◽

Certainty Factor ◽

The Loess Plateau ◽

Forest Model ◽

Gully Head

Download Full-text

Clinical trial registries as Scientometric data: A novel solution for linking and deduplicating clinical trials from multiple registries

Scientometrics ◽

10.1007/s11192-021-04111-w ◽

2021 ◽

Author(s):

Christian Thiele ◽

Gerrit Hirschfeld ◽

Ruth von Brachel

Keyword(s):

Clinical Trials ◽

Random Forest ◽

Random Forest Model ◽

Scientometric Analysis ◽

Data Set ◽

The Public ◽

Forest Model ◽

Clinical Trial Registries ◽

Multiple Primary ◽

Clinical Trials Registry

AbstractRegistries of clinical trials are a potential source for scientometric analysis of medical research and serve important functions for the research community and the public at large. Clinical trials that recruit patients in Germany are usually registered in the German Clinical Trials Register (DRKS) or in international registries such as ClinicalTrials.gov. Furthermore, the International Clinical Trials Registry Platform (ICTRP) aggregates trials from multiple primary registries. We queried the DRKS, ClinicalTrials.gov, and the ICTRP for trials with a recruiting location in Germany. Trials that were registered in multiple registries were linked using the primary and secondary identifiers and a Random Forest model based on various similarity metrics. We identified 35,912 trials that were conducted in Germany. The majority of the trials was registered in multiple databases. 32,106 trials were linked using primary IDs, 26 were linked using a Random Forest model, and 10,537 internal duplicates on ICTRP were identified using the Random Forest model after finding pairs with matching primary or secondary IDs. In cross-validation, the Random Forest increased the F1-score from 96.4% to 97.1% compared to a linkage based solely on secondary IDs on a manually labelled data set. 28% of all trials were registered in the German DRKS. 54% of the trials on ClinicalTrials.gov, 43% of the trials on the DRKS and 56% of the trials on the ICTRP were pre-registered. The ratio of pre-registered studies and the ratio of studies that are registered in the DRKS increased over time.

Download Full-text

Discrimination of the geographic origins and varieties of wine grapes using high-throughput sequencing assisted by a random forest model

LWT ◽

10.1016/j.lwt.2021.111333 ◽

2021 ◽

pp. 111333

Author(s):

Feifei Gao ◽

Guihua Zeng ◽

Bin Wang ◽

Jing Xiao ◽

Liang Zhang ◽

...

Keyword(s):

Random Forest ◽

High Throughput ◽

High Throughput Sequencing ◽

Random Forest Model ◽

Wine Grapes ◽

Forest Model ◽

Geographic Origins

Download Full-text

Multi-Scenario Prediction of Intra-Urban Land Use Change Using a Cellular Automata-Random Forest Model

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi10080503 ◽

2021 ◽

Vol 10 (8) ◽

pp. 503

Author(s):

Hang Liu ◽

Riken Homma ◽

Qiang Liu ◽

Congying Fang

Keyword(s):

Land Use ◽

Random Forest ◽

Cellular Automata ◽

Land Use Change ◽

Urban Land ◽

Urban Land Use ◽

Random Forest Model ◽

Growth Trend ◽

Related Factors ◽

Forest Model

The simulation of future land use can provide decision support for urban planners and decision makers, which is important for sustainable urban development. Using a cellular automata-random forest model, we considered two scenarios to predict intra-land use changes in Kumamoto City from 2018 to 2030: an unconstrained development scenario, and a planning-constrained development scenario that considers disaster-related factors. The random forest was used to calculate the transition probabilities and the importance of driving factors, and cellular automata were used for future land use prediction. The results show that disaster-related factors greatly influence land vacancy, while urban planning factors are more important for medium high-rise residential, commercial, and public facilities. Under the unconstrained development scenario, urban land use tends towards spatially disordered growth in the total amount of steady growth, with the largest increase in low-rise residential areas. Under the planning-constrained development scenario that considers disaster-related factors, the urban land area will continue to grow, albeit slowly and with a compact growth trend. This study provides planners with information on the relevant trends in different scenarios of land use change in Kumamoto City. Furthermore, it provides a reference for Kumamoto City’s future post-disaster recovery and reconstruction planning.

Download Full-text