random forest model
Recently Published Documents





2022 ◽  
Vol 11 ◽  
Huangqi Zhang ◽  
Binhao Zhang ◽  
Wenting Pan ◽  
Xue Dong ◽  
Xin Li ◽  

PurposeThis study aimed to develop a repeatable MRI-based machine learning model to differentiate between low-grade gliomas (LGGs) and glioblastoma (GBM) and provide more clinical information to improve treatment decision-making.MethodsPreoperative MRIs of gliomas from The Cancer Imaging Archive (TCIA)–GBM/LGG database were selected. The tumor on contrast-enhanced MRI was segmented. Quantitative image features were extracted from the segmentations. A random forest classification algorithm was used to establish a model in the training set. In the test phase, a random forest model was tested using an external test set. Three radiologists reviewed the images for the external test set. The area under the receiver operating characteristic curve (AUC) was calculated. The AUCs of the radiomics model and radiologists were compared.ResultsThe random forest model was fitted using a training set consisting of 142 patients [mean age, 52 years ± 16 (standard deviation); 78 men] comprising 88 cases of GBM. The external test set included 25 patients (14 with GBM). Random forest analysis yielded an AUC of 1.00 [95% confidence interval (CI): 0.86–1.00]. The AUCs for the three readers were 0.92 (95% CI 0.74–0.99), 0.70 (95% CI 0.49–0.87), and 0.59 (95% CI 0.38–0.78). Statistical differences were only found between AUC and Reader 1 (1.00 vs. 0.92, respectively; p = 0.16).ConclusionAn MRI radiomics-based random forest model was proven useful in differentiating GBM from LGG and showed better diagnostic performance than that of two inexperienced radiologists.

2022 ◽  
Yuto Sunaga ◽  
Atsushi Watanabe ◽  
Nobuyuki Katsumata ◽  
Takako Toda ◽  
Masashi Yoshizawa ◽  

Abstract In Kawasaki disease (KD), accurate prediction of intravenous immunoglobulin (IVIG) resistance is crucial to reduce a risk for developing coronary artery lesions. To establish a simple and accurate scoring model predicting IVIG resistance, we conducted a retrospective cohort study of 996 KD patients that were diagnosed at 11 facilities for 10 years, in which 108 cases (23.5%) were resistant to initial IVIG treatment. We performed machine learning with random forest model using 30 clinical variables at diagnosis in 796 and 200 cases for training and test datasets, respectively. Random forest model accurately predicted IVIG resistance (AUC; 0.75, sensitivity; 0.54, specificity; 0.80). Next, using top five influential features (days of illness at initial therapy, serum levels of C-reactive protein, sodium, total bilirubin, and total cholesterol) in the random forest model, we designed a simple scoring system. In spite of its simplicity, the scoring system predicted IVIG resistance (AUC; 0.73, sensitivity; 0.55, specificity; 0.83) as accurately as the random forest model itself. Moreover, accuracy of our scoring system with five clinical features was almost identical to that of Gunma score with seven clinical features (AUC; 0.73, sensitivity; 0.53, specificity; 0.83), a well-known logistic regression scoring model, and superior to that of two widely used scores (Kurume score; 0.67, 0.46 and 0.76, respectively, and Osaka score; 0.69, 0.33 and 0.84, respectively). Conclusions: Our simple scoring system based on the findings in machine learning, as well as machine learning itself, seems to be useful to accurately predict IVIG resistance in KD patients.

Peipei Xu ◽  
Wei Fang ◽  
Tao Zhou ◽  
Hu Li ◽  
Xiang Zhao ◽  

Abstract The frequency and intensity of drought events are increasing with warming climate, which has resulted in worldwide forest mortality. Previous studies have reached a general consensus on the size-dependency of forest resistance to drought, but further understanding at a local scale remains ambiguous with conflicting evidence. In this study, we assessed the impact of canopy height on forest drought resistance in the broadleaf deciduous forest of southwestern China for the 2010 extreme drought event using linear regression and a random forest model. Drought condition was quantified with SPEI (standardized precipitation evapotranspiration index) and drought resistance was measured with the ratio of NDVI (normalized difference vegetation index) during (i.e. 2010) and before (i.e. 2009) the drought. At the regional scale we found that 1) drought resistance of taller canopies (30m and up) declined drastically more than that of canopies with lower height under extreme drought (SPEI < -2); 2). Random forest model showed that the importance of canopy height increased from 17.08% to 20.05% with the increase of drought intensities from no drought to extreme drought. Our results suggest that canopy structure plays a significant role in forest resistance to extreme drought, which has a broad range of implications in forest modeling and resource management.

2022 ◽  
Carsten Lange ◽  
Jian Lange

The paper identifies and quantifies the impact of race, poverty, politics, and age on COVID-19 vaccination rates in counties across the continental US. Both traditional Ordinary Least Square (OLS) regression analysis and Random Forest machine learning algorithms are applied to quantify contributing factors for county-level vaccination hesitancy. With the machine learning model, joint effects of multiple variables (race/ethnicity, partisanship, age etc.) are considered simultaneously to capture the unique combination of what factors affect the vaccination rate. By implementing a state-of-the-art Artificial Intelligence Explanations (AIX) algorithm, it is possible to solve the black box problem with machine learning models and provide answers to the "how much" question for each measured impact factor in every county. For most counties a higher percentage vote for Republicans, a greater African American population share, and a higher poverty rate lower the vaccination rate. While a higher Asian population share increases the predicted vaccination rate. The impact on the vaccination rate from the Hispanic population proportion is positive in the OLS model, but only positive for counties with very high Hispanic population (65% and more) in the Random Forest model. Both the proportion of seniors and the one for young people in a county have a significant impact in the OLS model - positive and negative, respectively. In contrast, the impacts are ambiguous in the Random Forest model. Because results vary between geographies and since the AIX algorithm is able to quantify vaccine impacts individually for each county, this research can be tailored to local communities. This way it is a helpful tool for local health officials and other policymakers to improve vaccination rates. An interactive online mapping dashboard that identifies impact factors for individual U.S. counties is available at https://www.cpp.edu/~clange/vacmap.html. It is apparent that the influence of impact factors is not universally the same across different geographies.

2022 ◽  
Vol 14 (1) ◽  
pp. 235
Julián Tijerín-Triviño ◽  
Daniel Moreno-Fernández ◽  
Miguel A. Zavala ◽  
Julen Astigarraga ◽  
Mariano García

Forest structure is a key driver of forest functional processes. The characterization of forest structure across spatiotemporal scales is essential for forest monitoring and management. LiDAR data have proven particularly useful for cost-effectively estimating forest structural attributes. This paper evaluates the ability of combined forest inventory data and low-density discrete return airborne LiDAR data to discriminate main forest structural types in the Mediterranean-temperate transition ecotone. Firstly, we used six structural variables from the Spanish National Forest Inventory (SNFI) and an aridity index in a k-medoids algorithm to define the forest structural types. These variables were calculated for 2770 SNFI plots. We identified the main species for each structural type using the SNFI. Secondly, we developed a Random Forest model to predict the spatial distribution of structural types and create wall-to-wall maps from LiDAR data. The k-medoids clustering algorithm enabled the identification of four clusters of forest structures. A total of six out of forty-one potential LiDAR metrics were utilized in our Random Forest, after evaluating their importance in the Random Forest model. Selected metrics were, in decreasing order of importance, the percentage of all returns above 2 m, mean height of the canopy profile, the difference between the 90th and 50th height percentiles, the area under the canopy curve, and the 5th and the 95th percentile of the return heights. The model yielded an overall accuracy of 64.18%. The producer’s accuracy ranged between 36.11% and 88.93%. Our results confirm the potential of this approximation for the continuous monitoring of forest structures, which is key to guiding forest management in this region.

Hokuto Nakata ◽  
Akifumi Eguchi ◽  
Shouta M. M. Nakayama ◽  
John Yabe ◽  
Kaampwe Muzandu ◽  

Lead poisoning is often considered a traditional disease; however, the specific mechanism of toxicity remains unclear. The study of Pb-induced alterations in cellular metabolic pathways is important to understand the biological response and disorders associated with environmental exposure to lead. Metabolomics studies have recently been paid considerable attention to understand in detail the biological response to lead exposure and the associated toxicity mechanisms. In the present study, wild rodents collected from an area contaminated with lead (N = 18) and a control area (N = 10) were investigated. This was the first ever experimental metabolomic study of wildlife exposed to lead in the field. While the levels of plasma phenylalanine and isoleucine were significantly higher in a lead-contaminated area versus the control area, hydroxybutyric acid was marginally significantly higher in the contaminated area, suggesting the possibility of enhancement of lipid metabolism. In the interregional least-absolute shrinkage and selection operator (lasso) regression model analysis, phenylalanine and isoleucine were identified as possible biomarkers, which is in agreement with the random forest model. In addition, in the random forest model, glutaric acid, glutamine, and hydroxybutyric acid were selected. In agreement with previous studies, enrichment analysis showed alterations in the urea cycle and ATP-binding cassette transporter pathways. Although regional rodent species bias was observed in this study, and the relatively small sample size should be taken into account, the present results are to some extent consistent with those of previous studies on humans and laboratory animals.

Sign in / Sign up

Export Citation Format

Share Document