A Random Forest Regression Model Predicting the Winners of Summer Olympic Events

Author(s):  
Mengjie Jia ◽  
Yue Zhao ◽  
Furong Chang ◽  
Bofeng Zhang ◽  
Kenji Yoshigoe
2020 ◽  
pp. 1-13
Author(s):  
Tianye Gao ◽  
Jian Liu

The comprehensive indicators of the physical fitness of young athletes and the specific modes of transportation, working and leisure activities as explanatory variables are not in line with the normal distribution. Moreover, there is a high correlation between explanatory variables, and fitting traditional regression models does not meet the assumptions, and multiple collinearity problems will occur, and good results will not be obtained. The random forest regression model has excellent performance in overcoming these difficulties. Therefore, the random forest regression model is constructed to evaluate the impact of various factors on the physical fitness of young people. This paper studies the impact of various factors on the health level of young people’s body and combines the source data and research goals to establish a comprehensive evaluation index system and an influential factor indicator system. In addition, this paper uses AHP to conduct comprehensive evaluation, and obtains the comprehensive physical quality of young people, and gives corresponding suggestions according to the actual situation.


2020 ◽  
Author(s):  
Peijia Liu ◽  
Dong Yang ◽  
Shaomin Li ◽  
Yutian Chong ◽  
Wentao Hu ◽  
...  

Abstract Background The utilization of estimating-GFR equations is critical for kidney disease in the clinic. However, the performance of the Chronic Kidney Disease Epidemiology Collaboration (CKD-EPI) equation has not improved substantially in the past eight years. Here we hypothesized that random forest regression(RF) method could go beyond revised linear regression, which is used to build the CKD-EPI equationMethods 1732 participants were enrolled in this study totally (1333 in development data set from Tianhe District and 399 in external data set Luogang District). Recursive feature elimination (RFE) is applied to the development data to select important variables and build random forest models. Then same variables were used to develop the estimated GFR equation with linear regression as a comparison. The performances of these equations are measured by bias, 30% accuracy , precision and root mean square error(RMSE).Results Of all the variables, creatinine, cystatin C, weight, body mass index (BMI), age, uric acid(UA), blood urea nitrogen(BUN), hematocrit(HCT) and apolipoprotein B(APOB) were selected by RFE method. The results revealed that the overall performance of random forest regression models ascended the revised regression models based on the same variables. In the 9-variable model, RF model was better than revised linear regression in term of bias, precision ,30%accuracy and RMSE(0.78 vs 2.98, 16.90 vs 23.62, 0.84 vs 0.80, 16.88 vs 18.70, all P<0.01 ). In the 4-variable model, random forest regression model showed an improvement in precision and RMSE compared with revised regression model. (20.82 vs 25.25, P<0.01, 19.08 vs 20.60, P<0.001). Bias and 30%accurancy were preferable, but the results were not statistically significant (0.34 vs 2.07, P=0.10, 0.8 vs 0.78, P=0.19, respectively).Conclusions The performances of random forest regression models are better than revised linear regression models when it comes to GFR estimation.


Author(s):  
Fengxiang Qiao ◽  
Mahreen Nabi ◽  
Qing Li ◽  
Lei Yu

Pavement roughness would affect the running of vehicle movement, and thus possibly impact fuel consumption and vehicle emissions, the numerical relationships and analytical steps of which are, however, not yet well studied. The major objective of this paper is to quantify vehicular emission factors—hydrocarbons (HC), carbon monoxide (CO), oxides of nitrogen (NOx), and carbon dioxide (CO2)—and fuel consumption as a function of pavement roughness (the International Roughness Index [IRI]) and other factors. Within each operating mode identification (OMID) bins of vehicle operational status, a random forest regression model (RFRM) was identified to estimate emission factors and fuel consumption. The field test data, with a total length of 1,067.41 mi driving and 323,075 data pairs from one test vehicle, were used to train and validate models. The portable emissions measurement system (PEMS) and a smartphone application for IRI were employed for the tests in Texas, U.S., roadways. Results show that the optimum roughness conditions for lower emissions and fuel consumption are in categories B and C with moderate roughness. The root-mean-square error (RMSE) during training, testing, and validation processes of the RFRM are within 6.4%, implying a good fit of resulted models. IRI has the most OMID bins as number one predictor, followed by vehicle specific power (VSP) and speed. Through separated modeling for each OMID, the impacts of IRI are successfully grasped. It is recommended conducting more field measurements with more vehicle types. This would help with possible incorporation of vehicle emissions, fuel consumption, and other environmental factors into the pavement design, maintenance, and retrofitting process.


Sign in / Sign up

Export Citation Format

Share Document