ESTIMATING BUILDING AGE WITH 3D GIS

Building datasets (e.g. footprints in OpenStreetMap and 3D city models) are becoming increasingly available worldwide. However, the thematic (attribute) aspect is not always given attention, as many of such datasets are lacking in completeness of attributes. A prominent attribute of buildings is the year of construction, which is useful for some applications, but its availability may be scarce. This paper explores the potential of estimating the year of construction (or age) of buildings from other attributes using random forest regression. The developed method has a two-fold benefit: enriching datasets and quality control (verification of existing attributes). Experiments are carried out on a semantically rich LOD1 dataset of Rotterdam in the Netherlands using 9 attributes. The results are mixed: the accuracy in the estimation of building age depends on the available information used in the regression model. In the best scenario we have achieved predictions with an RMSE of 11 years, but in more realistic situations with limited knowledge about buildings the error is much larger (RMSE = 26 years). Hence the main conclusion of the paper is that inferring building age with 3D city models is possible to a certain extent because it reveals the approximate period of construction, but precise estimations remain a difficult task.

Download Full-text

A Random Forest Regression Model Predicting the Winners of Summer Olympic Events

Proceedings of the 2020 2nd International Conference on Big Data Engineering ◽

10.1145/3404512.3404513 ◽

2020 ◽

Author(s):

Mengjie Jia ◽

Yue Zhao ◽

Furong Chang ◽

Bofeng Zhang ◽

Kenji Yoshigoe

Keyword(s):

Random Forest ◽

Regression Model ◽

Random Forest Regression

Download Full-text

Random Forest Regression Model For Estimation of Neonatal Levels In Nigeria

International Journal of Computer Science and Engineering ◽

10.14445/23488387/ijcse-v7i7p107 ◽

2020 ◽

Vol 7 (7) ◽

pp. 41-44

Author(s):

Managwu C ◽

Matthias D ◽

Nwaibu N

Keyword(s):

Random Forest ◽

Regression Model ◽

Random Forest Regression

Download Full-text

Application of improved random forest algorithm and fuzzy mathematics in physical fitness of athletes

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-189206 ◽

2020 ◽

pp. 1-13

Author(s):

Tianye Gao ◽

Jian Liu

Keyword(s):

Random Forest ◽

Young People ◽

Physical Fitness ◽

Regression Model ◽

Comprehensive Evaluation ◽

Indicator System ◽

Influential Factor ◽

Random Forest Regression ◽

Explanatory Variables ◽

The Impact

The comprehensive indicators of the physical fitness of young athletes and the specific modes of transportation, working and leisure activities as explanatory variables are not in line with the normal distribution. Moreover, there is a high correlation between explanatory variables, and fitting traditional regression models does not meet the assumptions, and multiple collinearity problems will occur, and good results will not be obtained. The random forest regression model has excellent performance in overcoming these difficulties. Therefore, the random forest regression model is constructed to evaluate the impact of various factors on the physical fitness of young people. This paper studies the impact of various factors on the health level of young people’s body and combines the source data and research goals to establish a comprehensive evaluation index system and an influential factor indicator system. In addition, this paper uses AHP to conduct comprehensive evaluation, and obtains the comprehensive physical quality of young people, and gives corresponding suggestions according to the actual situation.

Download Full-text

Mapping sub-pixel corn distribution using MODIS time-series data and a random forest regression model

2017 6th International Conference on Agro-Geoinformatics ◽

10.1109/agro-geoinformatics.2017.8047051 ◽

2017 ◽

Author(s):

Qiong Hu ◽

Wenbin Wu ◽

Mark A. Friedl

Keyword(s):

Time Series ◽

Random Forest ◽

Regression Model ◽

Time Series Data ◽

Series Data ◽

Random Forest Regression

Download Full-text

An empirical study on balance of trade in China based on random forest regression model

2010 5th International Conference on Computer Science & Education ◽

10.1109/iccse.2010.5593550 ◽

2010 ◽

Author(s):

Linkai Luo ◽

Weihang Lv ◽

Renshuo Zhang

Keyword(s):

Random Forest ◽

Empirical Study ◽

Regression Model ◽

Random Forest Regression ◽

Balance Of Trade

Download Full-text

A Random Forest Regression Model for Predicting Residual Stresses and Cutting Forces Introduced by Turning IN718 Alloy

2019 IEEE International Conference on Computation, Communication and Engineering (ICCCE) ◽

10.1109/iccce48422.2019.9010767 ◽

2019 ◽

Author(s):

Penghao Dong ◽

Huachen Peng ◽

Xianqiang Cheng ◽

Yan Xing ◽

Xin Zhou ◽

...

Keyword(s):

Random Forest ◽

Residual Stresses ◽

Regression Model ◽

Cutting Forces ◽

Random Forest Regression

Download Full-text

RFRCDB-siRNA: Improved design of siRNAs by random forest regression model coupled with database searching

Computer Methods and Programs in Biomedicine ◽

10.1016/j.cmpb.2007.06.001 ◽

2007 ◽

Vol 87 (3) ◽

pp. 230-238 ◽

Cited By ~ 17

Author(s):

Peng Jiang ◽

Haonan Wu ◽

Yao Da ◽

Fei Sang ◽

Jiawei Wei ◽

...

Keyword(s):

Random Forest ◽

Regression Model ◽

Database Searching ◽

Random Forest Regression ◽

Improved Design

Download Full-text

Applying Random Forest Model Algorithm to GFR Estimation

10.21203/rs.3.rs-74843/v1 ◽

2020 ◽

Author(s):

Peijia Liu ◽

Dong Yang ◽

Shaomin Li ◽

Yutian Chong ◽

Wentao Hu ◽

...

Keyword(s):

Random Forest ◽

Kidney Disease ◽

Linear Regression ◽

Regression Model ◽

Regression Models ◽

Random Forest Regression ◽

Variable Model ◽

Data Set ◽

Development Data ◽

Better Than

Abstract Background The utilization of estimating-GFR equations is critical for kidney disease in the clinic. However, the performance of the Chronic Kidney Disease Epidemiology Collaboration (CKD-EPI) equation has not improved substantially in the past eight years. Here we hypothesized that random forest regression(RF) method could go beyond revised linear regression, which is used to build the CKD-EPI equationMethods 1732 participants were enrolled in this study totally (1333 in development data set from Tianhe District and 399 in external data set Luogang District). Recursive feature elimination (RFE) is applied to the development data to select important variables and build random forest models. Then same variables were used to develop the estimated GFR equation with linear regression as a comparison. The performances of these equations are measured by bias, 30% accuracy , precision and root mean square error(RMSE).Results Of all the variables, creatinine, cystatin C, weight, body mass index (BMI), age, uric acid(UA), blood urea nitrogen(BUN), hematocrit(HCT) and apolipoprotein B(APOB) were selected by RFE method. The results revealed that the overall performance of random forest regression models ascended the revised regression models based on the same variables. In the 9-variable model, RF model was better than revised linear regression in term of bias, precision ,30%accuracy and RMSE(0.78 vs 2.98, 16.90 vs 23.62, 0.84 vs 0.80, 16.88 vs 18.70, all P<0.01 ). In the 4-variable model, random forest regression model showed an improvement in precision and RMSE compared with revised regression model. (20.82 vs 25.25, P<0.01, 19.08 vs 20.60, P<0.001). Bias and 30%accurancy were preferable, but the results were not statistically significant (0.34 vs 2.07, P=0.10, 0.8 vs 0.78, P=0.19, respectively).Conclusions The performances of random forest regression models are better than revised linear regression models when it comes to GFR estimation.

Download Full-text

Quality control-based signal drift correction and interpretations of metabolomics/proteomics data using random forest regression

10.1101/253583 ◽

2018 ◽

Author(s):

Hemi Luan ◽

Fenfen Ji ◽

Yu Chen ◽

Zongwei Cai

Keyword(s):

Quality Control ◽

Random Forest ◽

Large Scale ◽

Proteomics Data ◽

Drift Correction ◽

Random Forest Regression ◽

Signal Drift ◽

Signal Correction ◽

Term Analysis

AbstractLarge-scale mass spectrometry-based metabolomics and proteomics study requires the long-term analysis of multiple batches of biological samples, which often accompanied with significant signal drift and various inter‐ and intra‐ batch variations. The unwanted variations can lead to poor inter‐ and intra-day reproducibility, which is a hindrance to discover real significance. We developed a novel quality control-based random forest signal correction algorithm, being ensemble learning approach to remove inter‐ and intra‐ batches of unwanted variations at feature-level. Our evaluation based on real samples showed the developed algorithm improved the data precision and statistical accuracy for metabolomics and proteomics, which was superior to other common correction methods. We have been able to improve its performance for interpretations of large-scale metabolomics and proteomics data, and to allow the improvement of the data precision for uncovering the real biologically differences.

Download Full-text

Estimating Light-Duty Vehicle Emission Factors using Random Forest Regression Model with Pavement Roughness

Transportation Research Record Journal of the Transportation Research Board ◽

10.1177/0361198120922997 ◽

2020 ◽

Vol 2674 (8) ◽

pp. 37-52

Author(s):

Fengxiang Qiao ◽

Mahreen Nabi ◽

Qing Li ◽

Lei Yu

Keyword(s):

Random Forest ◽

Regression Model ◽

Fuel Consumption ◽

Vehicle Emissions ◽

Emission Factors ◽

Specific Power ◽

Oxides Of Nitrogen ◽

Random Forest Regression ◽

Pavement Roughness ◽

Light Duty Vehicle

Pavement roughness would affect the running of vehicle movement, and thus possibly impact fuel consumption and vehicle emissions, the numerical relationships and analytical steps of which are, however, not yet well studied. The major objective of this paper is to quantify vehicular emission factors—hydrocarbons (HC), carbon monoxide (CO), oxides of nitrogen (NOx), and carbon dioxide (CO2)—and fuel consumption as a function of pavement roughness (the International Roughness Index [IRI]) and other factors. Within each operating mode identification (OMID) bins of vehicle operational status, a random forest regression model (RFRM) was identified to estimate emission factors and fuel consumption. The field test data, with a total length of 1,067.41 mi driving and 323,075 data pairs from one test vehicle, were used to train and validate models. The portable emissions measurement system (PEMS) and a smartphone application for IRI were employed for the tests in Texas, U.S., roadways. Results show that the optimum roughness conditions for lower emissions and fuel consumption are in categories B and C with moderate roughness. The root-mean-square error (RMSE) during training, testing, and validation processes of the RFRM are within 6.4%, implying a good fit of resulted models. IRI has the most OMID bins as number one predictor, followed by vehicle specific power (VSP) and speed. Through separated modeling for each OMID, the impacts of IRI are successfully grasped. It is recommended conducting more field measurements with more vehicle types. This would help with possible incorporation of vehicle emissions, fuel consumption, and other environmental factors into the pavement design, maintenance, and retrofitting process.

Download Full-text