Assessing the impact of drought conditions on groundwater potential in Godavari Middle Sub-Basin, India using analytical hierarchy process and random forest machine learning algorithm

A variety of climate factors influence the precision of the long-term Global Navigation Satellite System (GNSS) monitoring data. To precisely analyze the effect of different climate factors on long-term GNSS monitoring records, this study combines the extended seven-parameter Helmert transformation and a machine learning algorithm named Extreme Gradient boosting (XGboost) to establish a hybrid model. We established a local-scale reference frame called stable Puerto Rico and Virgin Islands reference frame of 2019 (PRVI19) using ten continuously operating long-term GNSS sites located in the rigid portion of the Puerto Rico and Virgin Islands (PRVI) microplate. The stability of PRVI19 is approximately 0.4 mm/year and 0.5 mm/year in the horizontal and vertical directions, respectively. The stable reference frame PRVI19 can avoid the risk of bias due to long-term plate motions when studying localized ground deformation. Furthermore, we applied the XGBoost algorithm to the postprocessed long-term GNSS records and daily climate data to train the model. We quantitatively evaluated the importance of various daily climate factors on the GNSS time series. The results show that wind is the most influential factor with a unit-less index of 0.013. Notably, we used the model with climate and GNSS records to predict the GNSS-derived displacements. The results show that the predicted displacements have a slightly lower root mean square error compared to the fitted results using spline method (prediction: 0.22 versus fitted: 0.31). It indicates that the proposed model considering the climate records has the appropriate predict results for long-term GNSS monitoring.

Download Full-text

Detecting Cognitive Distraction Using Random Forest by Considering Eye Movement Type

Intelligent Systems ◽

10.4018/978-1-5225-5643-5.ch069 ◽

2018 ◽

pp. 1587-1599

Author(s):

Hiroaki Koma ◽

Taku Harada ◽

Akira Yoshizawa ◽

Hirotoshi Iwasaki

Keyword(s):

Machine Learning ◽

Eye Movements ◽

Random Forest ◽

Decision Trees ◽

Eye Movement ◽

Learning Algorithm ◽

Machine Learning Algorithm ◽

Still Images ◽

Cognitive Distraction ◽

Movement Type

Detecting distracted states can be applied to various problems such as danger prevention when driving a car. A cognitive distracted state is one example of a distracted state. It is known that eye movements express cognitive distraction. Eye movements can be classified into several types. In this paper, the authors detect a cognitive distraction using classified eye movement types when applying the Random Forest machine learning algorithm, which uses decision trees. They show the effectiveness of considering eye movement types for detecting cognitive distraction when applying Random Forest. The authors use visual experiments with still images for the detection.

Download Full-text

Software reuse analytics using integrated random forest and gradient boosting machine learning algorithm

Software Practice and Experience ◽

10.1002/spe.2921 ◽

2020 ◽

Author(s):

Amandeep Kaur Sandhu ◽

Ranbir Singh Batth

Keyword(s):

Machine Learning ◽

Random Forest ◽

Software Reuse ◽

Learning Algorithm ◽

Gradient Boosting ◽

Machine Learning Algorithm ◽

Gradient Boosting Machine

Download Full-text

Predicting Bank Operational Efficiency Using Machine Learning Algorithm: Comparative Study of Decision Tree, Random Forest, and Neural Networks

Advances in Fuzzy Systems ◽

10.1155/2020/8581202 ◽

2020 ◽

Vol 2020 ◽

pp. 1-12

Author(s):

Peter Appiahene ◽

Yaw Marfo Missah ◽

Ussiph Najim

Keyword(s):

Machine Learning ◽

Random Forest ◽

Decision Tree ◽

Banking Sector ◽

Banking Industry ◽

Predictive Accuracy ◽

Learning Algorithm ◽

Machine Learning Algorithms ◽

Machine Learning Algorithm ◽

And Performance

The financial crisis that hit Ghana from 2015 to 2018 has raised various issues with respect to the efficiency of banks and the safety of depositors’ in the banking industry. As part of measures to improve the banking sector and also restore customers’ confidence, efficiency and performance analysis in the banking industry has become a hot issue. This is because stakeholders have to detect the underlying causes of inefficiencies within the banking industry. Nonparametric methods such as Data Envelopment Analysis (DEA) have been suggested in the literature as a good measure of banks’ efficiency and performance. Machine learning algorithms have also been viewed as a good tool to estimate various nonparametric and nonlinear problems. This paper presents a combined DEA with three machine learning approaches in evaluating bank efficiency and performance using 444 Ghanaian bank branches, Decision Making Units (DMUs). The results were compared with the corresponding efficiency ratings obtained from the DEA. Finally, the prediction accuracies of the three machine learning algorithm models were compared. The results suggested that the decision tree (DT) and its C5.0 algorithm provided the best predictive model. It had 100% accuracy in predicting the 134 holdout sample dataset (30% banks) and a P value of 0.00. The DT was followed closely by random forest algorithm with a predictive accuracy of 98.5% and a P value of 0.00 and finally the neural network (86.6% accuracy) with a P value 0.66. The study concluded that banks in Ghana can use the result of this study to predict their respective efficiencies. All experiments were performed within a simulation environment and conducted in R studio using R codes.

Download Full-text

Impact of Encoding of High Cardinality Categorical Data to Solve Prediction Problems

Journal of Computational and Theoretical Nanoscience ◽

10.1166/jctn.2020.9044 ◽

2020 ◽

Vol 17 (9) ◽

pp. 4197-4201

Author(s):

Heena Gupta ◽

V. Asha

Keyword(s):

Machine Learning ◽

Categorical Data ◽

Learning Algorithm ◽

Machine Learning Algorithm ◽

Prediction Problem ◽

Encoding Scheme ◽

The Impact ◽

Prediction Problems

The prediction problem in any domain is very important to assess the prices and preferences among people. This issue varies for different kinds of data. Data may be nominal or ordinal, it may involve more categories or less. For any category to be considered by a machine learning algorithm, it needs to be encoded before any other operation can be further performed. There are various encoding schemes available like label encoding, count encoding and one hot encoding. This paper aims to understand the impact of various encoding schemes and the accuracy among the prediction problems of high cardinality categorical data. The paper also proposes an encoding scheme based on curated strings. The domain chosen for this purpose is predicting doctors’ fees in various cities having different profiles and qualification.

Download Full-text

Machine learning algorithms can predict tail biting outbreaks in pigs using feeding behaviour records

10.1101/2021.05.11.443554 ◽

2021 ◽

Author(s):

Catherine Ollagnier ◽

Claudia Kasper ◽

Anna Wallenbeck ◽

Linda Keeling ◽

Siavash A Bigdeli

Keyword(s):

Machine Learning ◽

Random Forest ◽

Real Time ◽

Feeding Behaviour ◽

Learning Algorithm ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Machine Learning Algorithm ◽

Tail Biting ◽

Testing Set

Tail biting is a detrimental behaviour that impacts the welfare and health of pigs. Early detection of tail biting precursor signs allows for preventive measures to be taken, thus avoiding the occurrence of the tail biting event. This study aimed to build a machine-learning algorithm for real time detection of upcoming tail biting outbreaks, using feeding behaviour data recorded by an electronic feeder. Prediction capacities of seven machine learning algorithms (e.g., random forest, neural networks) were evaluated from daily feeding data collected from 65 pens originating from 2 herds of grower-finisher pigs (25-100kg), in which 27 tail biting events occurred. Data were divided into training and testing data, either by randomly splitting data into 75% (training set) and 25% (testing set), or by randomly selecting pens to constitute the testing set. The random forest algorithm was able to predict 70% of the upcoming events with an accuracy of 94%, when predicting events in pens for which it had previous data. The detection of events for unknown pens was less sensitive, and the neural network model was able to detect 14% of the upcoming events with an accuracy of 63%. A machine-learning algorithm based on ongoing data collection should be considered for implementation into automatic feeder systems for real time prediction of tail biting events.

Download Full-text

A Machine-Learning Algorithm for Estimating and Ranking the Impact of Environmental Risk Factors in Exploratory Epidemiological Studies

Statistical Modeling for Biological Systems ◽

10.1007/978-3-030-34675-1_8 ◽

2020 ◽

pp. 137-156

Author(s):

Jessica G. Young ◽

Alan E. Hubbard ◽

Brenda Eskenazi ◽

Nicholas P. Jewell

Keyword(s):

Machine Learning ◽

Risk Factors ◽

Environmental Risk ◽

Learning Algorithm ◽

Epidemiological Studies ◽

Environmental Risk Factors ◽

Machine Learning Algorithm ◽

The Impact

Download Full-text

Ability of a Machine Learning Algorithm to Predict the Need for Perioperative Red Blood Cells Transfusion in Pelvic Fracture Patients: A Multicenter Cohort Study in China

Frontiers in Medicine ◽

10.3389/fmed.2021.694733 ◽

2021 ◽

Vol 8 ◽

Author(s):

Xueyuan Huang ◽

Yongjun Wang ◽

Bingyu Chen ◽

Yuanshuai Huang ◽

Xinhua Wang ◽

...

Keyword(s):

Machine Learning ◽

Cohort Study ◽

Random Forest ◽

Blood Cells ◽

Nearest Neighbor ◽

Learning Algorithm ◽

Kappa Coefficient ◽

Gradient Boosting ◽

Machine Learning Algorithm ◽

K Nearest Neighbor

Background: Predicting the perioperative requirement for red blood cells (RBCs) transfusion in patients with the pelvic fracture may be challenging. In this study, we constructed a perioperative RBCs transfusion predictive model (ternary classifications) based on a machine learning algorithm.Materials and Methods: This study included perioperative adult patients with pelvic trauma hospitalized across six Chinese centers between September 2012 and June 2019. An extreme gradient boosting (XGBoost) algorithm was used to predict the need for perioperative RBCs transfusion, with data being split into training test (80%), which was subjected to 5-fold cross-validation, and test set (20%). The ability of the predictive transfusion model was compared with blood preparation based on surgeons' experience and other predictive models, including random forest, gradient boosting decision tree, K-nearest neighbor, logistic regression, and Gaussian naïve Bayes classifier models. Data of 33 patients from one of the hospitals were prospectively collected for model validation.Results: Among 510 patients, 192 (37.65%) have not received any perioperative RBCs transfusion, 127 (24.90%) received less-transfusion (RBCs < 4U), and 191 (37.45%) received more-transfusion (RBCs ≥ 4U). Machine learning-based transfusion predictive model produced the best performance with the accuracy of 83.34%, and Kappa coefficient of 0.7967 compared with other methods (blood preparation based on surgeons' experience with the accuracy of 65.94%, and Kappa coefficient of 0.5704; the random forest method with an accuracy of 82.35%, and Kappa coefficient of 0.7858; the gradient boosting decision tree with an accuracy of 79.41%, and Kappa coefficient of 0.7742; the K-nearest neighbor with an accuracy of 53.92%, and Kappa coefficient of 0.3341). In the prospective dataset, it also had a food performance with accuracy 81.82%.Conclusion: This multicenter retrospective cohort study described the construction of an accurate model that could predict perioperative RBCs transfusion in patients with pelvic fractures.

Download Full-text

CLASSIFICATION OF UAV POINT CLOUDS BY RANDOM FOREST MACHINE LEARNING ALGORITHM

Turkish Journal of Engineering ◽

10.31127/tuje.669566 ◽

2021 ◽

Author(s):

Mustafa ZEYBEK

Keyword(s):

Machine Learning ◽

Random Forest ◽

Learning Algorithm ◽

Point Clouds ◽

Machine Learning Algorithm

Download Full-text