scholarly journals Comparation Analysis of Ensemble Technique With Boosting(Xgboost) and Bagging (Randomforest) For Classify Splice Junction DNA Sequence Category

2019 ◽  
Vol 9 (1) ◽  
pp. 27-36
Author(s):  
Iswaya Maalik Syahrani

Bioinformatics research currently supported by rapid growth of computation technology and algorithm. Ensemble decision tree is common method for classifying large and complex dataset such as DNA sequence. By implementing two classification methods with ensemble technique like xgboost and random Forest might improve the accuracy result on classifying DNA Sequence splice junction type. With 96,24% of xgboost accuracy and 95,11% of Random Forest accuracy, our conclusions  the xgboost and random forest methods using right parameter setting are highly effective tool for classifying small example dataset. Analyzing both methods with their characteristics will give an overview on how they work to meet the needs in DNA splicing.

2019 ◽  
Vol 9 (1) ◽  
pp. 27 ◽  
Author(s):  
Iswaya Maalik Syahrani

<p class="JGI-AbstractIsi">Bioinformatics research currently supported by rapid growth of computation technology and algorithm. Ensemble decision tree is common method for classifying large and complex dataset such as DNA sequence. By implementing two classification methods with ensemble technique like xgboost and random Forest might improve the accuracy result on classifying DNA Sequence splice junction type. With 96,24% of xgboost accuracy and 95,11% of Random Forest accuracy, our conclusions  the xgboost and random forest methods using right parameter setting are highly effective tool for classifying small example dataset. Analyzing both methods with their characteristics will give an overview on how they work to meet the needs in DNA splicing.</p>


Author(s):  
Triando Hamonangan Saragih ◽  
Diny Melsye Nurul Fajri ◽  
Alfita Rakhmandasari

Jatropha Curcas is a very useful plant that can be used as a bio fuel for diesel engines replacing the coal. In Indonesia, there are few plantation that plant Jatropha Curcas. But there is so limited farmers that understand in detail about the disease of Jatropha Curcas and it may cause a big loss during harvesting when the disease occured with no further action. An expert system can help the farmers to identify the lant diseases of Jatropha Curcas. The objective of this research is to compare several identification and classification methods, such as Decision Tree, K-Nearest Neighbor and its modification. The comparison is based on the accuracy. Modified K-Nearest Neighbor method given the best accuracy result that is 67.74%.


2021 ◽  
Vol 13 (11) ◽  
pp. 2039
Author(s):  
Joon Jin Song ◽  
Melissa Innerst ◽  
Kyuhee Shin ◽  
Bo-Young Ye ◽  
Minho Kim ◽  
...  

Estimating precipitation area is important for weather forecasting as well as real-time application. This paper aims to develop an analytical framework for efficient precipitation area estimation using S-band dual-polarization radar measurements. Several types of factors, such as types of sensors, thresholds, and models, are considered and compared to form a data set. After building the appropriate data set, this paper yields a rigorous comparison of classification methods in statistical (logistic regression and linear discriminant analysis) and machine learning (decision tree, support vector machine, and random forest). To achieve better performance, spatial classification is considered by incorporating latitude and longitude of observation location into classification, compared with non-spatial classification. The data used in this study were collected by rain detector and present weather sensor in a network of automated weather systems (AWS), and an S-band dual-polarimetric weather radar during ten different rainfall events of varying lengths. The mean squared prediction error (MSPE) from leave-one-out cross validation (LOOCV) is computed to assess the performance of the methods. Of the methods, the decision tree and random forest methods result in the lowest MSPE, and spatial classification outperforms non-spatial classification. Particularly, machine-learning-based spatial classification methods accurately estimate the precipitation area in the northern areas of the study region.


2020 ◽  
Vol 4 (Supplement_1) ◽  
pp. 268-269
Author(s):  
Jaime Speiser ◽  
Kathryn Callahan ◽  
Jason Fanning ◽  
Thomas Gill ◽  
Anne Newman ◽  
...  

Abstract Advances in computational algorithms and the availability of large datasets with clinically relevant characteristics provide an opportunity to develop machine learning prediction models to aid in diagnosis, prognosis, and treatment of older adults. Some studies have employed machine learning methods for prediction modeling, but skepticism of these methods remains due to lack of reproducibility and difficulty understanding the complex algorithms behind models. We aim to provide an overview of two common machine learning methods: decision tree and random forest. We focus on these methods because they provide a high degree of interpretability. We discuss the underlying algorithms of decision tree and random forest methods and present a tutorial for developing prediction models for serious fall injury using data from the Lifestyle Interventions and Independence for Elders (LIFE) study. Decision tree is a machine learning method that produces a model resembling a flow chart. Random forest consists of a collection of many decision trees whose results are aggregated. In the tutorial example, we discuss evaluation metrics and interpretation for these models. Illustrated in data from the LIFE study, prediction models for serious fall injury were moderate at best (area under the receiver operating curve of 0.54 for decision tree and 0.66 for random forest). Machine learning methods may offer improved performance compared to traditional models for modeling outcomes in aging, but their use should be justified and output should be carefully described. Models should be assessed by clinical experts to ensure compatibility with clinical practice.


2021 ◽  
Vol 11 (4) ◽  
pp. 1378
Author(s):  
Seung Hyun Lee ◽  
Jaeho Son

It has been pointed out that the act of carrying a heavy object that exceeds a certain weight by a worker at a construction site is a major factor that puts physical burden on the worker’s musculoskeletal system. However, due to the nature of the construction site, where there are a large number of workers simultaneously working in an irregular space, it is difficult to figure out the weight of the object carried by the worker in real time or keep track of the worker who carries the excess weight. This paper proposes a prototype system to track the weight of heavy objects carried by construction workers by developing smart safety shoes with FSR (Force Sensitive Resistor) sensors. The system consists of smart safety shoes with sensors attached, a mobile device for collecting initial sensing data, and a web-based server computer for storing, preprocessing and analyzing such data. The effectiveness and accuracy of the weight tracking system was verified through the experiments where a weight was lifted by each experimenter from +0 kg to +20 kg in 5 kg increments. The results of the experiment were analyzed by a newly developed machine learning based model, which adopts effective classification algorithms such as decision tree, random forest, gradient boosting algorithm (GBM), and light GBM. The average accuracy classifying the weight by each classification algorithm showed similar, but high accuracy in the following order: random forest (90.9%), light GBM (90.5%), decision tree (90.3%), and GBM (89%). Overall, the proposed weight tracking system has a significant 90.2% average accuracy in classifying how much weight each experimenter carries.


2021 ◽  
Vol 15 (1) ◽  
Author(s):  
Moaz Hiba ◽  
Ahmed Farid Ibrahim ◽  
Salaheldin Elkatatny ◽  
Abdulwahab Ali

Sign in / Sign up

Export Citation Format

Share Document