Peningkatan Hasil Klasifikasi pada Algoritma Random Forest untuk Deteksi Pasien Penderita Diabetes Menggunakan Metode Normalisasi

2021 ◽  
Vol 5 (1) ◽  
pp. 114-122
Author(s):  
Gde Agung Brahmana Suryanegara ◽  
Adiwijaya ◽  
Mahendra Dwifebri Purbolaksono

Diabetes is a disease caused by high blood sugar in the body or beyond normal limits. Diabetics in Indonesia have experienced a significant increase, Basic Health Research states that diabetics in Indonesia were 6.9% to 8.5% increased from 2013 to 2018 with an estimated number of sufferers more than 16 million people. Therefore, it is necessary to have a technology that can detect diabetes with good performance, accurate level of analysis, so that diabetes can be treated early to reduce the number of sufferers, disabilities, and deaths. The different scale values for each attribute in Gula Karya Medika’s data can complicate the classification process, for this reason the researcher uses two data normalization methods, namely min-max normalization, z-score normalization, and a method without data normalization with Random Forest (RF) as a classification method. Random Forest (RF) as a classification method has been tested in several previous studies. Moreover, this method is able to produce good performance with high accuracy. Based on the research results, the best accuracy is model 1 (Min-max normalization-RF) of 95.45%, followed by model 2 (Z-score normalization-RF) of 95%, and model 3 (without data normalization-RF) of 92%. From these results, it can be concluded that model 1 (Min-max normalization-RF) is better than the other two data normalization models and is able to increase the performance of classification Random Forest by 95.45%.  

2017 ◽  
Vol 5 (4) ◽  
pp. 319 ◽  
Author(s):  
Adel S. Eesa ◽  
Wahab Kh. Arabo

Neural Networks (NN) have been used by many researchers to solve problems in several domains including classification and pattern recognition, and Backpropagation (BP) which is one of the most well-known artificial neural network models. Constructing effective NN applications relies on some characteristics such as the network topology, learning parameter, and normalization approaches for the input and the output vectors. The Input and the output vectors for BP need to be normalized properly in order to achieve the best performance of the network. This paper applies several normalization methods on several UCI datasets and comparing between them to find the best normalization method that works better with BP. Norm, Decimal scaling, Mean-Man, Median-Mad, Min-Max, and Z-score normalization are considered in this study. The comparative study shows that the performance of Mean-Mad and Median-Mad is better than the all remaining methods. On the other hand, the worst result is produced with Norm method.


2019 ◽  
Vol 7 (7) ◽  
pp. 10
Author(s):  
Feyzullah Koca

The aim in this study, will comparison anthropometric characteristics and motor performance tests to be between according to the ages of boys and girls ski athletes. A total of 41 Girls and 47 Boys ski athletes participated in this study voluntarily. One Wey ANOVA and LSD tests were used.In this study, there were differences in height and body weights statistical (p <0.001). 12 year old girl ski athletes were taller. Again, the body weight of girls is higher than men. In this study, the Sit and Reach Test values of girls and boys at 11 years of age were significantly higher than the values of boys and girls at 12 years of age (p <0.001). There was no difference between boys and girls (p> 0.05). The flamingo test values of boys and girls 11 and 12 aged changed according to gender and age statistical (p <0.01). The plate tapping test values of boys and girls 11 and 12 aged changed according to gender and age not statistical (p >0.05). It was statistically significant that girls' sit up and Standing Long Jump value was better than boys (p <0.001). It was statistically significant that boys' Bent Arm Hang test and mini cooper test was better than girls (p <0.001).Conclusion: Anthropometric characteristics and motor performance tests were found to be within normal limits according to the ages of boys and girls ski athletes. Physical characteristics and motor performance parameters can change according age and sex for 11 and 12 aged Child. For ski training and education's plans, according age and gender should be taken into consideration on child.


2020 ◽  
Vol 8 (1) ◽  
Author(s):  
Danielle D Crain ◽  
Amanda Thomas ◽  
Farzaneh Mansouri ◽  
Charles W Potter ◽  
Sascha Usenko ◽  
...  

Abstract Marine animals experience additional stressors as humans continue to industrialize the oceans and as the climate continues to rapidly change. To examine how the environment or humans impact animal stress, many researchers analyse hormones from biological matrices. Scientists have begun to examine hormones in continuously growing biological matrices, such as baleen whale earwax plugs, baleen and pinniped vibrissae. Few of these studies have determined if the hormones in these tissues across the body of the organism are interchangeable. Here, hormone values in the right and left earplugs from the same individual were compared for two reasons: (i) to determine whether right and left earplug hormone values can be used interchangeably and (ii) to assess methods of standardizing hormones in right and left earplugs to control for individuals’ naturally varying hormone expressions. We analysed how absolute, baseline-corrected and Z-score normalized hormones performed in reaching these goals. Absolute hormones in the right and left earplugs displayed a positive relationship, while using Z-score normalization was necessary to standardize the variance in hormone expression. After Z-score normalization, it was possible to show that the 95% confidence intervals of the differences in corresponding lamina of the right and left earplugs include zero for both cortisol and progesterone. This indicates that the hormones in corresponding lamina of right and left earplugs are no different from zero. The results of this study reveal that both right and left earplugs from the same baleen whale can be used in hormone analyses after Z-score normalization. This study also shows the importance of Z-score normalization to interpretation of results and methodologies associated with analysing long-term trends using whale earplugs.


Author(s):  
Dwianti Westari ◽  

The diabetes classification system is very useful in the health sector. This paper discusses the classification system for diabetes using the K-Means algorithm. The Pima Indian Diabetes (PID) dataset is used to train and evaluate this algorithm. The unbalanced value range in the attributes affects the quality of the classification result, so it is necessary to preprocess the data which is expected to improve the accuracy of the PID dataset classification result. Two types of preprocessing methods are used that are min-max normalization and z-score normalization. These two normalization methods are used and the classification accuracies are compared. Before the data classification process is carried out, the data is divided into training data and test data. The result of the classification test using the K-Means algorithm has shown that the best accuracy lies in the PID dataset which has been normalized using the min-max normalization method, which 79% compared to z-score normalization.


Author(s):  
Siti Rafidah M-Dawam ◽  
Ku Ruhana Ku-Mahamud

Many non-parametric techniques such as Neural Network (NN) are used to forecast current reservoir water level (RWL<sub>t</sub>). However, modelling using these techniques can be established without knowledge of the mathematical relationship between the inputs and the corresponding outputs. Another important issue to be considered which is related to forecasting is the preprocessing stage where most non-parametric techniques normalize data into discretized data. Data normalization can influence the the results of forecasting. This paper presents reservoir water level (RWL) forecasting using normalization and multiple regression. In this study, continuous data of rainfall (RF) and changes of reservoir water level (WC) are normalized using two different normalization methods, Min-Max and Z-Score techniques. Its comparative studies and forecasting process are carried out using multiple regression. Three input scenarios for multiple regression were designed which comprise of temporal patterns of WC and RF, in which the sliding window technique has been applied. The experimental results showed that the best input scenario for forecasting the RWL<sub>t</sub> employs both the RF and the WC, in which the best predictors are three day’s delay of WC and two days’ delay of RF. The findings also suggested that the performance of the RWL forecasting model using multiple regression was dependent on the normalization methods.


2009 ◽  
Vol 6 (6) ◽  
pp. 10447-10477 ◽  
Author(s):  
L. Zhang ◽  
M. Xu ◽  
M. Huang ◽  
G. Yu

Abstract. Modeling ecosystem carbon cycle on the regional and global scales is crucial to the prediction of future global atmospheric CO2 concentration and thus global temperature which features large uncertainties due mainly to the limitations in our knowledge and in the climate and ecosystem models. There is a growing body of research on parameter estimation against available carbon measurements to reduce model prediction uncertainty at regional and global scales. However, the systematic errors with the observation data have rarely been investigated in the optimization procedures in previous studies. In this study, we examined the feasibility of reducing the impact of systematic errors on parameter estimation using normalization methods, and evaluated the effectiveness of three normalization methods (i.e. maximum normalization, min-max normalization, and z-score normalization) on inversing key parameters, for example the maximum carboxylation rate (Vcmax,25) at a reference temperature of 25°C, in a process-based ecosystem model for deciduous needle-leaf forests in northern China constrained by the leaf area index (LAI) data. The LAI data used for parameter estimation were composed of the model output LAI (truth) and various designated systematic errors and random errors. We found that the estimation of Vcmax,25 could be severely biased with the composite LAI if no normalization was taken. Compared with the maximum normalization and the min-max normalization methods, the z-score normalization method was the most robust in reducing the impact of systematic errors on parameter estimation. The most probable values of estimated Vcmax,25 inversed by the z-score normalized LAI data were consistent with the true parameter values as in the model inputs though the estimation uncertainty increased with the magnitudes of random errors in the observations. We concluded that the z-score normalization method should be applied to the observed or measured data to improve model parameter estimation, especially when the potential errors in the constraining (observation) datasets are unknown.


2020 ◽  
Vol 4 (5) ◽  
pp. 805-812
Author(s):  
Riska Chairunisa ◽  
Adiwijaya ◽  
Widi Astuti

Cancer is one of the deadliest diseases in the world with a mortality rate of 57,3% in 2018 in Asia. Therefore, early diagnosis is needed to avoid an increase in mortality caused by cancer. As machine learning develops, cancer gene data can be processed using microarrays for early detection of cancer outbreaks. But the problem that microarray has is the number of attributes that are so numerous that it is necessary to do dimensional reduction. To overcome these problems, this study used dimensions reduction Discrete Wavelet Transform (DWT) with Classification and Regression Tree (CART) and Random Forest (RF) as classification method. The purpose of using these two classification methods is to find out which classification method produces the best performance when combined with the DWT dimension reduction. This research use five microarray data, namely Colon Tumors, Breast Cancer, Lung Cancer, Prostate Tumors and Ovarian Cancer from Kent-Ridge Biomedical Dataset. The best accuracy obtained in this study for breast cancer data were 76,92% with CART-DWT, Colon Tumors 90,1% with RF-DWT, lung cancer 100% with RF-DWT, prostate tumors 95,49% with RF-DWT, and ovarian cancer 100% with RF-DWT. From these results it can be concluded that RF-DWT is better than CART-DWT.  


2020 ◽  
Vol 27 (3) ◽  
pp. 178-186 ◽  
Author(s):  
Ganesan Pugalenthi ◽  
Varadharaju Nithya ◽  
Kuo-Chen Chou ◽  
Govindaraju Archunan

Background: N-Glycosylation is one of the most important post-translational mechanisms in eukaryotes. N-glycosylation predominantly occurs in N-X-[S/T] sequon where X is any amino acid other than proline. However, not all N-X-[S/T] sequons in proteins are glycosylated. Therefore, accurate prediction of N-glycosylation sites is essential to understand Nglycosylation mechanism. Objective: In this article, our motivation is to develop a computational method to predict Nglycosylation sites in eukaryotic protein sequences. Methods: In this article, we report a random forest method, Nglyc, to predict N-glycosylation site from protein sequence, using 315 sequence features. The method was trained using a dataset of 600 N-glycosylation sites and 600 non-glycosylation sites and tested on the dataset containing 295 Nglycosylation sites and 253 non-glycosylation sites. Nglyc prediction was compared with NetNGlyc, EnsembleGly and GPP methods. Further, the performance of Nglyc was evaluated using human and mouse N-glycosylation sites. Results: Nglyc method achieved an overall training accuracy of 0.8033 with all 315 features. Performance comparison with NetNGlyc, EnsembleGly and GPP methods shows that Nglyc performs better than the other methods with high sensitivity and specificity rate. Conclusion: Our method achieved an overall accuracy of 0.8248 with 0.8305 sensitivity and 0.8182 specificity. Comparison study shows that our method performs better than the other methods. Applicability and success of our method was further evaluated using human and mouse N-glycosylation sites. Nglyc method is freely available at https://github.com/bioinformaticsML/ Ngly.


Author(s):  
V. Purushothaman ◽  
K. Vinoth Kumar ◽  
Sabari Girish Ambat ◽  
R. Venkataswami

Abstract Background Total brachial plexus palsy (TBPP) accounts for nearly 50% of all brachial plexus injuries. Since achieving a good functional hand was almost impossible, the aim was settled to get a good shoulder and elbow function. It was Gu, who popularized the concept of utilizing contralateral C7 (CC7) with vascularized ulnar nerve graft (VUNG) to get some hand function. We have modified it to suit our patients by conducting it as a single-stage procedure, thereby trying to get a functional upper limb. Methods From 2009 to 2014, we had 20 TBPP patients. We feel nerve reconstruction is always better than any other salvage procedure, including free muscle transfer. We modified Gu's concept and present our concept of total nerve reconstruction as “ALL IN ONE OR (W)HOLE IN ONE REPAIR.” Results All patients able to move their reconstructed limbs independently or with the help of contralateral limbs. Three patients developed hook grip and one patient was able to incorporate limbs to do bimanual jobs. One important observation is that all the reconstructed limbs regain the bulk, and to a certain extent, the attitude and appearance looks normal, as patients no longer hide it or hang it in a sling. Conclusion Adult brachial plexus injury itself is a devastating injury affecting young males. By doing this procedure, the affected limb is not dissociated from the rest of the body and rehabilitation can be aimed to get a supportive limb.


Sign in / Sign up

Export Citation Format

Share Document