scholarly journals Decision Trees for Prediction and Data Mining

2005 ◽  
Author(s):  
Yin-Loh Wei
Keyword(s):  
2018 ◽  
Vol 2 (2) ◽  
pp. 167
Author(s):  
Marko Ferdian Salim ◽  
Sugeng Sugeng

Latar Belakang: Diabetes mellitus adalah penyakit kronis yang mempengaruhi beban ekonomi dan sosial secara luas. Data pasien dicatat melalui sistem rekam medis pasien yang tersimpan dalam database sistem informasi rumah sakit, data yang tercatat belum dianalisis secara efektif untuk menghasilkan informasi yang berharga. Teknik data mining bisa digunakan untuk menghasilkan informasi yang berharga tersebut.Tujuan: Mengidentifikasi karakteristik pasien Diabetes mellitus, kecenderungan dan tipe Diabetes melitus melalui penerapan teknik data mining di RSUP Dr. Sardjito Yogyakarta.Metode: Penelitian ini merupakan penelitian deskriptif observasional dengan rancangan cross sectional. Teknik pengumpulan data dilakukan secara retrospektif melalui observasi dan studi dokumentasi rekam medis elektronik di RSUP Dr. Sardjito Yogyakarta. Data yang terkumpul kemudian dilakukan analisis dengan menggunakan aplikasi Weka.Hasil: Pasien Diabetes mellitus di RSUP Dr. Sardjito tahun 2011-2016 berjumlah 1.554 orang dengan tren yang cenderung menurun. Pasien paling banyak berusia 56 - 63 tahun (27,86%). Kejadian Diabetes mellitus didominasi oleh Diabetes mellitus tipe 2 dengan komplikasi tertinggi adalah hipertensi, nefropati, dan neuropati. Dengan menggunakan teknik data mining dengan algoritma decision tree J48 (akurasi 88.42%) untuk analisis rekam medis pasien telah menghasilkan beberapa rule.Kesimpulan: Teknik klasifikasi data mining (akurasi 88.42%) dan decision trees telah berhasil mengidentifikasi karakteristik pasien dan menemukan beberapa rules yang dapat digunakan pihak rumah sakit dalam pengambilan keputusan mengenai penyakit Diabetes mellitus.


Author(s):  
Malcolm J. Beynonm

The seminal work of Zadeh (1965), namely fuzzy set theory (FST), has developed into a methodology fundamental to analysis that incorporates vagueness and ambiguity. With respect to the area of data mining, it endeavours to find potentially meaningful patterns from data (Hu & Tzeng, 2003). This includes the construction of if-then decision rule systems, which attempt a level of inherent interpretability to the antecedents and consequents identified for object classification (See Breiman, 2001). Within a fuzzy environment this is extended to allow a linguistic facet to the possible interpretation, examples including mining time series data (Chiang, Chow, & Wang, 2000) and multi-objective optimisation (Ishibuchi & Yamamoto, 2004). One approach to if-then rule construction has been through the use of decision trees (Quinlan, 1986), where the path down a branch of a decision tree (through a series of nodes), is associated with a single if-then rule. A key characteristic of the traditional decision tree analysis is that the antecedents described in the nodes are crisp, where this restriction is mitigated when operating in a fuzzy environment (Crockett, Bandar, Mclean, & O’Shea, 2006). This chapter investigates the use of fuzzy decision trees as an effective tool for data mining. Pertinent to data mining and decision making, Mitra, Konwar and Pal (2002) succinctly describe a most important feature of decision trees, crisp and fuzzy, which is their capability to break down a complex decision-making process into a collection of simpler decisions and thereby, providing an easily interpretable solution.


2013 ◽  
Vol 2013 ◽  
pp. 1-6 ◽  
Author(s):  
Gábor Szűcs

The paper deals with classification in privacy-preserving data mining. An algorithm, the Random Response Forest, is introduced constructing many binary decision trees, as an extension of Random Forest for privacy-preserving problems. Random Response Forest uses the Random Response idea among the anonymization methods, which instead of generalization keeps the original data, but mixes them. An anonymity metric is defined for undistinguishability of two mixed sets of data. This metric, the binary anonymity, is investigated and taken into consideration for optimal coding of the binary variables. The accuracy of Random Response Forest is presented at the end of the paper.


Crystals ◽  
2021 ◽  
Vol 11 (10) ◽  
pp. 1218
Author(s):  
Natasha Dropka ◽  
Klaus Böttcher ◽  
Martin Holena

The aim of this study was to assess the ability of the various data mining and supervised machine learning techniques: correlation analysis, k-means clustering, principal component analysis and decision trees (regression and classification), to derive, optimize and understand the factors influencing VGF-GaAs growth. Training data were generated by Computational Fluid Dynamics (CFD) simulations and consisted of 130 datasets with 6 inputs (growth rate and power of 5 heaters) and 5 outputs (interface position and deflection, and temperatures at various positions in GaAs). Data mining results confirmed a good dispersion of the training data without the feasibility of a dimensionality reduction. Data clustering was observed in relation to the position of the crystallization front relative to the side heaters. Based on the statistical performance criteria and training results, decision trees identified the most decisive inputs and their ranges for a favorable interface shape and to keep GaAs temperature beyond limits for heavy arsenic evaporation. Decision trees are a recommendable machine learning technique with short training times and acceptable predictive accuracy based on small volume of CFD training data, capable of providing guidelines for understanding the crystal growth process, which is a prerequisite for the growth of low-cost, high-quality bulk crystals.


10.1142/9097 ◽  
2013 ◽  
Author(s):  
Lior Rokach ◽  
Oded Maimon
Keyword(s):  

Author(s):  
Mário Saldanha ◽  
Marcelo Porto ◽  
César Marcon ◽  
Luciano Agostini

This dissertation presents a fast depth map coding for 3D-High Efficiency Video Coding (3D-HEVC) based on static Coding Unit (CU) splitting decision trees. The proposed solution is based on our previous works and avoids the costly Rate-Distortion Optimization (RDO) process for depth maps coding, which evaluates several possibilities of block partitioning and encoding modes for choosing the best one. This coding approach uses data mining and machine learning to extract the correlation among the encoder context attributes and to build the static decision trees. Each decision tree defines if a depth map CU must be split into smaller blocks, considering the encoding context through the evaluation of the CU features and encoder attributes. The results demonstrated that this approach can halve the 3D-HEVC encoder processing time with negligible coding efficiency loss. Besides, the obtained results surpass all related works regarding processing time and coding efficiency. The results reported in this dissertation were published in three journals and two events, besides generate a patent deposit. These products have the master student as the first author.


Author(s):  
Ricardo Timarán Pereira ◽  
Andrés Calderón Romero ◽  
Javier Jiménez Toledo

Resumen En este artículo se presentan los primeros resultados del proyecto de investigación cuyo objetivo es detectar patrones de deserción estudiantil a partir de los datos socioeconómicos, académicos, disciplinares e institucionales de los estudiantes de los programas de pregrado de la Universidad de Nariño e Institución Universitaria IUCESMAG, dos instituciones de educación superior de la ciudad de Pasto (Colombia), utilizando técnicas de Minería de Datos. Los resultados obtenidos corresponden a la Universidad de Nariño. Se descubrieron perfiles socioeconómicos y académicos de los estudiantes que desertan utilizando la técnica de clasificación basada en árboles de decisión. El conocimiento generado permitirá soportar la toma de decisiones eficaces de las directivas universitarias enfocadas a formular políticas y estrategias relacionadas con los programas de retención estudiantil que actualmente se encuentran establecidos. Palabras claveExtracción de Perfiles, Deserción Estudiantil, Minería de Datos, Clasificación, Árboles de Decisión   Abstract The first results of the research project that aims to identify patterns of student dropout from socioeconomic, academic, disciplinary and institutional data of students from undergraduate programs at the University of Nariño and IUCESMAG University, two higher education institutions in the city of Pasto (Colombia), using data mining techniques are presented. The results correspond to the University of Nariño. Socioeconomic and academic profiles were discovered of students who drop using classification technique based on decision trees. The knowledge generated will support effective decision-making of university staff focused to develop policies and strategies related to student retention programs that are currently set.KeywordsExtraction of Profiles, Student Dropout, Data Mining, Classification, Decision Trees


Sign in / Sign up

Export Citation Format

Share Document