A Machine Learning View for Health Data Mining Emphasizes on the Decision Trees

Data mining is a hot topic that attracts researchers of different areas, such as database, machine learning, and agent-oriented software engineering. As a consequence of the growth of data volume, there is an increasing need to obtain knowledge from these large datasets that are very difficult to handle and process with traditional methods. Software agents can play a significant role performing data mining processes in ways that are more efficient. For instance, they can work to perform selection, extraction, preprocessing, and integration of data as well as parallel, distributed, or multisource mining. This paper proposes a framework based on multiagent systems to apply data mining techniques to health datasets. Last but not least, the usage scenarios that we use are datasets for hypothyroidism and diabetes and we run two different mining processes in parallel in each database.

Download Full-text

Development and Optimization of VGF-GaAs Crystal Growth Process Using Data Mining and Machine Learning Techniques

Crystals ◽

10.3390/cryst11101218 ◽

2021 ◽

Vol 11 (10) ◽

pp. 1218

Author(s):

Natasha Dropka ◽

Klaus Böttcher ◽

Martin Holena

Keyword(s):

Machine Learning ◽

Data Mining ◽

Crystal Growth ◽

Decision Trees ◽

Growth Process ◽

Training Data ◽

Machine Learning Techniques ◽

Interface Position ◽

Crystal Growth Process ◽

Learning Techniques

The aim of this study was to assess the ability of the various data mining and supervised machine learning techniques: correlation analysis, k-means clustering, principal component analysis and decision trees (regression and classification), to derive, optimize and understand the factors influencing VGF-GaAs growth. Training data were generated by Computational Fluid Dynamics (CFD) simulations and consisted of 130 datasets with 6 inputs (growth rate and power of 5 heaters) and 5 outputs (interface position and deflection, and temperatures at various positions in GaAs). Data mining results confirmed a good dispersion of the training data without the feasibility of a dimensionality reduction. Data clustering was observed in relation to the position of the crystallization front relative to the side heaters. Based on the statistical performance criteria and training results, decision trees identified the most decisive inputs and their ranges for a favorable interface shape and to keep GaAs temperature beyond limits for heavy arsenic evaporation. Decision trees are a recommendable machine learning technique with short training times and acceptable predictive accuracy based on small volume of CFD training data, capable of providing guidelines for understanding the crystal growth process, which is a prerequisite for the growth of low-cost, high-quality bulk crystals.

Download Full-text

Making Use of Functional Dependencies Based on Data to Find Better Classification Trees

International Journal of Circuits, Systems and Signal Processing ◽

10.46300/9106.2021.15.160 ◽

2021 ◽

Vol 15 ◽

pp. 1475-1485

Author(s):

Hyontai Sug

Keyword(s):

Machine Learning ◽

Data Mining ◽

Decision Trees ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Functional Dependencies ◽

Chi Square ◽

Chi Square Test ◽

Novel Method ◽

Categorical Attributes

For the classification task of machine learning algorithms independency between conditional attributes is a precondition for success of data mining. On the other hand, decision trees are one of the mostly used machine learning algorithms because of their good understandability. So, because dependency between conditional attributes can cause more complex trees, supplying conditional attributes independent each other is very important, the requirement of conditional attributes for decision trees as well as other machine learning algorithms is that they are independent each other and dependent on decisional attributes only. Statistical method to check independence between attributes is Chi-square test, but the test can be effective for categorical attributes only. So, the applicability of Chi-square test is limited, because most datasets for data mining have mixed attributes of categorical and numerical. In order to overcome the problem, and as a way to test dependency between conditional attributes, a novel method based on functional dependency based on data that can be applied to any datasets irrespective of data type of attributes is suggested. After removing highly dependent attributes between conditional attributes, we can generate better decision trees. Experiments were performed to show that the method is effective, and the experiments showed very good results.

Download Full-text

Implementation of Real-Time Medical and Health Data Mining System Based on Machine Learning

Journal of Healthcare Engineering ◽

10.1155/2021/7011205 ◽

2021 ◽

Vol 2021 ◽

pp. 1-5

Author(s):

Pengyuan Wang ◽

Jie Li

Keyword(s):

Machine Learning ◽

Data Mining ◽

Health Management ◽

Learning Algorithms ◽

Health Data ◽

Machine Learning Algorithms ◽

Application Process ◽

Mining System ◽

Sensing Technology ◽

Data Mining System

This article analyzes the application process of data mining technology in the medical and health management system and uses machine learning algorithms to design a medical and health data mining system. The system collects patient’s physical health data based on wireless sensing technology and uses machine learning algorithms to analyze the data. The system uploads the collected health data to the system for cluster analysis. Finally, the method is applied to the diagnosis data mining of patients, so as to prove the effectiveness of the classification method in the medical field through examples.

Download Full-text

TEKNIK DATA MINING UNTUK MENGKLASIFIKASIKAN DATA ULASAN DESTINASI WISATA MENGGUNAKAN REDUKSI DATA PRINCIPAL COMPONENT ANALYSIS (PCA)

Edutic - Scientific Journal of Informatics Education ◽

10.21107/edutic.v7i2.9247 ◽

2021 ◽

Vol 7 (2) ◽

Author(s):

Alven Safik Ritonga ◽

Isnaini Muhandhis

Keyword(s):

Machine Learning ◽

Data Mining ◽

Principal Component Analysis ◽

Support Vector Machine ◽

Decision Trees ◽

Principal Component ◽

Component Analysis ◽

Support Vector

Peningkatan kunjungan wisatawan ke suatu destinasi wisata, dipengaruhi oleh kepuasan wisatawan waktu berkunjung. Untuk mengetahui suatu destinasi pariwisata sudah sesuai dengan yang diharapkan wisatawan, perlu dilakukan evaluasi terhadap kepuasan wisatawan. Tujuan penelitian ini adalah mendapatkan model klasifikasi yang mempunyai akurasi tinggi dalam melakukan klasifikasi ulasan kepuasan destinasi wisata dan menghasilkan alat bantu untuk pengambilan keputusan dalam pengembagan destinasi wisata. Data yang dipakai pada penelitian ini dimensinya cukup besar, hal ini nantinya membuat waktu komputasi untuk pengklasifikasian makin lama, membuat analisis tidak praktis atau tidak layak, maka reduksi dimensi data diterapkan pada penelitian ini untuk mendapatkan dimensi data yang jauh lebih kecil, namun tetap mempertahankan integritas data asli. Metode yang digunakan untuk pengklasifikasian ulasan kepuasan destinasi wisata adalah kombinasi antara metode Principal Component Analysis (PCA) sebagai metode reduksi dimensi data, dengan tiga metode data mining berikut ini; Support Vector Machine (SVM), Jaringan Saraf Tiruan (JST), dan Decision Trees. Penelitian ini menggunakan data kedua yang diambil dari UCI Machine Learning Repository. Hasil penelitian dengan mengkombinasikan PCA pada ketiga metode memperlihatkan bahwa akurasi klasifikasi lebih baik untuk beberapa metode. Dari ketiga metode yang dipakai, SVM-PCA mempunyai akurasi yang lebih baik dengan 91,50% disusul oleh metode ANN-PCA sebesar 89,46% dan metode Decision-PCA sebesar 88,78%.

Download Full-text

Data Mining and Machine Learning

10.1017/9781108564175 ◽

2020 ◽

Cited By ~ 2

Author(s):

Mohammed J. Zaki ◽

Wagner Meira, Jr

Keyword(s):

Machine Learning ◽

Data Mining

Download Full-text

Instant medical care and drug suggestion service using data mining and machine learning based intelligent self-diagnosis medical system

International Journal of Advanced Life Sciences ◽

10.26627/ijals/2017/10.03.0022 ◽

2017 ◽

Vol 10 (03) ◽

pp. 318-325

Author(s):

sudha M

Keyword(s):

Machine Learning ◽

Data Mining ◽

Medical Care ◽

Medical System ◽

Using Data

Download Full-text

Machine Learning and Data Mining Activity Results when using Projectiles in Different Sports

International Journal of Advanced Trends in Computer Science and Engineering ◽

10.30534/ijatcse/2020/103932020 ◽

2020 ◽

Vol 9 (3) ◽

pp. 3157-3160

Author(s):

Burov Alexey Gennadievich

Keyword(s):

Machine Learning ◽

Data Mining ◽

Mining Activity

Download Full-text

An Introduction to Machine Learning for Panel Data: Decision Trees, Random Forests, and Other Dendrological Methods

SSRN Electronic Journal ◽

10.2139/ssrn.3717879 ◽

2020 ◽

Author(s):

James Ming Chen

Keyword(s):

Machine Learning ◽

Panel Data ◽

Decision Trees ◽

Random Forests

Download Full-text

Classification of Operational and Financial Variables Affecting the Bullwhip Effect in Indian Sectors: A Machine Learning Approach

Recent Patents on Computer Science ◽

10.2174/2213275911666181012121059 ◽

2019 ◽

Vol 12 (3) ◽

pp. 171-179 ◽

Cited By ~ 6

Author(s):

Sachin Gupta ◽

Anurag Saxena

Keyword(s):

Machine Learning ◽

Data Mining ◽

Supply Chain ◽

Supply Chain Management ◽

Product Life Cycle ◽

Consumer Preference ◽

Bullwhip Effect ◽

Machine Learning Techniques ◽

Chain Management ◽

Financial Variables

Background: The increased variability in production or procurement with respect to less increase of variability in demand or sales is considered as bullwhip effect. Bullwhip effect is considered as an encumbrance in optimization of supply chain as it causes inadequacy in the supply chain. Various operations and supply chain management consultants, managers and researchers are doing a rigorous study to find the causes behind the dynamic nature of the supply chain management and have listed shorter product life cycle, change in technology, change in consumer preference and era of globalization, to name a few. Most of the literature that explored bullwhip effect is found to be based on simulations and mathematical models. Exploring bullwhip effect using machine learning is the novel approach of the present study. Methods: Present study explores the operational and financial variables affecting the bullwhip effect on the basis of secondary data. Data mining and machine learning techniques are used to explore the variables affecting bullwhip effect in Indian sectors. Rapid Miner tool has been used for data mining and 10-fold cross validation has been performed. Weka Alternating Decision Tree (w-ADT) has been built for decision makers to mitigate bullwhip effect after the classification. Results: Out of the 19 selected variables affecting bullwhip effect 7 variables have been selected which have highest accuracy level with minimum deviation. Conclusion: Classification technique using machine learning provides an effective tool and techniques to explore bullwhip effect in supply chain management.

Download Full-text

A Machine Learning View for Health Data Mining Emphasizes on the Decision Trees

An Approach to Generate Software Agents for Health Data Mining

Development and Optimization of VGF-GaAs Crystal Growth Process Using Data Mining and Machine Learning Techniques

Making Use of Functional Dependencies Based on Data to Find Better Classification Trees

Implementation of Real-Time Medical and Health Data Mining System Based on Machine Learning

TEKNIK DATA MINING UNTUK MENGKLASIFIKASIKAN DATA ULASAN DESTINASI WISATA MENGGUNAKAN REDUKSI DATA PRINCIPAL COMPONENT ANALYSIS (PCA)

Data Mining and Machine Learning

Instant medical care and drug suggestion service using data mining and machine learning based intelligent self-diagnosis medical system

Machine Learning and Data Mining Activity Results when using Projectiles in Different Sports

An Introduction to Machine Learning for Panel Data: Decision Trees, Random Forests, and Other Dendrological Methods

Classification of Operational and Financial Variables Affecting the Bullwhip Effect in Indian Sectors: A Machine Learning Approach

Export Citation Format