scholarly journals An In-Depth Analysis of The Data Mining Algorithms and Their Effective Applicability in Health and Biomedical Informatics

Author(s):  
Deeya Tangri

Nowadays, the Health care industry is one of the fastest-growing industries. As we already know, health care has researched very widely, introducing many medical data that is not easy to mine. Data mining is an approach that helps to discover essential data from massive data or collection of data. So, in medical Science, there is a need for tools that help analyses the data, extract the significant result from massive data, and discover efficient use of information. Generally, three things are mandatory in medical for every patient. First is patient details, diagnosis and medications. Converting these data into a basic pattern for predicting the patient disease helps in early diagnosis. This research mainly focuses on the data mining approach, which is widely considered in the medical field.

2002 ◽  
Vol 124 (4) ◽  
pp. 923-926 ◽  
Author(s):  
Andrew Kusiak

Data mining offers methodologies and tools for data analysis, discovery of new knowledge, and autonomous process control. This paper introduces basic data mining algorithms. An approach based on rough set theory is used to derive associations among control parameters and the product quality in the form of decision rules. The model presented in the paper produces control signatures leading to good quality products of a metal forming process. The computational results reported in the paper indicate that data mining opens a new avenue for decision-making in material forming industry.


2012 ◽  
Vol 134 (2) ◽  
Author(s):  
Anoop Verma ◽  
Andrew Kusiak

Components of wind turbines are subjected to asymmetric loads caused by variable wind conditions. Carbon brushes are critical components of the wind turbine generator. Adequately maintaining and detecting abnormalities in the carbon brushes early is essential for proper turbine performance. In this paper, data-mining algorithms are applied for early prediction of carbon brush faults. Predicting generator brush faults early enables timely maintenance or replacement of brushes. The results discussed in this paper are based on analyzing generator brush faults that occurred on 27 wind turbines. The datasets used to analyze faults were collected from the supervisory control and data acquisition (SCADA) systems installed at the wind turbines. Twenty-four data-mining models are constructed to predict faults up to 12 h before the actual fault occurs. To increase the prediction accuracy of the models discussed, a data balancing approach is used. Four data-mining algorithms were studied to evaluate the quality of the models for predicting generator brush faults. Among the selected data-mining algorithms, the boosting tree algorithm provided the best prediction results. Research limitations attributed to the available datasets are discussed.


Author(s):  
Seyed Mohammad Ayyoubzadeh ◽  
Seyed Mehdi Ayyoubzadeh ◽  
Hoda Zahedi ◽  
Mahnaz Ahmadi ◽  
Sharareh R Niakan Kalhori

BACKGROUND The recent global outbreak of coronavirus disease (COVID-19) is affecting many countries worldwide. Iran is one of the top 10 most affected countries. Search engines provide useful data from populations, and these data might be useful to analyze epidemics. Utilizing data mining methods on electronic resources’ data might provide a better insight into the COVID-19 outbreak to manage the health crisis in each country and worldwide. OBJECTIVE This study aimed to predict the incidence of COVID-19 in Iran. METHODS Data were obtained from the Google Trends website. Linear regression and long short-term memory (LSTM) models were used to estimate the number of positive COVID-19 cases. All models were evaluated using 10-fold cross-validation, and root mean square error (RMSE) was used as the performance metric. RESULTS The linear regression model predicted the incidence with an RMSE of 7.562 (SD 6.492). The most effective factors besides previous day incidence included the search frequency of handwashing, hand sanitizer, and antiseptic topics. The RMSE of the LSTM model was 27.187 (SD 20.705). CONCLUSIONS Data mining algorithms can be employed to predict trends of outbreaks. This prediction might support policymakers and health care managers to plan and allocate health care resources accordingly.


2021 ◽  
Vol 23 (2) ◽  
pp. 242-248
Author(s):  
BABY AKULA ◽  
R.S.PARMAR ◽  
M. P. RAJ ◽  
K. INDUDHAR REDDY

In order to explore the possibility of crop estimation, data mining approach being multidisciplinary was followed. The district of Ranga Reddy, Telangana State, India has been chosen for the study and its year wise average yield data of rice and daily weather over a period of 31 years i.e. from 1988-2019 (30th to 47th Standard Meteorological Weeks). Data mining tool WEKA (V3.8.1). Min- Max Normalization technique followed by Feature Selection algorithm, ‘cfsSubsetEval’ was also adopted to improve quality and accuracy of data mining algorithms. Thus, after cleaning and sorting of data, five classifiers viz., Logistic, MLP (Multi Layer Perceptron), J48 Classifier, LMT (Logistic Model Trees) and PART Classifier were employed over the trained data. The results indicated that the function based and tree based models have better performance over rule based model. In case of function based two models examined, viz., Logistic and MLP, the later performed better over Logistic model. Between tree based two models, LMT performed better over J48. Thus, MLP classifier model found to be the best fit model in predicting rice yields as it recorded an accuracy of 74.19 %, sensitivity of 0.742 and precision of 0.743 as compared with other models. The MLP has also achieved the highest F1 score of (0.742) and MCC (0.581).


2014 ◽  
Vol 667 ◽  
pp. 218-225 ◽  
Author(s):  
Yan Wang ◽  
Kun Yang ◽  
Xiang Jing ◽  
Huang Long Jin

KDD Cup 99 dataset is not only the most widely used dataset in intrusion detection, but also the de facto benchmark on evaluating the performance merits of intrusion detection system. Nevertheless there are a lot of issues in this dataset which cannot be omitted. In order to establish good data mining models in intrusion detection and find the appropriate network intrusion attack types’ features, researchers should have a well-known understanding on this dataset. In this paper, first and foremost we have made an in-depth analysis on the problems which the dataset are existed, and given the related solutions. Secondly, we also have carried out plenty data preprocessing on the 10% subset of KDD Cup 99 dataset’s training set, giving better results to the following process. What’s more, by comparing 10 common kinds of data mining algorithms in our experiment, we have analyzed and summarized that data preprocessing plays a vital role on the performance and importance to data mining algorithms.


Author(s):  
Syed Zahid Hassan ◽  
Brijesh Verma

This chapter focuses on hybrid data mining algorithms and their use in medical applications. It reviews existing data mining algorithms and presents a novel hybrid data mining approach, which takes advantage of intelligent and statistical modeling of data mining algorithms to extract meaningful patterns from medical data repositories. Various hybrid combinations of data mining algorithms are formulated and tested on a benchmark medical database. The chapter includes the experimental results with existing and new hybrid approaches to demonstrate the superiority of hybrid data mining algorithms over standard algorithms.


Entropy ◽  
2019 ◽  
Vol 21 (4) ◽  
pp. 426 ◽  
Author(s):  
Bartosz Kowalik ◽  
Marcin Szpyrka

Modern cars are equipped with plenty of electronic devices called Electronic Control Units (ECU). ECUs collect diagnostic data from a car’s components such as the engine, brakes etc. These data are then processed, and the appropriate information is communicated to the driver. From the point of view of safety of the driver and the passengers, the information about the car faults is vital. Regardless of the development of on-board computers, only a small amount of information is passed on to the driver. With the data mining approach, it is possible to obtain much more information from the data than it is provided by standard car equipment. This paper describes the environment built by the authors for data collection from ECUs. The collected data have been processed using parameterized entropies and data mining algorithms. Finally, we built a classifier able to detect a malfunctioning thermostat even if the car equipment does not indicate it.


2016 ◽  
Vol 15 (6) ◽  
pp. 6806-6813 ◽  
Author(s):  
Sethunya R Joseph ◽  
Hlomani Hlomani ◽  
Keletso Letsholo

The research on data mining has successfully yielded numerous tools, algorithms, methods and approaches for handling large amounts of data for various purposeful use and   problem solving. Data mining has become an integral part of many application domains such as data ware housing, predictive analytics, business intelligence, bio-informatics and decision support systems. Prime objective of data mining is to effectively handle large scale data, extract actionable patterns, and gain insightful knowledge. Data mining is part and parcel of knowledge discovery in databases (KDD) process. Success and improved decision making normally depends on how quickly one can discover insights from data. These insights could be used to drive better actions which can be used in operational processes and even predict future behaviour. This paper presents an overview of various algorithms necessary for handling large data sets. These algorithms define various structures and methods implemented to handle big data. The review also discusses the general strengths and limitations of these algorithms. This paper can quickly guide or an eye opener to the data mining researchers on which algorithm(s) to select and apply in solving the problems they will be investigating.


2011 ◽  
Vol 133 (1) ◽  
Author(s):  
Andrew Kusiak ◽  
Anoop Verma

This paper presents the application of data-mining techniques for identification and prediction of status patterns in wind turbines. Early prediction of status patterns benefits turbine maintenance by indicating the deterioration of components. An association rule mining algorithm is used to identify frequent status patterns of turbine components and systems that are in turn predicted using historical wind turbine data. The status patterns are predicted at six time periods spaced at 10 min intervals. The prediction models are generated by five data-mining algorithms. The random forest algorithm has produced the best prediction results. The prediction results are used to develop a component performance monitoring scheme.


2013 ◽  
Vol 664 ◽  
pp. 1066-1071
Author(s):  
Chen Wang ◽  
Shu Xiang Li

This paper first discusses the limitations and shortcomings of data mining under the conditions of massive data. Combined with the advantages of cloud computing, a data mining architecture for cloud computing is designed, and on this basis, the paper discusses the improvement of data mining algorithms for cloud computing. As a theoretical exploration, the paper proposed useful suggestions for the optimization of data mining in face of cloud computing.


Sign in / Sign up

Export Citation Format

Share Document