Witten IH, Frank E: Data Mining: Practical Machine Learning Tools and Techniques 2nd edition

AbstractDrensky and Lakatos (Lecture Notes in Computer Science, 357 (Springer, Berlin, 1989), pp. 181–188) have established a convenient property of certain ideals in polynomial quotient rings, which can now be used to determine error-correcting capabilities of combined multiple classifiers following a standard approach explained in the well-known monograph by Witten and Frank (Data Mining: Practical Machine Learning Tools and Techniques (Elsevier, Amsterdam, 2005)). We strengthen and generalise the result of Drensky and Lakatos by demonstrating that the corresponding nice property remains valid in a much larger variety of constructions and applies to more general types of ideals. Examples show that our theorems do not extend to larger classes of ring constructions and cannot be simplified or generalised.

Download Full-text

Data Mining: Practical Machine Learning Tools and Techniques

10.1016/c2009-0-19715-5 ◽

2011 ◽

Cited By ~ 16

Keyword(s):

Machine Learning ◽

Data Mining ◽

Learning Tools ◽

Tools And Techniques

Download Full-text

Paradigms of Machine Learning and Data Analytics

Handling Priority Inversion in Time-Constrained Distributed Databases - Advances in Data Mining and Database Management ◽

10.4018/978-1-7998-2491-6.ch009 ◽

2020 ◽

pp. 156-174

Author(s):

Pawan Kumar Chaurasia

Keyword(s):

Machine Learning ◽

Heart Disease ◽

Deep Learning ◽

Critical Review ◽

Data Analytics ◽

Disease Patient ◽

Learning Tools ◽

Heart Disease Patient ◽

Tools And Techniques ◽

New Hypothesis

This chapter conducts a critical review on ML and deep learning tools and techniques in the field of heart disease related to heart disease complexity, prediction, and diagnosis. Only specific papers are selected for the study to extract useful information, which stimulated a new hypothesis to understand further investigation of the heart disease patient.

Download Full-text

Investigating the Physics of Tokamak Global Stability with Interpretable Machine Learning Tools

Applied Sciences ◽

10.3390/app10196683 ◽

2020 ◽

Vol 10 (19) ◽

pp. 6683

Author(s):

Andrea Murari ◽

Emmanuele Peluso ◽

Michele Lungaroni ◽

Riccardo Rossi ◽

Michela Gelfusa ◽

...

Keyword(s):

Machine Learning ◽

Data Mining ◽

Independent Learning ◽

Support Vector ◽

Learning Tools ◽

Feedback Systems ◽

Theoretical Understanding ◽

Machine Learning Classifiers ◽

Learning Classifiers ◽

Mining Tools

The inadequacies of basic physics models for disruption prediction have induced the community to increasingly rely on data mining tools. In the last decade, it has been shown how machine learning predictors can achieve a much better performance than those obtained with manually identified thresholds or empirical descriptions of the plasma stability limits. The main criticisms of these techniques focus therefore on two different but interrelated issues: poor “physics fidelity” and limited interpretability. Insufficient “physics fidelity” refers to the fact that the mathematical models of most data mining tools do not reflect the physics of the underlying phenomena. Moreover, they implement a black box approach to learning, which results in very poor interpretability of their outputs. To overcome or at least mitigate these limitations, a general methodology has been devised and tested, with the objective of combining the predictive capability of machine learning tools with the expression of the operational boundary in terms of traditional equations more suited to understanding the underlying physics. The proposed approach relies on the application of machine learning classifiers (such as Support Vector Machines or Classification Trees) and Symbolic Regression via Genetic Programming directly to experimental databases. The results are very encouraging. The obtained equations of the boundary between the safe and disruptive regions of the operational space present almost the same performance as the machine learning classifiers, based on completely independent learning techniques. Moreover, these models possess significantly better predictive power than traditional representations, such as the Hugill or the beta limit. More importantly, they are realistic and intuitive mathematical formulas, which are well suited to supporting theoretical understanding and to benchmarking empirical models. They can also be deployed easily and efficiently in real-time feedback systems.

Download Full-text

Predictive analytics in health care using machine learning tools and techniques

2017 International Conference on Intelligent Computing and Control Systems (ICICCS) ◽

10.1109/iccons.2017.8250771 ◽

2017 ◽

Cited By ~ 21

Author(s):

B. Nithya ◽

V. Ilango

Keyword(s):

Machine Learning ◽

Health Care ◽

Predictive Analytics ◽

Learning Tools ◽

Tools And Techniques

Download Full-text

Latest Tools for Data Mining and Machine Learning

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.i1003.0789s19 ◽

2019 ◽

Vol 8 (9S) ◽

pp. 18-23 ◽

Cited By ~ 2

Keyword(s):

Machine Learning ◽

Data Mining ◽

Decision Making ◽

Feature Selection ◽

Open Source ◽

Predictive Analysis ◽

Learning Tools ◽

Pros And Cons ◽

Selection For ◽

Extract Information

Nowadays, Data Mining is used everywhere for extracting information from the data and in turn, acquires knowledge for decision making. Data Mining analyzes patterns which are used to extract information and knowledge for making decisions. Many open source and licensed tools like Weka, RapidMiner, KNIME, and Orange are available for Data Mining and predictive analysis. This paper discusses about different tools available for Data Mining and Machine Learning, followed by the description, pros and cons of these tools. The article provides details of all the algorithms like classification, regression, characterization, discretization, clustering, visualization and feature selection for Data Mining and Machine Learning tools. It will help people for efficient decision making and suggests which tool is suitable according to their requirement.

Download Full-text

Data mining tools -a case study for network intrusion detection

Multimedia Tools and Applications ◽

10.1007/s11042-020-09916-0 ◽

2020 ◽

Author(s):

Soodeh Hosseini ◽

Saman Rafiee Sardo

Keyword(s):

Machine Learning ◽

Data Mining ◽

Intrusion Detection ◽

Network Intrusion Detection ◽

Learning Approaches ◽

Learning Tools ◽

Network Intrusion ◽

The Face ◽

Level Of Knowledge ◽

Mining Tools

Abstract With the growth of data mining and machine learning approaches in recent years, many efforts have been made to generalize these sciences so that researchers from any field can easily utilize these sciences. One of the most important of these efforts is the development of data mining tools that try to hide the complexities from researchers so that they can achieve a professional output with any level of knowledge. This paper is focused on reviewing and comparing data mining and machine learning tools including WEKA, KNIME, Keel, Orange, Azure, IBM SPSS Modeler, R and Scikit-Learn to show what approach each of these methods has taken in the face of the complexities and problems of different scenarios of generalization of data mining and machine learning. In addition, for a more detailed review, this paper examines the challenge of network intrusion detection in two tools, Knime with graphical interface and Scikit-Learn with coding environment.

Download Full-text

Machine Learning and data mining tools applied for databases of low number of records

Advanced Engineering Research ◽

10.23947/2687-1653-2021-21-4-346-363 ◽

2022 ◽

Vol 21 (4) ◽

pp. 346-363

Author(s):

Hubert Anysz

Keyword(s):

Machine Learning ◽

Data Mining ◽

Computational Methods ◽

Large Datasets ◽

Learning Tools ◽

Data Preparation ◽

Preparation Methods ◽

Use Of Data ◽

Small Set ◽

Mining Tools

The use of data mining and machine learning tools is becoming increasingly common. Their usefulness is mainly noticeable in the case of large datasets, when information to be found or new relationships are extracted from information noise. The development of these tools means that datasets with much fewer records are being explored, usually associated with specific phenomena. This specificity most often causes the impossibility of increasing the number of cases, and that can facilitate the search for dependences in the phenomena under study. The paper discusses the features of applying the selected tools to a small set of data. Attempts have been made to present methods of data preparation, methods for calculating the performance of tools, taking into account the specifics of databases with a small number of records. The techniques selected by the author are proposed, which helped to break the deadlock in calculations, i.e., to get results much worse than expected. The need to apply methods to improve the accuracy of forecasts and the accuracy of classification was caused by a small amount of analysed data. This paper is not a review of popular methods of machine learning and data mining; nevertheless, the collected and presented material will help the reader to shorten the path to obtaining satisfactory results when using the described computational methods

Download Full-text

Data Mining and Machine Learning Tools for Combinatorial Material Science of All-Oxide Photovoltaic Cells

Molecular Informatics ◽

10.1002/minf.201400174 ◽

2015 ◽

Vol 34 (6-7) ◽

pp. 367-379 ◽

Cited By ~ 27

Author(s):

Abraham Yosipof ◽

Oren E. Nahum ◽

Assaf Y. Anderson ◽

Hannah-Noa Barad ◽

Arie Zaban ◽

...

Keyword(s):

Machine Learning ◽

Data Mining ◽

Material Science ◽

Photovoltaic Cells ◽

Learning Tools ◽

Combinatorial Material Science

Download Full-text

Machine Learning in Nutritional Follow-up Research

Open Computer Science ◽

10.1515/comp-2017-0008 ◽

2017 ◽

Vol 7 (1) ◽

pp. 41-45 ◽

Cited By ~ 5

Author(s):

Rita Reis ◽

Hugo Peixoto ◽

José Machado ◽

António Abelha

Keyword(s):

Machine Learning ◽

Data Mining ◽

Decision Making ◽

Daily Basis ◽

Decision Makers ◽

Learning Tools ◽

Healthcare Organizations ◽

Data Mining Techniques ◽

Take The Best

Abstract Healthcare is one of the world’s fastest growing industries, having large volumes of data collected on a daily basis. It is generally perceived as being ‘information rich’ yet ‘knowledge poor’. Hidden relationships and valuable knowledge can be discovered in the collected data from the application of data mining techniques. These techniques are being increasingly implemented in healthcare organizations in order to respond to the needs of doctors in their daily decision-making activities. To help the decision-makers to take the best decision it is fundamental to develop a solution able to predict events before their occurrence. The aim of this project was to predict if a patient would need to be followed by a nutrition specialist, by combining a nutritional dataset with data mining classification techniques, using WEKA machine learning tools. The achieved results showed to be very promising, presenting accuracy around 91%, specificity around 97% and precision about 95%.

Download Full-text