Bayesian Data Mining and Knowledge Discovery

Data Mining ◽

10.4018/978-1-59140-051-6.ch011 ◽

2011 ◽

pp. 260-277

Author(s):

Eitel J.M. Lauria ◽

Giri Kumar Tayi

Keyword(s):

Data Mining ◽

Bayesian Methods ◽

Practical Method ◽

Bayesian Belief Networks ◽

Bayes Theorem ◽

Theoretical Concept ◽

Machine Learning Techniques ◽

Probability Models ◽

Simulation Techniques ◽

Basic Probability

One of the major problems faced by data-mining technologies is how to deal with uncertainty. The prime characteristic of Bayesian methods is their explicit use of probability for quantifying uncertainty. Bayesian methods provide a practical method to make inferences from data using probability models for values we observe and about which we want to draw some hypotheses. Bayes’ Theorem provides the means of calculating the probability of a hypothesis (posterior probability) based on its prior probability, the probability of the observations, and the likelihood that the observational data fits the hypothesis. The purpose of this chapter is twofold: to provide an overview of the theoretical framework of Bayesian methods and its application to data mining, with special emphasis on statistical modeling and machine-learning techniques; and to illustrate each theoretical concept covered with practical examples. We will cover basic probability concepts, Bayes’ Theorem and its implications, Bayesian classification, Bayesian belief networks, and an introduction to simulation techniques.

Download Full-text

Bayesian Machine Learning

Encyclopedia of Information Science and Technology, First Edition ◽

10.4018/978-1-59140-553-5.ch043 ◽

2005 ◽

pp. 229-235

Author(s):

Eitel J.M. Lauria

Keyword(s):

Machine Learning ◽

Observational Data ◽

Bayesian Methods ◽

Posterior Probability ◽

Prior Probability ◽

Probabilistic Approach ◽

Bayes Theorem ◽

Bayesian Framework ◽

Probability Models ◽

Bayesian Machine Learning

Bayesian methods provide a probabilistic approach to machine learning. The Bayesian framework allows us to make inferences from data using probability models for values we observe and about which we want to draw some hypotheses. Bayes theorem provides the means of calculating the probability of a hypothesis (posterior probability) based on its prior probability, the probability of the observations and the likelihood that the observational data fit the hypothesis.

Download Full-text

Bayesian Applications in Auditory Research

Journal of Speech Language and Hearing Research ◽

10.1044/2018_jslhr-h-astm-18-0228 ◽

2019 ◽

Vol 62 (3) ◽

pp. 577-586 ◽

Cited By ~ 9

Author(s):

Garnett P. McMillan ◽

John B. Cannon

Keyword(s):

Monte Carlo ◽

Markov Chain ◽

Bayesian Methods ◽

Prior Distribution ◽

Interim Analysis ◽

Iterative Process ◽

Bayes Theorem ◽

Sound Therapy ◽

The Many ◽

Auditory Research

Purpose This article presents a basic exploration of Bayesian inference to inform researchers unfamiliar to this type of analysis of the many advantages this readily available approach provides. Method First, we demonstrate the development of Bayes' theorem, the cornerstone of Bayesian statistics, into an iterative process of updating priors. Working with a few assumptions, including normalcy and conjugacy of prior distribution, we express how one would calculate the posterior distribution using the prior distribution and the likelihood of the parameter. Next, we move to an example in auditory research by considering the effect of sound therapy for reducing the perceived loudness of tinnitus. In this case, as well as most real-world settings, we turn to Markov chain simulations because the assumptions allowing for easy calculations no longer hold. Using Markov chain Monte Carlo methods, we can illustrate several analysis solutions given by a straightforward Bayesian approach. Conclusion Bayesian methods are widely applicable and can help scientists overcome analysis problems, including how to include existing information, run interim analysis, achieve consensus through measurement, and, most importantly, interpret results correctly. Supplemental Material https://doi.org/10.23641/asha.7822592

Download Full-text

Classification of Operational and Financial Variables Affecting the Bullwhip Effect in Indian Sectors: A Machine Learning Approach

Recent Patents on Computer Science ◽

10.2174/2213275911666181012121059 ◽

2019 ◽

Vol 12 (3) ◽

pp. 171-179 ◽

Cited By ~ 6

Author(s):

Sachin Gupta ◽

Anurag Saxena

Keyword(s):

Machine Learning ◽

Data Mining ◽

Supply Chain ◽

Supply Chain Management ◽

Product Life Cycle ◽

Consumer Preference ◽

Bullwhip Effect ◽

Machine Learning Techniques ◽

Chain Management ◽

Financial Variables

Background: The increased variability in production or procurement with respect to less increase of variability in demand or sales is considered as bullwhip effect. Bullwhip effect is considered as an encumbrance in optimization of supply chain as it causes inadequacy in the supply chain. Various operations and supply chain management consultants, managers and researchers are doing a rigorous study to find the causes behind the dynamic nature of the supply chain management and have listed shorter product life cycle, change in technology, change in consumer preference and era of globalization, to name a few. Most of the literature that explored bullwhip effect is found to be based on simulations and mathematical models. Exploring bullwhip effect using machine learning is the novel approach of the present study. Methods: Present study explores the operational and financial variables affecting the bullwhip effect on the basis of secondary data. Data mining and machine learning techniques are used to explore the variables affecting bullwhip effect in Indian sectors. Rapid Miner tool has been used for data mining and 10-fold cross validation has been performed. Weka Alternating Decision Tree (w-ADT) has been built for decision makers to mitigate bullwhip effect after the classification. Results: Out of the 19 selected variables affecting bullwhip effect 7 variables have been selected which have highest accuracy level with minimum deviation. Conclusion: Classification technique using machine learning provides an effective tool and techniques to explore bullwhip effect in supply chain management.

Download Full-text

Probability

10.1093/oso/9780198785699.003.0008 ◽

2017 ◽

Author(s):

Andrew Gelman ◽

Deborah Nolan

Keyword(s):

Conditional Probability ◽

Bayes Rule ◽

Probability Models ◽

Classroom Activities ◽

Basic Probability

This chapter contains many classroom activities and demonstrations to help students understand basic probability calculations, including conditional probability and Bayes rule. Many of the activities alert students to misconceptions about randomness. They create dramatic settings where the instructor discerns real coin flips from fake ones, students modify dice and coins in order to load them, students “accused” of lying based on the outcome of an inaccurate simulated lie detector face their classmates. Additionally, probability models of real outcomes offer good value: first we can do the probability calculations, and then can go back and discuss the potential flaws of the model.

Download Full-text

Probability logic: A model-theoretic perspective

Journal of Logic and Computation ◽

10.1093/logcom/exaa066 ◽

2020 ◽

Author(s):

M Pourmahdian ◽

R Zoghifard

Keyword(s):

Characterization Theorem ◽

Expressive Power ◽

Probability Logic ◽

Theoretic Analysis ◽

Compactness Property ◽

Probability Models ◽

Finitely Additive Probability ◽

Additive Probability ◽

Basic Probability ◽

Abstract Logics

Abstract This paper provides some model-theoretic analysis for probability (modal) logic ($PL$). It is known that this logic does not enjoy the compactness property. However, by passing into the sublogic of $PL$, namely basic probability logic ($BPL$), it is shown that this logic satisfies the compactness property. Furthermore, by drawing some special attention to some essential model-theoretic properties of $PL$, a version of Lindström characterization theorem is investigated. In fact, it is verified that probability logic has the maximal expressive power among those abstract logics extending $PL$ and satisfying both the filtration and disjoint unions properties. Finally, by alternating the semantics to the finitely additive probability models ($\mathcal{F}\mathcal{P}\mathcal{M}$) and introducing positive sublogic of $PL$ including $BPL$, it is proved that this sublogic possesses the compactness property with respect to $\mathcal{F}\mathcal{P}\mathcal{M}$.

Download Full-text

A data mining approach based on machine learning techniques to classify biological sequences

Knowledge-Based Systems ◽

10.1016/s0950-7051(01)00143-5 ◽

2002 ◽

Vol 15 (4) ◽

pp. 217-223 ◽

Cited By ~ 15

Author(s):

M. Maddouri ◽

M. Elloumi

Keyword(s):

Machine Learning ◽

Data Mining ◽

Machine Learning Techniques ◽

Biological Sequences ◽

Data Mining Approach ◽

Learning Techniques

Download Full-text

Data Mining and Machine Learning Techniques for Bank Customers Segmentation: A Systematic Mapping Study

Advances in Intelligent Systems and Computing - Intelligent Systems and Applications ◽

10.1007/978-3-030-55187-2_48 ◽

2020 ◽

pp. 666-684

Author(s):

Maricel Monge ◽

Christian Quesada-López ◽

Alexandra Martínez ◽

Marcelo Jenkins

Keyword(s):

Machine Learning ◽

Data Mining ◽

Machine Learning Techniques ◽

Systematic Mapping Study ◽

Mapping Study ◽

Systematic Mapping ◽

Learning Techniques ◽

Customers Segmentation

Download Full-text

Review On Application of Data Mining in Life Insurance

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i4.5.20035 ◽

2018 ◽

Vol 7 (4.5) ◽

pp. 159

Author(s):

Vaibhav A. Hiwase ◽

Dr. Avinash J Agrawa

Keyword(s):

Risk Factors ◽

Data Mining ◽

Life Insurance ◽

Insurance Industry ◽

Feature Space ◽

Machine Learning Techniques ◽

High Dimensional ◽

Insurance Company ◽

Financial Loss ◽

Life Insurance Company

The growth of life insurance has been mainly depending on the risk of insured people. These risks are unevenly distributed among the people which can be captured from different characteristics and lifestyle. These unknown distribution needs to be analyzed from historical data and use for underwriting and policy-making in life insurance industry. Traditionally risk is calculated from selected features known as risk factors but today it becomes important to know these risk factors in high dimensional feature space. Clustering in high dimensional feature is a challenging task mainly because of the curse of dimensionality and noisy features. Hence the use of data mining and machine learning techniques should experiment to see some interesting pattern and behaviour. This will help life insurance company to protect from financial loss to the insured person and company as well. This paper focuses on analyzing hidden correlation among features and use it for risk calculation of an individual customer.

Download Full-text

Systematic Methods on Machine Learning Techniques for Clinical Predictive Modelling

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.e2138.039520 ◽

2020 ◽

Vol 9 (5) ◽

pp. 288-297

Keyword(s):

Machine Learning ◽

Data Mining ◽

High Performance ◽

Predictive Modelling ◽

Medical Application ◽

Medical Data ◽

Machine Learning Techniques ◽

Screening Process ◽

Huge Data ◽

System Data

Predictive modelling is a mathematical technique which uses Statistics for prediction, due to the rapid growth of data over the cloud system, data mining plays a significant role. Here, the term data mining is a way of extracting knowledge from huge data sources where it’s increasing the attention in the field of medical application. Specifically, to analyse and extract the knowledge from both known and unknown patterns for effective medical diagnosis, treatment, management, prognosis, monitoring and screening process. But the historical medical data might include noisy, missing, inconsistent, imbalanced and high dimensional data.. This kind of data inconvenience lead to severe bias in predictive modelling and decreased the data mining approach performances. The various pre-processing and machine learning methods and models such as Supervised Learning, Unsupervised Learning and Reinforcement Learning in recent literature has been proposed. Hence the present research focuses on review and analyses the various model, algorithm and machine learning technique for clinical predictive modelling to obtain high performance results from numerous medical data which relates to the patients of multiple diseases.

Download Full-text

Predicting Student Failure in University Examination using Machine Learning Algorithms

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.e2643.039520 ◽

2020 ◽

Vol 9 (5) ◽

pp. 956-959

Keyword(s):

Machine Learning ◽

Data Mining ◽

Performance Management ◽

Student Performance ◽

Learning Algorithms ◽

Educational Data Mining ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Social Characteristics ◽

Student Failure

Student Performance Management is one of the key pillars of the higher education institutions since it directly impacts the student’s career prospects and college rankings. This paper follows the path of learning analytics and educational data mining by applying machine learning techniques in student data for identifying students who are at the more likely to fail in the university examinations and thus providing needed interventions for improved student performance. The Paper uses data mining approach with 10 fold cross validation to classify students based on predictors which are demographic and social characteristics of the students. This paper compares five popular machine learning algorithms Rep Tree, Jrip, Random Forest, Random Tree, Naive Bayes algorithms based on overall classifier accuracy as well as other class specific indicators i.e. precision, recall, f-measure. Results proved that Rep tree algorithm outperformed other machine learning algorithms in classifying students who are at more likely to fail in the examinations.

Download Full-text