Financial Statement Audit Utilising Naive Bayes Networks, Decision Trees, Linear Discriminant Analysis and Logistic Regression

Author(s):  
Aram Khalaf Nawaiseh ◽  
Maysam F. Abbod
2017 ◽  
Vol 6 (3) ◽  
pp. 57-60
Author(s):  
Денис Кривогуз ◽  
Denis Krivoguz

Modern approaches to the region’s landslide susceptibility assessment are considered in this paper. Have been presented descriptions of the most used techniques for landslide susceptibility assessment: logistic regression, indicator validity, linear discriminant analysis and application of artificial neural networks. These techniques’ advantages and disadvantages are discussed in the paper. The most suitable techniques for various conditions of analysis have been marked. It has been concluded that the most acceptable techniques of analysis for a large number of input data related to the studied region are the method of logistic regression and indicator validity method. With these methods the most accurate results are achieved. When there is a lack of information, it is more expedient to use linear discriminant analysis and artificial neural networks that will minimize potential analysis inaccuracies.


2021 ◽  
Vol 6 (2) ◽  
pp. 96-104
Author(s):  
Yulia Resti ◽  
Endang Sri Kresnawati ◽  
Novi Rustiana Dewi ◽  
Des Alwine Zayanti ◽  
Ning Eliyati

Diabetes is a chronic disease that can cause serious illness. Women are four times more likely to develop heart problems caused by diabetes. Women are also more prone to experience complications due to diabetes, such as kidney problems, depression, and decreased vision quality. Nearly 200 million women worldwide are affected by diabetes, with two out of five affected by the disease being women of reproductive age. This paper aims to predict women with at least 21 years of age having diabetes based on eight diagnostic measurements using the statistical learning methods; Multinomial Naive Bayes, Fisher Discriminant Analysis, and Logistic Regression. Model validation is built based on dividing the data into training data and test data based on 5-fold cross-validation. The model validation performance shows that the Gaussian Naïve Bayes is the best method in predicting diabetes diagnosis. This paper’s contribution is that all performance measures of the Multinomial Naïve Bayes method have a value greater than 93 %. These results are beneficial in predicting diabetes status with the same explanatory variables.


2020 ◽  
Vol 43 (2) ◽  
pp. 233-249
Author(s):  
Adolphus Wagala ◽  
Graciela González-Farías ◽  
Rogelio Ramos ◽  
Oscar Dalmau

This study involves the implentation of the extensions of the partial least squares generalized linear regression (PLSGLR) by combining  it with logistic regression and  linear  discriminant analysis,  to  get a  partial least  squares generalized linear  regression-logistic regression model (PLSGLR-log),  and a partial least squares generalized linear regression-linear discriminant analysis model (PLSGLRDA). A comparative  study  of  the obtained  classifiers with   the   classical  methodologies like  the k-nearest  neighbours (KNN), linear   discriminant  analysis  (LDA),   partial  least  squares discriminant analysis (PLSDA),  ridge  partial least squares (RPLS), and  support vector machines(SVM)  is  then  carried  out.    Furthermore,  a  new  methodology known as kernel multilogit algorithm (KMA) is also implemented and its performance compared with those of the other classifiers. The KMA emerged as the best classifier based  on the lowest  classification error  rates  compared to  the  others  when  applied   to  the  types   of data   are considered;  the  un- preprocessed and preprocessed.


2020 ◽  
Vol 8 (9) ◽  
pp. 358-367
Author(s):  
O. Akangoziri ◽  
C. N. Okoli

This study examined comparison of the Multiple logistic regression, Linear discriminant analysis and Quadratic discriminant in estimating the infant birth outcome and misclassification error rate of birth outcomes with factors of infant mortality in Anambra State, Nigeria. The birth outcomes of interest were the Neonatal death, Still birth and Alive. Secondary source of data were obtained from the records department of General Hospital Onitsha from 2007-2016. The data comprises of Status of infant birth, Mothers parity, Age of mother, Weight of baby, Mothers Education Status, Number of Bookings before gestation and Gestation Age. The data analysis is performed using R-software. The result of the findings from the multiple logistic regression showed that Mothers Education Status (MES) and Booking contributed significantly on the logistic model while factors of Parity, Sex, Age of Mother (AOM), Year, GA and Birth Weight (BW) were found to be insignificant on birth outcomes. Also observed that the misclassification error rate for birth outcome for the said approach is found to be 0.5992 (59.92%). More so, findings of the study equally showed that the prior probabilities of the groups for the linear and quadratic discriminant analysis were 0.228503, 0.40168 and 0.36981 for Alive, Neonatal death and Still birth respectively. Further findings revealed that the Mothers Education Status and Bookings Status have the greatest impact for first and second linear function respectively. In addition, the result of the misclassification error rate for birth outcome using the linear discriminant analysis is 0.5931 (59.31%). The misclassification error rate for birth outcome based on   quadratic discriminant analysis is 0.5956 (59.56%). Based on the findings of this study, linear discriminant approach is the best alternative in estimating misclassification error rate of infant birth outcome followed by quadratic discriminant analysis and the least is multiple logistic regression. The findings clearly confirmed that the linear discriminant analysis is the best with misclassification error rate of 59.31%.


2004 ◽  
Vol 1 (1) ◽  
pp. 143-161
Author(s):  
Maja Pohar ◽  
Mateja Blas ◽  
Sandra Turk

Two of the most widely used statistical methods for analyzing categorical outcome variables are linear discriminant analysis and logistic regression. While both are appropriate for the development of linear classification models, linear discriminant analysis makes more assumptions about the underlying data. Hence, it is assumed that logistic regression is the more flexible and more robust method in case of violations of these assumptions. In this paper we consider the problem of choosing between the two methods, and set some guidelines for proper choice. The comparison between the methods is based on several measures of predictive accuracy. The performance of the methods is studied by simulations. We start with an example where all the assumptions of the linear discriminant analysis are satisfied and observe the impact of changes regarding the sample size, covariance matrix, Mahalanobis distance and direction of distance between group means. Next, we compare the robustness of the methods towards categorisation and non-normality of explanatory variables in a closely controlled way. We show that the results of LDA and LR are close whenever the normality assumptions are not too badly violated, and set some guidelines for recognizing these situations. We discuss the inappropriateness of LDA in all other cases.


Sign in / Sign up

Export Citation Format

Share Document