scholarly journals Improving the Interpretability of Classification Rules Discovered by an Ant Colony Algorithm: Extended Results

2016 ◽  
Vol 24 (3) ◽  
pp. 385-409 ◽  
Author(s):  
Fernando E. B. Otero ◽  
Alex A. Freitas

Most ant colony optimization (ACO) algorithms for inducing classification rules use a ACO-based procedure to create a rule in a one-at-a-time fashion. An improved search strategy has been proposed in the cAnt-Miner[Formula: see text] algorithm, where an ACO-based procedure is used to create a complete list of rules (ordered rules), i.e., the ACO search is guided by the quality of a list of rules instead of an individual rule. In this paper we propose an extension of the cAnt-Miner[Formula: see text] algorithm to discover a set of rules (unordered rules). The main motivations for this work are to improve the interpretation of individual rules by discovering a set of rules and to evaluate the impact on the predictive accuracy of the algorithm. We also propose a new measure to evaluate the interpretability of the discovered rules to mitigate the fact that the commonly used model size measure ignores how the rules are used to make a class prediction. Comparisons with state-of-the-art rule induction algorithms, support vector machines, and the cAnt-Miner[Formula: see text] producing ordered rules are also presented.


2021 ◽  
Author(s):  
Hanna Klimczak ◽  
Wojciech Kotłowski ◽  
Dagmara Oszkiewicz ◽  
Francesca DeMeo ◽  
Agnieszka Kryszczyńska ◽  
...  

<p>The aim of the project is the classification of asteroids according to the most commonly used asteroid taxonomy (Bus-Demeo et al. 2009) with the use of various machine learning methods like Logistic Regression, Naive Bayes, Support Vector Machines, Gradient Boosting and Multilayer Perceptrons. Different parameter sets are used for classification in order to compare the quality of prediction with limited amount of data, namely the difference in performance between using the 0.45mu to 2.45mu spectral range and multiple spectral features, as well as performing the Prinicpal Component Analysis to reduce the dimensions of the spectral data.</p> <p> </p> <p>This work has been supported by grant No. 2017/25/B/ST9/00740 from the National Science Centre, Poland.</p>



2012 ◽  
Vol 433-440 ◽  
pp. 3577-3583
Author(s):  
Yan Zhang ◽  
Hao Wang ◽  
Yong Hua Zhang ◽  
Yun Chen ◽  
Xu Li

To overcome the defect of the classical ant colony algorithm’s slow convergence speed, and its vulnerability to local optimization, the authors propose Parallel Ant Colony Optimization Algorithm Based on Multiplicate Pheromon Declining to solve Traveling Salesman Problem according to the characteristics of natural ant colony multi-group and pheromone updating features of ant colony algorithm, combined with OpenMP parallel programming idea. The new algorithm combines three different pheromone updating methods to make a new declining pheromone updating method. It effectively reduces the impact of pheromone on the non-optimal path in the ants parade loop to subsequent ants and improves the parade quality of subsequent ants. It makes full use of multi-core CPU's computing power and improves the efficiency significantly. The new algorithm is compared with ACO through experiments. The results show that the new algorithm has faster convergence rate and better ability of global optimization than ACO.



2017 ◽  
Vol 10 (1) ◽  
pp. 43 ◽  
Author(s):  
Nursuci Putri Husain ◽  
Nursanti Novi Arisa ◽  
Putri Nur Rahayu ◽  
Agus Zainal Arifin ◽  
Darlis Herumurti

Many kinds of classification method are able to diagnose a patient who suffered Hepatitis disease. One of classification methods that can be used was Least Squares Support Vector Machines (LSSVM). There are two parameters that very influence to improve the classification accuracy on LSSVM, they are kernel parameter and regularization parameter. Determining the optimal parameters must be considered to obtain a high classification accuracy on LSSVM. This paper proposed an optimization method based on Improved Ant Colony Algorithm (IACA) in determining the optimal parameters of LSSVM for diagnosing Hepatitis disease. IACA create a storage solution to keep the whole route of the ants. The solutions that have been stored were the value of the parameter LSSVM. There are three main stages in this study. Firstly, the dimension of Hepatitis dataset will be reduced by Local Fisher Discriminant Analysis (LFDA). Secondly, search the optimal parameter LSSVM with IACA optimization using the data training, And the last, classify the data testing using optimal parameters of LSSVM. Experimental results have demonstrated that the proposed method produces high accuracy value (93.7%) for  the 80-20% training-testing partition.



Author(s):  
X L Zhang ◽  
X F Chen ◽  
Z J He

Since support vector machines (SVM) exhibit a good generalization performance in the small sample cases, these have a wide application in machinery fault diagnosis. However, a problem arises from setting optimal parameters for SVM so as to obtain optimal diagnosis result. This article presents a fault diagnosis method based on SVM with parameter optimization by ant colony algorithm to attain a desirable fault diagnosis result, which is performed on the locomotive roller bearings to validate its feasibility and efficiency. The experiment finds that the proposed algorithm of ant colony optimization with SVM (ACO—SVM) can help one to obtain a good fault diagnosis result, which confirms the advantage of the proposed ACO—SVM approach.



Data Mining ◽  
2011 ◽  
pp. 191-208 ◽  
Author(s):  
Rafael S. Parpinelli ◽  
Heitor S. Lopes ◽  
Alex A. Freitas

This work proposes an algorithm for rule discovery called Ant-Miner (Ant Colony-Based Data Miner). The goal of Ant-Miner is to extract classification rules from data. The algorithm is based on recent research on the behavior of real ant colonies as well as in some data mining concepts. We compare the performance of Ant-Miner with the performance of the wellknown C4.5 algorithm on six public domain data sets. The results provide evidence that: (a) Ant-Miner is competitive with C4.5 with respect to predictive accuracy; and (b) the rule sets discovered by Ant-Miner are simpler (smaller) than the rule sets discovered by C4.5.



2010 ◽  
Vol 08 (06) ◽  
pp. 945-965 ◽  
Author(s):  
MIZANUR R. KHONDOKER ◽  
TILL T. BACHMANN ◽  
MURIEL MEWISSEN ◽  
PAUL DICKINSON ◽  
BARTOSZ DOBRZELECKI ◽  
...  

Machine learning and statistical model based classifiers have increasingly been used with more complex and high dimensional biological data obtained from high-throughput technologies. Understanding the impact of various factors associated with large and complex microarray datasets on the predictive performance of classifiers is computationally intensive, under investigated, yet vital in determining the optimal number of biomarkers for various classification purposes aimed towards improved detection, diagnosis, and therapeutic monitoring of diseases. We investigate the impact of microarray based data characteristics on the predictive performance for various classification rules using simulation studies. Our investigation using Random Forest, Support Vector Machines, Linear Discriminant Analysis and k-Nearest Neighbour shows that the predictive performance of classifiers is strongly influenced by training set size, biological and technical variability, replication, fold change and correlation between biomarkers. Optimal number of biomarkers for a classification problem should therefore be estimated taking account of the impact of all these factors. A database of average generalization errors is built for various combinations of these factors. The database of generalization errors can be used for estimating the optimal number of biomarkers for given levels of predictive accuracy as a function of these factors. Examples show that curves from actual biological data resemble that of simulated data with corresponding levels of data characteristics. An R package optBiomarker implementing the method is freely available for academic use from the Comprehensive R Archive Network ().



Author(s):  
Chunyu Liu ◽  
Fengrui Mu ◽  
Weilong Zhang

Background: In recent era of technology, the traditional Ant Colony Algorithm (ACO) is insufficient in solving the problem of network congestion and load balance, and network utilization. Methods: This paper proposes an improved ant colony algorithm, which considers the price factor based on the theory of elasticity of demand. The price factor is denominated in the impact on the network load which means indirect control of network load, congestion or auxiliary solution to calculate the idle resources caused by the low network utilization and reduced profits. Results: Experimental results show that the improved algorithm can balance the overall network load, extend the life of path by nearly 3 hours, greatly reduce the risk of network paralysis, and increase the profit of the manufacturer by 300 million Yuan. Conclusion: Furthermore, results shows that the improved method has a great application value in improving the network efficiency, balancing network load, prolonging network life and increasing network operating profit.



Foods ◽  
2021 ◽  
Vol 10 (11) ◽  
pp. 2723
Author(s):  
Evgenia D. Spyrelli ◽  
Christina Papachristou ◽  
George-John E. Nychas ◽  
Efstathios Z. Panagou

Fourier transform infrared spectroscopy (FT-IR) and multispectral imaging (MSI) were evaluated for the prediction of the microbiological quality of poultry meat via regression and classification models. Chicken thigh fillets (n = 402) were subjected to spoilage experiments at eight isothermal and two dynamic temperature profiles. Samples were analyzed microbiologically (total viable counts (TVCs) and Pseudomonas spp.), while simultaneously MSI and FT-IR spectra were acquired. The organoleptic quality of the samples was also evaluated by a sensory panel, establishing a TVC spoilage threshold at 6.99 log CFU/cm2. Partial least squares regression (PLS-R) models were employed in the assessment of TVCs and Pseudomonas spp. counts on chicken’s surface. Furthermore, classification models (linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), support vector machines (SVMs), and quadratic support vector machines (QSVMs)) were developed to discriminate the samples in two quality classes (fresh vs. spoiled). PLS-R models developed on MSI data predicted TVCs and Pseudomonas spp. counts satisfactorily, with root mean squared error (RMSE) values of 0.987 and 1.215 log CFU/cm2, respectively. SVM model coupled to MSI data exhibited the highest performance with an overall accuracy of 94.4%, while in the case of FT-IR, improved classification was obtained with the QDA model (overall accuracy 71.4%). These results confirm the efficacy of MSI and FT-IR as rapid methods to assess the quality in poultry products.



2021 ◽  
Author(s):  
Syeda Nadia Firdaus

This thesis explores machine learning models based on various feature sets to solve the protein structural class prediction problem which is a significant classification problem in bioinformatics. Knowledge of protein structural classes contributes to an understanding of protein folding patterns, and this has made structural class prediction research a major topic of interest. In this thesis, features are extracted from predicted secondary structure and hydropathy sequence using new strategies to classify proteins into one of the four major structural classes: all-α, all-β, α/β, and α+β. The prediction accuracy using these features compares favourably with some existing successful methods. We use Support Vector Machines (SVM), since this learning method has well-known efficiency in solving this classification problem. On a standard dataset (25PDB), the proposed system has an overall accuracy of 89% with as few as 22 features, whereas the previous best performing method had an accuracy of 88% using 2510 features.



Sign in / Sign up

Export Citation Format

Share Document