A Novel and Simple Mathematical Transform Improves the Perfomance of Lernmatrix in Pattern Classification

The Lernmatrix is a classic associative memory model. The Lernmatrix is capable of executing the pattern classification task, but its performance is not competitive when compared to state-of-the-art classifiers. The main contribution of this paper consists of the proposal of a simple mathematical transform, whose application eliminates the subtractive alterations between patterns. As a consequence, the Lernmatrix performance is significantly improved. To perform the experiments, we selected 20 datasets that are challenging for any classifier, as they exhibit class imbalance. The effectiveness of our proposal was compared against seven supervised classifiers of the most important approaches (Bayes, nearest neighbors, decision trees, logistic function, support vector machines, and neural networks). By choosing balanced accuracy as a performance measure, our proposal obtained the best results in 10 datasets. The elimination of subtractive alterations makes the new model competitive against the best classifiers, and sometimes beats them. After applying the Friedman test and the Holm post hoc test, we can conclude that within a 95% confidence, our proposal competes successfully with the most effective classifiers of the state of the art.

Download Full-text

Supervised Classification of Diseases Based on an Improved Associative Algorithm

Mathematics ◽

10.3390/math9131458 ◽

2021 ◽

Vol 9 (13) ◽

pp. 1458

Author(s):

Raúl Jiménez-Cruz ◽

José-Luis Velázquez-Rodríguez ◽

Itzamá López-Yáñez ◽

Yenny Villuendas-Rey ◽

Cornelio Yáñez-Márquez

Keyword(s):

Performance Measure ◽

Support Vector ◽

Mathematical Tool ◽

Classifier Ensembles ◽

Orthogonal Projections ◽

Vector Machines ◽

Supervised Classifiers ◽

Classification Of Diseases ◽

Low Performance ◽

Value Decomposition

The linear associator is a classic associative memory model. However, due to its low performance, it is pertinent to note that very few linear associator applications have been published. The reason for this is that this model requires the vectors representing the patterns to be orthonormal, which is a big restriction. Some researchers have tried to create orthogonal projections to the vectors to feed the linear associator. However, this solution has serious drawbacks. This paper presents a proposal that effectively improves the performance of the linear associator when acting as a pattern classifier. For this, the proposal involves transforming the dataset using a powerful mathematical tool: the singular value decomposition. To perform the experiments, we selected fourteen medical datasets of two classes. All datasets exhibit balance, so it is possible to use accuracy as a performance measure. The effectiveness of our proposal was compared against nine supervised classifiers of the most important approaches (Bayes, nearest neighbors, decision trees, support vector machines, and neural networks), including three classifier ensembles. The Friedman and Holm tests show that our proposal had a significantly better performance than four of the nine classifiers. Furthermore, there are no significant differences against the other five, although three of them are ensembles.

Download Full-text

Contact Lens Classification by Using Segmented Lens Boundary Features

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v11.i3.pp1129-1135 ◽

2018 ◽

Vol 11 (3) ◽

pp. 1129

Author(s):

Nur Ariffin Mohd Zin ◽

Hishammuddin Asmuni ◽

Haza Nuzly Abdul Hamed ◽

Razib M. Othman ◽

Shahreen Kasim ◽

...

Keyword(s):

Support Vector Machines ◽

Contact Lens ◽

State Of The Art ◽

Classification Method ◽

Support Vector ◽

Local Descriptors ◽

Iris Image ◽

Vector Machines ◽

False Reject Rate ◽

Better Than

Recent studies have shown that the wearing of soft lens may lead to performance degradation with the increase of false reject rate. However, detecting the presence of soft lens is a non-trivial task as its texture that almost indiscernible. In this work, we proposed a classification method to identify the existence of soft lens in iris image. Our proposed method starts with segmenting the lens boundary on top of the sclera region. Then, the segmented boundary is used as features and extracted by local descriptors. These features are then trained and classified using Support Vector Machines. This method was tested on Notre Dame Cosmetic Contact Lens 2013 database. Experiment showed that the proposed method performed better than state of the art methods.

Download Full-text

State of the Art Survey of Deep Learning and Machine Learning Models for Smart Cities and Urban Sustainability

10.31219/osf.io/gmuzk ◽

2020 ◽

Author(s):

Saeed Nosratabadi ◽

Amir Mosavi ◽

Ramin Keivani ◽

Sina Faizollahzadeh Ardabili ◽

Farshid Aram

Keyword(s):

Machine Learning ◽

Deep Learning ◽

State Of The Art ◽

Smart Cities ◽

Model Development ◽

Urban Sustainability ◽

Urban Transport ◽

Support Vector ◽

Neuro Fuzzy ◽

Vector Machines

Deep learning (DL) and machine learning (ML) methods have recently contributed to the advancement of models in the various aspects of prediction, planning, and uncertainty analysis of smart cities and urban development. This paper presents the state of the art of DL and ML methods used in this realm. Through a novel taxonomy, the advances in model development and new application domains in urban sustainability and smart cities are presented. Findings reveal that five DL and ML methods have been most applied to address the different aspects of smart cities. These are artificial neural networks; support vector machines; decision trees; ensembles, Bayesians, hybrids, and neuro-fuzzy; and deep learning. It is also disclosed that energy, health, and urban transport are the main domains of smart cities that DL and ML methods contributed in to address their problems.

Download Full-text

Fuzzy Possibilistic Support Vector Machines for Class Imbalance Learning

Journal of Convergence Information Technology ◽

10.4156/jcit.vol8.issue3.82 ◽

2013 ◽

Vol 8 (3) ◽

pp. 692-701 ◽

Cited By ~ 2

Author(s):

Meng Fanrong ◽

Gao Chunxiao ◽

Liu Bing

Keyword(s):

Support Vector Machines ◽

Class Imbalance ◽

Support Vector ◽

Vector Machines ◽

Imbalance Learning ◽

Class Imbalance Learning

Download Full-text

Linear Support Vector Machines for Prediction of Student Performance in School-Based Education

Mathematical Problems in Engineering ◽

10.1155/2020/4761468 ◽

2020 ◽

Vol 2020 ◽

pp. 1-7

Author(s):

Nalindren Naicker ◽

Timothy Adeliyi ◽

Jeanette Wing

Keyword(s):

Machine Learning ◽

Support Vector Machines ◽

Student Performance ◽

State Of The Art ◽

Learning Algorithms ◽

The State ◽

Machine Learning Algorithms ◽

Superior Performance ◽

Support Vector ◽

Vector Machines

Educational Data Mining (EDM) is a rich research field in computer science. Tools and techniques in EDM are useful to predict student performance which gives practitioners useful insights to develop appropriate intervention strategies to improve pass rates and increase retention. The performance of the state-of-the-art machine learning classifiers is very much dependent on the task at hand. Investigating support vector machines has been used extensively in classification problems; however, the extant of literature shows a gap in the application of linear support vector machines as a predictor of student performance. The aim of this study was to compare the performance of linear support vector machines with the performance of the state-of-the-art classical machine learning algorithms in order to determine the algorithm that would improve prediction of student performance. In this quantitative study, an experimental research design was used. Experiments were set up using feature selection on a publicly available dataset of 1000 alpha-numeric student records. Linear support vector machines benchmarked with ten categorical machine learning algorithms showed superior performance in predicting student performance. The results of this research showed that features like race, gender, and lunch influence performance in mathematics whilst access to lunch was the primary factor which influences reading and writing performance.

Download Full-text

Fuzzy Twin Support Vector Machines for Pattern Classification

Mathematical Programming and Game Theory for Decision Making - Statistical Science and Interdisciplinary Research ◽

10.1142/9789812813220_0009 ◽

2008 ◽

pp. 131-142 ◽

Cited By ~ 5

Author(s):

Reshma Khemchandani ◽

Jayadeva ◽

Suresh Chandra

Keyword(s):

Support Vector Machines ◽

Pattern Classification ◽

Support Vector ◽

Twin Support Vector Machines ◽

Vector Machines

Download Full-text

Efficient sparse least squares support vector machines for pattern classification

2012 9th International Conference on Fuzzy Systems and Knowledge Discovery ◽

10.1109/fskd.2012.6234016 ◽

2012 ◽

Author(s):

Yingjie Tian ◽

Xuchan Ju ◽

Zhiquan Qi ◽

Yong Shi

Keyword(s):

Support Vector Machines ◽

Least Squares ◽

Pattern Classification ◽

Support Vector ◽

Vector Machines

Download Full-text

K-mer based classifiers extract functionally relevant features to support accurate Peroxiredoxin subgroup distinction

10.1101/387787 ◽

2018 ◽

Author(s):

Jiajie Xiao ◽

William H. Turkett

Keyword(s):

Active Site ◽

Antioxidant Defense ◽

State Of The Art ◽

Support Vector ◽

Site Analysis ◽

Representative Sequence ◽

Current State ◽

Vector Machines ◽

Sequence Representation ◽

Functional Relevance

AbstractBackgroundThe Peroxiredoxins (Prx) are a family of proteins that play a major role in antioxidant defense and peroxide-regulated signaling. Six distinct Prx subgroups have been defined based on analysis of structure and sequence regions in proximity to the Prx active site. Analysis of other sequence regions of these annotated proteins may improve the ability to distinguish subgroups and uncover additional representative sequence regions beyond the active site.ResultsThe space of Prx subgroup classifiers is surveyed to highlight similarities and differences in the available approaches. Exploiting the recent growth in annotated Prx proteins, a whole sequence-based classifier is presented that employs support vector machines and a k-mer (k=3) sequence representation.Distinguishing k-mers are extracted and located relative to published active site regions.ConclusionsThis work demonstrates that the 3-mer based classifier can attain high accuracy in subgroup annotation, at rates similar to the current state-of-the-art. Analysis of the classifier’s automatically derived models show that the classification decision is based on a combination of conserved features, including a significant number of residue regions that have not been previously suggested as informative by other classifiers but for which there is evidence of functional relevance.

Download Full-text

Filtering and Classifying Relevant Short Text with a Few Seed Words

Data and Information Management ◽

10.2478/dim-2019-0011 ◽

2019 ◽

Vol 3 (3) ◽

pp. 165-186 ◽

Cited By ~ 1

Author(s):

Chenliang Li ◽

Shiqian Chen ◽

Yan Qi

Keyword(s):

Latent Dirichlet Allocation ◽

Topic Model ◽

State Of The Art ◽

Superior Performance ◽

Support Vector ◽

Short Text ◽

Text Filtering ◽

Supervised Classifiers ◽

Real World Datasets ◽

Weakly Supervised

Abstract Filtering out irrelevant documents and classifying the relevant ones into topical categories is a de facto task in many applications. However, supervised learning solutions require extravagant human efforts on document labeling. In this paper, we propose a novel seed-guided topic model for dataless short text classification and filtering, named SSCF. Without using any labeled documents, SSCF takes a few “seed words” for each category of interest, and conducts short text filtering and classification in a weakly supervised manner. To overcome the issues of data sparsity and imbalance, the short text collection is mapped to a collection of pseudodocuments, one for each word. SSCF infers two kinds of topics on pseudo-documents: category-topics and general-topics. Each category-topic is associated with one category of interest, covering the meaning of the latter. In SSCF, we devise a novel word relevance estimation process based on the seed words, for hidden topic inference. The dominating topic of a short text is identified through post inference and then used for filtering and classification. On two real-world datasets in two languages, experimental results show that our proposed SSCF consistently achieves better classification accuracy than state-of-the-art baselines. We also observe that SSCF can even achieve superior performance than the supervised classifiers supervised latent dirichlet allocation (sLDA) and support vector machine (SVM) on some testing tasks.

Download Full-text

Boosting Granular Support Vector Machines for the Accurate Prediction of Protein-Nucleotide Binding Sites

Combinatorial Chemistry & High Throughput Screening ◽

10.2174/1386207322666190925125524 ◽

2019 ◽

Vol 22 (7) ◽

pp. 455-469

Author(s):

Yi-Heng Zhu ◽

Jun Hu ◽

Yong Qi ◽

Xiao-Ning Song ◽

Dong-Jun Yu

Keyword(s):

Machine Learning ◽

Support Vector Machines ◽

Ligand Binding ◽

Binding Sites ◽

Negative Impact ◽

Class Imbalance ◽

Nucleotide Binding ◽

Support Vector ◽

Vector Machines ◽

Ligand Binding Sites

Aim and Objective: The accurate identification of protein-ligand binding sites helps elucidate protein function and facilitate the design of new drugs. Machine-learning-based methods have been widely used for the prediction of protein-ligand binding sites. Nevertheless, the severe class imbalance phenomenon, where the number of nonbinding (majority) residues is far greater than that of binding (minority) residues, has a negative impact on the performance of such machine-learning-based predictors. Materials and Methods: In this study, we aim to relieve the negative impact of class imbalance by Boosting Multiple Granular Support Vector Machines (BGSVM). In BGSVM, each base SVM is trained on a granular training subset consisting of all minority samples and some reasonably selected majority samples. The efficacy of BGSVM for dealing with class imbalance was validated by benchmarking it with several typical imbalance learning algorithms. We further implemented a protein-nucleotide binding site predictor, called BGSVM-NUC, with the BGSVM algorithm. Results: Rigorous cross-validation and independent validation tests for five types of proteinnucleotide interactions demonstrated that the proposed BGSVM-NUC achieves promising prediction performance and outperforms several popular sequence-based protein-nucleotide binding site predictors. The BGSVM-NUC web server is freely available at http://csbio.njust.edu.cn/bioinf/BGSVM-NUC/ for academic use.

Download Full-text