Machine learning methods for classification problems

There are available metrics for predicting fault prone classes, which may help software organizations for planning and performing testing activities. This may be possible due to proper allocation of resources on fault prone parts of the design and code of the software. Hence, importance and usefulness of such metrics is understandable, but empirical validation of these metrics is always a great challenge. Random Forest (RF) algorithm has been successfully applied for solving regression and classification problems in many applications. In this work, the authors predict faulty classes/modules using object oriented metrics and static code metrics. This chapter evaluates the capability of RF algorithm and compares its performance with nine statistical and machine learning methods in predicting fault prone software classes. The authors applied RF on six case studies based on open source, commercial software and NASA data sets. The results indicate that the prediction performance of RF is generally better than statistical and machine learning models. Further, the classification of faulty classes/modules using the RF method is better than the other methods in most of the data sets.

Download Full-text

Feature Extraction Approaches for Biological Sequences: A Comparative Study of Mathematical Models

10.1101/2020.06.08.140368 ◽

2020 ◽

Author(s):

Robson Parmezan Bonidia ◽

Lucas Dias Hiera Sampaio ◽

Douglas Silva Domingues ◽

Alexandre Rossi Paschoal ◽

Fabrício Martins Lopes ◽

...

Keyword(s):

Machine Learning ◽

Feature Extraction ◽

Mathematical Models ◽

Relevant Information ◽

Biological Sequences ◽

Biological Sequence ◽

Classification Problems ◽

Learning Methods ◽

Rna Sequences ◽

Machine Learning Methods

AbstractThe number of available biological sequences has increased significantly in recent years due to various genomic sequencing projects, creating a huge volume of data. Consequently, new computational methods are needed to analyze and extract information from these sequences. Machine learning methods have shown broad applicability in computational biology and bioinformatics. The utilization of machine learning methods has helped to extract relevant information from various biological datasets. However, there are still several obstacles that motivate new algorithms and pipeline proposals, mainly involving feature extraction problems, in which extracting significant discriminatory information from a biological set is challenging. Considering this, our work proposes to study and analyze a feature extraction pipeline based on mathematical models (Numerical Mapping, Fourier, Entropy, and Complex Networks). As a case study, we analyze Long Non-Coding RNA sequences. Moreover, we divided this work into two studies, e.g., (I) we assessed our proposal with the most addressed problem in our review, e.g., lncRNA vs. mRNA; (II) we tested its generalization on different classification problems, e.g., circRNA vs. lncRNA. The experimental results demonstrated three main contributions: (1) An in-depth study of several mathematical models; (2) a new feature extraction pipeline and (3) its generalization and robustness for distinct biological sequence classification.

Download Full-text

SUPPORT VECTOR MACHINES UNTUK MENYELESAIKAN MASALAH KLASIFIKASI PADA PENGENALAN POLA

Jurnal Poli-Teknologi ◽

10.32722/pt.v18i2.1432 ◽

2019 ◽

Vol 18 (2) ◽

Author(s):

Abdul Azis Abdillah

Keyword(s):

Machine Learning ◽

Pattern Recognition ◽

Support Vector Machines ◽

Support Vector ◽

Classification Problems ◽

Learning Methods ◽

Learning Pattern ◽

Machine Learning Methods ◽

Vector Machines ◽

Learning Machine

ABSTRACTSupport Vector Machines (SVM) are known as the latest machine learning (machine learning) methods to solve classification problems in pattern recognition. This paper discusses the use of SVM in solving problems in pattern recognition. An example of the problem given in this paper contains a collection of data on Any Linearly Separable Datase, Any dataset with Noise, and Real datasets.Key words: machine learning, pattern recognition, SVMABSTRAKSupport Vector Machines (SVM) dikenal sebagai metode machine learning (pembelajaran mesin) paling mutakhir untuk menyelesaikan masalah klasifikasi pada pengenalan pola. Tulisan ini bertujuan untuk membahas penggunaan SVM dalam memecahkan masalah klasifikasi pada pengenalan pola. Contoh masalah yang diberikan pada tulisan ini meliputi klasifikasi data pada Sembarang Linearly Separable Dataset, Sembarang Dataset dengan Noise, dan Real dataset.Kata kunci : klasifikasi, pengenalan pola, SVM

Download Full-text