Cost-Sensitive Learning of Fuzzy Rules for Imbalanced Classification Problems Using FURIA

This paper is intended to verify that cost-sensitive learning is a competitive approach for learning fuzzy rules in certain imbalanced classification problems. It will be shown that there exist cost matrices whose use in combination with a suitable classifier allows for improving the results of some popular data-level techniques. The well known FURIA algorithm is extended to take advantage of this definition. A numerical study is carried out to compare the proposed cost-sensitive FURIA to other state-of-the-art classification algorithms, based on fuzzy rules and on other classical machine learning methods, on 64 different imbalanced datasets.

Download Full-text

RTHN: A RNN-Transformer Hierarchical Network for Emotion Cause Extraction

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/734 ◽

2019 ◽

Cited By ~ 3

Author(s):

Rui Xia ◽

Mengran Zhang ◽

Zixiang Ding

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Relative Position ◽

Deep Neural Networks ◽

State Of The Art ◽

Hierarchical Network ◽

Classification Problems ◽

Rule Based ◽

Machine Learning Methods ◽

Word Level

The emotion cause extraction (ECE) task aims at discovering the potential causes behind a certain emotion expression in a document. Techniques including rule-based methods, traditional machine learning methods and deep neural networks have been proposed to solve this task. However, most of the previous work considered ECE as a set of independent clause classification problems and ignored the relations between multiple clauses in a document. In this work, we propose a joint emotion cause extraction framework, named RNN-Transformer Hierarchical Network (RTHN), to encode and classify multiple clauses synchronously. RTHN is composed of a lower word-level encoder based on RNNs to encode multiple words in each clause, and an upper clause-level encoder based on Transformer to learn the correlation between multiple clauses in a document. We furthermore propose ways to encode the relative position and global predication information into Transformer that can capture the causality between clauses and make RTHN more efficient. We finally achieve the best performance among 12 compared systems and improve the F1 score of the state-of-the-art from 72.69% to 76.77%.

Download Full-text

Domain Adaptation Using a Three-Way Decision Improves the Identification of Autism Patients from Multisite fMRI Data

Brain Sciences ◽

10.3390/brainsci11050603 ◽

2021 ◽

Vol 11 (5) ◽

pp. 603

Author(s):

Chunlei Shi ◽

Xianwei Xin ◽

Jiacai Zhang

Keyword(s):

Machine Learning ◽

Domain Adaptation ◽

Recognition Accuracy ◽

State Of The Art ◽

Autism Spectrum ◽

Fmri Data ◽

Target Domain ◽

Sample Distribution ◽

Machine Learning Methods ◽

First Time

Machine learning methods are widely used in autism spectrum disorder (ASD) diagnosis. Due to the lack of labelled ASD data, multisite data are often pooled together to expand the sample size. However, the heterogeneity that exists among different sites leads to the degeneration of machine learning models. Herein, the three-way decision theory was introduced into unsupervised domain adaptation in the first time, and applied to optimize the pseudolabel of the target domain/site from functional magnetic resonance imaging (fMRI) features related to ASD patients. The experimental results using multisite fMRI data show that our method not only narrows the gap of the sample distribution among domains but is also superior to the state-of-the-art domain adaptation methods in ASD recognition. Specifically, the ASD recognition accuracy of the proposed method is improved on all the six tasks, by 70.80%, 75.41%, 69.91%, 72.13%, 71.01% and 68.85%, respectively, compared with the existing methods.

Download Full-text

The Analysis of EEG Signal and Comparison of Classification Algorithms Using Machine Learning Methods

Software Engineering Perspectives in Intelligent Systems - Advances in Intelligent Systems and Computing ◽

10.1007/978-3-030-63322-6_47 ◽

2020 ◽

pp. 582-590

Author(s):

Andrea Nemethova ◽

Dmitrii Borkin ◽

Martin Nemeth

Keyword(s):

Machine Learning ◽

Classification Algorithms ◽

Eeg Signal ◽

Learning Methods ◽

Machine Learning Methods

Download Full-text

Machine learning methods for classification problems

Śląski Przegląd Statystyczny ◽

10.15611/sps.2020.18.14 ◽

2020 ◽

Vol 18 (24) ◽

pp. 241-248

Author(s):

Heiko Groeniitz

Keyword(s):

Machine Learning ◽

Classification Problems ◽

Learning Methods ◽

Machine Learning Methods

Download Full-text

Constraint relaxation, cost-sensitive learning and bagging for imbalanced classification problems with outliers

Optimization Letters ◽

10.1007/s11590-015-0934-z ◽

2015 ◽

Vol 11 (5) ◽

pp. 915-928 ◽

Cited By ~ 5

Author(s):

Talayeh Razzaghi ◽

Petros Xanthopoulos ◽

Onur Şeref

Keyword(s):

Classification Problems ◽

Cost Sensitive Learning ◽

Imbalanced Classification ◽

Constraint Relaxation

Download Full-text

Machine learning-based analysis of multi-omics data on the cloud for investigating gene regulations

Briefings in Bioinformatics ◽

10.1093/bib/bbaa032 ◽

2020 ◽

Cited By ~ 2

Author(s):

Minsik Oh ◽

Sungjoon Park ◽

Sun Kim ◽

Heejoon Chae

Keyword(s):

Machine Learning ◽

Gene Regulation ◽

State Of The Art ◽

Patient Specific ◽

Specific Gene ◽

Omics Data ◽

Gene Expressions ◽

Learning Methods ◽

Machine Learning Methods ◽

Disease Subtype

Abstract Gene expressions are subtly regulated by quantifiable measures of genetic molecules such as interaction with other genes, methylation, mutations, transcription factor and histone modifications. Integrative analysis of multi-omics data can help scientists understand the condition or patient-specific gene regulation mechanisms. However, analysis of multi-omics data is challenging since it requires not only the analysis of multiple omics data sets but also mining complex relations among different genetic molecules by using state-of-the-art machine learning methods. In addition, analysis of multi-omics data needs quite large computing infrastructure. Moreover, interpretation of the analysis results requires collaboration among many scientists, often requiring reperforming analysis from different perspectives. Many of the aforementioned technical issues can be nicely handled when machine learning tools are deployed on the cloud. In this survey article, we first survey machine learning methods that can be used for gene regulation study, and we categorize them according to five different goals: gene regulatory subnetwork discovery, disease subtype analysis, survival analysis, clinical prediction and visualization. We also summarize the methods in terms of multi-omics input types. Then, we explain why the cloud is potentially a good solution for the analysis of multi-omics data, followed by a survey of two state-of-the-art cloud systems, Galaxy and BioVLAB. Finally, we discuss important issues when the cloud is used for the analysis of multi-omics data for the gene regulation study.

Download Full-text

Machine learning methods in the computational biology of cancer

Proceedings of The Royal Society A Mathematical Physical and Engineering Sciences ◽

10.1098/rspa.2014.0081 ◽

2014 ◽

Vol 470 (2167) ◽

pp. 20140081 ◽

Cited By ~ 15

Author(s):

M. Vidyasagar

Keyword(s):

Machine Learning ◽

Ovarian Cancer ◽

Feature Selection ◽

Classification Problems ◽

Open Problems ◽

Machine Learning Methods ◽

Selection For ◽

Personalized Cancer Therapy ◽

Personalized Cancer ◽

Sparse Feature Selection

The objectives of this Perspective paper are to review some recent advances in sparse feature selection for regression and classification, as well as compressed sensing, and to discuss how these might be used to develop tools to advance personalized cancer therapy. As an illustration of the possibilities, a new algorithm for sparse regression is presented and is applied to predict the time to tumour recurrence in ovarian cancer. A new algorithm for sparse feature selection in classification problems is presented, and its validation in endometrial cancer is briefly discussed. Some open problems are also presented.

Download Full-text

Identity Recognition Using Biological Electroencephalogram Sensors

Journal of Sensors ◽

10.1155/2016/1831742 ◽

2016 ◽

Vol 2016 ◽

pp. 1-9 ◽

Cited By ~ 1

Author(s):

Wei Liang ◽

Liang Cheng ◽

Mingdong Tang

Keyword(s):

Machine Learning ◽

Signal Processing ◽

Human Brain ◽

State Of The Art ◽

Brain Wave ◽

Identity Recognition ◽

Wave Signal ◽

Machine Learning Methods ◽

Characteristic Extraction ◽

Critical Issues

Brain wave signal is a bioelectric phenomenon reflecting activities in human brain. In this paper, we firstly introduce brain wave-based identity recognition techniques and the state-of-the-art work. We then analyze important features of brain wave and present challenges confronted by its applications. Further, we evaluate the security and practicality of using brain wave in identity recognition and anticounterfeiting authentication and describe use cases of several machine learning methods in brain wave signal processing. Afterwards, we survey the critical issues of characteristic extraction, classification, and selection involved in brain wave signal processing. Finally, we propose several brain wave-based identity recognition techniques for further studies and conclude this paper.

Download Full-text

A Survey of the Application of Artifical Intellegence on COVID-19 Diagnosis and Prediction

Engineering, Technology & Applied Science Research ◽

10.48084/etasr.4503 ◽

2021 ◽

Vol 11 (6) ◽

pp. 7824-7835

Author(s):

H. Alalawi ◽

M. Alsuwat ◽

H. Alhakami

Keyword(s):

Machine Learning ◽

State Of The Art ◽

Future Research ◽

Classification Algorithms ◽

Classification Methods ◽

Research Directions ◽

The World ◽

Modern Methods ◽

Future Research Directions ◽

Class Labels

The importance of classification algorithms has increased in recent years. Classification is a branch of supervised learning with the goal of predicting class labels categorical of new cases. Additionally, with Coronavirus (COVID-19) propagation since 2019, the world still faces a great challenge in defeating COVID-19 even with modern methods and technologies. This paper gives an overview of classification algorithms to provide the readers with an understanding of the concept of the state-of-the-art classification algorithms and their applications used in the COVID-19 diagnosis and detection. It also describes some of the research published on classification algorithms, the existing gaps in the research, and future research directions. This article encourages both academics and machine learning learners to further strengthen the basis of classification methods.

Download Full-text

Application of an Interpretable Classification Model on Early Folding Residues during Protein Folding

10.1101/381483 ◽

2018 ◽

Author(s):

Sebastian Bittrich ◽

Marika Kaden ◽

Christoph Leberecht ◽

Florian Kaiser ◽

Thomas Villmann ◽

...

Keyword(s):

Machine Learning ◽

Protein Folding ◽

Learning Strategies ◽

Life Sciences ◽

Classification Model ◽

Classification Problems ◽

Hydrophobic Residues ◽

Imbalanced Classification ◽

Fine Grained ◽

Generalized Matrix

AbstractBackgroundMachine learning strategies are prominent tools for data analysis. Especially in life sciences, they have become increasingly important to handle the growing datasets collected by the scientific community. Meanwhile, algorithms improve in performance, but also gain complexity, and tend to neglect interpretability and comprehensiveness of the resulting models.ResultsGeneralized Matrix Learning Vector Quantization (GMLVQ) is a supervised, prototype-based machine learning method and provides comprehensive visualization capabilities not present in other classifiers which allow for a fine-grained interpretation of the data. In contrast to commonly used machine learning strategies, GMLVQ is well-suited for imbalanced classification problems which are frequent in life sciences. We present a Weka plug-in implementing GMLVQ. The feasibility of GMLVQ is demonstrated on a dataset of Early Folding Residues (EFR) that have been shown to initiate and guide the protein folding process. Using 27 features, an area under the receiver operating characteristic of 76.6% was achieved which is comparable to other state-of-the-art classifiers.ConclusionsThe application on EFR prediction demonstrates how an easy interpretation of classification models can promote the comprehension of biological mechanisms. The results shed light on the special features of EFR which were reported as most influential for the classification: EFR are embedded in ordered secondary structure elements and they participate in networks of hydrophobic residues. Visualization capabilities of GMLVQ are presented as we demonstrate how to interpret the results.

Download Full-text