scholarly journals Cost-Sensitive Learning of Fuzzy Rules for Imbalanced Classification Problems Using FURIA

Author(s):  
Ana Palacios ◽  
Krzysztof Trawiński ◽  
Oscar Cordón ◽  
Luciano Sánchez

This paper is intended to verify that cost-sensitive learning is a competitive approach for learning fuzzy rules in certain imbalanced classification problems. It will be shown that there exist cost matrices whose use in combination with a suitable classifier allows for improving the results of some popular data-level techniques. The well known FURIA algorithm is extended to take advantage of this definition. A numerical study is carried out to compare the proposed cost-sensitive FURIA to other state-of-the-art classification algorithms, based on fuzzy rules and on other classical machine learning methods, on 64 different imbalanced datasets.

Author(s):  
Rui Xia ◽  
Mengran Zhang ◽  
Zixiang Ding

The emotion cause extraction (ECE) task aims at discovering the potential causes behind a certain emotion expression in a document. Techniques including rule-based methods, traditional machine learning methods and deep neural networks have been proposed to solve this task. However, most of the previous work considered ECE as a set of independent clause classification problems and ignored the relations between multiple clauses in a document. In this work, we propose a joint emotion cause extraction framework, named RNN-Transformer Hierarchical Network (RTHN), to encode and classify multiple clauses synchronously. RTHN is composed of a lower word-level encoder based on RNNs to encode multiple words in each clause, and an upper clause-level encoder based on Transformer to learn the correlation between multiple clauses in a document. We furthermore propose ways to encode the relative position and global predication information into Transformer that can capture the causality between clauses and make RTHN more efficient. We finally achieve the best performance among 12 compared systems and improve the F1 score of the state-of-the-art from 72.69% to 76.77%.


2021 ◽  
Vol 11 (5) ◽  
pp. 603
Author(s):  
Chunlei Shi ◽  
Xianwei Xin ◽  
Jiacai Zhang

Machine learning methods are widely used in autism spectrum disorder (ASD) diagnosis. Due to the lack of labelled ASD data, multisite data are often pooled together to expand the sample size. However, the heterogeneity that exists among different sites leads to the degeneration of machine learning models. Herein, the three-way decision theory was introduced into unsupervised domain adaptation in the first time, and applied to optimize the pseudolabel of the target domain/site from functional magnetic resonance imaging (fMRI) features related to ASD patients. The experimental results using multisite fMRI data show that our method not only narrows the gap of the sample distribution among domains but is also superior to the state-of-the-art domain adaptation methods in ASD recognition. Specifically, the ASD recognition accuracy of the proposed method is improved on all the six tasks, by 70.80%, 75.41%, 69.91%, 72.13%, 71.01% and 68.85%, respectively, compared with the existing methods.


Author(s):  
Minsik Oh ◽  
Sungjoon Park ◽  
Sun Kim ◽  
Heejoon Chae

Abstract Gene expressions are subtly regulated by quantifiable measures of genetic molecules such as interaction with other genes, methylation, mutations, transcription factor and histone modifications. Integrative analysis of multi-omics data can help scientists understand the condition or patient-specific gene regulation mechanisms. However, analysis of multi-omics data is challenging since it requires not only the analysis of multiple omics data sets but also mining complex relations among different genetic molecules by using state-of-the-art machine learning methods. In addition, analysis of multi-omics data needs quite large computing infrastructure. Moreover, interpretation of the analysis results requires collaboration among many scientists, often requiring reperforming analysis from different perspectives. Many of the aforementioned technical issues can be nicely handled when machine learning tools are deployed on the cloud. In this survey article, we first survey machine learning methods that can be used for gene regulation study, and we categorize them according to five different goals: gene regulatory subnetwork discovery, disease subtype analysis, survival analysis, clinical prediction and visualization. We also summarize the methods in terms of multi-omics input types. Then, we explain why the cloud is potentially a good solution for the analysis of multi-omics data, followed by a survey of two state-of-the-art cloud systems, Galaxy and BioVLAB. Finally, we discuss important issues when the cloud is used for the analysis of multi-omics data for the gene regulation study.


Author(s):  
M. Vidyasagar

The objectives of this Perspective paper are to review some recent advances in sparse feature selection for regression and classification, as well as compressed sensing, and to discuss how these might be used to develop tools to advance personalized cancer therapy. As an illustration of the possibilities, a new algorithm for sparse regression is presented and is applied to predict the time to tumour recurrence in ovarian cancer. A new algorithm for sparse feature selection in classification problems is presented, and its validation in endometrial cancer is briefly discussed. Some open problems are also presented.


2016 ◽  
Vol 2016 ◽  
pp. 1-9 ◽  
Author(s):  
Wei Liang ◽  
Liang Cheng ◽  
Mingdong Tang

Brain wave signal is a bioelectric phenomenon reflecting activities in human brain. In this paper, we firstly introduce brain wave-based identity recognition techniques and the state-of-the-art work. We then analyze important features of brain wave and present challenges confronted by its applications. Further, we evaluate the security and practicality of using brain wave in identity recognition and anticounterfeiting authentication and describe use cases of several machine learning methods in brain wave signal processing. Afterwards, we survey the critical issues of characteristic extraction, classification, and selection involved in brain wave signal processing. Finally, we propose several brain wave-based identity recognition techniques for further studies and conclude this paper.


2021 ◽  
Vol 11 (6) ◽  
pp. 7824-7835
Author(s):  
H. Alalawi ◽  
M. Alsuwat ◽  
H. Alhakami

The importance of classification algorithms has increased in recent years. Classification is a branch of supervised learning with the goal of predicting class labels categorical of new cases. Additionally, with Coronavirus (COVID-19) propagation since 2019, the world still faces a great challenge in defeating COVID-19 even with modern methods and technologies. This paper gives an overview of classification algorithms to provide the readers with an understanding of the concept of the state-of-the-art classification algorithms and their applications used in the COVID-19 diagnosis and detection. It also describes some of the research published on classification algorithms, the existing gaps in the research, and future research directions. This article encourages both academics and machine learning learners to further strengthen the basis of classification methods.


2018 ◽  
Author(s):  
Sebastian Bittrich ◽  
Marika Kaden ◽  
Christoph Leberecht ◽  
Florian Kaiser ◽  
Thomas Villmann ◽  
...  

AbstractBackgroundMachine learning strategies are prominent tools for data analysis. Especially in life sciences, they have become increasingly important to handle the growing datasets collected by the scientific community. Meanwhile, algorithms improve in performance, but also gain complexity, and tend to neglect interpretability and comprehensiveness of the resulting models.ResultsGeneralized Matrix Learning Vector Quantization (GMLVQ) is a supervised, prototype-based machine learning method and provides comprehensive visualization capabilities not present in other classifiers which allow for a fine-grained interpretation of the data. In contrast to commonly used machine learning strategies, GMLVQ is well-suited for imbalanced classification problems which are frequent in life sciences. We present a Weka plug-in implementing GMLVQ. The feasibility of GMLVQ is demonstrated on a dataset of Early Folding Residues (EFR) that have been shown to initiate and guide the protein folding process. Using 27 features, an area under the receiver operating characteristic of 76.6% was achieved which is comparable to other state-of-the-art classifiers.ConclusionsThe application on EFR prediction demonstrates how an easy interpretation of classification models can promote the comprehension of biological mechanisms. The results shed light on the special features of EFR which were reported as most influential for the classification: EFR are embedded in ordered secondary structure elements and they participate in networks of hydrophobic residues. Visualization capabilities of GMLVQ are presented as we demonstrate how to interpret the results.


Sign in / Sign up

Export Citation Format

Share Document