scholarly journals Feature Genes Selection Using Fuzzy Rough Uncertainty Metric for Tumor Diagnosis

2019 ◽  
Vol 2019 ◽  
pp. 1-9
Author(s):  
Jiucheng Xu ◽  
Yun Wang ◽  
Keqiang Xu ◽  
Tianli Zhang

To select more effective feature genes, many existing algorithms focus on the selection and study of evaluation methods for feature genes, ignoring the accurate mapping of original information in data processing. Therefore, for solving this problem, a new model is proposed in this paper: rough uncertainty metric model. First, the fuzzy neighborhood granule of the sample is constructed by combining the fuzzy similarity relation with the neighborhood radius in the rough set, and the rough decision is defined by using the fuzzy similarity relation and the decision equivalence class. Then, the fuzzy neighborhood granule and the rough decision are introduced into the conditional entropy, and the rough uncertainty metric model is proposed; in the meantime, the definition of measuring the significance of feature genes and the proof of some related theorems are given. To make this model tolerate noises in data, this paper introduces a variable precision model and discusses the selection of parameters. Finally, based on the rough uncertainty metric model, we design a feature genes selection algorithm and compare it with some existing similar algorithms. The experimental results show that the proposed algorithm can select the smaller feature genes subset with higher classification accuracy and verify that the model proposed in this paper is more effective.

Author(s):  
ROLLY INTAN ◽  
MASAO MUKAIDONO

In 1982, Pawlak proposed the concept of rough sets with a practical purpose of representing indiscernibility of elements or objects in the presence of information systems. Even if it is easy to analyze, the rough set theory built on a partition induced by equivalence relation may not provide a realistic view of relationships between elements in real-world applications. Here, coverings of, or nonequivalence relations on, the universe can be considered to represent a more realistic model instead of a partition in which a generalized model of rough sets was proposed. In this paper, first a weak fuzzy similarity relation is introduced as a more realistic relation in representing the relationship between two elements of data in real-world applications. Fuzzy conditional probability relation is considered as a concrete example of the weak fuzzy similarity relation. Coverings of the universe is provided by fuzzy conditional probability relations. Generalized concepts of rough approximations and rough membership functions are proposed and defined based on coverings of the universe. Such generalization is considered as a kind of fuzzy rough set. A more generalized fuzzy rough set approximation of a given fuzzy set is proposed and discussed as an alternative to provide interval-value fuzzy sets. Their properties are examined.


Author(s):  
Rolly Intan ◽  
◽  
Masao Mukaidono ◽  

Fuzzy relational database was proposed for dealing with imprecise data or fuzzy information in a relational database. In order to provide a more realistic relation in representing similarity between two imprecise data, we need to weaken fuzzy similarity relation to be weak fuzzy similarity relation in which fuzzy conditional probability relation (FCPR, for short) is regarded as a concrete example of the weak fuzzy similarity relation. In this paper, application of approximate data querying is discussed induced by FCPR in the presence of the fuzzy relational database. Application of approximate data querying in order to provide fuzzy query relation is presented into two frameworks, namely dependent inputs and independent inputs. Finally, related to join operator, approximate join of two or more fuzzy query relations is given for the purpose of extending query system.


2014 ◽  
Vol 8 ◽  
pp. 2035-2040
Author(s):  
Rogi Jacob ◽  
Sunny Kuriakose A

Algorithms ◽  
2019 ◽  
Vol 12 (2) ◽  
pp. 29 ◽  
Author(s):  
Soufiane Maguerra ◽  
Azedine Boulmakoul ◽  
Lamia Karim ◽  
Hassan Badir

The proliferation of indoor and outdoor tracking devices has led to a vast amount of spatial data. Each object can be described by several trajectories that, once analysed, can yield to significant knowledge. In particular, pattern analysis by clustering generic trajectories can give insight into objects sharing the same patterns. Still, sequential clustering approaches fail to handle large volumes of data. Hence, the necessity of distributed systems to be able to infer knowledge in a trivial time interval. In this paper, we detail an efficient, scalable and distributed execution pipeline for clustering raw trajectories. The clustering is achieved via a fuzzy similarity relation obtained by the transitive closure of a proximity relation. Moreover, the pipeline is integrated in Spark, implemented in Scala and leverages the Core and Graphx libraries making use of Resilient Distributed Datasets (RDD) and graph processing. Furthermore, a new simple, but very efficient, partitioning logic has been deployed in Spark and integrated into the execution process. The objective behind this logic is to equally distribute the load among all executors by considering the complexity of the data. In particular, resolving the load balancing issue has reduced the conventional execution time in an important manner. Evaluation and performance of the whole distributed process has been analysed by handling the Geolife project’s GPS trajectory dataset.


2007 ◽  
Vol 2007 ◽  
pp. 1-13 ◽  
Author(s):  
E. A. Rady ◽  
M. M. E. Abd El-Monsef ◽  
W. A. Abd El-Latif

The key point of the tolerance relation or similarity relation presented in the literature is to assign a “null” value to all missing attribute values. In other words, a “null” value may be equal to any value in the domain of the attribute values. This may cause a serious effect in data analysis and decision analysis because the missing values are just “missed” but they do exist and have an influence on the decision. In this paper, we will introduce the modified similarity relation denoted by MSIM that is dependent on the number of missing values with respect to the number of the whole defined attributes for each object. According to the definition of MSIM, many problems concerning the generalized decisions are solved. This point may be used in scaling in statistics in a wide range. Also, a new definition of the discernibility matrix, deduction of the decision rules, and reducts in the present of the missing values are obtained.


Sign in / Sign up

Export Citation Format

Share Document