scholarly journals A Feature Selection Approach in the Study of Azorean Proverbs

Author(s):  
Luís Cavique ◽  
Armando B. Mendes ◽  
Matthias Funk ◽  
Jorge M. A. Santos

A paremiologic (study of proverbs) case is presented as part of a wider project based on data collected among the Azorean population. Given the considerable distance between the Azores islands, the authors present the hypothesis that there are significant differences in the proverbs from each island, thus permitting the identification of the native island of the interviewee based on his or her knowledge of proverbs. In this chapter, a feature selection algorithm that combines Rough Sets and the Logical Analysis of Data (LAD) is presented. The algorithm named LAID (Logical Analysis of Inconsistent Data) deals with noisy data, and the authors believe that an important link was established between the two different schools with similar approaches. The algorithm was applied to a real world dataset based on data collected using thousands of interviews of Azoreans, involving an initial set of twenty-two thousand Portuguese proverbs.

Data mining is an important research concept that has a vast scope in future. Data mining is used to find the unseen information from the data. In cluster, main half is feature choice. It involves recognition of a set of options of a set, because feature choice is taken into account as a necessary method. They additionally produce the approximate and according requests with the initial set of options employed in this kind of approach. The most construct on the far side this paper is to relinquish the end result of the bunch options. This paper conveys the cluster and the clustering process. The processing of large datasets the nature of clustering where some more concepts are more helpful and important in a clustering process. In clustering methodology many concepts are very useful. The feature selection algorithm which affects the entire process of clustering is the map-reduce concept. Here time needed to seek out the effective options, options of quality subsets is capable of providing effectiveness. The paper discussed map-reduce feature selection approach, its algorithm and framework of implementation.


Author(s):  
Kechika. S ◽  
Sapthika. B ◽  
Keerthana. B ◽  
Abinaya. S ◽  
Abdulfaiz. A

We have been studying the problem clustering data objects as we have implemented a new algorithm called algorithm of clustering data using map reduce approach. In cluster, main part is feature selection which involves in recognition of set of features of a subset, since feature selection is considered as a important process. They also produces the approximate and according requests with the original set of features used in this type of approach. The main concept beyond this paper is to give the outcome of the clustering features. This paper which also gives the knowledge about cluster and it's own process. To processing of large datasets the nature of clustering where some more concepts are more helpful and important in a clustering process. In a clustering methodology where more concepts are very useful. The feature selection algorithm which affects, the entire process of clustering is the map-reduce concept. since, feature selection or extraction which is also used in map-reduce approach. The most desirable component is time complexity where efficiency concerns in this criterion. Here time required to find the effective features, where features of quality subsets is equal to effectiveness. The complexity to find based on this criteria based map-reduce features selection approach, which is proposed and evaluated in this paper.


2010 ◽  
Vol 4 (8) ◽  
Author(s):  
Vandar Kuzhali Jagannathan ◽  
Rajendran Govind ◽  
Srinivasan V ◽  
Siva Kumar Ganapathi

2015 ◽  
Vol 2015 ◽  
pp. 1-10 ◽  
Author(s):  
Zilin Zeng ◽  
Hongjun Zhang ◽  
Rui Zhang ◽  
Youliang Zhang

Feature interaction has gained considerable attention recently. However, many feature selection methods considering interaction are only designed for categorical features. This paper proposes a mixed feature selection algorithm based on neighborhood rough sets that can be used to search for interacting features. In this paper, feature relevance, feature redundancy, and feature interaction are defined in the framework of neighborhood rough sets, the neighborhood interaction weight factor reflecting whether a feature is redundant or interactive is proposed, and a neighborhood interaction weight based feature selection algorithm (NIWFS) is brought forward. To evaluate the performance of the proposed algorithm, we compare NIWFS with other three feature selection algorithms, including INTERACT, NRS, and NMI, in terms of the classification accuracies and the number of selected features with C4.5 and IB1. The results from ten real world datasets indicate that NIWFS not only deals with mixed datasets directly, but also reduces the dimensionality of feature space with the highest average accuracies.


Sign in / Sign up

Export Citation Format

Share Document