A new approach for textual feature selection based on N-composite isolated labels

2019 ◽  
Vol 26 (2) ◽  
pp. 221-243
Author(s):  
Samir Elloumi

AbstractTextual Feature Selection (TFS) aims to extract relevant parts or segments from text as being the most relevant ones w.r.t. the information it expresses. The selected features are useful for automatic indexing, summarization, document categorization, knowledge discovery, so on. Regarding the huge amount of electronic textual data daily published, many challenges related to the semantic aspect as well as the processing efficiency are addressed. In this paper, we propose a new approach for TFS based on Formal Concept Analysis background. Mainly, we propose to extract textual features by exploring the regularities in a formal context where isolated points exist. We introduce the notion ofN-composite isolated points as a set ofNwords to be considered as a unique textual feature. We show that a reduced value ofN(between 1 and 3) allows extracting significant textual features compared with existing approaches even for non-completely covering an initial formal context.

Information ◽  
2018 ◽  
Vol 9 (11) ◽  
pp. 266
Author(s):  
Phillip Santos ◽  
Pedro Ruas ◽  
Julio Neves ◽  
Paula Silva ◽  
Sérgio Dias ◽  
...  

Formal concept analysis (FCA) is largely applied in different areas. However, in some FCA applications the volume of information that needs to be processed can become unfeasible. Thus, the demand for new approaches and algorithms that enable processing large amounts of information is increasing substantially. This article presents a new algorithm for extracting proper implications from high-dimensional contexts. The proposed algorithm, called ImplicPBDD, was based on the PropIm algorithm, and uses a data structure called binary decision diagram (BDD) to simplify the representation of the formal context and enhance the extraction of proper implications. In order to analyze the performance of the ImplicPBDD algorithm, we performed tests using synthetic contexts varying the number of objects, attributes and context density. The experiments show that ImplicPBDD has a better performance—up to 80% faster—than its original algorithm, regardless of the number of attributes, objects and densities.


2021 ◽  
Vol 179 (3) ◽  
pp. 295-319
Author(s):  
Longchun Wang ◽  
Lankun Guo ◽  
Qingguo Li

Formal Concept Analysis (FCA) has been proven to be an effective method of restructuring complete lattices and various algebraic domains. In this paper, the notion of contractive mappings over formal contexts is proposed, which can be viewed as a generalization of interior operators on sets into the framework of FCA. Then, by considering subset-selections consistent with contractive mappings, the notions of attribute continuous formal contexts and continuous concepts are introduced. It is shown that the set of continuous concepts of an attribute continuous formal context forms a continuous domain, and every continuous domain can be restructured in this way. Moreover, the notion of F-morphisms is identified to produce a category equivalent to that of continuous domains with Scott continuous functions. The paper also investigates the representations of various subclasses of continuous domains including algebraic domains and stably continuous semilattices.


Information ◽  
2018 ◽  
Vol 9 (9) ◽  
pp. 228 ◽  
Author(s):  
Zuping Zhang ◽  
Jing Zhao ◽  
Xiping Yan

Web page clustering is an important technology for sorting network resources. By extraction and clustering based on the similarity of the Web page, a large amount of information on a Web page can be organized effectively. In this paper, after describing the extraction of Web feature words, calculation methods for the weighting of feature words are studied deeply. Taking Web pages as objects and Web feature words as attributes, a formal context is constructed for using formal concept analysis. An algorithm for constructing a concept lattice based on cross data links was proposed and was successfully applied. This method can be used to cluster the Web pages using the concept lattice hierarchy. Experimental results indicate that the proposed algorithm is better than previous competitors with regard to time consumption and the clustering effect.


2020 ◽  
Vol 39 (3) ◽  
pp. 2783-2790
Author(s):  
Qian Hu ◽  
Ke-Yun Qin

The construction of concept lattices is an important research topic in formal concept analysis. Inspired by multi-granularity rough sets, multi-granularity formal concept analysis has become a new hot research issue. This paper mainly studies the construction methods of concept lattices in multi-granularity formal context. The relationships between concept forming operators under different granularity are discussed. The mutual transformation methods of formal concepts under different granularity are presented. In addition, the approaches of obtaining coarse-granularity concept lattice by fine-granularity concept lattice and fine-granularity concept lattice by coarse-granularity concept lattice are examined. The related algorithms for generating concept lattices are proposed. The practicability of the method is illustrated by an example.


Information ◽  
2019 ◽  
Vol 10 (2) ◽  
pp. 78 ◽  
Author(s):  
Jingpu Zhang ◽  
Ronghui Liu ◽  
Ligeng Zou ◽  
Licheng Zeng

Formal concept analysis has proven to be a very effective method for data analysis and rule extraction, but how to build formal concept lattices is a difficult and hot topic. In this paper, an efficient and rapid incremental concept lattice construction algorithm is proposed. The algorithm, named FastAddExtent, is seen as a modification of AddIntent in which we improve two fundamental procedures, including fixing the covering relation and searching the canonical generator. The proposed algorithm can locate the desired concept quickly by adding data fields to every concept. The algorithm is depicted in detail, using a formal context to show how the new algorithm works and discussing time and space complexity issues. We also present an experimental evaluation of its performance and comparison with AddExtent. Experimental results show that the FastAddExtent algorithm can improve efficiency compared with the primitive AddExtent algorithm.


Sign in / Sign up

Export Citation Format

Share Document