Categorization of Multiple Documents Using Fuzzy Overlapping Clustering Based on Formal Concept Analysis

Author(s):  
Yi-Hui Chen ◽  
Eric Jui-Lin Lu ◽  
Ya-Wen Cheng

Most clustering algorithms build disjoint clusters. However, clusters might be overlapped because documents may belong to two or more categories in the real world. For example, a paper discussing the Apple Watch may be categorized into either 3C, Fashion, or even Clothing and Shoes. Therefore, overlapping clustering algorithms have been studied such that a resource can be assigned to one or more clusters. Formal Concept Analysis (FCA), which has many practical applications in information science, has been used in disjoin clustering, but has not been studied in overlapping clustering. To make overlapping clustering possible by using FCA, we propose an approach, including two types of transformation. From the experimental results, it shows that the proposed fuzzy overlapping clustering performed more efficiently than existing overlapping clustering methods. The positive results confirm the feasibility of the proposed scheme used in overlapping clustering. Also, it can be used in applications such as recommendation systems.

2021 ◽  
Author(s):  
Shaoxia Zhang ◽  
Deyu Li ◽  
Yanhui Zhai

Abstract Decision implication is an elementary representation of decision knowledge in formal concept analysis. Decision implication canonical basis (DICB), a set of decision implications with completeness and nonredundancy, is the most compact representation of decision implications. The method based on true premises (MBTP) for DICB generation is the most efficient one at present. In practical applications, however, data is always changing dynamically, and MBTP has to re-generate inefficiently the whole DICB. This paper proposes an incremental algorithm for DICB generation, which obtains a new DICB just by modifying and updating the existing one. Experimental results verify that when the samples in data are much more than condition attributes, which is actually a general case in practical applications, the incremental algorithm is significantly superior to MBTP. Furthermore, we conclude that, even for the data in which samples is less than condition attributes, when new samples are continually added into data, the incremental algorithm must be also more efficient than MBTP, because the incremental algorithm just needs to modify the existing DICB, which is only a part of work of MBTP.


Author(s):  
Adriana M. Guimaraes de Farias ◽  
Marcos E. Cintra ◽  
Angelica C. Felix ◽  
Danniel L. Cavalcante

Public security has always been an important research topic. In this sense, machine learning algorithms have been used to extract knowledge from criminal databases, which usually maintain records in order to generate statistics. The automatic extraction of knowledge from such databases allows the improvement and planning of strategies to prevent and combat crimes. Accordingly, in this work different models related to public security are presented. Such models are based on clustering algorithms, on the analysis of formal concept techniques, and on the analysis of crime record data collected in the city of Mossoro, Brazil. The two types of models generated are: (i) concept lattices with crime patterns; (ii) criminal hot spot maps. We also produced a ranking of dangerousness for neighbourhoods of Mossoro. The Fuzzy K-Means clustering algorithm was used to obtain criminal hot spots, which indicate locations with high crime incidence. Formal concept analysis was used for extracting visual models describing patterns that characterize criminal activities. Such models have the form of conceptual lattices that provide graphical displays which can be used for defining strategies to combat and prevent crime. The models were first empirically evaluated and then analysed by public security experts, who provided positive feedback for their practical use. The advantages of the automatically generated models presented in this paper are many, including the short time to produce such models, the variety of different models that can be generated for specific regions and periods of days, months, or years, the graphical characteristic of such models that allow a fast analysis of them, as well as the use of large amounts of data, which are infeasible activities to be done by human experts.


Author(s):  
Cherukuri Kumar

Knowledge discovery in data using formal concept analysis and random projections In this paper our objective is to propose a random projections based formal concept analysis for knowledge discovery in data. We demonstrate the implementation of the proposed method on two real world healthcare datasets. Formal Concept Analysis (FCA) is a mathematical framework that offers a conceptual knowledge representation through hierarchical conceptual structures called concept lattices. However, during the design of a concept lattice, complexity plays a major role.


2021 ◽  
Vol 179 (3) ◽  
pp. 295-319
Author(s):  
Longchun Wang ◽  
Lankun Guo ◽  
Qingguo Li

Formal Concept Analysis (FCA) has been proven to be an effective method of restructuring complete lattices and various algebraic domains. In this paper, the notion of contractive mappings over formal contexts is proposed, which can be viewed as a generalization of interior operators on sets into the framework of FCA. Then, by considering subset-selections consistent with contractive mappings, the notions of attribute continuous formal contexts and continuous concepts are introduced. It is shown that the set of continuous concepts of an attribute continuous formal context forms a continuous domain, and every continuous domain can be restructured in this way. Moreover, the notion of F-morphisms is identified to produce a category equivalent to that of continuous domains with Scott continuous functions. The paper also investigates the representations of various subclasses of continuous domains including algebraic domains and stably continuous semilattices.


Sign in / Sign up

Export Citation Format

Share Document