scholarly journals Recent Developments in Boolean Matrix Factorization

Author(s):  
Pauli Miettinen ◽  
Stefan Neumann

The goal of Boolean Matrix Factorization (BMF) is to approximate a given binary matrix as the product of two low-rank binary factor matrices, where the product of the factor matrices is computed under the Boolean algebra. While the problem is computationally hard, it is also attractive because the binary nature of the factor matrices makes them highly interpretable. In the last decade, BMF has received a considerable amount of attention in the data mining and formal concept analysis communities and, more recently, the machine learning and the theory communities also started studying BMF. In this survey, we give a concise summary of the efforts of all of these communities and raise some open questions which in our opinion require further investigation.

2015 ◽  
Vol 713-715 ◽  
pp. 1970-1973
Author(s):  
Chun Liu ◽  
Dong Xing Wang ◽  
Kun Tan

Concept lattice in essence describe the links between objects and attributes,demonstratesthe generalization and specialization of concepts. The corresponding Hasse diagrams realize the visualization of the data. At present, formal concept analysis has been extensively studied and applied to many areas, such asinformation retrieval, machine learning andsoftware engineering. Based on the above reasons, it is necessary to research the methods of latticeconcept of data mining. This paper is divided into three parts; the first part introduces the basic concepts of data mining. The second part introduces the basic theory of concept lattices. The last part focuses on the application of concept in data mining.


2020 ◽  
Vol 34 (04) ◽  
pp. 6086-6093 ◽  
Author(s):  
Changlin Wan ◽  
Wennan Chang ◽  
Tong Zhao ◽  
Mengya Li ◽  
Sha Cao ◽  
...  

Boolean matrix has been used to represent digital information in many fields, including bank transaction, crime records, natural language processing, protein-protein interaction, etc. Boolean matrix factorization (BMF) aims to find an approximation of a binary matrix as the Boolean product of two low rank Boolean matrices, which could generate vast amount of information for the patterns of relationships between the features and samples. Inspired by binary matrix permutation theories and geometric segmentation, we developed a fast and efficient BMF approach, called MEBF (Median Expansion for Boolean Factorization). Overall, MEBF adopted a heuristic approach to locate binary patterns presented as submatrices that are dense in 1's. At each iteration, MEBF permutates the rows and columns such that the permutated matrix is approximately Upper Triangular-Like (UTL) with so-called Simultaneous Consecutive-ones Property (SC1P). The largest submatrix dense in 1 would lie on the upper triangular area of the permutated matrix, and its location was determined based on a geometric segmentation of a triangular. We compared MEBF with other state of the art approaches on data scenarios with different density and noise levels. MEBF demonstrated superior performances in lower reconstruction error, and higher computational efficiency, as well as more accurate density patterns than popular methods such as ASSO, PANDA and Message Passing. We demonstrated the application of MEBF on both binary and non-binary data sets, and revealed its further potential in knowledge retrieving and data denoising.


2013 ◽  
Vol 9 (1) ◽  
pp. 36-53
Author(s):  
Evis Trandafili ◽  
Marenglen Biba

Social networks have an outstanding marketing value and developing data mining methods for viral marketing is a hot topic in the research community. However, most social networks remain impossible to be fully analyzed and understood due to prohibiting sizes and the incapability of traditional machine learning and data mining approaches to deal with the new dimension in the learning process related to the large-scale environment where the data are produced. On one hand, the birth and evolution of such networks has posed outstanding challenges for the learning and mining community, and on the other has opened the possibility for very powerful business applications. However, little understanding exists regarding these business applications and the potential of social network mining to boost marketing. This paper presents a review of the most important state-of-the-art approaches in the machine learning and data mining community regarding analysis of social networks and their business applications. The authors review the problems related to social networks and describe the recent developments in the area discussing important achievements in the analysis of social networks and outlining future work. The focus of the review in not only on the technical aspects of the learning and mining approaches applied to social networks but also on the business potentials of such methods.


Author(s):  
Nida Meddouri ◽  
Mondher Maddouri

Knowledge discovery in databases (KDD) aims to exploit the large amounts of data collected every day in various fields of computing application. The idea is to extract hidden knowledge from a set of data. It gathers several tasks that constitute a process, such as: data selection, pre-processing, transformation, data mining, visualization, etc. Data mining techniques include supervised classification and unsupervised classification. Classification consists of predicting the class of new instances with a classifier built on learning data of labeled instances. Several approaches were proposed such as: the induction of decision trees, Bayes, nearest neighbor search, neural networks, support vector machines, and formal concept analysis. Learning formal concepts always refers to the mathematical structure of concept lattice. This article presents a state of the art on formal concept analysis classifier. The authors present different ways to calculate the closure operators from nominal data and also present new approach to build only a part of the lattice including the best concepts. This approach is based on Dagging (ensemble method) that generates an ensemble of classifiers, each one represents a formal concept, and combines them by a voting rule. Experimental results are given to prove the efficiency of the proposed method.


Sign in / Sign up

Export Citation Format

Share Document