scholarly journals Dist Frequent Next Neighbours: A Distributed Galois Lattice Algorithm for Frequent Closed Itemsets Extraction

2021 ◽  
Author(s):  
Naomie Sandra Noumi Sandji ◽  
Djamal Abdoul Nasser Seck

The general purpose of this paper is to propose a distributed version of frequent closed itemsets extraction in the context of big data. The goal is to have good performances of frequent closed itemsets extraction as frequent closed item-sets are bases for frequent itemsets. To achieve this goal, we have extended the Galois lattice technique (or concept lattice) in this context. Indeed, Galois lattices are an efficient alternative for extracting closed itemsets which are interesting approaches for generating frequent itemsets. Thus we proposed Dist Frequent Next Neighbour which is a distributed version of the Frequent Next Neighbour concept lattice construction algorithm, which considerably reduces the extraction time by parallelizing the computation of frequent concepts (closed itemsets).

2021 ◽  
Vol 82 (2) ◽  
Author(s):  
Sándor Radeleczki

AbstractG. Czédli proved that the blocks of any compatible tolerance T of a lattice L can be ordered in such a way that they form a lattice L/T called the factor lattice of L modulo T. Here we show that the Dedekind–MacNeille completion of the lattice L/T is isomorphic to the concept lattice of the context (L, L, R), where R stands for the reflexive weak ordered relation $$ \mathord {\le } \circ T$$ ≤ ∘ T . Weak ordered relations constitute the generalization of the ordered relations introduced by S. Valentini. Reflexive weak ordered relations can be characterized as compatible reflexive relations $$R\subseteq L^{2}$$ R ⊆ L 2 satisfying $$R=\ \mathord {\le } \circ R\circ \mathord {\le } $$ R = ≤ ∘ R ∘ ≤ .


2009 ◽  
Vol 12 (11) ◽  
pp. 49-56
Author(s):  
Bac Hoai Le ◽  
Bay Dinh Vo

In traditional mining of association rules, finding all association rules from databases that satisfy minSup and minConf faces with some problems in case of the number of frequent itemsets is large. Thus, it is necessary to have a suitable method for mining fewer rules but they still embrace all rules of traditional mining method. One of the approaches that is the mining method of essential rules: it only keeps the rule that its left hand side is minimal and its right side is maximal (follow in parent-child relationship). In this paper, we propose a new algorithm for mining the essential rules from the frequent closed itemsets lattice to reduce the time of mining rules. We use the parent-child relationship in lattice to reduce the cost of considering parent-child relationship and lead to reduce the time of mining rules.


2019 ◽  
Vol 8 (2) ◽  
pp. 3885-3889

Closed item sets are frequent itemsets that uniquely determines the exact frequency of frequent item sets. Closed Item sets reduces the massive output to a smaller magnitude without redundancy. In this paper, we present PSS-MCI, an efficient candidate generate based approach for mining all closed itemsets. It enumerates closed item sets using hash tree, candidate generation, super-set and sub-set checking. It uses partitioned based strategy to avoid unnecessary computation for the itemsets which are not useful. Using an efficient algorithm, it determines all closed item sets from a single scan over the database. However, several unnecessary item sets are being hashed in the buckets. To overcome the limitations, heuristics are enclosed with algorithm PSS-MCI. Empirical evaluation and results show that the PSS-MCI outperforms all candidate generate and other approaches. Further, PSS-MCI explores all closed item sets.


2012 ◽  
Vol 6-7 ◽  
pp. 625-630 ◽  
Author(s):  
Hong Sheng Xu

In the form of background in the form of concept partial relation to the corresponding concept lattice, concept lattice is the core data structure of formal concept analysis. Association rule mining process includes two phases: first find all the frequent itemsets in data collection, Second it is by these frequent itemsets to generate association rules. This paper analyzes the association rule mining algorithms, such as Apriori and FP-Growth. The paper presents the construction search engine based on formal concept analysis and association rule mining. Experimental results show that the proposed algorithm has high efficiency.


2020 ◽  
Vol 54 (4) ◽  
pp. 409-435
Author(s):  
Paolo Manghi ◽  
Claudio Atzori ◽  
Michele De Bonis ◽  
Alessia Bardi

PurposeSeveral online services offer functionalities to access information from “big research graphs” (e.g. Google Scholar, OpenAIRE, Microsoft Academic Graph), which correlate scholarly/scientific communication entities such as publications, authors, datasets, organizations, projects, funders, etc. Depending on the target users, access can vary from search and browse content to the consumption of statistics for monitoring and provision of feedback. Such graphs are populated over time as aggregations of multiple sources and therefore suffer from major entity-duplication problems. Although deduplication of graphs is a known and actual problem, existing solutions are dedicated to specific scenarios, operate on flat collections, local topology-drive challenges and cannot therefore be re-used in other contexts.Design/methodology/approachThis work presents GDup, an integrated, scalable, general-purpose system that can be customized to address deduplication over arbitrary large information graphs. The paper presents its high-level architecture, its implementation as a service used within the OpenAIRE infrastructure system and reports numbers of real-case experiments.FindingsGDup provides the functionalities required to deliver a fully-fledged entity deduplication workflow over a generic input graph. The system offers out-of-the-box Ground Truth management, acquisition of feedback from data curators and algorithms for identifying and merging duplicates, to obtain an output disambiguated graph.Originality/valueTo our knowledge GDup is the only system in the literature that offers an integrated and general-purpose solution for the deduplication graphs, while targeting big data scalability issues. GDup is today one of the key modules of the OpenAIRE infrastructure production system, which monitors Open Science trends on behalf of the European Commission, National funders and institutions.


2019 ◽  
Vol 892 ◽  
pp. 157-167
Author(s):  
Fatimah Audah Md Zaki ◽  
Nurul Fariza Zulkurnain

The task in mining closed frequent itemsets requires the algorithm to mine the frequent ones then determine its closure. The efficiency of closure computation is very important as it will determine the total mining time and the required memory. Over the years, many closure computation methods have been proposed to achieve these goals. However, to the best of our knowledge, there is no suitable method that can be adapted for algorithms that enumerate the rowset lattice, which is effective for biological datasets. Therefore, this paper proposed a method for computing closure compare with the method used in BVBUC algorithm method. Finally, BVBUC_I is proposed and the performances of these algorithms were evaluated using two synthetic datasets and three real datasets. The results of these tests proved the efficiency of the proposed method.


Information ◽  
2019 ◽  
Vol 10 (2) ◽  
pp. 78 ◽  
Author(s):  
Jingpu Zhang ◽  
Ronghui Liu ◽  
Ligeng Zou ◽  
Licheng Zeng

Formal concept analysis has proven to be a very effective method for data analysis and rule extraction, but how to build formal concept lattices is a difficult and hot topic. In this paper, an efficient and rapid incremental concept lattice construction algorithm is proposed. The algorithm, named FastAddExtent, is seen as a modification of AddIntent in which we improve two fundamental procedures, including fixing the covering relation and searching the canonical generator. The proposed algorithm can locate the desired concept quickly by adding data fields to every concept. The algorithm is depicted in detail, using a formal context to show how the new algorithm works and discussing time and space complexity issues. We also present an experimental evaluation of its performance and comparison with AddExtent. Experimental results show that the FastAddExtent algorithm can improve efficiency compared with the primitive AddExtent algorithm.


Sign in / Sign up

Export Citation Format

Share Document