Dist Frequent Next Neighbours: A Distributed Galois Lattice Algorithm for Frequent Closed Itemsets Extraction

Mapping Intimacies ◽

10.3233/faia210299 ◽

2021 ◽

Author(s):

Naomie Sandra Noumi Sandji ◽

Djamal Abdoul Nasser Seck

Keyword(s):

Big Data ◽

Concept Lattice ◽

General Purpose ◽

Frequent Itemsets ◽

Galois Lattice ◽

Galois Lattices ◽

Closed Itemsets ◽

Efficient Alternative ◽

Lattice Algorithm ◽

Lattice Construction

The general purpose of this paper is to propose a distributed version of frequent closed itemsets extraction in the context of big data. The goal is to have good performances of frequent closed itemsets extraction as frequent closed item-sets are bases for frequent itemsets. To achieve this goal, we have extended the Galois lattice technique (or concept lattice) in this context. Indeed, Galois lattices are an efficient alternative for extracting closed itemsets which are interesting approaches for generating frequent itemsets. Thus we proposed Dist Frequent Next Neighbour which is a distributed version of the Frequent Next Neighbour concept lattice construction algorithm, which considerably reduces the extraction time by parallelizing the computation of frequent concepts (closed itemsets).

Download Full-text

G. Czédli’s tolerance factor lattice construction and weak ordered relations

Algebra Universalis ◽

10.1007/s00012-021-00712-x ◽

2021 ◽

Vol 82 (2) ◽

Author(s):

Sándor Radeleczki

Keyword(s):

Concept Lattice ◽

Tolerance Factor ◽

Macneille Completion ◽

Lattice Construction

AbstractG. Czédli proved that the blocks of any compatible tolerance T of a lattice L can be ordered in such a way that they form a lattice L/T called the factor lattice of L modulo T. Here we show that the Dedekind–MacNeille completion of the lattice L/T is isomorphic to the concept lattice of the context (L, L, R), where R stands for the reflexive weak ordered relation $$ \mathord {\le } \circ T$$ ≤ ∘ T . Weak ordered relations constitute the generalization of the ordered relations introduced by S. Valentini. Reflexive weak ordered relations can be characterized as compatible reflexive relations $$R\subseteq L^{2}$$ R ⊆ L 2 satisfying $$R=\ \mathord {\le } \circ R\circ \mathord {\le } $$ R = ≤ ∘ R ∘ ≤ .

Download Full-text

PIM-WEAVER: A High Energy-efficient, General-purpose Acceleration Architecture for String Operations in Big Data Processing

Sustainable Computing Informatics and Systems ◽

10.1016/j.suscom.2019.01.006 ◽

2019 ◽

Vol 21 ◽

pp. 129-142

Author(s):

Wenming Li ◽

Xiaochun Ye ◽

Da Wang ◽

Hao Zhang ◽

Zhimin Tang ◽

...

Keyword(s):

Big Data ◽

Data Processing ◽

Energy Efficient ◽

High Energy ◽

General Purpose ◽

Big Data Processing

Download Full-text

MINING ESSENTIAL RULES USING FREQUENT CLOSED ITEMSETS LATTICE

Science and Technology Development Journal ◽

10.32508/stdj.v12i11.2311 ◽

2009 ◽

Vol 12 (11) ◽

pp. 49-56

Author(s):

Bac Hoai Le ◽

Bay Dinh Vo

Keyword(s):

Association Rules ◽

Frequent Itemsets ◽

Suitable Method ◽

Mining Method ◽

Parent Child Relationship ◽

Left Hand ◽

Child Relationship ◽

Closed Itemsets ◽

The Cost ◽

Parent Child

In traditional mining of association rules, finding all association rules from databases that satisfy minSup and minConf faces with some problems in case of the number of frequent itemsets is large. Thus, it is necessary to have a suitable method for mining fewer rules but they still embrace all rules of traditional mining method. One of the approaches that is the mining method of essential rules: it only keeps the rule that its left hand side is minimal and its right side is maximal (follow in parent-child relationship). In this paper, we propose a new algorithm for mining the essential rules from the frequent closed itemsets lattice to reduce the time of mining rules. We use the parent-child relationship in lattice to reduce the cost of considering parent-child relationship and lead to reduce the time of mining rules.

Download Full-text

Mining Closed Item sets using Partition based Single Scan Algorithm

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.a1920.078219 ◽

2019 ◽

Vol 8 (2) ◽

pp. 3885-3889

Keyword(s):

Efficient Algorithm ◽

Empirical Evaluation ◽

Frequent Itemsets ◽

Frequent Item ◽

Closed Itemsets ◽

Frequent Item Sets

Closed item sets are frequent itemsets that uniquely determines the exact frequency of frequent item sets. Closed Item sets reduces the massive output to a smaller magnitude without redundancy. In this paper, we present PSS-MCI, an efficient candidate generate based approach for mining all closed itemsets. It enumerates closed item sets using hash tree, candidate generation, super-set and sub-set checking. It uses partitioned based strategy to avoid unnecessary computation for the itemsets which are not useful. Using an efficient algorithm, it determines all closed item sets from a single scan over the database. However, several unnecessary item sets are being hashed in the buckets. To overcome the limitations, heuristics are enclosed with algorithm PSS-MCI. Empirical evaluation and results show that the PSS-MCI outperforms all candidate generate and other approaches. Further, PSS-MCI explores all closed item sets.

Download Full-text

Object oriented concept lattice construction through the combination of formal contexts

2009 International Conference on Machine Learning and Cybernetics ◽

10.1109/icmlc.2009.5212149 ◽

2009 ◽

Author(s):

Ling Wei ◽

Guang Yao ◽

Wei Zhao

Keyword(s):

Concept Lattice ◽

Object Oriented ◽

Lattice Construction

Download Full-text

Construction Search Engine Based on Formal Concept Analysis and Association Rule Mining

Advanced Engineering Forum ◽

10.4028/www.scientific.net/aef.6-7.625 ◽

2012 ◽

Vol 6-7 ◽

pp. 625-630 ◽

Cited By ~ 1

Author(s):

Hong Sheng Xu

Keyword(s):

Search Engine ◽

Association Rule ◽

Formal Concept Analysis ◽

Association Rule Mining ◽

High Efficiency ◽

Concept Lattice ◽

Concept Analysis ◽

Frequent Itemsets ◽

Formal Concept ◽

Rule Mining

In the form of background in the form of concept partial relation to the corresponding concept lattice, concept lattice is the core data structure of formal concept analysis. Association rule mining process includes two phases: first find all the frequent itemsets in data collection, Second it is by these frequent itemsets to generate association rules. This paper analyzes the association rule mining algorithms, such as Apriori and FP-Growth. The paper presents the construction search engine based on formal concept analysis and association rule mining. Experimental results show that the proposed algorithm has high efficiency.

Download Full-text

Entity deduplication in big data graphs for scholarly communication

Data Technologies and Applications ◽

10.1108/dta-09-2019-0163 ◽

2020 ◽

Vol 54 (4) ◽

pp. 409-435

Author(s):

Paolo Manghi ◽

Claudio Atzori ◽

Michele De Bonis ◽

Alessia Bardi

Keyword(s):

Big Data ◽

Ground Truth ◽

Open Science ◽

General Purpose ◽

High Level Architecture ◽

Multiple Sources ◽

Actual Problem ◽

Content Type ◽

Infrastructure System ◽

High Level

PurposeSeveral online services offer functionalities to access information from “big research graphs” (e.g. Google Scholar, OpenAIRE, Microsoft Academic Graph), which correlate scholarly/scientific communication entities such as publications, authors, datasets, organizations, projects, funders, etc. Depending on the target users, access can vary from search and browse content to the consumption of statistics for monitoring and provision of feedback. Such graphs are populated over time as aggregations of multiple sources and therefore suffer from major entity-duplication problems. Although deduplication of graphs is a known and actual problem, existing solutions are dedicated to specific scenarios, operate on flat collections, local topology-drive challenges and cannot therefore be re-used in other contexts.Design/methodology/approachThis work presents GDup, an integrated, scalable, general-purpose system that can be customized to address deduplication over arbitrary large information graphs. The paper presents its high-level architecture, its implementation as a service used within the OpenAIRE infrastructure system and reports numbers of real-case experiments.FindingsGDup provides the functionalities required to deliver a fully-fledged entity deduplication workflow over a generic input graph. The system offers out-of-the-box Ground Truth management, acquisition of feedback from data curators and algorithms for identifying and merging duplicates, to obtain an output disambiguated graph.Originality/valueTo our knowledge GDup is the only system in the literature that offers an integrated and general-purpose solution for the deduplication graphs, while targeting big data scalability issues. GDup is today one of the key modules of the OpenAIRE infrastructure production system, which monitors Open Science trends on behalf of the European Commission, National funders and institutions.

Download Full-text

Improved BVBUC Algorithm to Discover Closed Itemsets in Long Biological Datasets

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.892.157 ◽

2019 ◽

Vol 892 ◽

pp. 157-167

Author(s):

Fatimah Audah Md Zaki ◽

Nurul Fariza Zulkurnain

Keyword(s):

Frequent Itemsets ◽

Suitable Method ◽

Closed Frequent Itemsets ◽

Closed Itemsets ◽

Synthetic Datasets

The task in mining closed frequent itemsets requires the algorithm to mine the frequent ones then determine its closure. The efficiency of closure computation is very important as it will determine the total mining time and the required memory. Over the years, many closure computation methods have been proposed to achieve these goals. However, to the best of our knowledge, there is no suitable method that can be adapted for algorithms that enumerate the rowset lattice, which is effective for biological datasets. Therefore, this paper proposed a method for computing closure compare with the method used in BVBUC algorithm method. Finally, BVBUC_I is proposed and the performances of these algorithms were evaluated using two synthetic datasets and three real datasets. The results of these tests proved the efficiency of the proposed method.

Download Full-text

A New Rapid Incremental Algorithm for Constructing Concept Lattices

Information ◽

10.3390/info10020078 ◽

2019 ◽

Vol 10 (2) ◽

pp. 78 ◽

Cited By ~ 1

Author(s):

Jingpu Zhang ◽

Ronghui Liu ◽

Ligeng Zou ◽

Licheng Zeng

Keyword(s):

Formal Concept Analysis ◽

Concept Lattice ◽

Incremental Algorithm ◽

Formal Context ◽

Formal Concept ◽

Concept Lattices ◽

Covering Relation ◽

Canonical Generator ◽

Lattice Construction ◽

Time And Space Complexity

Formal concept analysis has proven to be a very effective method for data analysis and rule extraction, but how to build formal concept lattices is a difficult and hot topic. In this paper, an efficient and rapid incremental concept lattice construction algorithm is proposed. The algorithm, named FastAddExtent, is seen as a modification of AddIntent in which we improve two fundamental procedures, including fixing the covering relation and searching the canonical generator. The proposed algorithm can locate the desired concept quickly by adding data fields to every concept. The algorithm is depicted in detail, using a formal context to show how the new algorithm works and discussing time and space complexity issues. We also present an experimental evaluation of its performance and comparison with AddExtent. Experimental results show that the FastAddExtent algorithm can improve efficiency compared with the primitive AddExtent algorithm.

Download Full-text

Mining materials knowledge with concept lattice algorithm

Materials Today Communications ◽

10.1016/j.mtcomm.2019.100726 ◽

2020 ◽

Vol 22 ◽

pp. 100726

Author(s):

Qiuling Sheng ◽

Quan Qian

Keyword(s):

Concept Lattice ◽

Lattice Algorithm ◽

Materials Knowledge

Download Full-text