scholarly journals RDD-Eclat: Approaches to Parallelize Eclat Algorithm on Spark RDD Framework (Extended Version)

Author(s):  
Pankaj Singh ◽  
Sudhakar Singh ◽  
P K Mishra ◽  
Rakhi Garg

Abstract Frequent itemset mining (FIM) is a highly computational and data intensive algorithm. Therefore, parallel and distributed FIM algorithms have been designed to process large volume of data in a reduced time. Recently, a number of FIM algorithms have been designed on Hadoop MapReduce, a distributed big data processing framework. But, due to heavy disk I/O, MapReduce is found to be inefficient for the highly iterative FIM algorithms. Therefore, Spark, a more efficient distributed data processing framework, has been developed with in-memory computation and resilient distributed dataset (RDD) features to support the iterative algorithms. On this framework, Apriori and FP-Growth based FIM algorithms have been designed on the Spark RDD framework, but Eclat-based algorithm has not been explored yet. In this paper, RDD-Eclat, a parallel Eclat algorithm on the Spark RDD framework is proposed with its five variants. The proposed algorithms are evaluated on the various benchmark datasets, and the experimental results show that RDD-Eclat outperforms the Spark-based Apriori by many times. Also, the experimental results show the scalability of the proposed algorithms on increasing the number of cores and size of the dataset.

BMC Genomics ◽  
2020 ◽  
Vol 21 (1) ◽  
Author(s):  
Onur Yukselen ◽  
Osman Turkyilmaz ◽  
Ahmet Rasit Ozturk ◽  
Manuel Garber ◽  
Alper Kucukural

1979 ◽  
Vol 21 (2) ◽  
Author(s):  
L. J. Heinrich

Der Beitrag erläutert das subjektive Verständnis des Begriffes ,,Computerleistung am Arbeitsplatz" als Schlagwort für eine progressive Gestaltungsphilosophie computergestützter Informationssysteme. Sie impliziert sowohl die Anwendung moderner Hard- und Softwaretechnologien, wie sie für die 80er Jahre bestimmend sein werden, als auch die in den Vordergrund rückende Berücksichtigung der sowohl von der Arbeitsaufgabe bestimmten als auch der subjektiven Benutzerbedürfnisse. Sie verbindet damit ,, Distributed Data Processing" als ein technologisches Konzept mit ..Benutzerorientierung". Die Gestaltungsbereiche der Benutzerorientierung - Arbeitsmittel und Arbeitsumwelt, Mensch- Computer-Interaktionsschnittstelle sowie die Arbeitsorganisation - werden erläutert. Gestaltungsmaßnahmen werden beispielhaft angegeben, und es wird auf die weiterführende Literatur verwiesen; dabei steht das im Oldenbourg- Verlag erschienene Buch ,,Computerleistung am Arbeitsplatz - benutzerorientiertes Distributed Data Processing" im Vordergrund.


Author(s):  
V.G. Belenkov ◽  
V.I. Korolev ◽  
V.I. Budzko ◽  
D.A. Melnikov

The article discusses the features of the use of the cryptographic information protection means (CIPM)in the environment of distributed processing and storage of data of large information and telecommunication systems (LITS).A brief characteristic is given of the properties of the cryptographic protection control subsystem - the key system (CS). A description is given of symmetric and asymmetric cryptographic systems, required to describe the problem of using KS in LITS.Functional and structural models of the use of KS and CIPM in LITS, are described. Generalized information about the features of using KS in LITS is given. The obtained results form the basis for further work on the development of the architecture and principles of KS construction in LITS that implement distributed data processing and storage technologies. They can be used both as a methodological guide, and when carrying out specific work on the creation and development of systems that implement these technologies, as well as when forming technical specifications for the implementation of work on the creation of such systems.


Sign in / Sign up

Export Citation Format

Share Document