scholarly journals Frequent Itemset Mining Based on Development of FP-growth Algorithm and Use MapReduce Technique

Author(s):  
Zakria Mahrousa ◽  
Dima Mufti Alchawafa ◽  
Hasan Kazzaz

The Finding of frequent itemset in big data is an important task in data mining and knowledgediscovery. The exponential daily growth of data, called “Big Data”, mining frequent patterns from the hugevolumes of data has many challenges due to memory requirement, multiple data dimensions, heterogeneityof data and so on. The complexities related to mining frequent item-sets from a Big Data can be minimizedby using Modified FP-growth algorithm and parallelizing the mining task with Map Reduce framework inHadoop. In this paper, a modified FP-growth based on directed graph with Hadoop framework will reducethe execution time for the massive database and works efficiently on number of nodes (computers). Thealgorithm was tested, our experimental results demonstrated that the proposed algorithm could scale welland efficiently process large datasets. In addition, it achieves improvement in memory consumption to storefrequent patterns and time complexity.

Author(s):  
Adeel Shiraz Hashmi ◽  
Tanvir Ahmad

We are now in Big Data era, and there is a growing demand for tools which can process and analyze it. Big data analytics deals with extracting valuable information from that complex data which can’t be handled by traditional data mining tools. This paper surveys the available tools which can handle large volumes of data as well as evolving data streams. The data mining tools and algorithms which can handle big data have also been summarized, and one of the tools has been used for mining of large datasets using distributed algorithms.


Author(s):  
Devendra Kumar Mishra

Today is the era of internet; the internet represents a big space where large amounts of data are added every day. This huge amount of digital data and interconnection exploding data. Big Data mining have the capability to retrieving useful information in large datasets or streams of data. Analysis can also be done in a distributed environment. The framework needed for analysis to this large amount of data must support statistical analysis and data mining. The framework should be design in such a way so that big data and traditional data can be combined, so results that come analyzing new data with the old data. Traditional tools are not sufficient to extract information those are unseen.


Author(s):  
Feng Ye ◽  
Zhi-Jian Wang ◽  
Fa-Chao Zhou ◽  
Ya-Pu Wang ◽  
Yuan-Chao Zhou
Keyword(s):  
Big Data ◽  

Sign in / Sign up

Export Citation Format

Share Document