scholarly journals Applying Improved Apriori Algorithm in Figuring out the Relation between Weather Factors and Rainfall

2020 ◽  
Vol 1 (1) ◽  
pp. 23-26
Author(s):  
Siti Zulaikha ◽  
Martaleli Bettiza ◽  
Nola Ritha

Data on the rainfall is compelling to study as it becomes one of the major factors affecting the weather in a certain region and various aspects of life as well. Generally, predicting rainfall is performed by analyzing data in the past in certain methods. Rainfall is prone to follow repeated pattern in sequence of time. The utilization of big data mining is expected to result in any valuable information that used to be unrevealed in the big data store. Some methods used in data mining are Apriori Algorithm and Improved Apriori Algorithm. Improved Apriori itself is to represent the database in the form of matrix to describe its relation in the database. Data used in this research is the rainfall factor in 2016 in Tanjungpinang city. Based on the test of Improved Apriori Algorithm, it was found out that the relation of the rainfall and weather factors utilizing 2 item sets, that is, if the temperature is low (24,0 - 26,0), the humidity is high (85 - 100), then the rainfall is mild. If the temperature is low (24,0 - 26,0), the light intensity is low (0 – 3), then the rainfall is heavy, and 3 item sets if the temperature is low (24,0 - 26,0), the humidity is high (85 - 100), the sun light intensity is low (0-3), then the rainfall is medium.

2014 ◽  
Vol 568-570 ◽  
pp. 798-801
Author(s):  
Ye Qing Xiong ◽  
Shu Dong Zhang

It occurs time and space performance bottlenecks when traditional association rules algorithms are used to big data mining. This paper proposes a parallel algorithm based on matrix under cloud computing to improve Apriori algorithm. The algorithm uses binary matrix to store transaction data, uses matrix "and" operation to replace the connection between itemsets and combines cloud computing technology to implement the parallel mining for frequent itemsets. Under different conditions, the simulation shows it improves the efficiency, solves the performance bottleneck problem and can be widely used in big data mining with strong scalability and stability.


2021 ◽  
Vol 11 (2) ◽  
pp. 478-486
Author(s):  
Jing Zheng ◽  
Zhongjun Gao ◽  
Lixin Pu ◽  
Mingjie He ◽  
Jipeng Fan ◽  
...  

Using the medical big data mining related technology, the model of tumor disease was analyzed and studied. Using data science methods as a guiding method and idea, analyzing and constructing a medical service model based on big data for oncology diseases, exploring its development strategy; using business process analysis method to analyze the business process and mapping of cancer disease medical services; using serviceoriented architecture analysis and Design methodology to build a highly flexible, configurable, and easily scalable precision medical big data platform. By analyzing the characteristics of medical big data and the shortcomings of the traditional Apriori algorithm, the Hadoop platform is used to improve and optimize the Apriori algorithm. The results show that the improved Apriori algorithm has great improvement in efficiency and performance, and can be adapted to mining medical big data. Through data mining experiments, it is concluded that there is a correlation between tumors and smoking, chronic infection, occupational pathogenic factors, etc. It has certain guiding significance for the prevention and treatment of tumors, thus also demonstrating the improved Apriori algorithm for lung tumors. Clinical research has practical significance.


Due to the massive data size and complexness, big data mining using a sole computer is a problematic task. With the rapid increase in the database size, parallel and distributed computing systems can yield better benefits in the data mining applications. Parallelization of the Association Rule Mining (ARM) algorithms is a significant task in the data mining application for effectively mining the frequent itemsets from the large-size databases. These mining algorithms allocate the database in a horizontal manner or increase the number of processors to decrease the overall time necessary for mining the frequent itemsets. In this paper, a combined Horizontal Parallel-Apriori (HP-Apriori) and Adaptive Frequent Pattern (FP) Growth algorithm is proposed to divide the database both horizontally and vertically into four sub-processes, for parallel processing of all four tasks. The Horizontal Parallel-Apriori algorithm increases the speed of the mining process using an index file. Adaptive Binomial Distribution (ABD) is applied to the Frequent Pattern Growth Algorithm to find the minimum support for mining the optimal frequent itemsets. Experimental analysis established that the combined algorithm outperforms in terms of minimizing the overall execution time and increasing the computational speed in high scalability.


Author(s):  
Min Ye ◽  
Hongxia Li

The e-commerce platform in the digital economy era has evolved into a data platform ecosystem built around data resources and data mining technology systems. The most typical applications of big data are also concentrated in the field of e-commerce. E-commerce companies should first grasp the interactive relationship among the three major factors of data, technology and innovation, e-commerce platform operation is a multidisciplinary research field. It is not easy for researchers to obtain a panoramic view of the knowledge structure in this field. Knowledge graph is a kind of graph that shows the development process and structure relationship of knowledge with the field of knowledge as the object. It is not only a visual knowledge mapping, but also a serialized knowledge pedigree, which provides researchers with a quantitative research method for the development trend of statistics and academic status. The purpose of this research is to help researchers understand the key knowledge, evolutionary trends and research frontiers of current research. This study uses Citespace bibliometric analysis to analyze the data of the Science Net database and finds that: 1) The development of the research field has gone through three stages, and some representative key scholars and key documents have been recognized; 2) the common knowledge mapping of literature The co-occurrence of citations and keywords shows research hotspots; 3) The results of burst detection and central node analysis reveal research frontiers and development trends. Today, the visualization of big data brings different challenges. The abstraction between the world and today's data visualization occurs when the data is captured. Every user sees his own visualization data generated by standardized calculations. At the same time, there are still many controversies in the theoretical model, structure and structural dimensions. This is the direction that future researchers need to further study.


Author(s):  
Kiran Kumar S V N Madupu

Big Data has terrific influence on scientific discoveries and also value development. This paper presents approaches in data mining and modern technologies in Big Data. Difficulties of data mining as well as data mining with big data are discussed. Some technology development of data mining as well as data mining with big data are additionally presented.


Author(s):  
Vivek Raich ◽  
Pankaj Maurya

in the time of the Information Technology, the big data store is going on. Due to which, Huge amounts of data are available for decision makers, and this has resulted in the progress of information technology and its wide growth in many areas of business, engineering, medical, and scientific studies. Big data means that the size which is bigger in size, but there are several types, which are not easy to handle, technology is required to handle it. Due to continuous increase in the data in this way, it is important to study and manage these datasets by adjusting the requirements so that the necessary information can be obtained.The aim of this paper is to analyze some of the analytic methods and tools. Which can be applied to large data. In addition, the application of Big Data has been analyzed, using the Decision Maker working on big data and using enlightened information for different applications.


Sign in / Sign up

Export Citation Format

Share Document