A Pattern Storage System using Pattern Warehouse along with Sources of Pattern Generation and Applications

Now a day different data mining algorithms are ready to create the specific set of data known as Pattern from a huge data repository, but there is no infrastructure or system to save it as persistent storage for the generated patterns. Pattern warehouse presents a foundation to make these patterns safe in the specific environment for long term use. Most organizations are excited to know the information or patterns rather than raw data or group of unprocessed data. Because extracted knowledge play a vital role to take right decision for the growth of an organization. We have examined the sources of patterns generated from large data sets. In this paper, we have presented little importance on the application area of pattern and idea of patter warehouse, the architecture of pattern warehouse then correlation between data warehouse and data mining, association between data mining and pattern warehouse, critical evaluation between existing approaches which theoretically published and more stress on association rule related review elements. In this paper, we analyze the patterns warehouse, data warehouse concerning various factors like storage space, type of storage unit, characteristics, and provide several research domains.

Download Full-text

PRM87 - CRITICAL EVALUATION OF VARIOUS DATA MINING ALGORITHMS USED FOR SIGNAL DETECTION IN FDA ADVERSE EVENT REPORTING SYSTEM DATABASE

Value in Health ◽

10.1016/j.jval.2018.09.2208 ◽

2018 ◽

Vol 21 ◽

pp. S370

Author(s):

KS Koonisetty ◽

V Subeesh ◽

E Maheswari ◽

SS Minnikanti ◽

C Pudi

Keyword(s):

Data Mining ◽

Adverse Event ◽

Signal Detection ◽

Reporting System ◽

Adverse Event Reporting System ◽

Critical Evaluation ◽

Adverse Event Reporting ◽

Event Reporting ◽

Data Mining Algorithms ◽

Mining Algorithms

Download Full-text

Penggunaan Association Rule Data Mining Untuk Menentukan Pola Lama Studi Mahasiswa F-MIPA UNSRAT

d'CARTESIAN ◽

10.35799/dc.3.1.2014.3777 ◽

2014 ◽

Vol 3 (1) ◽

pp. 1

Author(s):

M. Zainal Mahmudin ◽

Altien Rindengan ◽

Winsy Weku

Keyword(s):

Data Mining ◽

Association Rule ◽

Large Data ◽

Apriori Algorithm ◽

Data Mining Algorithms ◽

Adequate Information ◽

Processing Data ◽

Rule Method ◽

Mining Algorithms ◽

Confidence Value

Abstract The requirement of highest information sometimes is not balance with the provision of adequate information, so that the information must be re-excavated in large data. By using the technique of association rule we can obtain information from large data such as the college data. The purposes of this research is to determine the patterns of study from student in F-MIPA UNSRAT by using association rule method of data mining algorithms and to compare in the apriori method and a hash-based algorithms. The major’s student data of F-MIPA UNSRAT as a data were processed by association rule method of data mining with the apriori algorithm and a hash-based algorithm by using support and confidance at least 1 %. The results of processing data with apriori algorithms was same with the processing results of hash-based algorithms is as much as 49 combinations of 2-itemset. The pattern that formed between 7,5% of graduates from mathematics major that studied for more 5 years with confidence value is 38,5%. Keywords: Apriori algorithm, hash-based algorithm, association rule, data mining. Abstrak Kebutuhan informasi yang sangat tinggi terkadang tidak diimbangi dengan pemberian informasi yang memadai, sehingga informasi tersebut harus kembali digali dalam data yang besar. Dengan menggunakan teknik association rule kita dapat memperoleh informasi dari data yang besar seperti data yang ada di perguruan tinggi. Tujuan penelitian ini adalah menentukan pola lama studi mahasiswa F-MIPA UNSRAT dengan menggunakan metode association rule data mining serta membandingkan algoritma apriori dan algoritma hash-based. Data yang digunakan adalah data induk mahasiswa F-MIPA UNSRAT yang diolah menggunakan teknik association rule data mining dengan algoritma apriori dan algoritma hash-based dengan minimum support 1% dan minimum confidance 1%. Hasil pengolahan data dengan algoritma apriori sama dengan hasil pengolahan data dengan algoritma hash-based yaitu sebanyak 49 kombinasi 2-itemset. Pola yang terbentuk antara lain 7,5% lulusan yang berasal dari jurusan matematika menempuh studi selama lebih dari 5 tahun dengan nilai confidence 38,5%. Kata kunci : Association rule data mining, algoritma apriori, algoritma hash-based

Download Full-text

Applications of Machine Learning in Public Security Information and Resource Management

Scientific Programming ◽

10.1155/2021/4734187 ◽

2021 ◽

Vol 2021 ◽

pp. 1-9

Author(s):

Zhihui Wang ◽

Jinyu Wang

Keyword(s):

Data Mining ◽

Big Data ◽

Data Warehouse ◽

Large Scale ◽

Relevant Information ◽

Practical Significance ◽

Public Security ◽

Data Mining Algorithms ◽

New Findings ◽

Security Information

The data mining and big data technologies could be of utmost importance to investigate outbound and case datasets in the police records. New findings and useful information may potentially be obtained through data preprocessing and multidimensional modeling. Public security data is a kind of “big data,” having characteristics like large volume, rapid growth, various structures, large-scale storage, low density, and time sensitiveness. In this paper, a police data warehouse is constructed and a public security information analysis system is proposed. The proposed system comprises two modules: (i) case management and (ii) public security information mining. The former is responsible for the collection and processing of case information. The latter preprocesses the data of major cases that have occurred in the past ten years to create a data warehouse. Then, we use the model to create a data warehouse based on needs. By dividing the measurement values and dimensions, the analysis and prediction of criminals’ characteristics and the case environment realize relationships between them. In the process of mining and processing crime data, data mining algorithms can quickly find out the relevant information in the data. Furthermore, the system can find out relevant trends and laws to detect criminal cases faster than other methods. This can reduce the emergence of new crimes and provide a basis for decision-making in the public security department that has practical significance.

Download Full-text

Data warehousing and data mining: A case study

Yugoslav journal of operations research ◽

10.2298/yjor0501125s ◽

2005 ◽

Vol 15 (1) ◽

pp. 125-145 ◽

Cited By ~ 1

Author(s):

Milija Suknovic ◽

Milutin Cupic ◽

Milan Martic ◽

Darko Krulj

Keyword(s):

Data Mining ◽

Decision Making ◽

Data Warehouse ◽

Business Decision ◽

Serbia And Montenegro ◽

Data Mining Algorithms ◽

Use Of Data ◽

Business System ◽

Time Period ◽

Mining Algorithms

This paper shows design and implementation of data warehouse as well as the use of data mining algorithms for the purpose of knowledge discovery as the basic resource of adequate business decision making process. The project is realized for the needs of Student's Service Department of the Faculty of Organizational Sciences (FOS), University of Belgrade, Serbia and Montenegro. This system represents a good base for analysis and predictions in the following time period for the purpose of quality business decision-making by top management. Thus, the first part of the paper shows the steps in designing and development of data warehouse of the mentioned business system. The second part of the paper shows the implementation of data mining algorithms for the purpose of deducting rules, patterns and knowledge as a resource for support in the process of decision making.

Download Full-text

Research of Production and Growth of Coriander in Various Seasons using K-Means Algorithm

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.l1129.10812s19 ◽

2019 ◽

Vol 8 (12S) ◽

pp. 518-521

Keyword(s):

Data Mining ◽

Meteorological Data ◽

Weather Condition ◽

Large Data ◽

Vital Role ◽

Data Sets ◽

Useful Knowledge ◽

Productivity Data ◽

The People ◽

Important Field

The main employment and resource of our country is agriculture. In the upcoming days agriculture is going to be one of the important field .Agriculture plays a vital role in economical development of india. Half of the Indian population is mainly depended on agriculture. It is the source of living it is important in everyday life. Comparing to previous years Now-aday's Agriculture is in poor condition. The most important reasons for this is there is no proper guidance for the farmers.Outstanding to these problems, farming affects the yield of Coriander and lack of knowledge about the Coriander cultivation methodologies. And also season to cultivate the coriander and choosing which soil is the best to cultivate the particular Coriander based on the weather condition and also when to harvest the Coriander for the best yield. If the farmer is aware about the Coriander cultivation methodologies and harvesting it will more helpful for the people in the real world and also to increase the Coriander productivity. Data mining is the process of finding new template from large data sets, this technology which is in use in inferring useful knowledge that can be put to use from a vast amount of data. Climate is one of the meteorological data that is well-to-do by important knowledge. This paper presents a brief comparative study of various different techniques used for yield of coriander. The data mining techniques that are in use for the coriander yield estimation are K-Means.

Download Full-text

Data mining fool’s gold

Journal of Information Technology ◽

10.1177/0268396220915600 ◽

2020 ◽

Vol 35 (3) ◽

pp. 182-194

Author(s):

Gary Smith

Keyword(s):

Data Mining ◽

Scientific Method ◽

Large Data ◽

Learning Systems ◽

Large Data Sets ◽

Data Sets ◽

Computer Algorithms ◽

Data Mining Algorithms ◽

Rigorous Testing ◽

Mining Algorithms

The scientific method is based on the rigorous testing of falsifiable conjectures. Data mining, in contrast, puts data before theory by searching for statistical patterns without being constrained by prespecified hypotheses. Artificial intelligence and machine learning systems, for example, often rely on data-mining algorithms to construct models with little or no human guidance. However, a plethora of patterns are inevitable in large data sets, and computer algorithms have no effective way of assessing whether the patterns they unearth are truly useful or meaningless coincidences. While data mining sometimes discovers useful relationships, the data deluge has caused the number of possible patterns that can be discovered relative to the number that are genuinely useful to grow exponentially—which makes it increasingly likely that what data mining unearths is likely to be fool’s gold.

Download Full-text

Research on the Parallel Frequent Data Mining Strategy under the Cloud Computing Environment

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.719-720.924 ◽

2015 ◽

Vol 719-720 ◽

pp. 924-928 ◽

Cited By ~ 1

Author(s):

Xiao Chun Sheng ◽

Xiao Feng Xue ◽

Yan Ping Cheng

Keyword(s):

Data Mining ◽

Cloud Computing ◽

Large Data ◽

Efficient Solutions ◽

Data Repository ◽

Computing Environment ◽

Cloud Computing Environment ◽

Item Data ◽

Important Basis ◽

Data Mining Strategy

Cloud computing is computing tasks distribution resources of a large number of computers in the subnet, to provide users with cheap and efficient computing power, storage capacity and service capabilities. Data mining is to find useful information in large data repository. Frequent flow of large amounts of data quickly and accurately find important basis for forecasting and decision, therefore, under the cloud computing environment parallelization frequent item data mining strategy to provide efficient solutions to store and analyze vast amounts of data has important theoretical significanceand application value.

Download Full-text

A Prediction System for Diagnosis of Diabetes Mellitus

Journal of Computational and Theoretical Nanoscience ◽

10.1166/jctn.2020.8701 ◽

2020 ◽

Vol 17 (1) ◽

pp. 6-9

Author(s):

Ramya G. Franklin ◽

B. Muthukumar

Keyword(s):

Data Mining ◽

Large Data ◽

Heterogeneous Data ◽

Common Disease ◽

Prediction System ◽

Health Issues ◽

Data Set ◽

Data Mining Algorithms ◽

Growth Of Science ◽

Mining Algorithms

The growth of Science is a priceless asset to the human and society. The plethora of high-end machines has made life a sophistication which in turn is paid back as health issues. The health care data are complex and large. This heterogeneous data are used to diagnose patient’s diseases. It is better to predict the diseases at an earlier stage that can save the life and also have an upper hand in controlling the diseases. Data mining approaches are very useful in analyzing the complex, heterogeneous and large data set. The mining algorithms extract the essential data set from the raw data. This paper presents a survey on the various data mining algorithms used in predicting a very common disease in day a today life “Diabetics Mellitus.” Over 246 million people in the world are diabetic with a majority of them being women. The WHO reports that by 2025 this number is expected to rise to over 380 million.

Download Full-text

Comparison of Various Data Mining Algorithms in the Prediction of Risk for Gestational Diabetes

International Journal of Advanced Research in Computer Science and Software Engineering ◽

10.23956/ijarcsse.v7i8.26 ◽

2017 ◽

Vol 7 (8) ◽

pp. 74 ◽

Cited By ~ 1

Author(s):

R. Catherine Stephina Mary ◽

B. Satheesh Kumar

Keyword(s):

Data Mining ◽

Blood Glucose ◽

Gestational Diabetes ◽

Blood Sugar ◽

Large Data ◽

The Body ◽

High Blood Sugar ◽

Data Mining Algorithms ◽

Increased Risk ◽

Bayes Algorithm

Data Mining is a field of computer science which is used to discover new patterns for large data sets. Classification is an important task in data mining. In different areas of medicine, data mining has contributed to improve the results with other methodologies. Gestational diabetes is a condition characterized by high blood sugar (glucose) levels that is first recognized during pregnancy period of a woman. Diabetes is a disease in which levels of blood glucose, also called blood sugar, are above normal. People with diabetes have problems converting food to energy. Normally, after a meal, the body breaks food down into glucose, which the blood carries to cells throughout the body. Cells use insulin, a hormone made in the pancreas, to help them convert blood glucose into energy.During the second and third trimester, a mother's diabetes can lead to over-nutrition and excess growth of the baby. Having a large baby increases risks during labour and delivery. For example, large babies often require caesarean deliveries and if he or she is delivered vaginally, they are at increased risk for trauma to their shoulder. In addition, when foetal over-nutrition occurs and hyper insulinemia results, the baby's blood sugar can drop very low after birth, since it won't be receiving the high blood sugar from the mother. However, with proper treatment, a gestational diabetic mother can deliver a healthy baby despite having diabetes. In this paper, many classification algorithms like J48, simple CART and Naïve bayes algorithm are used to diagnose the diabetes in pregnant women and they are compared for their accuracy levels.

Download Full-text

Problems of KDD Cup 99 Dataset Existed and Data Preprocessing

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.667.218 ◽

2014 ◽

Vol 667 ◽

pp. 218-225 ◽

Cited By ~ 5

Author(s):

Yan Wang ◽

Kun Yang ◽

Xiang Jing ◽

Huang Long Jin

Keyword(s):

Data Mining ◽

Intrusion Detection ◽

Detection System ◽

Data Preprocessing ◽

Vital Role ◽

Data Mining Algorithms ◽

Network Intrusion ◽

Depth Analysis ◽

Mining Algorithms ◽

Kdd Cup 99

KDD Cup 99 dataset is not only the most widely used dataset in intrusion detection, but also the de facto benchmark on evaluating the performance merits of intrusion detection system. Nevertheless there are a lot of issues in this dataset which cannot be omitted. In order to establish good data mining models in intrusion detection and find the appropriate network intrusion attack types’ features, researchers should have a well-known understanding on this dataset. In this paper, first and foremost we have made an in-depth analysis on the problems which the dataset are existed, and given the related solutions. Secondly, we also have carried out plenty data preprocessing on the 10% subset of KDD Cup 99 dataset’s training set, giving better results to the following process. What’s more, by comparing 10 common kinds of data mining algorithms in our experiment, we have analyzed and summarized that data preprocessing plays a vital role on the performance and importance to data mining algorithms.

Download Full-text