Incremental Discovery of Fuzzy Functional Dependencies

Author(s):  
Shyue-Liang Wang ◽  
Ju-Wen Shen ◽  
Tuzng-Pei Hong

Mining functional dependencies (FDs) from databases has been identified as an important database analysis technique. It has received considerable research interest in recent years. However, most current data mining techniques for determining functional dependencies deal only with crisp databases. Although various forms of fuzzy functional dependencies (FFDs) have been proposed for fuzzy databases, they emphasized conceptual viewpoints and only a few mining algorithms are given. In this research, we propose methods to validate and incrementally search for FFDs from similarity-based fuzzy relational databases. For a given pair of attributes, the validation of FFDs is based on fuzzy projection and fuzzy selection operations. In addition, the property that FFDs are monotonic in the sense that r1 ? r2 implies FDa(r1) ? FDa(r2) is shown. An incremental search algorithm for FFDs based on this property is then presented. Experimental results showing the behavior of the search algorithm are discussed.

Author(s):  
Ali Bayır ◽  
Sevinç Gülseçen ◽  
Gökhan Türkmen

Political elections are influenced by a number of factors such as political tendencies, voters' perceptions, and preferences. The results of a political election could also be based on specific attributes of candidates: age, gender, occupancy, education, etc. Although it is very difficult to understand all the factors which could have influenced the outcome of the election, many of the attributes mentioned above could be included in a data set, and by using current data mining techniques, undiscovered patterns can be revealed. Despite unpredictability of human behaviors and/or choices involved, data mining techniques still could help in predicting the election outcomes. In this study, the results of the survey prepared by KONDA Research and Consultancy Company before 2011 elections in Turkey were used as raw data. This study may help in understanding how data mining methods and techniques could be used in political sciences research. The study may also reveal whether voting tendencies in elections could be a factor for the outcome of the election.


Author(s):  
Alex A. Freitas ◽  
Gisele L. Pappa

At present there is a wide range of data mining algorithms available to researchers and practitioners (Witten & Frank, 2005; Tan et al., 2006). Despite the great diversity of these algorithms, virtually all of them share one feature: they have been manually designed. As a result, current data mining algorithms in general incorporate human biases and preconceptions in their designs. This article proposes an alternative approach to the design of data mining algorithms, namely the automatic creation of data mining algorithms by means of Genetic Programming (GP) (Pappa & Freitas, 2006). In essence, GP is a type of Evolutionary Algorithm – i.e., a search algorithm inspired by the Darwinian process of natural selection – that evolves computer programs or executable structures. This approach opens new avenues for research, providing the means to design novel data mining algorithms that are less limited by human biases and preconceptions, and so offer the potential to discover new kinds of patterns (or knowledge) to the user. It also offers an interesting opportunity for the automatic creation of data mining algorithms tailored to the data being mined.


2014 ◽  
Vol 5 (3) ◽  
pp. 11-28
Author(s):  
Ljiljana Kašćelan ◽  
Vladimir Kašćelan ◽  
Milijana Novović-Burić

This paper has proposed a data mining approach for risk assessment in car insurance. Standard methods imply classification of policies to great number of tariff classes and assessment of risk on basis of them. With application of data mining techniques, it is possible to get functional dependencies between the level of risk and risk factors as well as better results in predictions. On the case study data it has been proved that data mining techniques can, with better accuracy than the standard methods, predict claim sizes and occurrence of claims, and this represents the basis for calculation of net risk premium and risk classification. This paper, also, discusses advantages of data mining methods compared to standard methods for risk assessment in car insurance, as well as the specificities of the obtained results due to small insurance market, such is the one in Montenegro.


The improvement of an information processing and Memory capacity, the vast amount of data is collected for various data analyses purposes. Data mining techniques are used to get knowledgeable information. The process of extraction of data by using data mining techniques the data get discovered publically and this leads to breaches of specific privacy data. Privacypreserving data mining is used to provide to protection of sensitive information from unwanted or unsanctioned disclosure. In this paper, we analysis the problem of discovering similarity checks for functional dependencies from a given dataset such that application of algorithm (l, d) inference with generalization can anonymised the micro data without loss in utility. [8] This work has presented Functional dependency based perturbation approach which hides sensitive information from the user, by applying (l, d) inference model on the dependency attributes based on Information Gain. This approach works on both categorical and numerical attributes. The perturbed data set does not affects the original dataset it maintains the same or very comparable patterns as the original data set. Hence the utility of the application is always high, when compared to other data mining techniques. The accuracy of the original and perturbed datasets is compared and analysed using tools, data mining classification algorithm.


Author(s):  
Miroslav Hudec ◽  
Miljan Vučetić ◽  
Mirko Vujošević

Data mining methods based on fuzzy logic have been developed recently and have become an increasingly important research area. In this chapter, the authors examine possibilities for discovering potentially useful knowledge from relational database by integrating fuzzy functional dependencies and linguistic summaries. Both methods use fuzzy logic tools for data analysis, acquiring, and representation of expert knowledge. Fuzzy functional dependencies could detect whether dependency between two examined attributes in the whole database exists. If dependency exists only between parts of examined attributes' domains, fuzzy functional dependencies cannot detect its characters. Linguistic summaries are a convenient method for revealing this kind of dependency. Using fuzzy functional dependencies and linguistic summaries in a complementary way could mine valuable information from relational databases. Mining intensities of dependencies between database attributes could support decision making, reduce the number of attributes in databases, and estimate missing values. The proposed approach is evaluated with case studies using real data from the official statistics. Strengths and weaknesses of the described methods are discussed. At the end of the chapter, topics for further research activities are outlined.


Big Data ◽  
2016 ◽  
pp. 2028-2046
Author(s):  
Ljiljana Kašćelan ◽  
Vladimir Kašćelan ◽  
Milijana Novović-Burić

This paper has proposed a data mining approach for risk assessment in car insurance. Standard methods imply classification of policies to great number of tariff classes and assessment of risk on basis of them. With application of data mining techniques, it is possible to get functional dependencies between the level of risk and risk factors as well as better results in predictions. On the case study data it has been proved that data mining techniques can, with better accuracy than the standard methods, predict claim sizes and occurrence of claims, and this represents the basis for calculation of net risk premium and risk classification. This paper, also, discusses advantages of data mining methods compared to standard methods for risk assessment in car insurance, as well as the specificities of the obtained results due to small insurance market, such is the one in Montenegro.


2014 ◽  
Vol 926-930 ◽  
pp. 2280-2283
Author(s):  
Qiong Ren

With the increasing of input data size, process cost will be very long, for the explosive growth of the Internet data even reached the point of single machine can handle. This article mainly introduces the architecture of the concept of cloud computing and, the mainstream of the analysis of the current data mining algorithms, based on cloud computing to develop the data mining system, providing the operation feasibility of data mining in cloud computing platform, having strong guiding significance.


Author(s):  
Pheeha Machaka ◽  
Fulufhelo Nelwamondo

This chapter reviews the evolution of the traditional internet into the Internet of Things (IoT). The characteristics and application of the IoT are also reviewed, together with its security concerns in terms of distributed denial of service attacks. The chapter further investigates the state-of-the-art in data mining techniques for Distributed Denial of Service (DDoS) attacks targeting the various infrastructures. The chapter explores the characteristics and pervasiveness of DDoS attacks. It also explores the motives, mechanisms and techniques used to execute a DDoS attack. The chapter further investigates the current data mining techniques that are used to combat and detect these attacks, their advantages and disadvantages are explored. Future direction of the research is also provided.


Author(s):  
G. Ramadevi ◽  
Srujitha Yeruva ◽  
P. Sravanthi ◽  
P. Eknath Vamsi ◽  
S. Jaya Prakash

In a digitized world, data is growing exponentially and it is difficult to analyze the data and give the results. Data mining techniques play an important role in healthcare sector - BigData. By making use of Data mining algorithms it is possible to analyze, detect and predict the presence of disease which helps doctors to detect the disease early and in decision making. The objective of data mining techniques used is to design an automated tool that notifies the patient’s treatment history disease and medical data to doctors. Data mining techniques are very much useful in analyzing medical data to achieve meaningful and practical patterns. This project works on diabetes medical data, classification and clustering algorithms like (OPTICS, NAIVEBAYES, and BRICH) are implemented and the efficiency of the same is examined.


Sign in / Sign up

Export Citation Format

Share Document