Multi-Relational Data Mining A Comprehensive Survey

Data Mining on XML Data

Encyclopedia of Data Warehousing and Mining, Second Edition ◽

10.4018/978-1-60566-010-3.ch079 ◽

2011 ◽

pp. 506-510

Author(s):

Qin Ding

Keyword(s):

Data Mining ◽

Data Storage ◽

Efficient Algorithms ◽

Relational Data ◽

Xml Data ◽

Structure Information ◽

Data Mining Algorithms ◽

Mining Algorithms ◽

Naive Approach

With the growing usage of XML data for data storage and exchange, there is an imminent need to develop efficient algorithms to perform data mining on semistructured XML data. Mining on XML data is much more difficult than mining on relational data because of the complexity of structure in XML data. A naïve approach to mining on XML data is to first convert XML data into relational format. However the structure information may be lost during the conversion. It is desired to develop efficient and effective data mining algorithms that can be directly applied on XML data.

Download Full-text

Association Rule Mining of Relational Data

Encyclopedia of Data Warehousing and Mining, Second Edition ◽

10.4018/978-1-60566-010-3.ch015 ◽

2011 ◽

pp. 87-93

Author(s):

Anne Denton

Keyword(s):

Data Mining ◽

Data Structures ◽

Association Rule ◽

Association Rule Mining ◽

Relational Data ◽

Rule Mining ◽

Data Mining Algorithms ◽

Mining Algorithms ◽

Relational Database Management ◽

Relational Database Management Systems

Most data of practical relevance are structured in more complex ways than is assumed in traditional data mining algorithms, which are based on a single table. The concept of relations allows for discussing many data structures such as trees and graphs. Relational data have much generality and are of significant importance, as demonstrated by the ubiquity of relational database management systems. It is, therefore, not surprising that popular data mining techniques, such as association rule mining, have been generalized to relational data. An important aspect of the generalization process is the identification of challenges that are new to the generalized setting.

Download Full-text

Identifying Decision Structures Underlying Activity Patterns: An Exploration of Data Mining Algorithms

Transportation Research Record Journal of the Transportation Research Board ◽

10.3141/1718-01 ◽

2000 ◽

Vol 1718 (1) ◽

pp. 1-9 ◽

Cited By ~ 39

Author(s):

Geert Wets ◽

Koen Vanhoof ◽

Theo Arentze ◽

Harry Timmermans

Keyword(s):

Data Mining ◽

Decision Tree ◽

Logit Model ◽

Goodness Of Fit ◽

Travel Demand ◽

Activity Patterns ◽

Future Research ◽

Data Set ◽

Data Mining Algorithms ◽

Mining Algorithms

The utility-maximizing framework—in particular, the logit model—is the dominantly used framework in transportation demand modeling. Computational process modeling has been introduced as an alternative approach to deal with the complexity of activity-based models of travel demand. Current rule-based systems, however, lack a methodology to derive rules from data. The relevance and performance of data-mining algorithms that potentially can provide the required methodology are explored. In particular, the C4 algorithm is applied to derive a decision tree for transport mode choice in the context of activity scheduling from a large activity diary data set. The algorithm is compared with both an alternative method of inducing decision trees (CHAID) and a logit model on the basis of goodness-of-fit on the same data set. The ratio of correctly predicted cases of a holdout sample is almost identical for the three methods. This suggests that for data sets of comparable complexity, the accuracy of predictions does not provide grounds for either rejecting or choosing the C4 method. However, the method may have advantages related to robustness. Future research is required to determine the ability of decision tree-based models in predicting behavioral change.

Download Full-text

Prediction of Wind Farm Power Ramp Rates: A Data-Mining Approach

Journal of Solar Energy Engineering ◽

10.1115/1.3142727 ◽

2009 ◽

Vol 131 (3) ◽

Cited By ~ 65

Author(s):

Haiyang Zheng ◽

Andrew Kusiak

Keyword(s):

Data Mining ◽

Time Series ◽

Wind Farm ◽

Multivariate Time Series ◽

Future Research ◽

Support Vector ◽

Time Series Models ◽

Ramp Rate ◽

Data Mining Algorithms ◽

Mining Algorithms

In this paper, multivariate time series models were built to predict the power ramp rates of a wind farm. The power changes were predicted at 10 min intervals. Multivariate time series models were built with data-mining algorithms. Five different data-mining algorithms were tested using data collected at a wind farm. The support vector machine regression algorithm performed best out of the five algorithms studied in this research. It provided predictions of the power ramp rate for a time horizon of 10–60 min. The boosting tree algorithm selects parameters for enhancement of the prediction accuracy of the power ramp rate. The data used in this research originated at a wind farm of 100 turbines. The test results of multivariate time series models were presented in this paper. Suggestions for future research were provided.

Download Full-text

Association Rule Mining of Regional Data

Encyclopedia of Data Warehousing and Mining ◽

10.4018/978-1-59140-557-3.ch014 ◽

2011 ◽

pp. 70-73

Author(s):

Anne Denton ◽

Christopher Besemann

Keyword(s):

Data Mining ◽

Data Structures ◽

Association Rule ◽

Association Rule Mining ◽

Relational Data ◽

Rule Mining ◽

Data Mining Algorithms ◽

Mining Algorithms ◽

Relational Database Management ◽

Relational Database Management Systems

Most data of practical relevance are structured in more complex ways than is assumed in traditional data mining algorithms, which are based on a single table. The concept of relations allows for discussing many data structures such as trees and graphs. Relational data have much generality and are of significant importance, as demonstrated by the ubiquity of relational database management systems. It is, therefore, not surprising that popular data mining techniques, such as association rule mining, have been generalized to relational data. An important aspect of the generalization process is the identification of problems that are new to the generalized setting.

Download Full-text

Analyzing Customer Behavior Using Online Analytical Mining (OLAM)

Marketing and Consumer Behavior ◽

10.4018/978-1-4666-7357-1.ch040 ◽

2015 ◽

pp. 894-910 ◽

Cited By ~ 1

Author(s):

Thanachart Ritbumroong

Keyword(s):

Data Mining ◽

Behavior Analysis ◽

Customer Behavior ◽

Future Research ◽

Technical Performance ◽

Data Mining Algorithms ◽

Integration Data ◽

Mining Algorithms ◽

Provided Examples

Online Analytical Mining (OLAM) is an architecture integrating data mining into OLAP. With this integration, data mining algorithms can be performed with OLAP abilities. OLAM enables users to choose a particular portion of data and analyze them with data mining models. Previous studies have provided examples of OLAM applications with the motivation to improve technical performance. This chapter reviews the capabilities of OLAM and discusses the well-known concept encompassing the analysis of customer behavior. The underlying motivation of this chapter is to present the opportunities for the development of OLAM to support the customer behavior analysis. Three main directions of the advancement in OLAM are proposed for future research.

Download Full-text

Analyzing Customer Behavior Using Online Analytical Mining (OLAM)

Advances in Business Strategy and Competitive Advantage - Integration of Data Mining in Business Intelligence Systems ◽

10.4018/978-1-4666-6477-7.ch006 ◽

2015 ◽

pp. 98-118

Author(s):

Thanachart Ritbumroong

Keyword(s):

Data Mining ◽

Behavior Analysis ◽

Customer Behavior ◽

Future Research ◽

Technical Performance ◽

Data Mining Algorithms ◽

Integration Data ◽

Mining Algorithms ◽

Provided Examples

Online Analytical Mining (OLAM) is an architecture integrating data mining into OLAP. With this integration, data mining algorithms can be performed with OLAP abilities. OLAM enables users to choose a particular portion of data and analyze them with data mining models. Previous studies have provided examples of OLAM applications with the motivation to improve technical performance. This chapter reviews the capabilities of OLAM and discusses the well-known concept encompassing the analysis of customer behavior. The underlying motivation of this chapter is to present the opportunities for the development of OLAM to support the customer behavior analysis. Three main directions of the advancement in OLAM are proposed for future research.

Download Full-text

Directions for Future Research on the Automatic Design of Data Mining Algorithms

Natural Computing Series - Automating the Design of Data Mining Algorithms ◽

10.1007/978-3-642-02541-9_7 ◽

2009 ◽

pp. 177-184 ◽

Cited By ~ 1

Author(s):

Gisele L. Pappa ◽

Alex A. Freitas

Keyword(s):

Data Mining ◽

Future Research ◽

Automatic Design ◽

Data Mining Algorithms ◽

Mining Algorithms

Download Full-text

Research on Data Mining Algorithm Based on Rough Set

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.433-440.3340 ◽

2012 ◽

Vol 433-440 ◽

pp. 3340-3346 ◽

Cited By ~ 1

Author(s):

Yong Bin Yang

Keyword(s):

Data Mining ◽

Rough Set ◽

Search Space ◽

Data Mining Algorithm ◽

Data Mining Algorithms ◽

Starting Point ◽

Depth Study ◽

Mining Algorithms ◽

Selection Of ◽

Improved Algorithm

Through in-depth study on the existing rough set and data mining technologies, for the shortcomings of the existing data mining algorithms based on rough set, this paper presents an improved algorithm. This algorithm has the attribute nuclear as the starting point of reduction calculation, filtering distinguishable matrix as the basis for selection of candidate attributes, and condition attribute, decision attribute information entropy as heuristic information, to find the smallest reduction of the decision information system. The improved algorithm well solves the defects of the heuristic algorithm based on distinguish matrix, reducing the property search space, so as to improve the reduction speed.

Download Full-text

Novel Adverse Events of Iloperidone: A Disproportionality Analysis in US Food and Drug Administration Adverse Event Reporting System (FAERS) Database

Current Drug Safety ◽

10.2174/1574886313666181026100000 ◽

2019 ◽

Vol 14 (1) ◽

pp. 21-26 ◽

Cited By ~ 2

Author(s):

Viswam Subeesh ◽

Eswaran Maheswari ◽

Hemendra Singh ◽

Thomas Elsa Beulah ◽

Ann Mary Swaroop

Keyword(s):

Data Mining ◽

Adverse Event ◽

Adverse Events ◽

Reporting System ◽

Adverse Event Reporting System ◽

Adverse Event Reporting ◽

Disproportionality Analysis ◽

Positive Signal ◽

Data Mining Algorithms ◽

Mining Algorithms

Background: The signal is defined as “reported information on a possible causal relationship between an adverse event and a drug, of which the relationship is unknown or incompletely documented previously”. Objective: To detect novel adverse events of iloperidone by disproportionality analysis in FDA database of Adverse Event Reporting System (FAERS) using Data Mining Algorithms (DMAs). Methodology: The US FAERS database consists of 1028 iloperidone associated Drug Event Combinations (DECs) which were reported from 2010 Q1 to 2016 Q3. We consider DECs for disproportionality analysis only if a minimum of ten reports are present in database for the given adverse event and which were not detected earlier (in clinical trials). Two data mining algorithms, namely, Reporting Odds Ratio (ROR) and Information Component (IC) were applied retrospectively in the aforementioned time period. A value of ROR-1.96SE>1 and IC- 2SD>0 were considered as the threshold for positive signal. Results: The mean age of the patients of iloperidone associated events was found to be 44years [95% CI: 36-51], nevertheless age was not mentioned in twenty-one reports. The data mining algorithms exhibited positive signal for akathisia (ROR-1.96SE=43.15, IC-2SD=2.99), dyskinesia (21.24, 3.06), peripheral oedema (6.67,1.08), priapism (425.7,9.09) and sexual dysfunction (26.6-1.5) upon analysis as those were well above the pre-set threshold. Conclusion: Iloperidone associated five potential signals were generated by data mining in the FDA AERS database. The result requires an integration of further clinical surveillance for the quantification and validation of possible risks for the adverse events reported of iloperidone.

Download Full-text