Recovery of Data Dependencies

Author(s):  
Hee Beng Kuan Tan ◽  
Yuan Zhao

Today, many companies have to deal with problems in maintaining legacy database applications, which were developed on old database technology. These applications are getting harder and harder to maintain. Reengineering is an important means to address the problems and to upgrade the applications to newer technology (Hainaut, Englebert, Henrard, Hick, J.-M., & Roland, 1995). However, much of the design of legacy databases including data dependencies is buried in the transactions, which update the databases. They are not explicitly stated anywhere else. The recovery of data dependencies designed from transactions is essential to both the reengineering of database applications and frequently encountered maintenance tasks. Without an automated approach, the recovery is difficult and time-consuming. This issue is important in data mining, which entails mining the relationships between data from program source codes. However, until recently, no such approach was proposed in the literature.

Author(s):  
Hee Beng Kuan Tan ◽  
Yuan Zhao

Although the use of statistically probable properties is very common in the area of medicine, it is not so in software engineering. The use of such properties may open a new avenue for the automated recovery of designs from source codes. In fact, the recovery of designs can also be called program mining, which in turn can be viewed as an extension of data mining to the mining in program source codes.


Author(s):  
Xuelong Zhang

With the advent of the era of big data, people are eager to extract valuable knowledge from the rapidly expanding data, so that they can more effectively use these massive storage data. The traditional data processing technology can only achieve basic functions such as data query and statistics, and cannot achieve the goal of extracting the knowledge existing in the data to predict the future trend. Therefore, along with the rapid development of database technology and the rapid improvement of computer’s computing power, data mining (DM) came into existence. Research on DM algorithms includes knowledge of various fields such as database, statistics, pattern recognition and artificial intelligence. Pattern recognition mainly extracts features of known data samples. The DM algorithm using pattern recognition technology is a better method to obtain effective information from massive data, thus providing decision support, and has a good application prospect. Support vector machine (SVM) is a new pattern recognition algorithm proposed in recent years, which avoids dimension disaster by dimensioning and linearization. Based on this, this paper studies the DM algorithm based on pattern recognition, and proposes a DM algorithm based on SVM. The algorithm divides the vector of the SV set into two different types and iterates through multiple iterations to obtain a classifier that converges to the final result. Finally, through the cross-validation simulation experiment, the results show that the DM algorithm based on pattern recognition can effectively reduce the training time and solve the mining problem of massive data. The results show that the algorithm has certain rationality and feasibility.


2012 ◽  
Vol 268-270 ◽  
pp. 1752-1757 ◽  
Author(s):  
Hong Yan Zhao

With the development of database technology as well as the widespread application of database management system, the capability of collecting data was improved rapidly, and lots of data have been accumulated. Data mining was created for the purpose of excavate the useful knowledge hidden behind these data. Data classification is not only an important issue of data mining but also an effective KDD method. Decision Tree, which is a major technology of data classification, is applied far and widely. In this article, the concrete step of mining data by decision tree, the main algorithm and the basic idea of decision tree were summarized and analysed.


2013 ◽  
Vol 846-847 ◽  
pp. 1435-1438
Author(s):  
Xin Ai Xu

This paper mainly studies the present distributed technology and database applications, with examples for analysis, analysis of the current computer technology in the field of distributed technology. And based on the theory of database, distributed technology and database technology, database technology is to call SQL, distributed research. From the analysis of the distributed computer field, the combination of the database and application the paper was written.


2014 ◽  
Vol 971-973 ◽  
pp. 1820-1823
Author(s):  
Xi Long Ding

data mining using the database, a variety of technologies such as artificial intelligence and mathematical statistics. This paper introduces the present situation of database technology, according to the mining method and its application in how to build a Bayesian network technology, through the scattered according to the mining to solve concrete problems encountered in the process of Bayesian network modeling, namely how to from scale effect according to the library to find the relationship between each variable and how to determine the conditional probability problem.


Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
You Wu ◽  
Zheng Wang ◽  
Shengqi Wang

Data mining is currently a frontier research topic in the field of information and database technology. It is recognized as one of the most promising key technologies. Data mining involves multiple technologies, such as mathematical statistics, fuzzy theory, neural networks, and artificial intelligence, with relatively high technical content. The realization is also difficult. In this article, we have studied the basic concepts, processes, and algorithms of association rule mining technology. Aiming at large-scale database applications, in order to improve the efficiency of data mining, we proposed an incremental association rule mining algorithm based on clustering, that is, using fast clustering. First, the feasibility of realizing performance appraisal data mining is studied; then, the business process needed to realize the information system is analyzed, the business process-related links and the corresponding data input interface are designed, and then the data process to realize the data processing is designed, including data foundation and database model. Aiming at the high efficiency of large-scale database mining, database development tools are used to implement the specific system settings and program design of this algorithm. Incorporated into the human resource management system of colleges and universities, they carried out successful association broadcasting, realized visualization, and finally discovered valuable information.


Electronics ◽  
2020 ◽  
Vol 9 (9) ◽  
pp. 1348
Author(s):  
Alberto Arteta Albert ◽  
Nuria Gómez Blas ◽  
Luis Fernando de Mingo López

An issue that most databases face is the static and manual character of indexing operations. This old-fashioned way of indexing database objects is proven to affect the database performance to some degree, creating downtime and a possible impact in the performance that is usually solved by manually running index rebuild or defrag operations. Many data mining algorithms can speed up by using appropriate index structures. Choosing the proper index largely depends on the type of query that the algorithm performs against the database. The statistical analyzers embedded in the Database Management System are neither always accurate enough to automatically determine when to use an index nor to change its inner structure. This paper provides an algorithm that targets those indexes that are causing performance issues on the databases and then performs an automatic operation (defrag, recreation, or modification) that can boost the overall performance of the Database System. The effectiveness of proposed algorithm has been evaluated with several experiments developed and show that this approach consistently leads to a better resulting index configuration. The downtime of having a damaged, fragmented, or inefficient index is reduced by increasing the chances for the optimizer to be using the proper index structure.


Sign in / Sign up

Export Citation Format

Share Document