Building Empirical-Based Knowledge for Design Recovery

Author(s):  
Hee Beng Kuan Tan ◽  
Yuan Zhao

Although the use of statistically probable properties is very common in the area of medicine, it is not so in software engineering. The use of such properties may open a new avenue for the automated recovery of designs from source codes. In fact, the recovery of designs can also be called program mining, which in turn can be viewed as an extension of data mining to the mining in program source codes.

Author(s):  
Minh Ngoc Ngo

Due to the need to reengineer and migrating aging software and legacy systems, reverse engineering has started to receive some attention. It has now been established as an area in software engineering to understand the software structure, to recover or extract design and features from programs mainly from source code. The inference of design and feature from codes has close similarity with data mining that extracts and infers information from data. In view of their similarity, reverse engineering from program codes can be called as program mining. Traditionally, the latter has been mainly based on invariant properties and heuristics rules. Recently, empirical properties have been introduced to augment the existing methods. This article summarizes some of the work in this area.


Author(s):  
Hee Beng Kuan Tan ◽  
Yuan Zhao

Today, many companies have to deal with problems in maintaining legacy database applications, which were developed on old database technology. These applications are getting harder and harder to maintain. Reengineering is an important means to address the problems and to upgrade the applications to newer technology (Hainaut, Englebert, Henrard, Hick, J.-M., & Roland, 1995). However, much of the design of legacy databases including data dependencies is buried in the transactions, which update the databases. They are not explicitly stated anywhere else. The recovery of data dependencies designed from transactions is essential to both the reengineering of database applications and frequently encountered maintenance tasks. Without an automated approach, the recovery is difficult and time-consuming. This issue is important in data mining, which entails mining the relationships between data from program source codes. However, until recently, no such approach was proposed in the literature.


2017 ◽  
Vol 27 (09n10) ◽  
pp. 1579-1589 ◽  
Author(s):  
Reinier Morejón ◽  
Marx Viana ◽  
Carlos Lucena

Data mining is a hot topic that attracts researchers of different areas, such as database, machine learning, and agent-oriented software engineering. As a consequence of the growth of data volume, there is an increasing need to obtain knowledge from these large datasets that are very difficult to handle and process with traditional methods. Software agents can play a significant role performing data mining processes in ways that are more efficient. For instance, they can work to perform selection, extraction, preprocessing, and integration of data as well as parallel, distributed, or multisource mining. This paper proposes a framework based on multiagent systems to apply data mining techniques to health datasets. Last but not least, the usage scenarios that we use are datasets for hypothyroidism and diabetes and we run two different mining processes in parallel in each database.


Author(s):  
MIRA KAJKO-MATTSSON ◽  
NED CHAPIN

Consider two independently done software engineering studies that used different approaches to cover some of the same subject area, such as software maintenance. Although done differently and for different purposes, to what extent can each study serve as a validation of the other? Within the scope of the subject area overlap, data mining can be applied to provide a quantitative assessment. This paper reports on the data mining that attempted to cross validate two independently done and published software engineering studies of software maintenance, one on a corrective maintenance maturity model, and the other on an objective classification of software maintenance activities. The data mining established that each of the two independently done studies effectively and very strongly validates the other.


2009 ◽  
Vol 34 (1) ◽  
pp. 87-107 ◽  
Author(s):  
Oscar Marbán ◽  
Javier Segovia ◽  
Ernestina Menasalvas ◽  
Covadonga Fernández-Baizán

Computer ◽  
2009 ◽  
Vol 42 (7) ◽  
pp. 55-62 ◽  
Author(s):  
Tao Xie ◽  
Suresh Thummalapenta ◽  
David Lo ◽  
Chao Liu

2019 ◽  
Vol 35 (21) ◽  
pp. 4413-4418 ◽  
Author(s):  
Mustafa Solmaz ◽  
Adam Lane ◽  
Bilal Gonen ◽  
Ogulsheker Akmamedova ◽  
Mehmet H Gunes ◽  
...  

Abstract Motivation An important goal of cancer genomics initiatives is to provide the research community with the resources for the unbiased query of cancer mechanisms. Several excellent web platforms have been developed to enable the visual analyses of molecular alterations in cancers from these datasets. However, there are few tools to allow the researchers to mine these resources for mechanisms of cancer processes and their functional interactions in an intuitive unbiased manner. Results To address this need, we developed SEMA, a web platform for building and testing of models of cancer mechanisms from large multidimensional cancer genomics datasets. Unlike the existing tools for the analyses and query of these resources, SEMA is explicitly designed to enable exploratory and confirmatory analyses of complex cancer mechanisms through a suite of intuitive visual and statistical functionalities. Here, we present a case study of the functional mechanisms of TP53-mediated tumor suppression in various cancers, using SEMA, and identify its role in the regulation of cell cycle progression, DNA repair and signal transduction in different cancers. SEMA is a first-in-its-class web application designed to allow visual data mining and hypothesis testing from the multidimensional cancer datasets. The web application, an extensive tutorial and several video screencasts with case studies are freely available for academic use at https://sema.research.cchmc.org/. Availability and implementation SEMA is freely available at https://sema.research.cchmc.org. The web site also contains a detailed Tutorial (also in Supplementary Information), and a link to the YouTube channel for video screencasts of analyses, including the analyses presented here. The Shiny and JavaScript source codes have been deposited to GitHub: https://github.com/msolmazm/sema. Supplementary information Supplementary data are available at Bioinformatics online.


Author(s):  
Marcela Ridao ◽  
Jorge Horacio Doorn

Requirements Engineering is frequently seen as the activity of the Software Engineering process with fewer tools. Usually there are only available graphic and text editing aids. This is supported by the perception that it is a human being intensive task. This chapter is based on the understanding that such perception is just partially true. Models used along the Requirements Engineering process have underlying structures holding semantic information difficult to be seen by the reader. In fact, models created with well defined objective, were designed to maximize their expressiveness for that objective. However they may hold some useful shadowed information. Here is where a specialized tool may become valuable. From an epistemological point of view, this situation is similar to what happens in data mining. In this chapter, a tool able to make visible any clustering existing in Universe of Discourse glossaries is described. It is based on the automatic constructions of graphs using references embedded in the glossary itself.


Sign in / Sign up

Export Citation Format

Share Document