Building Empirical-Based Knowledge for Design Recovery

Encyclopedia of Data Warehousing and Mining ◽

10.4018/978-1-59140-557-3.ch022 ◽

2011 ◽

pp. 112-117

Author(s):

Hee Beng Kuan Tan ◽

Yuan Zhao

Keyword(s):

Data Mining ◽

Software Engineering ◽

Source Codes ◽

Design Recovery ◽

Program Mining

Although the use of statistically probable properties is very common in the area of medicine, it is not so in software engineering. The use of such properties may open a new avenue for the automated recovery of designs from source codes. In fact, the recovery of designs can also be called program mining, which in turn can be viewed as an extension of data mining to the mining in program source codes.

Download Full-text

Program Mining Augmented with Empirical Properties

Encyclopedia of Data Warehousing and Mining, Second Edition ◽

10.4018/978-1-60566-010-3.ch246 ◽

2011 ◽

pp. 1610-1616

Author(s):

Minh Ngoc Ngo

Keyword(s):

Data Mining ◽

Software Engineering ◽

Reverse Engineering ◽

Source Code ◽

Close Similarity ◽

Legacy Systems ◽

Software Structure ◽

Program Mining ◽

Invariant Properties ◽

To Receive

Due to the need to reengineer and migrating aging software and legacy systems, reverse engineering has started to receive some attention. It has now been established as an area in software engineering to understand the software structure, to recover or extract design and features from programs mainly from source code. The inference of design and feature from codes has close similarity with data mining that extracts and infers information from data. In view of their similarity, reverse engineering from program codes can be called as program mining. Traditionally, the latter has been mainly based on invariant properties and heuristics rules. Recently, empirical properties have been introduced to augment the existing methods. This article summarizes some of the work in this area.

Download Full-text

Recovery of Data Dependencies

Encyclopedia of Data Warehousing and Mining ◽

10.4018/978-1-59140-557-3.ch178 ◽

2011 ◽

pp. 947-949

Author(s):

Hee Beng Kuan Tan ◽

Yuan Zhao

Keyword(s):

Data Mining ◽

Database Applications ◽

Data Dependencies ◽

Source Codes ◽

Database Technology ◽

Important Means

Today, many companies have to deal with problems in maintaining legacy database applications, which were developed on old database technology. These applications are getting harder and harder to maintain. Reengineering is an important means to address the problems and to upgrade the applications to newer technology (Hainaut, Englebert, Henrard, Hick, J.-M., & Roland, 1995). However, much of the design of legacy databases including data dependencies is buried in the transactions, which update the databases. They are not explicitly stated anywhere else. The recovery of data dependencies designed from transactions is essential to both the reengineering of database applications and frequently encountered maintenance tasks. Without an automated approach, the recovery is difficult and time-consuming. This issue is important in data mining, which entails mining the relationships between data from program source codes. However, until recently, no such approach was proposed in the literature.

Download Full-text

An Approach to Generate Software Agents for Health Data Mining

International Journal of Software Engineering and Knowledge Engineering ◽

10.1142/s0218194017400125 ◽

2017 ◽

Vol 27 (09n10) ◽

pp. 1579-1589 ◽

Cited By ~ 1

Author(s):

Reinier Morejón ◽

Marx Viana ◽

Carlos Lucena

Keyword(s):

Machine Learning ◽

Data Mining ◽

Software Engineering ◽

Multiagent Systems ◽

Significant Role ◽

Software Agents ◽

Health Data ◽

Large Datasets ◽

Agent Oriented Software Engineering ◽

Data Volume

Data mining is a hot topic that attracts researchers of different areas, such as database, machine learning, and agent-oriented software engineering. As a consequence of the growth of data volume, there is an increasing need to obtain knowledge from these large datasets that are very difficult to handle and process with traditional methods. Software agents can play a significant role performing data mining processes in ways that are more efficient. For instance, they can work to perform selection, extraction, preprocessing, and integration of data as well as parallel, distributed, or multisource mining. This paper proposes a framework based on multiagent systems to apply data mining techniques to health datasets. Last but not least, the usage scenarios that we use are datasets for hypothyroidism and diabetes and we run two different mining processes in parallel in each database.

Download Full-text

DATA MINING FOR VALIDATION IN SOFTWARE ENGINEERING: AN EXAMPLE

International Journal of Software Engineering and Knowledge Engineering ◽

10.1142/s0218194004001725 ◽

2004 ◽

Vol 14 (04) ◽

pp. 407-427 ◽

Cited By ~ 1

Author(s):

MIRA KAJKO-MATTSSON ◽

NED CHAPIN

Keyword(s):

Data Mining ◽

Software Engineering ◽

Software Maintenance ◽

Subject Area ◽

The Other ◽

Maturity Model ◽

Engineering Studies ◽

Objective Classification ◽

The Subject

Consider two independently done software engineering studies that used different approaches to cover some of the same subject area, such as software maintenance. Although done differently and for different purposes, to what extent can each study serve as a validation of the other? Within the scope of the subject area overlap, data mining can be applied to provide a quantitative assessment. This paper reports on the data mining that attempted to cross validate two independently done and published software engineering studies of software maintenance, one on a corrective maintenance maturity model, and the other on an objective classification of software maintenance activities. The data mining established that each of the two independently done studies effectively and very strongly validates the other.

Download Full-text

Smarter software engineering: practical data mining approaches

27th Annual NASA Goddard Software Engineering Workshop, 2002. Proceedings. ◽

10.1109/nasase.2002.1176229 ◽

2003 ◽

Cited By ~ 1

Author(s):

T. Menzies ◽

G.D. Boetticher

Keyword(s):

Data Mining ◽

Software Engineering

Download Full-text

Toward data mining engineering: A software engineering approach

Information Systems ◽

10.1016/j.is.2008.04.003 ◽

2009 ◽

Vol 34 (1) ◽

pp. 87-107 ◽

Cited By ~ 24

Author(s):

Oscar Marbán ◽

Javier Segovia ◽

Ernestina Menasalvas ◽

Covadonga Fernández-Baizán

Keyword(s):

Data Mining ◽

Software Engineering ◽

Mining Engineering ◽

Engineering Approach

Download Full-text

Data Mining for Software Engineering

Computer ◽

10.1109/mc.2009.256 ◽

2009 ◽

Vol 42 (7) ◽

pp. 55-62 ◽

Cited By ~ 62

Author(s):

Tao Xie ◽

Suresh Thummalapenta ◽

David Lo ◽

Chao Liu

Keyword(s):

Data Mining ◽

Software Engineering

Download Full-text

Data mining: A tool for knowledge discovery in human aspect of software engineering

2015 2nd International Conference on Electronics and Communication Systems (ICECS) ◽

10.1109/ecs.2015.7124792 ◽

2015 ◽

Cited By ~ 1

Author(s):

Sangita Gupta ◽

V. Suma

Keyword(s):

Data Mining ◽

Software Engineering ◽

Knowledge Discovery ◽

Human Aspect

Download Full-text

Graphical data mining of cancer mechanisms with SEMA

Bioinformatics ◽

10.1093/bioinformatics/btz303 ◽

2019 ◽

Vol 35 (21) ◽

pp. 4413-4418 ◽

Cited By ~ 2

Author(s):

Mustafa Solmaz ◽

Adam Lane ◽

Bilal Gonen ◽

Ogulsheker Akmamedova ◽

Mehmet H Gunes ◽

...

Keyword(s):

Data Mining ◽

Web Application ◽

Cell Cycle Progression ◽

Cancer Genomics ◽

Supplementary Information ◽

Visual Data Mining ◽

Source Codes ◽

Molecular Alterations ◽

Regulation Of Cell Cycle ◽

The Web

Abstract Motivation An important goal of cancer genomics initiatives is to provide the research community with the resources for the unbiased query of cancer mechanisms. Several excellent web platforms have been developed to enable the visual analyses of molecular alterations in cancers from these datasets. However, there are few tools to allow the researchers to mine these resources for mechanisms of cancer processes and their functional interactions in an intuitive unbiased manner. Results To address this need, we developed SEMA, a web platform for building and testing of models of cancer mechanisms from large multidimensional cancer genomics datasets. Unlike the existing tools for the analyses and query of these resources, SEMA is explicitly designed to enable exploratory and confirmatory analyses of complex cancer mechanisms through a suite of intuitive visual and statistical functionalities. Here, we present a case study of the functional mechanisms of TP53-mediated tumor suppression in various cancers, using SEMA, and identify its role in the regulation of cell cycle progression, DNA repair and signal transduction in different cancers. SEMA is a first-in-its-class web application designed to allow visual data mining and hypothesis testing from the multidimensional cancer datasets. The web application, an extensive tutorial and several video screencasts with case studies are freely available for academic use at https://sema.research.cchmc.org/. Availability and implementation SEMA is freely available at https://sema.research.cchmc.org. The web site also contains a detailed Tutorial (also in Supplementary Information), and a link to the YouTube channel for video screencasts of analyses, including the analyses presented here. The Shiny and JavaScript source codes have been deposited to GitHub: https://github.com/msolmazm/sema. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Displaying Hidden Information in Glossaries

Encyclopedia of Information Science and Technology, Fourth Edition ◽

10.4018/978-1-5225-2255-3.ch645 ◽

2018 ◽

pp. 7411-7421 ◽

Cited By ~ 1

Author(s):

Marcela Ridao ◽

Jorge Horacio Doorn

Keyword(s):

Data Mining ◽

Software Engineering ◽

Requirements Engineering ◽

Point Of View ◽

Human Being ◽

Engineering Process ◽

Hidden Information ◽

Software Engineering Process ◽

Epistemological Point ◽

Universe Of Discourse

Requirements Engineering is frequently seen as the activity of the Software Engineering process with fewer tools. Usually there are only available graphic and text editing aids. This is supported by the perception that it is a human being intensive task. This chapter is based on the understanding that such perception is just partially true. Models used along the Requirements Engineering process have underlying structures holding semantic information difficult to be seen by the reader. In fact, models created with well defined objective, were designed to maximize their expressiveness for that objective. However they may hold some useful shadowed information. Here is where a specialized tool may become valuable. From an epistemological point of view, this situation is similar to what happens in data mining. In this chapter, a tool able to make visible any clustering existing in Universe of Discourse glossaries is described. It is based on the automatic constructions of graphs using references embedded in the glossary itself.

Download Full-text