Selected Readings on Database Technologies and Applications

An easy, efficient, and effective way to retrieve stored data is obviously one of the key issues of any information system. In the last few years, considerable effort has been devoted to the definition of more intuitive, visual-based querying paradigms, attempting to offer a good trade-off between expressiveness and intuitiveness. In this chapter, we analyze the main characteristics of visual languages specifically designed for querying information systems, concentrating on conventional relational databases, but also considering information systems with a less rigid structure such as Web resources storing XML documents. We consider two fundamental aspects of visual query languages: the adopted visual representation technique and the underlying data model, possibly specialized to specific application contexts.

Download Full-text

Action Research with Internet Database Tools

Selected Readings on Database Technologies and Applications ◽

10.4018/978-1-60566-098-1.ch026 ◽

2011 ◽

pp. 1-20

Author(s):

Bruce L. Mann

Keyword(s):

Action Research ◽

Teaching And Learning ◽

Instructional Methods

This chapter will discuss and present examples of Internet database tools, typical instructional methods used with the tools, and implications for Internet-supported action research as a progressively deeper examination of teaching and learning

Download Full-text

NetCube

Selected Readings on Database Technologies and Applications ◽

10.4018/978-1-60566-098-1.ch023 ◽

2011 ◽

pp. 471-489

Author(s):

Dimitris Margaritis ◽

Christos Faloutsos ◽

Sebastian Thrun

Keyword(s):

Bayesian Networks ◽

Graphical Models ◽

Parallel Implementation ◽

Joint Probability ◽

Reconstruction Error ◽

Joint Probability Distribution ◽

Conjunctive Query ◽

Database Size ◽

Novel Method ◽

Don't Cares

We present a novel method for answering count queries from a large database approximately and quickly. Our method implements an approximate DataCube of the application domain, which can be used to answer any conjunctive count query that can be formed by the user. The DataCube is a conceptual device that in principle stores the number of matching records for all possible such queries. However, because its size and generation time are inherently exponential, our approach uses one or more Bayesian networks to implement it approximately. Bayesian networks are statistical graphical models that can succinctly represent the underlying joint probability distribution of the domain, and can therefore be used to calculate approximate counts for any conjunctive query combination of attribute values and “don’t cares.” The structure and parameters of these networks are learned from the database in a preprocessing stage. By means of such a network, the proposed method, called NetCube, exploits correlations and independencies among attributes to answer a count query quickly without accessing the database. Our preprocessing algorithm scales linearly on the size of the database, and is thus scalable; it is also parallelizable with a straightforward parallel implementation. We give an algorithm for estimating the count result of arbitrary que ries that is fast (constant) on the database size. Our experimental results show that NetCubes have fast generation and use, achieve excellent compression and have low reconstruction error. Moreover, they naturally allow for visualization and data mining, at no extra cost.

Download Full-text

Bioinformatics Web Portals

Selected Readings on Database Technologies and Applications ◽

10.4018/978-1-60566-098-1.ch016 ◽

2011 ◽

pp. 330-351

Author(s):

Mario Cannataro

Keyword(s):

Structure Prediction ◽

Heterogeneous Data ◽

Software Tools ◽

Biological Data ◽

Data Generation ◽

Biological Databases ◽

Protein Databases ◽

Large Size ◽

Computationally Expensive ◽

The Right

Bioinformatics involves the design and development of advanced algorithms and computational platforms to solve problems in biomedicine (Jones & Pevzner, 2004). It also deals with methods for acquiring, storing, retrieving and analysing biological data obtained by querying biological databases or provided by experiments. Bioinformatics applications involve different datasets as well as different software tools and algorithms. Such applications need semantic models for basic software components and need advanced scientific portal services able to aggregate such different components and to hide their details and complexity from the final user. For instance, proteomics applications involve datasets, either produced by experiments or available as public databases, as well as a huge number of different software tools and algorithms. To use such applications it is required to know both biological issues related to data generation and results interpretation and informatics requirements related to data analysis. Bioinformatics applications require platforms that are computationally out of standard. Applications are indeed (1) naturally distributed, due to the high number of involved datasets; (2) require high computing power, due to the large size of datasets and the complexity of basic computations; (3) access heterogeneous data both in format and structure; and finally (5) require reliability and security. For instance, applications such as identification of proteins from spectra data (de Hoffmann & Stroobant, 2002), querying of protein databases (Swiss-Prot), predictions of proteins structures (Guerra & Istrail, 2003), and string-based pattern extraction from large biological sequences, are some examples of computationally expensive applications. Moreover, expertise is required in choosing the most appropriate tools. For instance, protein structure prediction depends on proteins family, so choosing the right tool may strongly influence the experimental results.

Download Full-text

Conceptual Modeling for XML

Selected Readings on Database Technologies and Applications ◽

10.4018/978-1-60566-098-1.ch007 ◽

2011 ◽

pp. 148-173

Author(s):

Sriram Mohan ◽

Arijit Sengupta

Keyword(s):

Conceptual Design ◽

Relational Databases ◽

Conceptual Modeling ◽

Future Trend ◽

Major Influence ◽

Design Phase ◽

Application Development ◽

Current Application ◽

Development Processes ◽

Entity Relationship

The process of conceptual design is independent of the final platform and the medium of implementation, and is usually in a form that is understandable and usable by managers and other personnel who may not be familiar with the low-level implementation details, but have a major influence in the development process. Although a strong design phase is involved in most current application development processes (e.g., Entity Relationship design for relational databases), conceptual design for XML has not been explored significantly in literature or in practice. Most XML design processes start by directly marking up data in XML, and the metadata is typically designed at the time of encoding the documents. In this chapter, the reader is introduced to existing methodologies for modeling XML. A discussion is then presented comparing and contrasting their capabilities and deficiencies, and delineating the future trend in conceptual design for XML applications.

Download Full-text

Discovering Quality Knowledge from Relational Databases

Selected Readings on Database Technologies and Applications ◽

10.4018/978-1-60566-098-1.ch004 ◽

2011 ◽

pp. 95-111

Author(s):

M. Mehdi Owrang O.

Keyword(s):

Knowledge Discovery ◽

Large Volume ◽

Relational Databases ◽

Discovery System ◽

New Knowledge ◽

Database Technology ◽

Long Term Trends ◽

Current Database

Current database technology involves processing a large volume of data in order to discover new knowledge. However, knowledge discovery on just the most detailed and recent data does not reveal the long-term trends. Relational databases create new types of problems for knowledge discovery since they are normalized to avoid redundancies and update anomalies, which make them unsuitable for knowledge discovery. A key issue in any discovery system is to ensure the consistency, accuracy, and completeness of the discovered knowledge. We describe the aforementioned problems associated with the quality of the discovered knowledge and provide some solutions to avoid them.

Download Full-text

Database High Availability

Selected Readings on Database Technologies and Applications ◽

10.4018/978-1-60566-098-1.ch027 ◽

2011 ◽

pp. 21-43

Author(s):

Moh’d A. Radaideh ◽

Hayder Al-Ameed

Keyword(s):

World Wide Web ◽

World Wide ◽

High Availability ◽

Performance Tuning ◽

Database Server ◽

Database Performance ◽

Computer Technologies ◽

The World ◽

Typical Measurement

With the advancement of computer technologies and the World Wide Web, there has been an explosion in the amount of available e-services, most of which represent database processing. Efficient and effective database performance tuning and high availability techniques should be employed to insure that all e-services remain reliable and available all times. To avoid the impacts of database downtime, many corporations have taken interest in database availability. The goal for some is to have continuous availability such that a database server never fails. Other companies require their content to be highly availabile. In such cases, short and planned downtimes would be allowed for maintenance purposes. This chapter is meant to present the definition, the background, and the typical measurement factors of high availability. It also demonstrates some approaches to minimize a database server’s shutdown time.

Download Full-text

Rule Discovery from Textual Data

Selected Readings on Database Technologies and Applications ◽

10.4018/978-1-60566-098-1.ch025 ◽

2011 ◽

pp. 499-527

Author(s):

Shigeaki Sakurai

Keyword(s):

Decision Tree ◽

Knowledge Discovery ◽

The Other ◽

Fuzzy Decision ◽

Fuzzy Decision Tree ◽

Parts Of Speech ◽

Textual Data ◽

New Knowledge ◽

Analysis System ◽

E Mail

This chapter introduces knowledge discovery methods based on a fuzzy decision tree from textual data. The author argues that the methods extract features of the textual data based on a key concept dictionary, which is a hierarchical thesaurus, and a key phrase pattern dictionary, which stores characteristic rows of both words and parts of speech, and generate knowledge in the format of a fuzzy decision tree. The author also discusses two application tasks. One is an analysis system for daily business reports and the other is an e-mail analysis system. The author hopes that the methods will provide new knowledge for researchers engaged in text mining studies, facilitating their understanding of the importance of the fuzzy decision tree in processing textual data.

Download Full-text

Enhancing UML Models

Selected Readings on Database Technologies and Applications ◽

10.4018/978-1-60566-098-1.ch018 ◽

2011 ◽

pp. 369-394

Author(s):

Iris Reinhartz-Berger ◽

Arnon Sturm

Keyword(s):

Internal Consistency ◽

Domain Analysis ◽

Domain Model ◽

Experimental Results ◽

Modeling Language ◽

Modeling Languages ◽

Domain Modeling ◽

System Requirements ◽

Uml Models

UML has been largely adopted as a standard modeling language. The emergence of UML from different modeling languages that refer to various system aspects causes a wide variety of completeness and correctness problems in UML models. Several methods have been proposed for dealing with correctness issues, mainly providing internal consistency rules but ignoring correctness and completeness with respect to the system requirements and the domain constraints. In this article, we propose addressing both completeness and correctness problems of UML models by adopting a domain analysis approach called application- based domain modeling (ADOM). We present experimental results from our study which checks the quality of application models when utilizing ADOM on UML. The results advocate that the availability of the domain model helps achieve more complete models without reducing the comprehension of these models.

Download Full-text

An Approach to Mining Crime Patterns

Selected Readings on Database Technologies and Applications ◽

10.4018/978-1-60566-098-1.ch015 ◽

2011 ◽

pp. 305-329

Author(s):

Sikha Bagui

Keyword(s):

Machine Learning ◽

Decision Tree ◽

Knowledge Discovery ◽

Association Rules ◽

Query Language ◽

Two Dimensional ◽

Raw Data ◽

Data Cubes ◽

Meaningful Information ◽

Learning Software

This paper presents a knowledge discovery effort to retrieve meaningful information about crime from a U.S. state database. The raw data were preprocessed, and data cubes were created using Structured Query Language (SQL). The data cubes then were used in deriving quantitative generalizations and for further analysis of the data. An entropy-based attribute relevance study was undertaken to determine the relevant attributes. A machine learning software called WEKA was used for mining association rules, developing a decision tree, and clustering. SOM was used to view multidimensional clusters on a regular two-dimensional grid.

Download Full-text

Selected Readings on Database Technologies and Applications
Latest Publications

TOTAL DOCUMENTS

H-INDEX

Published By IGI Global

Visual Query Languages, Representation Techniques, and Data Models

Action Research with Internet Database Tools

NetCube

Bioinformatics Web Portals

Conceptual Modeling for XML

Discovering Quality Knowledge from Relational Databases

Database High Availability

Rule Discovery from Textual Data

Enhancing UML Models

An Approach to Mining Crime Patterns

Export Citation Format

Selected Readings on Database Technologies and ApplicationsLatest Publications

TOTAL DOCUMENTS

H-INDEX

Published By IGI Global

Visual Query Languages, Representation Techniques, and Data Models

Action Research with Internet Database Tools

NetCube

Bioinformatics Web Portals

Conceptual Modeling for XML

Discovering Quality Knowledge from Relational Databases

Database High Availability

Rule Discovery from Textual Data

Enhancing UML Models

An Approach to Mining Crime Patterns

Selected Readings on Database Technologies and Applications
Latest Publications