Selected Readings on Database Technologies and Applications
Latest Publications


TOTAL DOCUMENTS

27
(FIVE YEARS 0)

H-INDEX

1
(FIVE YEARS 0)

Published By IGI Global

9781605660981, 9781605660998

Author(s):  
Maria Chiara Caschera ◽  
Arianna D’Ulizia ◽  
Leonardo Tininini

An easy, efficient, and effective way to retrieve stored data is obviously one of the key issues of any information system. In the last few years, considerable effort has been devoted to the definition of more intuitive, visual-based querying paradigms, attempting to offer a good trade-off between expressiveness and intuitiveness. In this chapter, we analyze the main characteristics of visual languages specifically designed for querying information systems, concentrating on conventional relational databases, but also considering information systems with a less rigid structure such as Web resources storing XML documents. We consider two fundamental aspects of visual query languages: the adopted visual representation technique and the underlying data model, possibly specialized to specific application contexts.


Author(s):  
Bruce L. Mann

This chapter will discuss and present examples of Internet database tools, typical instructional methods used with the tools, and implications for Internet-supported action research as a progressively deeper examination of teaching and learning


Author(s):  
Dimitris Margaritis ◽  
Christos Faloutsos ◽  
Sebastian Thrun

We present a novel method for answering count queries from a large database approximately and quickly. Our method implements an approximate DataCube of the application domain, which can be used to answer any conjunctive count query that can be formed by the user. The DataCube is a conceptual device that in principle stores the number of matching records for all possible such queries. However, because its size and generation time are inherently exponential, our approach uses one or more Bayesian networks to implement it approximately. Bayesian networks are statistical graphical models that can succinctly represent the underlying joint probability distribution of the domain, and can therefore be used to calculate approximate counts for any conjunctive query combination of attribute values and “don’t cares.” The structure and parameters of these networks are learned from the database in a preprocessing stage. By means of such a network, the proposed method, called NetCube, exploits correlations and independencies among attributes to answer a count query quickly without accessing the database. Our preprocessing algorithm scales linearly on the size of the database, and is thus scalable; it is also parallelizable with a straightforward parallel implementation. We give an algorithm for estimating the count result of arbitrary que ries that is fast (constant) on the database size. Our experimental results show that NetCubes have fast generation and use, achieve excellent compression and have low reconstruction error. Moreover, they naturally allow for visualization and data mining, at no extra cost.


Author(s):  
Mario Cannataro

Bioinformatics involves the design and development of advanced algorithms and computational platforms to solve problems in biomedicine (Jones & Pevzner, 2004). It also deals with methods for acquiring, storing, retrieving and analysing biological data obtained by querying biological databases or provided by experiments. Bioinformatics applications involve different datasets as well as different software tools and algorithms. Such applications need semantic models for basic software components and need advanced scientific portal services able to aggregate such different components and to hide their details and complexity from the final user. For instance, proteomics applications involve datasets, either produced by experiments or available as public databases, as well as a huge number of different software tools and algorithms. To use such applications it is required to know both biological issues related to data generation and results interpretation and informatics requirements related to data analysis. Bioinformatics applications require platforms that are computationally out of standard. Applications are indeed (1) naturally distributed, due to the high number of involved datasets; (2) require high computing power, due to the large size of datasets and the complexity of basic computations; (3) access heterogeneous data both in format and structure; and finally (5) require reliability and security. For instance, applications such as identification of proteins from spectra data (de Hoffmann & Stroobant, 2002), querying of protein databases (Swiss-Prot), predictions of proteins structures (Guerra & Istrail, 2003), and string-based pattern extraction from large biological sequences, are some examples of computationally expensive applications. Moreover, expertise is required in choosing the most appropriate tools. For instance, protein structure prediction depends on proteins family, so choosing the right tool may strongly influence the experimental results.


Author(s):  
Sriram Mohan ◽  
Arijit Sengupta

The process of conceptual design is independent of the final platform and the medium of implementation, and is usually in a form that is understandable and usable by managers and other personnel who may not be familiar with the low-level implementation details, but have a major influence in the development process. Although a strong design phase is involved in most current application development processes (e.g., Entity Relationship design for relational databases), conceptual design for XML has not been explored significantly in literature or in practice. Most XML design processes start by directly marking up data in XML, and the metadata is typically designed at the time of encoding the documents. In this chapter, the reader is introduced to existing methodologies for modeling XML. A discussion is then presented comparing and contrasting their capabilities and deficiencies, and delineating the future trend in conceptual design for XML applications.


Author(s):  
M. Mehdi Owrang O.

Current database technology involves processing a large volume of data in order to discover new knowledge. However, knowledge discovery on just the most detailed and recent data does not reveal the long-term trends. Relational databases create new types of problems for knowledge discovery since they are normalized to avoid redundancies and update anomalies, which make them unsuitable for knowledge discovery. A key issue in any discovery system is to ensure the consistency, accuracy, and completeness of the discovered knowledge. We describe the aforementioned problems associated with the quality of the discovered knowledge and provide some solutions to avoid them.


Author(s):  
Moh’d A. Radaideh ◽  
Hayder Al-Ameed

With the advancement of computer technologies and the World Wide Web, there has been an explosion in the amount of available e-services, most of which represent database processing. Efficient and effective database performance tuning and high availability techniques should be employed to insure that all e-services remain reliable and available all times. To avoid the impacts of database downtime, many corporations have taken interest in database availability. The goal for some is to have continuous availability such that a database server never fails. Other companies require their content to be highly availabile. In such cases, short and planned downtimes would be allowed for maintenance purposes. This chapter is meant to present the definition, the background, and the typical measurement factors of high availability. It also demonstrates some approaches to minimize a database server’s shutdown time.


Author(s):  
Shigeaki Sakurai

This chapter introduces knowledge discovery methods based on a fuzzy decision tree from textual data. The author argues that the methods extract features of the textual data based on a key concept dictionary, which is a hierarchical thesaurus, and a key phrase pattern dictionary, which stores characteristic rows of both words and parts of speech, and generate knowledge in the format of a fuzzy decision tree. The author also discusses two application tasks. One is an analysis system for daily business reports and the other is an e-mail analysis system. The author hopes that the methods will provide new knowledge for researchers engaged in text mining studies, facilitating their understanding of the importance of the fuzzy decision tree in processing textual data.


Author(s):  
Iris Reinhartz-Berger ◽  
Arnon Sturm

UML has been largely adopted as a standard modeling language. The emergence of UML from different modeling languages that refer to various system aspects causes a wide variety of completeness and correctness problems in UML models. Several methods have been proposed for dealing with correctness issues, mainly providing internal consistency rules but ignoring correctness and completeness with respect to the system requirements and the domain constraints. In this article, we propose addressing both completeness and correctness problems of UML models by adopting a domain analysis approach called application- based domain modeling (ADOM). We present experimental results from our study which checks the quality of application models when utilizing ADOM on UML. The results advocate that the availability of the domain model helps achieve more complete models without reducing the comprehension of these models.


Author(s):  
Sikha Bagui

This paper presents a knowledge discovery effort to retrieve meaningful information about crime from a U.S. state database. The raw data were preprocessed, and data cubes were created using Structured Query Language (SQL). The data cubes then were used in deriving quantitative generalizations and for further analysis of the data. An entropy-based attribute relevance study was undertaken to determine the relevant attributes. A machine learning software called WEKA was used for mining association rules, developing a decision tree, and clustering. SOM was used to view multidimensional clusters on a regular two-dimensional grid.


Sign in / Sign up

Export Citation Format

Share Document