Processing and Managing Complex Data for Decision Support
Latest Publications


TOTAL DOCUMENTS

13
(FIVE YEARS 0)

H-INDEX

3
(FIVE YEARS 0)

Published By IGI Global

9781591406556, 9781591406570

Author(s):  
Ioannis Karydis ◽  
Alexandros Nanopoulos ◽  
Yannis Manolopoulos

This chapter provides a broad survey of music data mining, including clustering, classification and pattern discovery in music. The data studied is mainly symbolic encodings of musical scores, although digital audio (acoustic data) is also addressed. Throughout the chapter, practical applications of music data mining are presented. Music data mining addresses the discovery of knowledge from music corpora. This chapter encapsulates the theory and methods required in order to discover knowledge in the form of patterns for music analysis and retrieval, or statistical models for music classification and generation. Music data, with their temporal, highly structured and polyphonic character, introduce new challenges for data mining. Additionally, due to their complex structure and their subjectivity to inaccuracies caused by perceptual effects, music data present challenges in knowledge representation as well.


Author(s):  
Jilin Han ◽  
Le Gruenwald ◽  
Tyrrell Conway

The study of gene expression levels under defined experimental conditions is an important approach to understand how a living cell works. High-throughput microarray technology is a very powerful tool for simultaneously studying thousands of genes in a single experiment. This revolutionary technology results in an extensive amount of data, which raises an important question: how to extract meaningful biological information from these data? In this chapter, we survey data mining techniques that have been used for clustering, classification and association rules for gene expression data analysis. In addition, we provide a comprehensive list of currently available commercial and academic data mining software together with their features. Lastly, we suggest future research directions.


Author(s):  
Antonio Badia

Data warehouses, already established as the main repository of data in the enterprise, are now being used to store documents (e-mails, manuals, reports, etc.) so as to capture more domain information. In order to integrate information in natural language (so-called unstructured data) with information in the database (structured and semistructured data), existing techniques from Information Retrieval are being used. In this chapter, which is part overview and part position paper, we review these techniques and discuss their limitations. We argue that true integration cannot be achieved within the framework of Information Retrieval and introduce another paradigm, based in Information Extraction. We discuss the main characteristics of Information Extraction and analyze the challenges that stand on the way of this technology being widely used. Finally, we close with some considerations on future developments in the general area of documents in databases.


Author(s):  
Giovanna Guerrini ◽  
Marco Mesiti ◽  
Elisa Bertino

This chapter discusses existing approaches to evaluate and measure structural similarity in sources of XML documents. A relevant peculiarity of XML documents, indeed, is that information on the document structure is available in the document itself. In the chapter we present different approaches aiming at evaluating structural similarity at three different levels: among documents, between a document and a schema, and among schemas. The most relevant applications of such measures are for document classification and schema extraction, and for document and schema structural clustering, though other interesting applications such as document change detection and structural querying can be devised, and will be discussed throughout the chapter.


Author(s):  
Maria Luisa Damiani ◽  
Stefano Spaccapietra

This chapter is concerned with multidimensional data models for spatial data warehouses. Over the last few years different approaches have been proposed in the literature for modelling multidimensional data with geometric extent. Nevertheless, the definition of a comprehensive and formal data model is still a major research issue. The main contributions of the chapter are twofold: First, it draws a picture of the research area; second it introduces a novel spatial multidimensional data model for spatial objects with geometry (MuSD – multigranular spatial data warehouse). MuSD complies with current standards for spatial data modelling, augmented by data warehousing concepts such as spatial fact, spatial dimension and spatial measure. The novelty of the model is the representation of spatial measures at multiple levels of geometric granularity. Besides the representation concepts, the model includes a set of OLAP operators supporting the navigation across dimension and measure levels.


Author(s):  
Theodore Dalamagas ◽  
Tao Cheng ◽  
Timos Selis

The recent proliferation of XML-based standards and technologies demonstrates the need for effective management of hierarchical structures. Such structures are used, for example, to organize data in product catalogs, taxonomies of thematic categories, concept hierarchies, etc. Since the XML language has become the standard data exchange format on the Web, organizing data in hierarchical structures has been vastly established. Even if data are not stored natively in such structures, export mechanisms make data publicly available in hierarchical structures to enable its automatic processing by programs, scripts and agents. Processing data encoded in hierarchical structures has been a popular research issue, resulting in the design of effective query languages. However, the inherent structural aspect of such encodings has not received strong attention till lately, when the requirement for mining tasks, like clustering/classification methods, similarity ranking, etc., on hierarchical structures has been raised. The key point to perform such tasks is the design of a structural distance metric to quantify the structural similarity between hierarchical structures. The chapter will study distance metrics that capture the structural similarity between hierarchical structures and approaches that exploit structural distance metrics to perform mining tasks on them.


Author(s):  
Jörg Rech

Source code occurs in diverse programming languages with documentation using miscellaneous standards, comments in individual styles, extracted metrics or associated test cases that are hard to exploit through information retrieval or knowledge-discovery techniques. Typically, the information about object-oriented source code for a software system is distributed across several different sources, which makes processing complex. In this chapter we describe the morphology of object-oriented source code and how we (pre-) process, integrate and use it for knowledge discovery in software engineering in order to support decision-making regarding the refactoring, reengineering and reuse of software systems.


Author(s):  
Vicky Nassis ◽  
Tharam S. Dilon ◽  
Wenny Rahayu ◽  
R. Rajugan

extensible Markup Language (XML) has emerged as the dominant standard in describing and exchanging data amongst heterogeneous data sources. The increasing presence of large volumes of data appearing creates the need to investigate XML document warehouses (XDW) as a means of handling the data for business intelligence. In our previous work (Nassis, Rajugan, Dillon, & Rahayu, 2004) we proposed a conceptual modelling approach for the development of an XDW with emphasis on the design techniques. We consider important the need of capturing data warehouse requirements early in the design stage. The elicitation of requirements and their use for data warehouse design is a significant and, as yet, an unaddressed issue. For this reason, we explore a requirement engineering (RE) approach, namely the goal-oriented approach. We will extract and extend the notion of this approach to introduce the XML document warehouse (XDW) requirement model. In order to perform this, we consider organisational objectives as well as user viewpoints. Furthermore, these are related to the XDW particularly focussing on deriving dimensions, as opposed to associating organisational objectives to the system functions, which is traditionally carried out by RE.


Author(s):  
H. Azzag ◽  
F. Picarougne ◽  
C. Guinot ◽  
G. Venturini

We present in this chapter a new 3D interactive method for visualizing multimedia data with virtual reality named VRMiner. We consider that an expert in a specific domain has collected a set of examples described with numeric and symbolic attributes but also with sounds, images, videos and Web sites or 3D models, and that this expert wishes to explore these data to understand their structure. We use a 3D stereoscopic display in order to let the expert easily visualize and observe the data. We add to this display contextual information such as texts and small images, voice synthesis and sound. Larger images, videos and Web sites are displayed on a second computer in order to ensure real time display. Navigating through the data is done in a very intuitive and precise way with a 3D sensor that simulates a virtual camera. Interactive requests can be formulated by the expert with a data glove that recognizes the hand gestures. We show how this tool has been successfully applied to several real world applications.


Author(s):  
Barbara Catania ◽  
Anna Maddalena

Knowledge intensive applications rely on the usage of knowledge artifacts, called patterns, to represent in a compact and semantically rich way huge quantities of heterogeneous raw data. Due to pattern characteristics of patterns, specific systems are required for pattern management in order to model, store, retrieve and manipulate patterns in an efficient and effective way. Several theoretical and industrial approaches (relying on standard proposals, metadata management and business intelligence solutions) have already been proposed for pattern management. However, no critical comparison of the existing approaches has been proposed so far. The aim of this chapter is to provide such a comparison. In particular, specific issues concerning pattern management systems, pattern models and pattern languages are discussed. Several parameters are also identified that will be used in evaluating the effectiveness of theoretical and industrial proposals. The chapter is concluded with a discussion concerning additional issues in the context of pattern management.


Sign in / Sign up

Export Citation Format

Share Document