Advances in Data Mining and Database Management - Handbook of Research on Innovative Database Query Processing Techniques
Latest Publications


TOTAL DOCUMENTS

21
(FIVE YEARS 0)

H-INDEX

1
(FIVE YEARS 0)

Published By IGI Global

9781466687677, 9781466687684

Author(s):  
Wei Yan

In cloud computing environments parallel kNN queries for big data is an important issue. The k nearest neighbor queries (kNN queries), designed to find k nearest neighbors from a dataset S for every object in another dataset R, is a primitive operator widely adopted by many applications including knowledge discovery, data mining, and spatial databases. This chapter proposes a parallel method of kNN queries for big data using MapReduce programming model. Firstly, this chapter proposes an approximate algorithm that is based on mapping multi-dimensional data sets into two-dimensional data sets, and transforming kNN queries into a sequence of two-dimensional point searches. Then, in two-dimensional space this chapter proposes a partitioning method using Voronoi diagram, which incorporates the Voronoi diagram into R-tree. Furthermore, this chapter proposes an efficient algorithm for processing kNN queries based on R-tree using MapReduce programming model. Finally, this chapter presents the results of extensive experimental evaluations which indicate efficiency of the proposed approach.


Author(s):  
Marlene Goncalves ◽  
Alberto Gobbi

Location-based Skyline queries select the nearest objects to a point that best meet the user's preferences. Particularly, this chapter focuses on location-based Skyline queries over web-accessible data. Web-accessible may have geographical location and be geotagged with documents containing ratings by web users. Location-based Skyline queries may express preferences based on dynamic features such as distance and changeable ratings. In this context, distance must be recalculated when a user changes his position while the ratings must be extracted from external data sources which are updated each time a user scores an item in the Web. This chapter describes and empirically studies four solutions capable of answering location-based Skyline queries considering user's position change and information extraction from the Web inside an area search around the user. They are based on an M-Tree index and Divide & Conquer principle.


Author(s):  
José Ángel Labbad ◽  
Ricardo R. Monascal ◽  
Leonid Tineo

Traditional database systems and languages are very rigid. XML data and query languages are not the exception. Fuzzy set theory is an appropriate tool for solving this problem. In this sense, Fuzzy XQuery was proposed as an extension of the XQUERY standard. This language defines the xs:truth datatype, the xml:truth attribute and allows the definition and use of fuzzy terms in queries. The main goal of this chapter is to show a high coupling implementation of Fuzzy XQuery within eXist-db, an open source XML DBMS. This extension strategy could also be used with other similar tools. This chapter also presents a statistical performance analysis of the extended fuzzy query engine using the XMark benchmark with user defined fuzzy terms. The study presents promising results.


Author(s):  
Fu Zhang ◽  
Z. M. Ma ◽  
Haitao Cheng
Keyword(s):  

Ontologies, as a standard (W3C recommendation) for representing knowledge in the Semantic Web, have been employed in many application domains. Currently, real ontologies tend to become very large to huge. Thus, one problem is considered that has arisen from practical needs: namely, efficient querying of ontologies. To this end, there are today many proposals for answering queries over ontologies, and until now the literature on querying of ontologies has been flourishing. In particular, on the basis of the efficient and mature techniques of databases, which are useful for querying ontologies. To investigate querying of ontologies and more importantly identifying the direction of querying of ontologies based on databases, in this chapter, we aim at providing a brief review of answering queries over ontologies based on databases. Some query techniques, their classifications and the directions for future research, are introduced. Other query formalisms over ontologies that are not related to databases are not covered here.


Author(s):  
Maristela Holanda ◽  
Jane Adriana Souza

This chapter aims to investigate how NoSQL (Not Only SQL) databases provide query language and data retrieval mechanisms. Users attest to many advantages in using the NoSQL databases for specific applications, however, they also report that querying and retrieving data easily continues to be a problem. The NoSQL operations require that, during the project, the queries must be thought of as built-in application codes. The authors intend to contribute to the investigation of querying, considering different types of NoSQL databases.


Author(s):  
Marlene Goncalves ◽  
Fabiola Di Bartolo

Skyline queries may be used to filter interesting data from a broad range of data. A Skyline query selects those data that are the best according to multiple user-defined criteria. A special case of Skyline queries are the Spatial Skyline Queries (SSQ). SSQ allow users to express preferences on the closeness between a set of data points and a set of query points. We study the problem of answering SSQ in presence of changing data, i.e., data whose values regularly change over a period of time. In this chapter, it is proposed an algorithm to evaluate SSQ on changing data. The proposed algorithm is able to avoid recomputation of the whole Skyline with each update on the data. Also, the performance of the proposed algorithm against state-of-the-art algorithms was empirically studied. The experimental study shows that the proposed algorithm may become 3 times faster than state-of-the-art algorithms.


Author(s):  
Li Yan ◽  
Z. M. Ma

Imperfect information extensively exists in data and knowledge intensive applications, where fuzzy data play an import role in nature. Fuzzy set theory has been extensively applied to extend various database models and resulted in numerous contributions. The chapter concentrates on two main issues in fuzzy data management: fuzzy data models and fuzzy data querying based on the fuzzy data models. A full up-to-date overview of the current state of the art in fuzzy data modeling and querying is provided in the chapter. In addition, the relationships among various fuzzy data models are discussed in the chapter. The chapter serves as identifying possible research opportunities in the area of fuzzy data management in addition to providing a generic overview of the approaches to modeling and querying fuzzy data.


Author(s):  
Vasileios Zois ◽  
Charalampos Chelmis ◽  
Viktor K. Prasanna

Time series data emerge naturally in many fields of applied sciences and engineering including but not limited to statistics, signal processing, mathematical finance, weather and power consumption forecasting. Although time series data have been well studied in the past, they still present a challenge to the scientific community. Advanced operations such as classification, segmentation, prediction, anomaly detection and motif discovery are very useful especially for machine learning as well as other scientific fields. The advent of Big Data in almost every scientific domain motivates us to provide an in-depth study of the state of the art approaches associated with techniques for efficient querying of time series. This chapters aims at providing a comprehensive review of the existing solutions related to time series representation, processing, indexing and querying operations.


Author(s):  
Gloria Bordogna ◽  
Simone Sterlacchini ◽  
Paolo Arcaini

In this chapter we propose a framework for collecting, organizing into a database and querying information in social networks by the specification of content-based, geographic and temporal conditions to the aim of detecting periodic and aperiodic events. Our proposal could be a basis for developing context aware services. For example to identify the streets and their rush hours by analyzing the messages in social media periodically sent by queuing drivers and to report these critical spatio-temporal situations to help other drivers to plan alternative routes. Specifically, we rely on a focused crawler to periodically collect messages in social networks related with the contents of interest, and on an original geo-temporal clustering algorithm in order to explore the geo-temporal distribution of the messages. The clustering algorithm can be customized so as to identify aperiodic and periodic events at global or local scale based on the specification of geographic and temporal query conditions.


Author(s):  
Soraya O. Carrasquel ◽  
Ricardo R. Monascal ◽  
Rosseline Rodríguez ◽  
Leonid Tineo

There are some data models and query languages based on the application of fuzzy set theory. Their goal is to provide more flexible DBMS that allow the expression of user preferences in querying as well as imprecision in data. In this sense, the FuzzyEER data model proposes four kinds of fuzzy attributes. One of them, named Type 3, consists of a set of labels provided of a similarity relation. An extension of SQL, named FSQL, allows the expression and use of fuzzy attributes. Nevertheless, FSQL does not allow using fuzzy attributes in some clauses based on data ordering, due to semantics problem. This chapter presents a solution for this problem in case of Type 3 fuzzy attributes. Main contribution consists in how to process queries involving such attributes by means of an extension to an existing RDBMS. Formal semantics, grammar, catalogue definition and translation schemas are contained in this chapter.


Sign in / Sign up

Export Citation Format

Share Document