Composite Event Processing for Data Streams and Domain Knowledge

In recent years the processing of composite event queries over data streams has attracted a lot of research attention. Traditional database techniques were not designed for stream processing system. Furthermore, example continuous queries are often formulated in declarative query language without specifying the semantics. To overcome these deficiencies, this article presents the design, implementation, and evaluation of a system that executes data streams with semantic information. Then, a set of optimization techniques are proposed for handling query. So, our approach not only makes it possible to express queries with a sound semantics, but also provides a solid foundation for query optimization. Experiment results show that our approach is effective and efficient for data streams and domain knowledge.

Download Full-text

Document SQL (DSQL)

Theoretical and Practical Advances in Information Systems Development ◽

10.4018/978-1-60960-521-6.ch013 ◽

2011 ◽

pp. 316-344

Author(s):

Arijit Sengupta ◽

V. Ramesh

Keyword(s):

Query Optimization ◽

Ad Hoc ◽

Query Language ◽

Optimization Techniques ◽

Order Logic ◽

First Order Logic ◽

Tree Structures ◽

Easy Integration ◽

Query Semantics ◽

Theoretical Foundations

This chapter presents DSQL, a conservative extension of SQL, as an ad-hoc query language for XML. The development of DSQL follows the theoretical foundations of first order logic, and uses common query semantics already accepted for SQL. DSQL represents a core subset of XQuery that lends well to query optimization techniques; while at the same time allows easy integration into current databases and applications that use SQL. The intent of DSQL is not to replace XQuery, the current W3C recommended XML query language, but to serve as an ad-hoc querying frontend to XQuery. Further, the authors present proofs for important query language properties such as complexity and closure. An empirical study comparing DSQL and XQuery for the purpose of ad-hoc querying demonstrates that users perform better with DSQL for both flat and tree structures, in terms of both accuracy and efficiency.

Download Full-text

Advanced signal processing system for demodulation, equalization and crosspol optimization of GBPS data streams

10.2514/6.1994-957 ◽

1994 ◽

Author(s):

Thomas Kolze ◽

Pascal Finkenbeiner ◽

Keith Yamashiro

Keyword(s):

Signal Processing ◽

Data Streams ◽

Processing System ◽

Signal Processing System

Download Full-text

Query Rewriting for Incremental Continuous Query Evaluation in HIFUN

Algorithms ◽

10.3390/a14050149 ◽

2021 ◽

Vol 14 (5) ◽

pp. 149

Author(s):

Petros Zervoudakis ◽

Haridimos Kondylakis ◽

Nicolas Spyratos ◽

Dimitris Plexousakis

Keyword(s):

Query Optimization ◽

Query Language ◽

Computational Cost ◽

Continuous Queries ◽

Continuous Query ◽

Query Rewriting ◽

Query Evaluation ◽

Clear Separation ◽

Complete Dataset ◽

High Level

HIFUN is a high-level query language for expressing analytic queries of big datasets, offering a clear separation between the conceptual layer, where analytic queries are defined independently of the nature and location of data, and the physical layer, where queries are evaluated. In this paper, we present a methodology based on the HIFUN language, and the corresponding algorithms for the incremental evaluation of continuous queries. In essence, our approach is able to process the most recent data batch by exploiting already computed information, without requiring the evaluation of the query over the complete dataset. We present the generic algorithm which we translated to both SQL and MapReduce using SPARK; it implements various query rewriting methods. We demonstrate the effectiveness of our approach in temrs of query answering efficiency. Finally, we show that by exploiting the formal query rewriting methods of HIFUN, we can further reduce the computational cost, adding another layer of query optimization to our implementation.

Download Full-text

Query Optimization in the Databases Distributed by Means of Product Equivalence Relations

Fundamenta Informaticae ◽

10.3233/fi-1988-11303 ◽

1988 ◽

Vol 11 (3) ◽

pp. 241-265

Author(s):

W. Marek ◽

C. Rauszer

Keyword(s):

Query Optimization ◽

Selection Process ◽

Distributed Databases ◽

Optimization Techniques ◽

Equivalence Relations ◽

Database Operations ◽

By Products

In this paper, we address the problem of query optimization in distributed databases. We show that horizontal partitions of databases, generated by products of equivalence relations, induce optimization techniques for the basic database operations (i.e., the selection, projection, and join operators). In the case of selection, our method allows for restriction of the number of blocks to be searched in the selection process and subsequent simplification of the selection formula at each block. For the natural join operation, we propose an algorithm that reduces the computation of fragments. Proofs of the correctness of our algorithms are also included.

Download Full-text

An Adaptive Throughput-First Packet Scheduling Algorithm for DPDK-Based Packet Processing Systems

Future Internet ◽

10.3390/fi13030078 ◽

2021 ◽

Vol 13 (3) ◽

pp. 78

Author(s):

Chuanhong Li ◽

Lei Song ◽

Xuewen Zeng

Keyword(s):

Packet Loss ◽

High Performance ◽

Packet Scheduling ◽

Scheduling Algorithm ◽

Processing System ◽

System Throughput ◽

Packet Processing ◽

Research Attention ◽

Continuous Increase ◽

Packet Scheduling Algorithm

The continuous increase in network traffic has sharply increased the demand for high-performance packet processing systems. For a high-performance packet processing system based on multi-core processors, the packet scheduling algorithm is critical because of the significant role it plays in load distribution, which is related to system throughput, attracting intensive research attention. However, it is not an easy task since the canonical flow-level packet scheduling algorithm is vulnerable to traffic locality, while the packet-level packet scheduling algorithm fails to maintain cache affinity. In this paper, we propose an adaptive throughput-first packet scheduling algorithm for DPDK-based packet processing systems. Combined with the feature of DPDK burst-oriented packet receiving and transmitting, we propose using Subflow as the scheduling unit and the adjustment unit making the proposed algorithm not only maintain the advantages of flow-level packet scheduling algorithms when the adjustment does not happen but also avoid packet loss as much as possible when the target core may be overloaded Experimental results show that the proposed method outperforms Round-Robin, HRW (High Random Weight), and CRC32 on system throughput and packet loss rate.

Download Full-text

Technology of Continuous Query Optimization over Data Streams

2008 International Symposium on Information Science and Engineering ◽

10.1109/isise.2008.36 ◽

2008 ◽

Author(s):

Feng Weibing ◽

Li Zhanhuai

Keyword(s):

Query Optimization ◽

Data Streams ◽

Continuous Query

Download Full-text

A Dynamic Subspace Anomaly Detection Method Using Generic Algorithm for Streaming Network Data

Handbook of Research on Emerging Developments in Data Privacy - Advances in Information Security, Privacy, and Ethics ◽

10.4018/978-1-4666-7381-6.ch018 ◽

2015 ◽

pp. 403-425

Author(s):

Ji Zhang

Keyword(s):

Anomaly Detection ◽

Data Streams ◽

Training Data ◽

Detection Methods ◽

Network Data ◽

Data Generation ◽

Research Attention ◽

Network Connection ◽

Dimensional Network ◽

Anomaly Classification

A great deal of research attention has been paid to data mining on data streams in recent years. In this chapter, the authors carry out a case study of anomaly detection in large and high-dimensional network connection data streams using Stream Projected Outlier deTector (SPOT) that is proposed in Zhang et al. (2009) to detect anomalies from data streams using subspace analysis. SPOT is deployed on 1999 KDD CUP anomaly detection application. Innovative approaches for training data generation, anomaly classification, false positive reduction, and adoptive detection subspace generation are proposed in this chapter as well. Experimental results demonstrate that SPOT is effective and efficient in detecting anomalies from network data streams and outperforms existing anomaly detection methods.

Download Full-text

Designing Document SQL (DSQL)

Journal of Database Management ◽

10.4018/jdm.2009062502 ◽

2009 ◽

Vol 20 (4) ◽

pp. 26-53 ◽

Cited By ~ 5

Author(s):

Arijit Sengupta ◽

V. Ramesh

Keyword(s):

Ad Hoc ◽

Query Language ◽

Optimization Techniques ◽

Order Logic ◽

Conservative Extension ◽

Tree Structures ◽

First Order ◽

Easy Integration ◽

Query Semantics ◽

Theoretical Foundations

This article presents DSQL, a conservative extension of SQL, as an ad-hoc query language for XML. The development of DSQL follows the theoretical foundations of first order logic, and uses common query semantics already accepted for SQL. DSQL represents a core subset of XQuery that lends well to optimization techniques, while at the same time allows easy integration into current databases and applications that useSQL. The intent of DSQL is not to replace XQuery, the current W3C recommended XML query language, but to serve as an ad-hoc querying frontend to XQuery. Further, the authors present proofs for important query language properties such as complexity and closure. An empirical study comparing DSQL and XQuery for the purpose of ad-hoc querying demonstrates that users perform better with DSQL for both flat and tree structures, in terms of both accuracy and efficiency.

Download Full-text

Query Formation and Information Retrieval with Ontology

Semantic Web Technologies and E-Business ◽

10.4018/978-1-59904-192-6.ch013 ◽

2007 ◽

pp. 310-323

Author(s):

Sheng-Uei Guan

Keyword(s):

Genetic Algorithms ◽

Information Retrieval ◽

Query Optimization ◽

Formation Process ◽

Retrieval System ◽

Mobile Commerce ◽

Information Retrieval System ◽

Optimization Techniques ◽

Tree Structure ◽

Retrieval Scheme

This chapter presents an ontology-based query formation and information retrieval system under the mobile commerce (m-commerce) agent framework. A query formation approach that combines the usage of ontology and keywords is implemented. This approach takes advantage of the tree structure in ontology to form queries visually and efficiently. It also uses additional aids such as keywords to complete the query formation process more efficiently. The proposed information retrieval scheme focuses on using genetic algorithms (GAs) to improve computational effectiveness. Other query optimization techniques used include query restructuring by logical terms and numerical constraints replacement.

Download Full-text

Query Optimization in Crowd-Sourcing Using Multi-Objective Ant Lion Optimizer

International Journal of Information Technology and Web Engineering ◽

10.4018/ijitwe.2019100103 ◽

2019 ◽

Vol 14 (4) ◽

pp. 50-63 ◽

Cited By ~ 1

Author(s):

Deepak Kumar ◽

Deepti Mehrotra ◽

Rohit Bansal

Keyword(s):

Query Optimization ◽

Query Language ◽

Database Systems ◽

Amazon Mechanical Turk ◽

Optimization Approach ◽

Crowd Sourcing ◽

Market Place ◽

Multi Objective ◽

Ant Lion Optimizer ◽

Ant Lion

Nowadays, query optimization is a biggest concern for crowd-sourcing systems, which are developed for relieving the user burden of dealing with the crowd. Initially, a user needs to submit a structured query language (SQL) based query and the system takes the responsibility of query compiling, generating an execution plan, and evaluating the crowd-sourcing market place. The input queries have several alternative execution plans and the difference in crowd-sourcing cost between the worst and best plans. In relational database systems, query optimization is essential for crowd-sourcing systems, which provides declarative query interfaces. Here, a multi-objective query optimization approach using an ant-lion optimizer was employed for declarative crowd-sourcing systems. It generates a query plan for developing a better balance between the latency and cost. The experimental outcome of the proposed methodology was validated on UCI automobile and Amazon Mechanical Turk (AMT) datasets. The proposed methodology saves 30%-40% of cost in crowd-sourcing query optimization compared to the existing methods.

Download Full-text