Composite Event Processing for Data Streams and Domain Knowledge

2011 ◽  
Vol 219-220 ◽  
pp. 927-931
Author(s):  
Jun Qiang Liu ◽  
Xiao Ling Guan

In recent years the processing of composite event queries over data streams has attracted a lot of research attention. Traditional database techniques were not designed for stream processing system. Furthermore, example continuous queries are often formulated in declarative query language without specifying the semantics. To overcome these deficiencies, this article presents the design, implementation, and evaluation of a system that executes data streams with semantic information. Then, a set of optimization techniques are proposed for handling query. So, our approach not only makes it possible to express queries with a sound semantics, but also provides a solid foundation for query optimization. Experiment results show that our approach is effective and efficient for data streams and domain knowledge.

Author(s):  
Arijit Sengupta ◽  
V. Ramesh

This chapter presents DSQL, a conservative extension of SQL, as an ad-hoc query language for XML. The development of DSQL follows the theoretical foundations of first order logic, and uses common query semantics already accepted for SQL. DSQL represents a core subset of XQuery that lends well to query optimization techniques; while at the same time allows easy integration into current databases and applications that use SQL. The intent of DSQL is not to replace XQuery, the current W3C recommended XML query language, but to serve as an ad-hoc querying frontend to XQuery. Further, the authors present proofs for important query language properties such as complexity and closure. An empirical study comparing DSQL and XQuery for the purpose of ad-hoc querying demonstrates that users perform better with DSQL for both flat and tree structures, in terms of both accuracy and efficiency.


Algorithms ◽  
2021 ◽  
Vol 14 (5) ◽  
pp. 149
Author(s):  
Petros Zervoudakis ◽  
Haridimos Kondylakis ◽  
Nicolas Spyratos ◽  
Dimitris Plexousakis

HIFUN is a high-level query language for expressing analytic queries of big datasets, offering a clear separation between the conceptual layer, where analytic queries are defined independently of the nature and location of data, and the physical layer, where queries are evaluated. In this paper, we present a methodology based on the HIFUN language, and the corresponding algorithms for the incremental evaluation of continuous queries. In essence, our approach is able to process the most recent data batch by exploiting already computed information, without requiring the evaluation of the query over the complete dataset. We present the generic algorithm which we translated to both SQL and MapReduce using SPARK; it implements various query rewriting methods. We demonstrate the effectiveness of our approach in temrs of query answering efficiency. Finally, we show that by exploiting the formal query rewriting methods of HIFUN, we can further reduce the computational cost, adding another layer of query optimization to our implementation.


1988 ◽  
Vol 11 (3) ◽  
pp. 241-265
Author(s):  
W. Marek ◽  
C. Rauszer

In this paper, we address the problem of query optimization in distributed databases. We show that horizontal partitions of databases, generated by products of equivalence relations, induce optimization techniques for the basic database operations (i.e., the selection, projection, and join operators). In the case of selection, our method allows for restriction of the number of blocks to be searched in the selection process and subsequent simplification of the selection formula at each block. For the natural join operation, we propose an algorithm that reduces the computation of fragments. Proofs of the correctness of our algorithms are also included.


2021 ◽  
Vol 13 (3) ◽  
pp. 78
Author(s):  
Chuanhong Li ◽  
Lei Song ◽  
Xuewen Zeng

The continuous increase in network traffic has sharply increased the demand for high-performance packet processing systems. For a high-performance packet processing system based on multi-core processors, the packet scheduling algorithm is critical because of the significant role it plays in load distribution, which is related to system throughput, attracting intensive research attention. However, it is not an easy task since the canonical flow-level packet scheduling algorithm is vulnerable to traffic locality, while the packet-level packet scheduling algorithm fails to maintain cache affinity. In this paper, we propose an adaptive throughput-first packet scheduling algorithm for DPDK-based packet processing systems. Combined with the feature of DPDK burst-oriented packet receiving and transmitting, we propose using Subflow as the scheduling unit and the adjustment unit making the proposed algorithm not only maintain the advantages of flow-level packet scheduling algorithms when the adjustment does not happen but also avoid packet loss as much as possible when the target core may be overloaded Experimental results show that the proposed method outperforms Round-Robin, HRW (High Random Weight), and CRC32 on system throughput and packet loss rate.


Author(s):  
Ji Zhang

A great deal of research attention has been paid to data mining on data streams in recent years. In this chapter, the authors carry out a case study of anomaly detection in large and high-dimensional network connection data streams using Stream Projected Outlier deTector (SPOT) that is proposed in Zhang et al. (2009) to detect anomalies from data streams using subspace analysis. SPOT is deployed on 1999 KDD CUP anomaly detection application. Innovative approaches for training data generation, anomaly classification, false positive reduction, and adoptive detection subspace generation are proposed in this chapter as well. Experimental results demonstrate that SPOT is effective and efficient in detecting anomalies from network data streams and outperforms existing anomaly detection methods.


2009 ◽  
Vol 20 (4) ◽  
pp. 26-53 ◽  
Author(s):  
Arijit Sengupta ◽  
V. Ramesh

This article presents DSQL, a conservative extension of SQL, as an ad-hoc query language for XML. The development of DSQL follows the theoretical foundations of first order logic, and uses common query semantics already accepted for SQL. DSQL represents a core subset of XQuery that lends well to optimization techniques, while at the same time allows easy integration into current databases and applications that useSQL. The intent of DSQL is not to replace XQuery, the current W3C recommended XML query language, but to serve as an ad-hoc querying frontend to XQuery. Further, the authors present proofs for important query language properties such as complexity and closure. An empirical study comparing DSQL and XQuery for the purpose of ad-hoc querying demonstrates that users perform better with DSQL for both flat and tree structures, in terms of both accuracy and efficiency.


Author(s):  
Sheng-Uei Guan

This chapter presents an ontology-based query formation and information retrieval system under the mobile commerce (m-commerce) agent framework. A query formation approach that combines the usage of ontology and keywords is implemented. This approach takes advantage of the tree structure in ontology to form queries visually and efficiently. It also uses additional aids such as keywords to complete the query formation process more efficiently. The proposed information retrieval scheme focuses on using genetic algorithms (GAs) to improve computational effectiveness. Other query optimization techniques used include query restructuring by logical terms and numerical constraints replacement.


Author(s):  
Deepak Kumar ◽  
Deepti Mehrotra ◽  
Rohit Bansal

Nowadays, query optimization is a biggest concern for crowd-sourcing systems, which are developed for relieving the user burden of dealing with the crowd. Initially, a user needs to submit a structured query language (SQL) based query and the system takes the responsibility of query compiling, generating an execution plan, and evaluating the crowd-sourcing market place. The input queries have several alternative execution plans and the difference in crowd-sourcing cost between the worst and best plans. In relational database systems, query optimization is essential for crowd-sourcing systems, which provides declarative query interfaces. Here, a multi-objective query optimization approach using an ant-lion optimizer was employed for declarative crowd-sourcing systems. It generates a query plan for developing a better balance between the latency and cost. The experimental outcome of the proposed methodology was validated on UCI automobile and Amazon Mechanical Turk (AMT) datasets. The proposed methodology saves 30%-40% of cost in crowd-sourcing query optimization compared to the existing methods.


Sign in / Sign up

Export Citation Format

Share Document