EMD-DSJoin: Efficient Similarity Join Over Probabilistic Data Streams Based on Earth Mover’s Distance

Author(s):  
Jia Xu ◽  
Jiazhen Zhang ◽  
Chao Song ◽  
Qianzhen Zhang ◽  
Pin Lv ◽  
...  
2011 ◽  
Vol 21 (4) ◽  
pp. 535-559 ◽  
Author(s):  
Jia Xu ◽  
Zhenjie Zhang ◽  
Anthony K. H. Tung ◽  
Ge Yu

2010 ◽  
Vol 25 (3) ◽  
pp. 389-400 ◽  
Author(s):  
Bin Wang ◽  
Xiao-Chun Yang ◽  
Guo-Ren Wang ◽  
Ge Yu

Author(s):  
MOHAMMAD G. DEZFULI ◽  
MOSTAFA S. HAGHJOO

Inherent imprecision of data in many applications motivates us to support uncertainty as a first-class concept. Data stream and probabilistic data have been recently considered noticeably in isolation. However, there are many applications including sensor data management systems and object monitoring systems which need both issues in tandem. Our main contribution is designing a probabilistic data stream management system, called Sarcheshmeh, for continuous querying over probabilistic data streams. Sarcheshmeh supports uncertainty from input data to final query results. In this paper, after reviewing requirements and applications of probabilistic data streams, we present our new data model for probabilistic data streams and define our main logical operators formally. Then, we present query language and physical operators. In addition, we introduce the architecture of Sarcheshmeh and also describe some major challenges like memory management and our floating precision mechanism toward designing a more robust system. Finally, we report evaluation of our system and the effect of floating precision on the tradeoff between accuracy and efficiency.


Sign in / Sign up

Export Citation Format

Share Document