Pruning Optimization over Threshold-Based Historical Continuous Query

Jiwei Qin; Liangli Ma; Qing Liu

doi:10.3390/a12050107

Pruning Optimization over Threshold-Based Historical Continuous Query

Algorithms ◽

10.3390/a12050107 ◽

2019 ◽

Vol 12 (5) ◽

pp. 107

Author(s):

Jiwei Qin ◽

Liangli Ma ◽

Qing Liu

Keyword(s):

Moving Objects ◽

Real Data ◽

Continuous Query ◽

Data Sets ◽

Trajectory Data ◽

Mobile Location ◽

Location Service ◽

Processing Cost ◽

Pruning Strategy ◽

Spatiotemporal Queries

With the increase in mobile location service applications, spatiotemporal queries over the trajectory data of moving objects have become a research hotspot, and continuous query is one of the key types of various spatiotemporal queries. In this paper, we study the sub-domain of the continuous query of moving objects, namely the pruning optimization over historical continuous query based on threshold. Firstly, for the problem that the processing cost of the Mindist-based pruning strategy is too large, a pruning strategy based on extended Minimum Bounding Rectangle overlap is proposed to optimize the processing overhead. Secondly, a best-first traversal algorithm based on E3DR-tree is proposed to ensure that an accurate pruning candidate set can be obtained with accessing as few index nodes as possible. Finally, experiments on real data sets prove that our method significantly outperforms other similar methods.

An efficient trajectory-clustering algorithm based on an index tree

Transactions of the Institute of Measurement and Control ◽

10.1177/0142331211423284 ◽

2011 ◽

Vol 34 (7) ◽

pp. 850-861 ◽

Cited By ~ 15

Author(s):

Guan Yuan ◽

Shixiong Xia ◽

Lei Zhang ◽

Yong Zhou ◽

Cheng Ji

Keyword(s):

Radio Frequency Identification ◽

Clustering Algorithm ◽

Real Data ◽

Structural Similarity ◽

Location Based Services ◽

Similarity Function ◽

Data Sets ◽

Trajectory Clustering ◽

Trajectory Data ◽

Index Tree

With the development of location-based services, such as the Global Positioning System and Radio Frequency Identification, a great deal of trajectory data can be collected. Therefore, how to mine knowledge from these data has become an attractive topic. In this paper, we propose an efficient trajectory-clustering algorithm based on an index tree. Firstly, an index tree is proposed to store trajectories and their similarity matrix, with which trajectories can be retrieved efficiently; secondly, a new conception of trajectory structure is introduced to analyse both the internal and external features of trajectories; then, trajectories are partitioned into trajectory segments according to their corners; furthermore, the similarity between every trajectory segment pairs is compared by presenting the structural similarity function; finally, trajectory segments are grouped into different clusters according to their location in the different levels of the index tree. Experimental results on real data sets demonstrate not only the efficiency and effectiveness of our algorithm, but also the great flexibility that feature sensitivity can be adjusted by different parameters, and the cluster results are more practically significant.

Privacy preserving semantic trajectory data publishing for mobile location-based services

Wireless Networks ◽

10.1007/s11276-019-02058-8 ◽

2019 ◽

Vol 26 (8) ◽

pp. 5551-5560 ◽

Cited By ~ 1

Author(s):

Rong Tan ◽

Yuan Tao ◽

Wen Si ◽

Yuan-Yuan Zhang

Keyword(s):

Moving Objects ◽

Information Source ◽

Location Based Services ◽

Data Publishing ◽

Trajectory Data ◽

Similarity Comparison ◽

Mobile Location ◽

Motion Modes ◽

Network Topologies ◽

Rich Information

Abstract The development of wireless technologies and the popularity of mobile devices is responsible for generating large amounts of trajectory data for moving objects. Trajectory datasets have spatiotemporal features and are a rich information source. The mining of trajectory data can reveal interesting patterns of human activities and behaviors. However, trajectory data can also be exploited to disclose users’ privacy information, e.g., the places they live and work, which could be abused by a malicious user. Therefore, it is very important to protect the users’ privacy before publishing any trajectory data. While most previous research on this subject has only considered the privacy protection of stay points, this paper distinguishes itself by modeling and processing semantic trajectories, which not only contain spatiotemporal data but also involve POI information and the users’ motion modes such as walking, running, driving, etc. Accordingly, in this research, semantic trajectory anonymizing based on the k-anonymity model is proposed that can form sensitive areas that contain k − 1 POI points that are similar to the sensitive points. Then, trajectory ambiguity is executed based on the motion modes, road network topologies and road weights in the sensitive area. Finally, a similarity comparison is performed to obtain the recordable and releasable anonymity trajectory sets. Experimental results show that this method performs efficiently and provides high privacy levels.

Design and Implementation of WiFi Network Based Mobile Location Service System

2012 Second International Conference on Business Computing and Global Informatization ◽

10.1109/bcgin.2012.172 ◽

2012 ◽

Cited By ~ 1

Author(s):

Zhang Lingcui ◽

Zhao Fang ◽

Li Zhaohui ◽

Luo Haiyong ◽

Xu Junjun

Keyword(s):

Service System ◽

Mobile Location ◽

Location Service ◽

Design And Implementation

Transforming variables to central normality

Machine Learning ◽

10.1007/s10994-021-05960-5 ◽

2021 ◽

Author(s):

Jakob Raymaekers ◽

Peter J. Rousseeuw

Keyword(s):

Maximum Likelihood ◽

Maximum Likelihood Estimator ◽

Simulation Study ◽

Real Data ◽

Data Sets ◽

Transformation Parameter ◽

Likelihood Estimator ◽

Extensive Simulation ◽

Highly Sensitive

AbstractMany real data sets contain numerical features (variables) whose distribution is far from normal (Gaussian). Instead, their distribution is often skewed. In order to handle such data it is customary to preprocess the variables to make them more normal. The Box–Cox and Yeo–Johnson transformations are well-known tools for this. However, the standard maximum likelihood estimator of their transformation parameter is highly sensitive to outliers, and will often try to move outliers inward at the expense of the normality of the central part of the data. We propose a modification of these transformations as well as an estimator of the transformation parameter that is robust to outliers, so the transformed data can be approximately normal in the center and a few outliers may deviate from it. It compares favorably to existing techniques in an extensive simulation study and on real data.

A New Extension of Thinning-Based Integer-Valued Autoregressive Models for Count Data

Entropy ◽

10.3390/e23010062 ◽

2020 ◽

Vol 23 (1) ◽

pp. 62

Author(s):

Zhengwei Liu ◽

Fukang Zhu

Keyword(s):

Likelihood Estimation ◽

Real Data ◽

Autoregressive Models ◽

Superior Performance ◽

Data Sets ◽

Binomial Thinning ◽

Free Case ◽

Two Parameters ◽

Conditional Maximum ◽

Thinning Operator

The thinning operators play an important role in the analysis of integer-valued autoregressive models, and the most widely used is the binomial thinning. Inspired by the theory about extended Pascal triangles, a new thinning operator named extended binomial is introduced, which is a general case of the binomial thinning. Compared to the binomial thinning operator, the extended binomial thinning operator has two parameters and is more flexible in modeling. Based on the proposed operator, a new integer-valued autoregressive model is introduced, which can accurately and flexibly capture the dispersed features of counting time series. Two-step conditional least squares (CLS) estimation is investigated for the innovation-free case and the conditional maximum likelihood estimation is also discussed. We have also obtained the asymptotic property of the two-step CLS estimator. Finally, three overdispersed or underdispersed real data sets are considered to illustrate a superior performance of the proposed model.

Goodness-of-Fit Tests for Bivariate Time Series of Counts

Econometrics ◽

10.3390/econometrics9010010 ◽

2021 ◽

Vol 9 (1) ◽

pp. 10

Author(s):

Šárka Hudecová ◽

Marie Hušková ◽

Simos G. Meintanis

Keyword(s):

Goodness Of Fit ◽

Probability Generating Function ◽

Parametric Bootstrap ◽

Real Data ◽

Data Sets ◽

Test Statistics ◽

Finite Sample ◽

Generalized Poisson ◽

Goodness Of Fit Tests ◽

Monte Carlo Experiments

This article considers goodness-of-fit tests for bivariate INAR and bivariate Poisson autoregression models. The test statistics are based on an L2-type distance between two estimators of the probability generating function of the observations: one being entirely nonparametric and the second one being semiparametric computed under the corresponding null hypothesis. The asymptotic distribution of the proposed tests statistics both under the null hypotheses as well as under alternatives is derived and consistency is proved. The case of testing bivariate generalized Poisson autoregression and extension of the methods to dimension higher than two are also discussed. The finite-sample performance of a parametric bootstrap version of the tests is illustrated via a series of Monte Carlo experiments. The article concludes with applications on real data sets and discussion.

TraceAll: A Real-Time Processing for Contact Tracing Using Indoor Trajectories

Information ◽

10.3390/info12050202 ◽

2021 ◽

Vol 12 (5) ◽

pp. 202

Author(s):

Louai Alarabi ◽

Saleh Basalamah ◽

Abdeltawab Hendawi ◽

Mohammed Abdalla

Keyword(s):

Infectious Diseases ◽

Infected Patient ◽

Public Health Problem ◽

Real Data ◽

Exposure Period ◽

Contact Tracing ◽

Data Sets ◽

Major Public Health Problem ◽

Real Time Processing ◽

Recent Developments

The rapid spread of infectious diseases is a major public health problem. Recent developments in fighting these diseases have heightened the need for a contact tracing process. Contact tracing can be considered an ideal method for controlling the transmission of infectious diseases. The result of the contact tracing process is performing diagnostic tests, treating for suspected cases or self-isolation, and then treating for infected persons; this eventually results in limiting the spread of diseases. This paper proposes a technique named TraceAll that traces all contacts exposed to the infected patient and produces a list of these contacts to be considered potentially infected patients. Initially, it considers the infected patient as the querying user and starts to fetch the contacts exposed to him. Secondly, it obtains all the trajectories that belong to the objects moved nearby the querying user. Next, it investigates these trajectories by considering the social distance and exposure period to identify if these objects have become infected or not. The experimental evaluation of the proposed technique with real data sets illustrates the effectiveness of this solution. Comparative analysis experiments confirm that TraceAll outperforms baseline methods by 40% regarding the efficiency of answering contact tracing queries.

The Flexible Burr X-G Family: Properties, Inference, and Applications in Engineering Science

Symmetry ◽

10.3390/sym13030474 ◽

2021 ◽

Vol 13 (3) ◽

pp. 474

Author(s):

Abdulhakim A. Al-Babtain ◽

Ibrahim Elbatal ◽

Hazem Al-Mofleh ◽

Ahmed M. Gemeay ◽

Ahmed Z. Afify ◽

...

Keyword(s):

Numerical Simulations ◽

Exponential Distribution ◽

Real Data ◽

Exponential Model ◽

Statistical Properties ◽

Engineering Science ◽

Data Sets ◽

Engineering Sciences ◽

General Statistical ◽

Anderson Darling

In this paper, we introduce a new flexible generator of continuous distributions called the transmuted Burr X-G (TBX-G) family to extend and increase the flexibility of the Burr X generator. The general statistical properties of the TBX-G family are calculated. One special sub-model, TBX-exponential distribution, is studied in detail. We discuss eight estimation approaches to estimating the TBX-exponential parameters, and numerical simulations are conducted to compare the suggested approaches based on partial and overall ranks. Based on our study, the Anderson–Darling estimators are recommended to estimate the TBX-exponential parameters. Using two skewed real data sets from the engineering sciences, we illustrate the importance and flexibility of the TBX-exponential model compared with other existing competing distributions.

Kumaraswamy Generalized Power Lomax Distributionand Its Applications

Stats ◽

10.3390/stats4010003 ◽

2021 ◽

Vol 4 (1) ◽

pp. 28-45

Author(s):

Vasili B.V. Nagarjuna ◽

R. Vishnu Vardhan ◽

Christophe Chesneau

Keyword(s):

Hazard Rate ◽

Real Data ◽

Rate Function ◽

Maximum Likelihood Estimates ◽

Parameter Estimates ◽

Parameter Distribution ◽

Data Sets ◽

Lomax Distribution ◽

Entropy Measures ◽

Modeling Behavior

In this paper, a new five-parameter distribution is proposed using the functionalities of the Kumaraswamy generalized family of distributions and the features of the power Lomax distribution. It is named as Kumaraswamy generalized power Lomax distribution. In a first approach, we derive its main probability and reliability functions, with a visualization of its modeling behavior by considering different parameter combinations. As prime quality, the corresponding hazard rate function is very flexible; it possesses decreasing, increasing and inverted (upside-down) bathtub shapes. Also, decreasing-increasing-decreasing shapes are nicely observed. Some important characteristics of the Kumaraswamy generalized power Lomax distribution are derived, including moments, entropy measures and order statistics. The second approach is statistical. The maximum likelihood estimates of the parameters are described and a brief simulation study shows their effectiveness. Two real data sets are taken to show how the proposed distribution can be applied concretely; parameter estimates are obtained and fitting comparisons are performed with other well-established Lomax based distributions. The Kumaraswamy generalized power Lomax distribution turns out to be best by capturing fine details in the structure of the data considered.

The Censored Beta-Skew Alpha-Power Distribution

Symmetry ◽

10.3390/sym13071114 ◽

2021 ◽

Vol 13 (7) ◽

pp. 1114

Author(s):

Guillermo Martínez-Flórez ◽

Roger Tovar-Falón ◽

María Martínez-Guerra

Keyword(s):

Power Distribution ◽

Information Matrix ◽

Real Data ◽

Likelihood Method ◽

Data Sets ◽

Alpha Power ◽

Score Functions ◽

New Family ◽

Two Parameters ◽

Family Of Distributions

This paper introduces a new family of distributions for modelling censored multimodal data. The model extends the widely known tobit model by introducing two parameters that control the shape and the asymmetry of the distribution. Basic properties of this new family of distributions are studied in detail and a model for censored positive data is also studied. The problem of estimating parameters is addressed by considering the maximum likelihood method. The score functions and the elements of the observed information matrix are given. Finally, three applications to real data sets are reported to illustrate the developed methodology.