scholarly journals Highway Event Detection Algorithm Based on Improved Fast Peak Clustering

2021 ◽  
Vol 2021 ◽  
pp. 1-13
Author(s):  
Lili Pei ◽  
Zhaoyun Sun ◽  
Yuxi Han ◽  
Wei Li ◽  
Huaixin Zhao

Aiming at the mining of traffic events based on large amounts of highway data, this paper proposes an improved fast peak clustering algorithm to process highway toll data. The highway toll data are first analyzed, and a data cleaning method based on the sum of similar coefficients is proposed to process the original data. Next, to avoid the shortcomings of the excessive subjectivity of the original algorithm, an improved fast peak clustering algorithm is proposed. Finally, the improved algorithm is applied to highway traffic condition analysis and abnormal event mining to obtain more accurate and intuitive clustering results. Compared with two classical algorithms, namely, the k-means and density-based spatial clustering of applications with noise (DBSCAN) algorithms, as well as the unimproved original fast peak clustering algorithm, the proposed algorithm is faster and more accurate and can reveal the complex relationships among massive data more efficiently. During the process of reforming the toll system, the algorithm can automatically and more efficiently analyze massive toll data and detect abnormal events, thereby providing a theoretical basis and data support for the operation monitoring and maintenance of highways.

2020 ◽  
Vol 91 (4) ◽  
pp. 2206-2217
Author(s):  
Qingkai Kong ◽  
Robert Martin-Short ◽  
Richard M. Allen

Abstract The MyShake project aims to build a global smartphone seismic network to facilitate large-scale earthquake early warning and other applications by leveraging the power of crowdsourcing. The MyShake mobile application first detects earthquake shaking on a single phone. The earthquake is then confirmed on the MyShake servers using a “network detection” algorithm that is activated by multiple single-phone detections. In this part one of the two article series, we present a simulation platform and a network detection algorithm to test earthquake scenarios at various locations around the world. The proposed network detection algorithm is built on the classic density-based spatial clustering of applications with noise spatial clustering algorithm, with modifications to take temporal characteristics into account and the association of new triggers. We test our network detection algorithm using real data recorded by MyShake users during the 4 January 2018 M 4.4 Berkeley and the 10 June 2016 M 5.2 Borrego Springs earthquakes to demonstrate the system’s utility. In order to test the entire detection procedure and to understand the first order performance of MyShake in various locations around the world representing different population and tectonic characteristics, we then present a software platform that can simulate earthquake triggers in hypothetical MyShake networks. Part two of this paper series explores our MyShake early warning simulation performance in selected regions around the world.


Author(s):  
Weiwu Ren ◽  
Jianfei Zhang ◽  
Xiaoqiang Di ◽  
Yinan Lu ◽  
Bochen Zhang ◽  
...  

Clustering by fast search and find of density peak (CFSFDP) is a simple and crisp density-clustering algorithm. The original algorithm is not suitable for direct application to anomaly detection. Its clustering results have a high level of redundant density information. If used directly as behavior profiles, the computation and storage costs of anomaly detection are high. Therefore, an improved algorithm based on CFSFDP is proposed for anomaly detection. The improved algorithm uses a few data points and their radius to support behavior profiles, and deletes the redundant data points without supporting profiles. This method not only reduces the large amount of data storage and distance calculation in the process of generating profiles, but also reduces the search space of profiles in the detection process. Numerous experiments show that the improved algorithm generates profiles faster than density-based spatial clustering of application with noise (DBSCAN), and has better profile precision than adaptive real-time anomaly detection with incremental clustering (ADWICE). The improved algorithm inherits the arbitrary shape clusters of CFSFDP, and improves the storage and computation performance. Compared with DBSCAN and ADWICE, the improved anomaly-detection algorithm based on CFSFDP has more balanced detection precision and real-time performance.


2019 ◽  
Vol 291 ◽  
pp. 01008 ◽  
Author(s):  
Bao Lei

The big data acquired by AIS system contains abundant maritime traffic information. With the wide application of data mining in various fields in recent years, the mining on AIS data has draw attention of related researchers. Based on the ship AIS location data, this paper studies the relevant spot area detection algorithm. Firstly, the sample data are pre-processed from the original data, and the residence point of each ship is identified according to the ship speed and course change. Then a DBSCAN based clustering algorithm is used to cluster several latitude and longitude lattice, that is spot areas. The experiments on real AIS data sets shows that the algorithm is efficient and correct.


2020 ◽  
pp. 1-10
Author(s):  
Mengliang Shao ◽  
Deyu Qi ◽  
Huili Xue

Outlier detection is an important branch of data mining. This paper proposes an advanced fast density peak outlier detection algorithm based on the characteristics of big data. The algorithm is an outlier detection method based on the improved density peak clustering algorithm. This paper improves the original algorithm. From the perspective of outlier detection, although it is a clustering idea, it avoids the clustering process, reduces the time complexity of the cluster-based outlier detection algorithm, and absorbs. The outlier detection based on neighbors is not sensitive to data dimensions and other advantages. In the power industry, outlier detection can be used in areas such as grid fault detection, equipment fault detection, and power abnormality detection. The simulation experiment of outlier detection based on the daily load curve of single and multiple transformers in a certain province shows that the improved algorithm can effectively detect outliers in the data.


Symmetry ◽  
2021 ◽  
Vol 13 (4) ◽  
pp. 596
Author(s):  
Krishna Kumar Sharma ◽  
Ayan Seal ◽  
Enrique Herrera-Viedma ◽  
Ondrej Krejcar

Calculating and monitoring customer churn metrics is important for companies to retain customers and earn more profit in business. In this study, a churn prediction framework is developed by modified spectral clustering (SC). However, the similarity measure plays an imperative role in clustering for predicting churn with better accuracy by analyzing industrial data. The linear Euclidean distance in the traditional SC is replaced by the non-linear S-distance (Sd). The Sd is deduced from the concept of S-divergence (SD). Several characteristics of Sd are discussed in this work. Assays are conducted to endorse the proposed clustering algorithm on four synthetics, eight UCI, two industrial databases and one telecommunications database related to customer churn. Three existing clustering algorithms—k-means, density-based spatial clustering of applications with noise and conventional SC—are also implemented on the above-mentioned 15 databases. The empirical outcomes show that the proposed clustering algorithm beats three existing clustering algorithms in terms of its Jaccard index, f-score, recall, precision and accuracy. Finally, we also test the significance of the clustering results by the Wilcoxon’s signed-rank test, Wilcoxon’s rank-sum test, and sign tests. The relative study shows that the outcomes of the proposed algorithm are interesting, especially in the case of clusters of arbitrary shape.


2021 ◽  
Vol 11 (10) ◽  
pp. 4497
Author(s):  
Dongming Chen ◽  
Mingshuo Nie ◽  
Jie Wang ◽  
Yun Kong ◽  
Dongqi Wang ◽  
...  

Aiming at analyzing the temporal structures in evolutionary networks, we propose a community detection algorithm based on graph representation learning. The proposed algorithm employs a Laplacian matrix to obtain the node relationship information of the directly connected edges of the network structure at the previous time slice, the deep sparse autoencoder learns to represent the network structure under the current time slice, and the K-means clustering algorithm is used to partition the low-dimensional feature matrix of the network structure under the current time slice into communities. Experiments on three real datasets show that the proposed algorithm outperformed the baselines regarding effectiveness and feasibility.


2014 ◽  
Vol 543-547 ◽  
pp. 1934-1938
Author(s):  
Ming Xiao

For a clustering algorithm in two-dimension spatial data, the Adaptive Resonance Theory exists not only the shortcomings of pattern drift and vector module of information missing, but also difficultly adapts to spatial data clustering which is irregular distribution. A Tree-ART2 network model was proposed based on the above situation. It retains the memory of old model which maintains the constraint of spatial distance by learning and adjusting LTM pattern and amplitude information of vector. Meanwhile, introducing tree structure to the model can reduce the subjective requirement of vigilance parameter and decrease the occurrence of pattern mixing. It is showed that TART2 network has higher plasticity and adaptability through compared experiments.


2014 ◽  
Vol 23 (1) ◽  
pp. 59-73
Author(s):  
E. Umamaheswari ◽  
T.V. Geetha

AbstractTraditional document clustering algorithms consider text-based features such as unique word count, concept count, etc. to cluster documents. Meanwhile, event mining is the extraction of specific events, their related sub-events, and the associated semantic relations from documents. This work discusses an approach to event mining through clustering. The Universal Networking Language (UNL)-based subgraph, a semantic representation of the document, is used as the input for clustering. Our research focuses on exploring the use of three different feature sets for event clustering and comparing the approaches used for specific event mining. In our previous work, the clustering algorithm used UNL-based event semantics to represent event context for clustering. However, this approach resulted in different events with similar semantics being clustered together. Hence, instead of considering only UNL event semantics, we considered assigning additional weights to similarity between event contexts with event-related attributes such as time, place, and persons. Although we get specific events in a single cluster, sub-events related to the specific events are not necessarily in a single cluster. Therefore, to improve our cluster efficiency, connective terms between two sentences and their representation as UNL subgraphs were also considered for similarity determination. By combining UNL semantics, event-specific arguments similarity, and connective term concepts between sentences, we were able to obtain clusters for specific events and their sub-events. We have used 112 000 Tamil documents from the Forum for Information Retrieval Evaluation data corpus and achieved good results. We have also compared our approach with the previous state-of-the-art approach for Router-RCV1 corpus and achieved 30% improvements in precision.


2016 ◽  
Vol 33 (4) ◽  
pp. 697-712 ◽  
Author(s):  
R. Andrew Weekley ◽  
R. Kent Goodrich ◽  
Larry B. Cornman

AbstractAn image-processing algorithm has been developed to identify aerosol plumes in scanning lidar backscatter data. The images in this case consist of lidar data in a polar coordinate system. Each full lidar scan is taken as a fixed image in time, and sequences of such scans are considered functions of time. The data are analyzed in both the original backscatter polar coordinate system and a lagged coordinate system. The lagged coordinate system is a scatterplot of two datasets, such as subregions taken from the same lidar scan (spatial delay), or two sequential scans in time (time delay). The lagged coordinate system processing allows for finding and classifying clusters of data. The classification step is important in determining which clusters are valid aerosol plumes and which are from artifacts such as noise, hard targets, or background fields. These cluster classification techniques have skill since both local and global properties are used. Furthermore, more information is available since both the original data and the lag data are used. Performance statistics are presented for a limited set of data processed by the algorithm, where results from the algorithm were compared to subjective truth data identified by a human.


Sign in / Sign up

Export Citation Format

Share Document