scholarly journals Multivariate weather anomaly detection using DBSCAN clustering algorithm

2021 ◽  
Vol 1869 (1) ◽  
pp. 012077
Author(s):  
S Wibisono ◽  
M T Anwar ◽  
A Supriyanto ◽  
I H A Amin
Author(s):  
Andrius Daranda ◽  
Gintautas Dzemyda

During the last years, marine traffic dramatically increases. Marine traffic safety highly depends on the mariner’s decisions and particular situations. The watch officer must continuously observe the marine traffic for anomalies because the anomaly detection is crucial to predict dangerous situations and to make a decision in time for safe marine navigation. In this paper, we present marine traffic anomaly detection by the combination of the DBSCAN clustering algorithm (Density- Based Spatial Clustering of Applications with Noise) with k-nearest neighbors analysis among the clusters and particular vessels. The clustering algorithm is applied to the historic marine traffic data – a set of vessel turn points. In our experiments, the total number of turn points was about 3 million, and about 160 megabytes of computer store was used. A formal numerical criterion to com-pare anomaly with normal traffic flow case has been proposed. It gives us a possibility to detect the vessels outside the typical traffic pattern. The proposed meth-od ensures the right decisions in different oceanic scale or hydro meteorology conditions in the detection of anomaly situation of the vessel.


IEEE Access ◽  
2021 ◽  
Vol 9 ◽  
pp. 43364-43377
Author(s):  
Xirui Xue ◽  
Shucai Huang ◽  
Jiahao Xie ◽  
Jiashun Ma ◽  
Ning Li

Author(s):  
J. W. Li ◽  
X. Q. Han ◽  
J. W. Jiang ◽  
Y. Hu ◽  
L. Liu

Abstract. How to establish an effective method of large data analysis of geographic space-time and quickly and accurately find the hidden value behind geographic information has become a current research focus. Researchers have found that clustering analysis methods in data mining field can well mine knowledge and information hidden in complex and massive spatio-temporal data, and density-based clustering is one of the most important clustering methods.However, the traditional DBSCAN clustering algorithm has some drawbacks which are difficult to overcome in parameter selection. For example, the two important parameters of Eps neighborhood and MinPts density need to be set artificially. If the clustering results are reasonable, the more suitable parameters can not be selected according to the guiding principles of parameter setting of traditional DBSCAN clustering algorithm. It can not produce accurate clustering results.To solve the problem of misclassification and density sparsity caused by unreasonable parameter selection in DBSCAN clustering algorithm. In this paper, a DBSCAN-based data efficient density clustering method with improved parameter optimization is proposed. Its evaluation index function (Optimal Distance) is obtained by cycling k-clustering in turn, and the optimal solution is selected. The optimal k-value in k-clustering is used to cluster samples. Through mathematical and physical analysis, we can determine the appropriate parameters of Eps and MinPts. Finally, we can get clustering results by DBSCAN clustering. Experiments show that this method can select parameters reasonably for DBSCAN clustering, which proves the superiority of the method described in this paper.


2020 ◽  
Author(s):  
Lucía Prieto Santamaría ◽  
Eduardo P. García del Valle ◽  
Gerardo Lagunes García ◽  
Massimiliano Zanin ◽  
Alejandro Rodríguez González ◽  
...  

AbstractWhile classical disease nosology is based on phenotypical characteristics, the increasing availability of biological and molecular data is providing new understanding of diseases and their underlying relationships, that could lead to a more comprehensive paradigm for modern medicine. In the present work, similarities between diseases are used to study the generation of new possible disease nosologic models that include both phenotypical and biological information. To this aim, disease similarity is measured in terms of disease feature vectors, that stood for genes, proteins, metabolic pathways and PPIs in the case of biological similarity, and for symptoms in the case of phenotypical similarity. An improvement in similarity computation is proposed, considering weighted instead of Booleans feature vectors. Unsupervised learning methods were applied to these data, specifically, density-based DBSCAN clustering algorithm. As evaluation metric silhouette coefficient was chosen, even though the number of clusters and the number of outliers were also considered. As a results validation, a comparison with randomly distributed data was performed. Results suggest that weighted biological similarities based on proteins, and computed according to cosine index, may provide a good starting point to rearrange disease taxonomy and nosology.


2020 ◽  
Vol 5 ◽  
Author(s):  
Luca Crociani ◽  
Giuseppe Vizzari ◽  
Andrea Gorrini ◽  
Stefania Bandini

Pedestrian behavioural dynamics have been growingly investigated by means of (semi)automated computing techniques for almost two decades, exploiting advancements on computing power, sensor accuracy and availability, computer vision algorithms. This has led to a unique consensus on the existence of significant difference between unidirectional and bidirectional flows of pedestrians, where the phenomenon of lane formation seems to play a major role. The collective behaviour of lane formation emerges in condition of variable density and due to a self-organisation dynamic, for which pedestrians are induced to walk following preceding persons to avoid and minimize conflictual situations. Although the formation of lanes is a well-known phenomenon in this field of study, there is still a lack of methods offering the possibility to provide an (even semi-) automatic identification and a quantitative characterization. In this context, the paper proposes an unsupervised learning approach for an automatic detection of lanes in multi-directional pedestrian flows, based on the DBSCAN clustering algorithm. The reliability of the approach is evaluated through an inter-rater agreement test between the results achieved by a human coder and by the algorithm.


2021 ◽  
Vol 2113 (1) ◽  
pp. 012062
Author(s):  
Weihong Wang ◽  
Zhuolin Wu ◽  
Xuan Liu ◽  
Lei Jia ◽  
Xiaoguang Wang

Abstract For modern operation and maintenance systems, they are usually required to monitor multiple types and large quantities of machine’s key performance indicators (KPIs) at the same time with limited resources. In this paper, to tackle these problems, we propose a highly compatible time series anomaly detection model based on K-means clustering algorithm with a new Wavelet Feature Distance (WFD). Our work is inspired by some ideas from image processing and signal processing domain. Our model detects abnormalities in the time series datasets which are first clustered by K-means to boost the accuracy. Our experiments show significant accuracy improvements compared with traditional algorithms, and excellent compatibilities and operating efficiencies compared with algorithms based on deep learning.


Sign in / Sign up

Export Citation Format

Share Document