Detecting Abnormal Traffic in Wireless Networks Using Unsupervised Models

Author(s):  
Alexis Huet

Development of high-speed LTE connections has induced an increasingly quantity of traffic data over the network. Detection of abnormal traffic from this continuous stream of data is crucial to identify technical problem or fraudulent intrusion. Unsupervised learning methods can automatically describe structure of the data and deduce patterns of the network. There are useful to identify unexpected behaviors and to promptly detect new type of anomalies. In this article, traffic in wireless network is analyzed through different unsupervised models. Emphasis is given on models combining traffic data with time stamps information. A model called Gaussian Probabilistic Latent Semantic Analysis (GPLSA) is introduced and compared with other methods such as time-dependent Gaussian Mixture Models (time-GMM). Efficiency and robustness of those models are compared, using both sampled and LTE traffic data. Those experimental results suggest that GPLSA can provide a robust and early detection of anomalies, in a fully automatic, data-driven solution.

2016 ◽  
Vol 2016 ◽  
pp. 1-11
Author(s):  
Ye Ouyang ◽  
Alexis Huet ◽  
J. P. Shim ◽  
Mantian (Mandy) Hu

Collected telecom data traffic has boomed in recent years, due to the development of 4G mobile devices and other similar high-speed machines. The ability to quickly identify unexpected traffic data in this stream is critical for mobile carriers, as it can be caused by either fraudulent intrusion or technical problems. Clustering models can help to identify issues by showing patterns in network data, which can quickly catch anomalies and highlight previously unseen outliers. In this article, we develop and compare clustering models for telecom data, focusing on those that include time-stamp information management. Two main models are introduced, solved in detail, and analyzed: Gaussian Probabilistic Latent Semantic Analysis (GPLSA) and time-dependent Gaussian Mixture Models (time-GMM). These models are then compared with other different clustering models, such as Gaussian model and GMM (which do not contain time-stamp information). We perform computation on both sample and telecom traffic data to show that the efficiency and robustness of GPLSA make it the superior method to detect outliers and provide results automatically with low tuning parameters or expertise requirement.


Author(s):  
Yirui Hu

Modeling co-occurrence data generated by more than one processes in network is a fundamental problem in anomaly detection. Co-occurrence data are joint occurrences of pairs of elementary observations from two sets: traffic data in one set are associated with the generating entities (Time) in the other set. Clustering algorithms are valuable because they can obtain the insights from the varied distribution associated with generating entities. This chapter leverages co-occurrence data that combine traffic data with time, and compares Gaussian probabilistic latent semantic analysis (GPLSA) model to a Gaussian Mixture Model (GMM) using temporal network data. Experimental results support that GPLSA holds better promise in early detection and low false alarm rate with low complexity of implementation in a fully automatic, data-driven solution.


2014 ◽  
Vol 14 (03) ◽  
pp. 1450012
Author(s):  
Yongmei Liu ◽  
Tanakrit Wongwitit ◽  
Linsen Yu

Automatic image annotation is an important and challenging job for image analysis and understanding such as content-based image retrieval (CBIR). The relationship between the keywords and visual features is too complicated due to the semantic gap. We present an approach of automatic image annotation based on scene analysis. With the constrain of scene semantics, the correlation between keywords and visual features becomes simpler and clearer. Our model has two stages of process. The first stage is training process which groups training image data set into semantic scenes using the extracted semantic feature and visual scenes constructed from the calculation distances of visual features for every pairs of training images by using Earth mover's distance (EMD). Then, combine a pair of semantic and visual scene together and apply Gaussian mixture model (GMM) for all scenes. The second stage is to test and annotate keywords for test image data set. Using the visual features provided by Duygulu, experimental results show that our model outperforms probabilistic latent semantic analysis (PLSA) & GMM (PLSA&GMM) model on Corel5K database.


Author(s):  
Samuel Kim ◽  
Panayiotis Georgiou ◽  
Shrikanth Narayanan

We propose the notion of latent acoustic topics to capture contextual information embedded within a collection of audio signals. The central idea is to learn a probability distribution over a set of latent topics of a given audio clip in an unsupervised manner, assuming that there exist latent acoustic topics and each audio clip can be described in terms of those latent acoustic topics. In this regard, we use the latent Dirichlet allocation (LDA) to implement the acoustic topic models over elemental acoustic units, referred as acoustic words, and perform text-like audio signal processing. Experiments on audio tag classification with the BBC sound effects library demonstrate the usefulness of the proposed latent audio context modeling schemes. In particular, the proposed method is shown to be superior to other latent structure analysis methods, such as latent semantic analysis and probabilistic latent semantic analysis. We also demonstrate that topic models can be used as complementary features to content-based features and offer about 9% relative improvement in audio classification when combined with the traditional Gaussian mixture model (GMM)–Support Vector Machine (SVM) technique.


2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Yumeng Sun

The data generated through telecommunication networks has grown exponentially in the last few years, and the resulting traffic data is unlikely to be processed and analyzed by manual style, especially detecting unintended traffic consumption from normal patterns remains an important issue. This area is critical because anomalies may lead to a reduction in network efficiency. The origin of these anomalies may be a technical problem in a cell or a fraudulent intrusion in the network. Usually, they need to be identified and fixed as soon as possible. Therefore, in order to identify these anomalies, data-driven systems using machine learning algorithms are developed with the aim from the raw data to identify and alert the occurrence of anomalies. Unsupervised learning methods can spontaneously describe the data structure and derive network patterns, which is effective for identifying unintended anomalous behavior and detecting new types of anomalies in a timely manner. In this paper, we use different unsupervised models to analyze traffic data in wireless networks, focusing on models that analyze traffic data combined with timeline information. The factor analysis method is used to derive the results of factor analysis, obtain the three major public factors and comprehensive factor scores, and combine the results with the BP neural network model to conduct a nonlinear simulation study on local governmental debt risk. A potential semantic analysis model based on Gaussian probability is presented and compared with other methods, and experimental results show that this model can provide a robust, over-the-top anomaly detection in a fully automated, data-driven solution.


Author(s):  
YingLi Tian ◽  
Liangliang Cao ◽  
Zicheng Liu ◽  
Zhengyou Zhang

This chapter addresses the problem of action detection from cluttered videos. In recent years, many feature extraction schemes have been designed to describe various aspects of actions. However, due to the difficulty of action detection, e.g., the cluttered background and potential occlusions, a single type of feature cannot effectively solve the action detection problems in cluttered videos. In this chapter, the authors propose a new type of feature, Hierarchically Filtered Motion (HFM), and further investigate the fusion of HFM with Spatiotemporal Interest Point (STIP) features for action detection from cluttered videos. In order to effectively and efficiently detect actions, they propose a new approach that combines Gaussian Mixture Models (GMMs) with Branch-and-Bound search to locate interested actions in cluttered videos. The proposed new HFM features and action detection method have been evaluated on the classical KTH dataset and the challenging MSR Action Dataset II, which consists of crowded videos with moving people or vehicles in the background. Experiment results demonstrate that the proposed method significantly outperforms existing techniques, especially for action detection in crowded videos.


Author(s):  
P.I. Tarasov

Research objective: studies of economic and transport infrastructure development in the Arctic and Northern Territories of Russia. Research methodology: analysis of transport infrastructure in the Republic of Sakha (Yakutia) and the types of railways used in Russia. Results: economic development of any region is proportional to the development of the road transport infrastructure and logistics. When a conventional railway is operated in the Arctic conditions, it is not always possible to maintain a cargo turnover that would ensure its efficient use, and transshipment from one mode of transport to another is very problematic. A new type of railway is proposed, i.e. a light railway. Conclusions: the proposed new type of transport offers all the main advantages of narrow gauge railroads (high speed of construction, efficiency, etc.) and helps to eliminate their main disadvantage, i.e. the need for transloading when moving from a narrow gauge to the conventional one with the width of 1520 mm, along with a significant reduction in capital costs.


Sign in / Sign up

Export Citation Format

Share Document