Modeling for Time Generating Network

Advances in Wireless Technologies and Telecommunication - Big Data Applications in the Telecommunications Industry ◽

10.4018/978-1-5225-1750-4.ch003 ◽

2016 ◽

pp. 31-40

Author(s):

Yirui Hu

Keyword(s):

Semantic Analysis ◽

Fundamental Problem ◽

Clustering Algorithms ◽

Low Complexity ◽

Gaussian Mixture ◽

Probabilistic Latent Semantic Analysis ◽

Traffic Data ◽

Automatic Data ◽

Occurrence Data ◽

Fully Automatic

Modeling co-occurrence data generated by more than one processes in network is a fundamental problem in anomaly detection. Co-occurrence data are joint occurrences of pairs of elementary observations from two sets: traffic data in one set are associated with the generating entities (Time) in the other set. Clustering algorithms are valuable because they can obtain the insights from the varied distribution associated with generating entities. This chapter leverages co-occurrence data that combine traffic data with time, and compares Gaussian probabilistic latent semantic analysis (GPLSA) model to a Gaussian Mixture Model (GMM) using temporal network data. Experimental results support that GPLSA holds better promise in early detection and low false alarm rate with low complexity of implementation in a fully automatic, data-driven solution.

Download Full-text

Detecting Abnormal Traffic in Wireless Networks Using Unsupervised Models

Advances in Wireless Technologies and Telecommunication - Big Data Applications in the Telecommunications Industry ◽

10.4018/978-1-5225-1750-4.ch001 ◽

2016 ◽

pp. 1-14

Author(s):

Alexis Huet

Keyword(s):

High Speed ◽

Semantic Analysis ◽

Technical Problem ◽

Gaussian Mixture Models ◽

Gaussian Mixture ◽

Probabilistic Latent Semantic Analysis ◽

Traffic Data ◽

Time Stamps ◽

Continuous Stream ◽

New Type

Development of high-speed LTE connections has induced an increasingly quantity of traffic data over the network. Detection of abnormal traffic from this continuous stream of data is crucial to identify technical problem or fraudulent intrusion. Unsupervised learning methods can automatically describe structure of the data and deduce patterns of the network. There are useful to identify unexpected behaviors and to promptly detect new type of anomalies. In this article, traffic in wireless network is analyzed through different unsupervised models. Emphasis is given on models combining traffic data with time stamps information. A model called Gaussian Probabilistic Latent Semantic Analysis (GPLSA) is introduced and compared with other methods such as time-dependent Gaussian Mixture Models (time-GMM). Efficiency and robustness of those models are compared, using both sampled and LTE traffic data. Those experimental results suggest that GPLSA can provide a robust and early detection of anomalies, in a fully automatic, data-driven solution.

Download Full-text

Latent Clustering Models for Outlier Identification in Telecom Data

Mobile Information Systems ◽

10.1155/2016/1542540 ◽

2016 ◽

Vol 2016 ◽

pp. 1-11

Author(s):

Ye Ouyang ◽

Alexis Huet ◽

J. P. Shim ◽

Mantian (Mandy) Hu

Keyword(s):

High Speed ◽

Semantic Analysis ◽

Gaussian Mixture Models ◽

Gaussian Model ◽

Gaussian Mixture ◽

Probabilistic Latent Semantic Analysis ◽

Time Stamp ◽

Traffic Data ◽

Data Traffic ◽

Technical Problems

Collected telecom data traffic has boomed in recent years, due to the development of 4G mobile devices and other similar high-speed machines. The ability to quickly identify unexpected traffic data in this stream is critical for mobile carriers, as it can be caused by either fraudulent intrusion or technical problems. Clustering models can help to identify issues by showing patterns in network data, which can quickly catch anomalies and highlight previously unseen outliers. In this article, we develop and compare clustering models for telecom data, focusing on those that include time-stamp information management. Two main models are introduced, solved in detail, and analyzed: Gaussian Probabilistic Latent Semantic Analysis (GPLSA) and time-dependent Gaussian Mixture Models (time-GMM). These models are then compared with other different clustering models, such as Gaussian model and GMM (which do not contain time-stamp information). We perform computation on both sample and telecom traffic data to show that the efficiency and robustness of GPLSA make it the superior method to detect outliers and provide results automatically with low tuning parameters or expertise requirement.

Download Full-text

Automatic Image Annotation Based on Scene Analysis

International Journal of Image and Graphics ◽

10.1142/s0219467814500120 ◽

2014 ◽

Vol 14 (03) ◽

pp. 1450012

Author(s):

Yongmei Liu ◽

Tanakrit Wongwitit ◽

Linsen Yu

Keyword(s):

Semantic Analysis ◽

Image Annotation ◽

Image Data ◽

Gaussian Mixture ◽

Training Image ◽

Scene Analysis ◽

Probabilistic Latent Semantic Analysis ◽

Visual Features ◽

Automatic Image Annotation ◽

Data Set

Automatic image annotation is an important and challenging job for image analysis and understanding such as content-based image retrieval (CBIR). The relationship between the keywords and visual features is too complicated due to the semantic gap. We present an approach of automatic image annotation based on scene analysis. With the constrain of scene semantics, the correlation between keywords and visual features becomes simpler and clearer. Our model has two stages of process. The first stage is training process which groups training image data set into semantic scenes using the extracted semantic feature and visual scenes constructed from the calculation distances of visual features for every pairs of training images by using Earth mover's distance (EMD). Then, combine a pair of semantic and visual scene together and apply Gaussian mixture model (GMM) for all scenes. The second stage is to test and annotate keywords for test image data set. Using the visual features provided by Duygulu, experimental results show that our model outperforms probabilistic latent semantic analysis (PLSA) & GMM (PLSA&GMM) model on Corel5K database.

Download Full-text

Latent acoustic topic models for unstructured audio classification

APSIPA Transactions on Signal and Information Processing ◽

10.1017/atsip.2012.7 ◽

2012 ◽

Vol 1 ◽

Cited By ~ 8

Author(s):

Samuel Kim ◽

Panayiotis Georgiou ◽

Shrikanth Narayanan

Keyword(s):

Latent Semantic Analysis ◽

Latent Dirichlet Allocation ◽

Semantic Analysis ◽

Topic Models ◽

Audio Signal ◽

Gaussian Mixture ◽

Probabilistic Latent Semantic Analysis ◽

Support Vector ◽

Audio Classification ◽

Audio Clip

We propose the notion of latent acoustic topics to capture contextual information embedded within a collection of audio signals. The central idea is to learn a probability distribution over a set of latent topics of a given audio clip in an unsupervised manner, assuming that there exist latent acoustic topics and each audio clip can be described in terms of those latent acoustic topics. In this regard, we use the latent Dirichlet allocation (LDA) to implement the acoustic topic models over elemental acoustic units, referred as acoustic words, and perform text-like audio signal processing. Experiments on audio tag classification with the BBC sound effects library demonstrate the usefulness of the proposed latent audio context modeling schemes. In particular, the proposed method is shown to be superior to other latent structure analysis methods, such as latent semantic analysis and probabilistic latent semantic analysis. We also demonstrate that topic models can be used as complementary features to content-based features and offer about 9% relative improvement in audio classification when combined with the traditional Gaussian mixture model (GMM)–Support Vector Machine (SVM) technique.

Download Full-text