On the Perfect Privacy: a Statistical Analysis of Network Traffic Approach

2016 ◽  
pp. 1-1
Author(s):  
Amir Hossein RezaeiTabar ◽  
Abolfazl Diyanat ◽  
Ahmad Khonsari
Author(s):  
Stevan Novakov ◽  
Chung-Horng Lung ◽  
Ioannis Lambadaris ◽  
Nabil Seddigh

Research into network anomaly detection has become crucial as a result of a significant increase in the number of computer attacks. Many approaches in network anomaly detection have been reported in the literature, but data or solutions typically are not freely available. Recently, a labeled network traffic flow dataset, Kyoto2006+, has been created and is publicly available. Most existing approaches using Kyoto2006+ for network anomaly detection apply various clustering techniques. This paper leverages existing well known statistical analysis and spectral analysis techniques for network anomaly detection. The first popular approach is a statistical analysis technique called Principal Component Analysis (PCA). PCA describes data in a new dimension to unlock otherwise hidden characteristics. The other well known spectral analysis technique is Haar Wavelet filtering analysis. It measures the amount and magnitude of abrupt changes in data. Both approaches have strengths and limitations. In response, this paper proposes a Hybrid PCA–Haar Wavelet Analysis. The hybrid approach first applies PCA to describe the data and then Haar Wavelet filtering for analysis. Based on prototyping and measurement, an investigation of the Hybrid PCA–Haar Wavelet Analysis technique is performed using the Kyoto2006+ dataset. The authors consider a number of parameters and present experimental results to demonstrate the effectiveness of the hybrid approach as compared to the two algorithms individually.


Author(s):  
Yu Wang

Statistical software and their corresponding computing environments are essential factors that will lead to the achievement of efficient and better research. If we think of computing and classifying algorithms as the roadmap to arrive at our final destination, a statistical package is the vehicle that is used to reach this point. Figure 2.1 shows a basic roadmap of the roles that statistical software packages play in network security. One of the advantages of using a statistical package in network security is that it provides a fairly easy and quick way to explore data, test algorithms and evaluate models. Unfortunately, not every package is suitable for analyzing network traffic. Given the natural characteristics of the network traffic data (i.e., large size and the ability to change dynamically), several fundamental attributes are necessary for specific packages. First, the package should have good data management capacities, which include the capacity to read large data and output/save resulting files in different formats, the capability to merge and link processed data with other data sources, and the ability to create, modify and delete variables within data. Second, it should be able to process large amounts of data efficiently because statistical analyses in network security are usually based on dynamic online data, which requires the application to conduct analyses timely; this differs from areas such as healthcare, life science, and epidemiology where statistical analyses are conducted based on static offline data. Third, it should support modern modeling procedures and methods, such as the Bayesian methods, hidden Markov model, hierarchical generalized linear model, etc. Finally, because usability is an important factor, we want the software to be both accessible and user-friendly. These attributes are particularly important during the development phase because they allow us to quickly test hypotheses and examine modeling strategies effectively. Since many commercial and research-oriented software packages may not have all of the aforementioned attributes, we may need to implement multiple packages, such as packages for data management, for fitting a particular model, and for displaying results graphically. In the end, we may more likely use a general-purpose programming language, such as C, C++ or Java to create a customized application which we can later integrate with the other components of the intrusion detection or prevention system. The results obtained from the statistical software can be used as a gold-standard benchmark to validate the results from the customized application. customized application. In this chapter, we will introduce several popular commercial and research-oriented packages that have been widely used in the statistical analysis, data mining, bioinformatics, and computer science communities. Specifically, we will discuss SAS1, Stata2 and R in Sections The SAS System, STATA and R, respectively; and briefly describe S-Plus3, WinBUGS, and MATLAB4 in Section Other Packages. The goal of this chapter is to provide a quick overview of these analytic software packages with some simple examples to help readers become familiar with the computing environments and statistical computing languages that will be referred to in the examples presented in the rest of these chapters. We have included some fundamental materials in the Reference section for further reading for those readers who would like to acquire more detailed information on using these software packages.


2020 ◽  
Vol 48 ◽  
Author(s):  
Liudas Kaklauskas ◽  
Leonidas Sakalauslas

The present article deals with statistical university network traffic, by applying the methods of self-similarity and chaos analysis. The object of measurement is Šiauliai University LitNet network node maintaining institutions of education of the northern Lithuania region. Time series of network traffic characteristics are formed by registering amount of information packets in a node at different regimes of network traffic and different values of discretion of registered information are present. Measurement results are processed by calculating Hurst index and estimating reliability of analysis results by applying the statistical method. Investigation of the network traffic allowed us drawing conclusions that time series bear features of self-similarity when aggregated time series bear features of slowly decreasing dependence.


Sign in / Sign up

Export Citation Format

Share Document