Network Data Characteristics
Data represents the natural phenomena of our real world. Data is constructed by rows and columns; usually rows represent the observations and columns represent the variables. Observations, also called subjects, records, or data points, represent a phenomenon in the real world and variables, as also known as data elements or data fields, represent the characteristics of observations in data. Variables take different values for different observations, which can make observations independent of each other. Figure 4.1 illustrates a section of TCP/IP traffic data, in which the rows are individual network traffics, and the columns, separated by a space, are characteristics of the traffics. In this example, the first column is a session index of each connection and the second column is the date when the connection occurred. In this chapter, we will discuss some fundamental key features of variables and network data. We will present detailed discussions on variable characteristics and distributions in Sections Random Variables and Variables Distributions, and describe network data modules in Section Network Data Modules. The material covered in this chapter will help readers who do not have a solid background in this area gain an understanding of the basic concepts of variables and data. Additional information can be found from Introduction to the Practice of Statistics by Moore and McCabe (1998).