Classification of Peer-to-Peer Traffic Using a Two-Stage Window-Based Classifier with Fast Decision Tree and IP Layer Attributes

This paper presents a new approach using data mining techniques, and in particular a two-stage architecture, for classification of Peer-to-Peer (P2P) traffic in IP networks where in the first stage the traffic is filtered using standard port numbers and layer 4 port matching to label well-known P2P and NonP2P traffic. The labeled traffic produced in the first stage is used to train a Fast Decision Tree (FDT) classifier with high accuracy. The Unknown traffic is then applied to the FDT model which classifies the traffic into P2P and NonP2P with high accuracy. The two-stage architecture not only classifies well-known P2P applications, but also classifies applications that use random or non-standard port numbers and cannot be classified otherwise. The authors captured the internet traffic at a gateway router, performed pre-processing on the data, selected the most significant attributes, and prepared a training data set to which the new algorithm was applied. Finally, the authors built several models using a combination of various attribute sets for different ratios of P2P to NonP2P traffic in the training data.

Download Full-text

Peer-to-Peer Traffic Identification by Mining IP Layer Data Streams Using Concept-Adapting Very Fast Decision Tree

2008 20th IEEE International Conference on Tools with Artificial Intelligence ◽

10.1109/ictai.2008.12 ◽

2008 ◽

Cited By ~ 17

Author(s):

Bijan Raahemi ◽

Weicai Zhong ◽

Jing Liu

Keyword(s):

Decision Tree ◽

Data Streams ◽

Peer To Peer ◽

Traffic Identification ◽

Very Fast Decision Tree ◽

Fast Decision

Download Full-text

Application of the Naïve Bayes Algorithm for Student Graduation Analysis

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i4.15.23596 ◽

2018 ◽

Vol 7 (4.15) ◽

pp. 421

Author(s):

Erick Akhmad Fahmi Alfa’izy ◽

Khairil Anam ◽

Naidah Naing ◽

Rosanita Tritias Utami ◽

Nur Anim Jauhariyah ◽

...

Keyword(s):

Naive Bayes ◽

Naïve Bayes ◽

Training Data ◽

College System ◽

Student Graduation ◽

Bayes Algorithm ◽

Using Data ◽

Analysis System ◽

Law Student

Design an analysis system to find out graduation by comparing previous data and existing data to overcome errors in a college system. By taking data records that are already available to be processed using the naïve Bayes algorithm. This research was conducted at Universitas Maarif Hasyim Latif. In this case, the object of research is to analyze the data of students with naïve Bayes algorithms to find out their graduation. For sampling the data taken is the previous Faculty of Law Student data to be used as training data, to retrieve the entire data using data records that are already available in the Directorate of Information Systems. That the naïve Bayes algorithm can be used in the classification of data in the form of a string or textual. This is based on researchers' trials in taking examples of calculations that have been done before. To compare the results of the classification of graduation analysis using the naïve Bayes algorithm testing is done with a sample of data in the form of training data compared to data testing. From the calculations that have been made, the accuracy is 77.78%.

Download Full-text

EFFICIENT CLASSIFICATION OF BIG DATA USING VFDT (VERY FAST DECISION TREE)

International Journal of Research in Engineering and Technology ◽

10.15623/ijret.2016.0501019 ◽

2016 ◽

Vol 05 (01) ◽

pp. 102-107

Author(s):

Sourav Roy .

Keyword(s):

Big Data ◽

Decision Tree ◽

Very Fast Decision Tree ◽

Fast Decision

Download Full-text

Classification of Thyroid Disease by Using Data Mining Models: A Comparison of Decision Tree Algorithms

The Oxford Journal of Intelligent Decision and Data Science ◽

10.5899/2016/ojids-00002 ◽

2016 ◽

Vol 2016 (2) ◽

pp. 13-28 ◽

Cited By ~ 7

Author(s):

Ebru Turanoglu-Bekar ◽

Gozde Ulutagay ◽

Suzan Kantarcı-Savas

Keyword(s):

Data Mining ◽

Decision Tree ◽

Thyroid Disease ◽

Tree Algorithms ◽

Using Data

Download Full-text

Fuzzy Decision Tree applied to defects classification of glass manufacturing using data from a glass furnace model

2012 Annual Meeting of the North American Fuzzy Information Processing Society (NAFIPS) ◽

10.1109/nafips.2012.6291000 ◽

2012 ◽

Author(s):

Herbert R do N Costa ◽

Alessandro La Neve

Keyword(s):

Decision Tree ◽

Glass Furnace ◽

Fuzzy Decision ◽

Fuzzy Decision Tree ◽

Using Data ◽

Defects Classification

Download Full-text

Classifying peer-to-peer applications using imbalanced concept-adapting very fast decision tree on IP data stream

Peer-to-Peer Networking and Applications ◽

10.1007/s12083-012-0147-5 ◽

2012 ◽

Vol 6 (3) ◽

pp. 233-246 ◽

Cited By ~ 5

Author(s):

Weicai Zhong ◽

Bijan Raahemi ◽

Jing Liu

Keyword(s):

Decision Tree ◽

Data Stream ◽

Peer To Peer ◽

Very Fast Decision Tree ◽

Fast Decision

Download Full-text

Klasifikasi Penentuan Jenis Obat Menggunakan Algoritma Decision Tree

Jurnal Informatika Polinema ◽

10.33795/jip.v7i3.629 ◽

2021 ◽

Vol 7 (3) ◽

pp. 53-60

Author(s):

Rika Nursyahfitri ◽

Alfanda Novebrian Maharadja ◽

Riva Arsyad Farissa ◽

Yuyun Umaidah

Keyword(s):

Decision Tree ◽

Medical Records ◽

Training Data ◽

Accuracy Rate ◽

Data Set ◽

Drug Determination ◽

Decision Tree Method ◽

The Relationship ◽

Tree Method

Classification is a technique that can be used for prediction, where the predicted value is a label. The classification of drug determination aims to predict the type of drug that is accurate for patients with the dataset that has been obtained. The data used in this study are data from the patient's medical records based on the symptoms of the disease but the type of medicine is not yet known. The data set used comes from kaggle.com which is then presented in the form of a decision tree with a mathematical model. To complete this research, a classification method is used in data mining, namely the decision tree. The decision tree method is used to find the relationship between a number of candidate variables, so that it becomes a classification target variable by dividing the data into 70% data testing and 30% training data. The results obtained from this study are in the form of rules and an accuracy rate of 96.36% as well as the recall and precision values of each type of drug using a multiclass configuration matrix.

Download Full-text

Naïve Bayes Algorithm for Classification of Student Major’s Specialization

Journal of Intelligent Computing & Health Informatics ◽

10.26714/jichi.v1i1.5570 ◽

2020 ◽

Vol 1 (1) ◽

pp. 17

Author(s):

Astia Weni Syaputri ◽

Erno Irwandi ◽

Mustakim Mustakim

Keyword(s):

Social Sciences ◽

Naive Bayes ◽

Confusion Matrix ◽

High Accuracy ◽

Naïve Bayes ◽

Natural Sciences ◽

Training Data ◽

Average Value ◽

Bayes Algorithm

Majors are important in determining student specialization. If there is an error in the direction of the student, it will certainly affect the education of subsequent students. In SMA Negeri 1 Kampar Timur, there are two majors, namely Natural Sciences and Social Sciences. To determine these majors, it is necessary to reference the average value of student grades from semester 3 to semester 5 which includes the average value of Islamic religious education, Indonesian, Citizenship Education, English, Natural Sciences, Social Sciences, and Mathematics. Naive Beyes algorithm is an algorithm that can be used in classifying majors found in SMA Negeri 1 Kampar Timur. To determine the classification of majors in SMA Negeri 1 Kampar Timur, training data and test data are used, respectively at 70% and 30%. This data will be tested for accuracy using a confusion matrix and produces a fairly high accuracy of 96.19%. With this high accuracy, the Naive Bayes algorithm is very suitable to be used in determining the direction of students in SMA Negeri 1 Kampar Timur.

Download Full-text

Detection and Classification of Concrete Patches by Integrating GPR and Surface Imaging

10.5703/1288284317320 ◽

2021 ◽

Author(s):

Peng Cheng ◽

James V. Krogmeier ◽

Mark R. Bell ◽

Joshua Li ◽

Guangwei Yang

Keyword(s):

Imaging System ◽

Training Data ◽

High Rate ◽

False Alarms ◽

Neural Net ◽

Pavement Distress ◽

Software Application ◽

Using Data ◽

Ground Penetrating

This research considers the detection, location, and classification of patches in concrete and asphalt-on-concrete pavements using data taken from ground penetrating radar (GPR) and the WayLink 3D Imaging System. In particular, the project seeks to develop a patching table for “inverted-T” patches. A number of deep neural net methods were investigated for patch detection from 3D elevation and image observation, but the success was inconclusive, partly because of a dearth of training data. Later, a method based on thresholding IRI values computed on a 12-foot window was used to localize pavement distress, particularly as seen by patch settling. This method was far more promising. In addition, algorithms were developed for segmentation of the GPR data and for classification of the ambient pavement and the locations and types of patches found in it. The results so far are promising but far from perfect, with a relatively high rate of false alarms. The two project parts were combined to produce a fused patching table. Several hundred miles of data was captured with the Waylink System to compare with a much more limited GPR dataset. The primary dataset was captured on I-74. A software application for MATLAB has been written to aid in automation of patch table creation.

Download Full-text