scholarly journals Machine Learning Approach for Detection of nonTor Traffic

Author(s):  
Elike Hodo ◽  
Xavier Bellekens ◽  
Ephraim Iorkyase ◽  
Andrew Hamilton ◽  
Christos Tachtatzis ◽  
...  

Intrusion detection has attracted a considerable interest from researchers and industry. After many years of research the community still faces the problem of building reliable and efficient intrusion detection systems (IDS) capable of handling large quantities of data with changing patterns in real time situations. The Tor network is popular in providing privacy and security to end user by anonymizing the identity of internet users connecting through a series of tunnels and nodes. This work identifies two problems; classification of Tor traffic and nonTor traffic to expose the activities within Tor traffic that minimizes the protection of users in using the UNB-CIC Tor Network Traffic dataset and classification of the Tor traffic flow in the network. This paper proposes a hybrid classifier; Artificial Neural Network in conjunction with Correlation feature selection algorithm for dimensionality reduction and improved classification performance. The reliability and efficiency of the propose hybrid classifier is compared with Support Vector Machine and naïve Bayes classifiers in detecting nonTor traffic in UNB-CIC Tor Network Traffic dataset. Experimental results show the hybrid classifier, ANN-CFS proved a better classifier in detecting nonTor traffic and classifying the Tor traffic flow in UNB-CIC Tor Network Traffic dataset.  

2019 ◽  
Vol 2019 ◽  
pp. 1-9
Author(s):  
Yizhe Wang ◽  
Cunqian Feng ◽  
Yongshun Zhang ◽  
Sisan He

Precession is a common micromotion form of space targets, introducing additional micro-Doppler (m-D) modulation into the radar echo. Effective classification of space targets is of great significance for further micromotion parameter extraction and identification. Feature extraction is a key step during the classification process, largely influencing the final classification performance. This paper presents two methods for classifying different types of space precession targets from the HRRPs. We first establish the precession model of space targets and analyze the scattering characteristics and then compute electromagnetic data of the cone target, cone-cylinder target, and cone-cylinder-flare target. Experimental results demonstrate that the support vector machine (SVM) using histograms of oriented gradient (HOG) features achieves a good result, whereas the deep convolutional neural network (DCNN) obtains a higher classification accuracy. DCNN combines the feature extractor and the classifier itself to automatically mine the high-level signatures of HRRPs through a training process. Besides, the efficiency of the two classification processes are compared using the same dataset.


Sensors ◽  
2021 ◽  
Vol 21 (21) ◽  
pp. 7417
Author(s):  
Alex J. Hope ◽  
Utkarsh Vashisth ◽  
Matthew J. Parker ◽  
Andreas B. Ralston ◽  
Joshua M. Roper ◽  
...  

Concussion injuries remain a significant public health challenge. A significant unmet clinical need remains for tools that allow related physiological impairments and longer-term health risks to be identified earlier, better quantified, and more easily monitored over time. We address this challenge by combining a head-mounted wearable inertial motion unit (IMU)-based physiological vibration acceleration (“phybrata”) sensor and several candidate machine learning (ML) models. The performance of this solution is assessed for both binary classification of concussion patients and multiclass predictions of specific concussion-related neurophysiological impairments. Results are compared with previously reported approaches to ML-based concussion diagnostics. Using phybrata data from a previously reported concussion study population, four different machine learning models (Support Vector Machine, Random Forest Classifier, Extreme Gradient Boost, and Convolutional Neural Network) are first investigated for binary classification of the test population as healthy vs. concussion (Use Case 1). Results are compared for two different data preprocessing pipelines, Time-Series Averaging (TSA) and Non-Time-Series Feature Extraction (NTS). Next, the three best-performing NTS models are compared in terms of their multiclass prediction performance for specific concussion-related impairments: vestibular, neurological, both (Use Case 2). For Use Case 1, the NTS model approach outperformed the TSA approach, with the two best algorithms achieving an F1 score of 0.94. For Use Case 2, the NTS Random Forest model achieved the best performance in the testing set, with an F1 score of 0.90, and identified a wider range of relevant phybrata signal features that contributed to impairment classification compared with manual feature inspection and statistical data analysis. The overall classification performance achieved in the present work exceeds previously reported approaches to ML-based concussion diagnostics using other data sources and ML models. This study also demonstrates the first combination of a wearable IMU-based sensor and ML model that enables both binary classification of concussion patients and multiclass predictions of specific concussion-related neurophysiological impairments.


2019 ◽  
Vol 38 (1) ◽  
pp. 155-169
Author(s):  
Chihli Hung ◽  
You-Xin Cao

Purpose This paper aims to propose a novel approach which integrates collocations and domain concepts for Chinese cosmetic word of mouth (WOM) sentiment classification. Most sentiment analysis works by collecting sentiment scores from each unigram or bigram. However, not every unigram or bigram in a WOM document contains sentiments. Chinese collocations consist of the main sentiments of WOM. This paper reduces the complexity of the document dimensionality and makes an improvement for sentiment classification. Design/methodology/approach This paper builds two contextual lexicons for feature words and sentiment words, respectively. Based on these contextual lexicons, this paper uses the techniques of associated rules and mutual information to build possible Chinese collocation sets. This paper applies preference vector modelling as the vector representation approach to catch the relationship between Chinese collocations and their associated concepts. Findings This paper compares the proposed preference vector models with benchmarks, using three classification techniques (i.e. support vector machine, J48 decision tree and multilayer perceptron). According to the experimental results, the proposed models outperform all benchmarks evaluated by the criterion of accuracy. Originality/value This paper focuses on Chinese collocations and proposes a novel research approach for sentiment classification. The Chinese collocations used in this paper are adaptable to the content and domains. Finally, this paper integrates collocations with the preference vector modelling approach, which not only achieves a better sentiment classification performance for Chinese WOM documents but also avoids the curse of dimensionality.


2019 ◽  
Vol 9 (2) ◽  
Author(s):  
Nur Rafeeqkha Sulaiman ◽  
Maheyzah Md. Siraj

Due to the growth of Internet, it has not only become the medium for getting information, it has also become a platform for communicating. Social Network Service (SNS) is one of the main platform where Internet users can communicate by distributing, sharing of information and knowledge. Chatting has become a popular communication medium for Internet users whereby users can communicate directly and privately with each other. However, due to the privacy of chat rooms or chatting mediums, the content of chat logs is not monitored and not filtered. Thus, easing cyber predators preying on their preys. Cyber groomers are one of cyber predators who prey on children or minors to satisfy their sexual desire. Workforce expertise that involve in intelligence gathering always deals with difficulty as the complexity of crime increases, human errors and time constraints. Hence, it is difficult to prevent undesired content, such as grooming conversation, in chat logs. An investigation on two term weighting schemes on two datasets are used to improve the content-based classification techniques. This study aims to improve the content-based classification accuracy on chat logs by comparing two term weighting schemes in classifying grooming contents. Two term weighting schemes namely Term Frequency – Inverse Document Frequency – Inverse Class Space Density Frequency (TF.IDF.ICSdF) and Fuzzy Rough Feature Selection (FRFS) are used as feature selection process in filtering chat logs. The performance of these techniques were examined via datasets, and the accuracy of their result was measured by Support Vector Machine (SVM). TF.IDF.ICSdF and FRFS are judged based on accuracy, precision, recall and F score measurement.


2018 ◽  
Vol 21 (62) ◽  
pp. 1
Author(s):  
Jorge E. Camargo ◽  
Vladimir Vargas-Calderon ◽  
Nelson Vargas ◽  
Liliana Calderón-Benavides

With the purpose of classifying text based on its sentiment polarity (positive or negative), we proposed an extension of a 68,000 tweets corpus through the inclusion of word definitions from a dictionary of the Real Academia Espa\~{n}ola de la Lengua (RAE). A set of 28,000 combinations of 6 Word2Vec and support vector machine parameters were considered in order to evaluate how positively would affect the inclusion of a RAE's dictionary definitions classification performance. We found that such a corpus extension significantly improve the classification accuracy. Therefore, we conclude that the inclusion of a RAE's dictionary increases the semantic relations learned by Word2Vec allowing a better classification accuracy.


Electrocardiogram (ECG) examination via computer techniques that involve feature extraction, pre-processing and post-processing was implemented due to its significant advantages. Extracting ECG signal standard features that requires high processing operation level was the main focusing point for many studies. In this paper, up to 6 different ECG signal classes are accurately predicted in the absence of ECG feature extraction. The corner stone of the proposed technique in this paper is the Linear predictive coding (LPC) technique that regress and normalize the signal during the pre-processing phase. Prior to the feature extraction using Wavelet energy (WE), a direct Wavelet transform (DWT) is implemented that converted ECG signal to frequency domain. In addition, the dataset was divided into two parts , one for training and the other for testing purposes Which have been classified in this proposed algorithm using support vector machine (SVM). Moreover, using MIT AI2 Companion was developed by MIT Center for Mobile Learning, the classification result was shared to the patient mobile phone that can call the ambulance and send the location in case of serious emergency. Finally, the confusion matrix values are used to measure the proposed classification performance. For 6 different ECG classes, an accuracy ration of about 98.15% was recorded. This ratio became 100% for 3 ECG signal classes and decreases to 97.95% by increasing ECG signal to 7 classes.


2020 ◽  
Vol 3 (2) ◽  
pp. 196-206
Author(s):  
Mausumi Das Nath ◽  
◽  
Tapalina Bhattasali

Due to the enormous usage of the Internet, users share resources and exchange voluminous amounts of data. This increases the high risk of data theft and other types of attacks. Network security plays a vital role in protecting the electronic exchange of data and attempts to avoid disruption concerning finances or disrupted services due to the unknown proliferations in the network. Many Intrusion Detection Systems (IDS) are commonly used to detect such unknown attacks and unauthorized access in a network. Many approaches have been put forward by the researchers which showed satisfactory results in intrusion detection systems significantly which ranged from various traditional approaches to Artificial Intelligence (AI) based approaches.AI based techniques have gained an edge over other statistical techniques in the research community due to its enormous benefits. Procedures can be designed to display behavior learned from previous experiences. Machine learning algorithms are used to analyze the abnormal instances in a particular network. Supervised learning is essential in terms of training and analyzing the abnormal behavior in a network. In this paper, we propose a model of Naïve Bayes and SVM (Support Vector Machine) to detect anomalies and an ensemble approach to solve the weaknesses and to remove the poor detection results


2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Gongliang Li ◽  
Mingyong Yin ◽  
Siyuan Jing ◽  
Bing Guo

Detection of abnormal network traffic is an important issue when builds intrusion detection systems. An effective way to address this issue is time series mining, in which the network traffic is naturally represented as a set of time series. In this paper, we propose a novel efficient algorithm, called RSFID (Random Shapelet Forest for Intrusion Detection), to detect abnormal traffic flow patterns in periodic network packets. Firstly, the Fast Correlation-based Filter (FCBF) algorithm is employed to remove irrelevant features to decrease the overfitting as well as the time complexity. Then, a random forest which is built upon a set of shapelet candidates is used to classify the normal and abnormal traffic flow patterns. Specifically, the Symbolic Aggregate approXimation (SAX) and random sampling technique are adopted to mitigate the high time complexity caused by enumerating shapelet candidates. Experimental results show the effectiveness and efficiency of the proposed algorithm.


Author(s):  
G. Jayagopi ◽  
S. Pushpa

<span>Heart diseases had been molded as potential threats to human lives, especially to elderly people in recent days due to the dynamically varying food habits among the people. However, these diseases could be easily caught by proper analysis of Electrocardiogram (ECG) signals acquired from individuals. This paper proposes a better method to detect and classify the arrhythmia using 15 features which include 4 R-R interval features, 3 statistical and 6 chaotic features estimated from ECG signals. Additionally, Entropy and Energy features had been gained after converting one dimensional ECG signals to two dimensional data and applied Tetrolet transforms on that.  Total numbers of 15 features had been utilized to classify the heart beats from the benchmark MIT-Arrhythmia database using Support Vector Machines (SVM). The classification performance was analyzed under various kernel functions and different Tetrolet decomposition levels. It is found that Radial Basis Function (RBF) kernel could perform better than linear and polynomial kernels. This research attempt yielded an accuracy of 99.35 % against the existing works. Moreover, addition of two more features had introduced a negligible overhead of time. Hence, this method is better suitable to detect and classify the Arrhythmia in both online and offline.</span>


2021 ◽  
Vol 3 (1) ◽  
pp. 6
Author(s):  
Eren Can Seyrek ◽  
Murat Uysal

Hyperspectral images (HSI) offer detailed spectral reflectance information about sensed objects through provision of information on hundreds of narrow spectral bands. HSI have a leading role in a broad range of applications, such as in forestry, agriculture, geology, and environmental sciences. The monitoring and management of agricultural lands is of great importance for meeting the nutritional and other needs of a rapidly and continuously increasing world population. In relation to this, classification of HSI is an effective way for creating land use and land cover maps quickly and accurately. In recent years, classification of HSI using convolutional neural networks (CNN), which is a sub-field of deep learning, has become a very popular research topic and several CNN architectures have been developed by researchers. The aim of this study was to investigate the classification performance of CNN model on agricultural HSI scenes. For this purpose, a 3D-2D CNN framework and a well-known support vector machine (SVM) model were compared using the Indian Pines and Salinas Scene datasets that contain crop and mixed vegetation classes. As a result of this study, it was confirmed that use of 3D-2D CNN offers superior performance for classifying agricultural HSI datasets.


Sign in / Sign up

Export Citation Format

Share Document