scholarly journals A Lightweight Flow Feature-Based IoT Device Identification Scheme

2022 ◽  
Vol 2022 ◽  
pp. 1-10
Ruizhong Du ◽  
Jingze Wang ◽  
Shuang Li

Internet of Things (IoT) device identification is a key step in the management of IoT devices. The devices connected to the network must be controlled by the manager. For this purpose, many schemes are proposed to identify IoT devices, especially the schemes working on the gateway. However, almost all researchers do not pay close attention to the cost. Thus, considering the gateway’s limited storage and computational resources, a new lightweight IoT device identification scheme is proposed. First, the DFI (deep/dynamic flow inspection) technology is utilized to efficiently extract flow-related statistical features based on in-depth studies. Then, combined with symmetric uncertainty and correlation coefficient, we proposed a novel filter feature selection method based on NSGA-III to select effective features for IoT device identification. We evaluate our proposed method by using a real smart home IoT data set and three different ML algorithms. The experimental results showed that our proposed method is lightweight and the feature selection algorithm is also effective, only using 6 features can achieve 99.5% accuracy with a 3-minute time interval.

2021 ◽  
Vol 2 (3) ◽  
pp. 1-24
Chih-Kai Huang ◽  
Shan-Hsiang Shen

The next-generation 5G cellular networks are designed to support the internet of things (IoT) networks; network components and services are virtualized and run either in virtual machines (VMs) or containers. Moreover, edge clouds (which are closer to end users) are leveraged to reduce end-to-end latency especially for some IoT applications, which require short response time. However, the computational resources are limited in edge clouds. To minimize overall service latency, it is crucial to determine carefully which services should be provided in edge clouds and serve more mobile or IoT devices locally. In this article, we propose a novel service cache framework called S-Cache , which automatically caches popular services in edge clouds. In addition, we design a new cache replacement policy to maximize the cache hit rates. Our evaluations use real log files from Google to form two datasets to evaluate the performance. The proposed cache replacement policy is compared with other policies such as greedy-dual-size-frequency (GDSF) and least-frequently-used (LFU). The experimental results show that the cache hit rates are improved by 39% on average, and the average latency of our cache replacement policy decreases 41% and 38% on average in these two datasets. This indicates that our approach is superior to other existing cache policies and is more suitable in multi-access edge computing environments. In the implementation, S-Cache relies on OpenStack to clone services to edge clouds and direct the network traffic. We also evaluate the cost of cloning the service to an edge cloud. The cloning cost of various real applications is studied by experiments under the presented framework and different environments.

2021 ◽  
pp. 1063293X2110160
Dinesh Morkonda Gunasekaran ◽  
Prabha Dhandayudam

Nowadays women are commonly diagnosed with breast cancer. Feature based Selection method plays an important step while constructing a classification based framework. We have proposed Multi filter union (MFU) feature selection method for breast cancer data set. The feature selection process based on random forest algorithm and Logistic regression (LG) algorithm based union model is used for selecting important features in the dataset. The performance of the data analysis is evaluated using optimal features subset from selected dataset. The experiments are computed with data set of Wisconsin diagnostic breast cancer center and next the real data set from women health care center. The result of the proposed approach shows high performance and efficient when comparing with existing feature selection algorithms.

2018 ◽  
Vol 13 (3) ◽  
pp. 323-336 ◽  
Naeimeh Elkhani ◽  
Ravie Chandren Muniyandi ◽  
Gexiang Zhang

Computational cost is a big challenge for almost all intelligent algorithms which are run on CPU. In this regard, our proposed kernel P system multi-objective binary particle swarm optimization feature selection and classification method should perform with an efficient time that we aimed to settle via using potentials of membrane computing in parallel processing and nondeterminism. Moreover, GPUs perform better with latency-tolerant, highly parallel and independent tasks. In this study, to meet all the potentials of a membrane-inspired model particularly parallelism and to improve the time cost, feature selection method implemented on GPU. The time cost of the proposed method on CPU, GPU and Multicore indicates a significant improvement via implementing method on GPU.

Nor Idayu Mahat ◽  
Maz Jamilah Masnan ◽  
Ali Yeon Md Shakaff ◽  
Ammar Zakaria ◽  
Muhd Khairulzaman Abdul Kadir

This chapter overviews the issue of multicollinearity in electronic nose (e-nose) classification and investigates some analytical solutions to deal with the problem. Multicollinearity effect may harm classification analysis from producing good parameters estimate during the construction of the classification rule. The common approach to deal with multicollinearity is feature extraction. However, the criterion used in extracting the raw features based on variances may not be appropriate for the ultimate goal of classification accuracy. Alternatively, feature selection method would be advisable as it chooses only valuable features. Two distance-based criteria in determining the right features for classification purposes, Wilk's Lambda and bounded Mahalanobis distance, are applied. Classification with features determined by bounded Mahalanobis distance statistically performs better than Wilk's Lambda. This chapter suggests that classification of e-nose with feature selection is a good choice to limit the cost of experiments and maintain good classification performance.

2021 ◽  
Marta Ferreira ◽  
Pierre Lovinfosse ◽  
Johanne Hermesse ◽  
Marjolein Decuypere ◽  
Caroline Rousseau ◽  

Abstract Background Features reproducibility and the generalizability of the models are currently among the most important limitations when integrating radiomics into the clinics. Radiomic features are sensitive to imaging acquisition protocols, reconstruction algorithms and parameters, as well as by the different steps of the usual radiomics workflow. We propose a framework for comparing the reproducibility of different pre-processing steps in PET/CT radiomic analysis in the prediction of disease free survival (DFS) across multi-scanners/centers. Results We evaluated and compared the prediction performance of several models that differ in i) the type of intensity discretization, ii) feature selection method, iii) features type i.e, original or tumour to liver ratio radiomic features (OR or TLR). We trained our models using data from one scanner/center and tested on two external scanner/centers. Our results show that there is a low reproducibility in predictions across scanners and discretization methods. Despite of this, TLR based models were generally more robust than OR. Maximum relevance minimum redundancy (MRMR) forward feature selection with Pearson correlation was the feature selection method that had the best mean area under the precision recall curve when using it combining the features from all discretization’s bin’s number (D_All_FBN) with TLR features for two of the four classifiers. Conclusion We evaluated and compared the prediction performance of several models in a data set containing hundred fifty-eight patients with locally advanced cervical cancer (LACC) from three distinct scanners. In our cohort of LAAC patients pre-processing of radiomic features in [18F]FDG PET affects DFS predictions performances across scanners and combining the D_All_FBN TLR approach with the MRMR forward Pearson feature selection method might help increasing robustness of radiomic studies.

As the new technologies are emerging, data is getting generated in larger volumes high dimensions. The high dimensionality of data may rise to great challenge while classification. The presence of redundant features and noisy data degrades the performance of the model. So, it is necessary to extract the relevant features from given data set. Feature extraction is an important step in many machine learning algorithms. Many researchers have been attempted to extract the features. Among these different feature extraction methods, mutual information is widely used feature selection method because of its good quality of quantifying dependency among the features in classification problems. To cope with this issue, in this paper we proposed simplified mutual information based feature selection with less computational overhead. The selected feature subset is experimented with multilayered perceptron on KDD CUP 99 data set with 2- class classification, 5-class classification and 4-class classification. The accuracy is of these models almost similar with less number of features.

Sentiment analysis plays a major role in e-commerce and social media these days. Due to the increasing growth of social media, a huge number of peoples and users send their reviews through the Internet and several other sources. Analyzing this data is challenging in today's life. In this paper new normalization based feature selection method is proposed and the topic of interest here is to select the relevant features and perform the classification of the data and find the accuracy. Stability of the data is considered as the most important challenge in analyzing the sentiments. In this paper investigating the sentiments and selecting the relevant features from the data set places a major role. The aim is to work with the vector-based feature selection and check the classification performance using recurrent networks. In this paper, text mining depends on feature retrieval methods to improve accuracy and propose a single matrix normalization method to reduce the dimensions. The proposed method performs data preprocessing or sentiment classification and features reduction to improve accuracy. The proposed method achieves better accuracy than the N-gram feature selection method. The experimental results show that the proposed method has better accuracy than other traditional feature selection approaches and that the proposed method can decrease the implementation time.

Sensors ◽  
2020 ◽  
Vol 20 (21) ◽  
pp. 6336 ◽  
Mnahi Alqahtani ◽  
Hassan Mathkour ◽  
Mohamed Maher Ben Ismail

Nowadays, Internet of Things (IoT) technology has various network applications and has attracted the interest of many research and industrial communities. Particularly, the number of vulnerable or unprotected IoT devices has drastically increased, along with the amount of suspicious activity, such as IoT botnet and large-scale cyber-attacks. In order to address this security issue, researchers have deployed machine and deep learning methods to detect attacks targeting compromised IoT devices. Despite these efforts, developing an efficient and effective attack detection approach for resource-constrained IoT devices remains a challenging task for the security research community. In this paper, we propose an efficient and effective IoT botnet attack detection approach. The proposed approach relies on a Fisher-score-based feature selection method along with a genetic-based extreme gradient boosting (GXGBoost) model in order to determine the most relevant features and to detect IoT botnet attacks. The Fisher score is a representative filter-based feature selection method used to determine significant features and discard irrelevant features through the minimization of intra-class distance and the maximization of inter-class distance. On the other hand, GXGBoost is an optimal and effective model, used to classify the IoT botnet attacks. Several experiments were conducted on a public botnet dataset of IoT devices. The evaluation results obtained using holdout and 10-fold cross-validation techniques showed that the proposed approach had a high detection rate using only three out of the 115 data traffic features and improved the overall performance of the IoT botnet attack detection process.

Esraa H. Abd Al-Ameer, Ahmed H. Aliwy

Documents classification is from most important fields for Natural language processing and text mining. There are many algorithms can be used for this task. In this paper, focuses on improving Text Classification by feature selection. This means determine some of the original features without affecting the accuracy of the work, where our work is a new feature selection method was suggested which can be a general formulation and mathematical model of Recursive Feature Elimination (RFE). The used method was compared with other two well-known feature selection methods: Chi-square and threshold. The results proved that the new method is comparable with the other methods, The best results were 83% when 60% of features used, 82% when 40% of features used, and 82% when 20% of features used. The tests were done with the Naïve Bayes (NB) and decision tree (DT) classification algorithms , where the used dataset is a well-known English data set “20 newsgroups text” consists of approximately 18846 files. The results showed that our suggested feature selection method is comparable with standard Like Chi-square.

Omar S. Qasim ◽  
Mohammed Sabah Mahmoud ◽  
Fatima Mahmood Hasan

The aim of the feature selection technique is to obtain the most important information from a specific set of datasets. Further elaborations in the feature selection technique will positively affect the classification process, which can be applied in various areas such as machine learning, pattern recognition, and signal processing. In this study, a hybrid algorithm between the binary dragonfly algorithm (BDA) and the statistical dependence (SD) is presented, whereby the feature selection method in discrete space is modeled as a binary-based optimization algorithm, guiding BDA and using the accuracy of the k-nearest neighbors classifier on the dataset to verify it in the chosen fitness function. The experimental results demonstrated that the proposed algorithm, which we refer to as SD-BDA, outperforms other algorithms in terms of the accuracy of the results represented by the cost of the calculations and the accuracy of the classification.

Sign in / Sign up

Export Citation Format

Share Document