scholarly journals The Impact of Different Botnet Flow Feature Subsets on Prediction Accuracy Using Supervised and Unsupervised Learning Methods

Author(s):  
Sean Miller ◽  
Curtis Busby-Earle

Text mining utilizes machine learning (ML) and natural language processing (NLP) for text implicit knowledge recognition, such knowledge serves many domains as translation, media searching, and business decision making. Opinion mining (OM) is one of the promised text mining fields, which are used for polarity discovering via text and has terminus benefits for business. ML techniques are divided into two approaches: supervised and unsupervised learning, since we herein testified an OM feature selection(FS)using four ML techniques. In this paper, we had implemented number of experiments via four machine learning techniques on the same three Arabic language corpora. This paper aims at increasing the accuracy of opinion highlighting on Arabic language, by using enhanced feature selection approaches. FS proposed model is adopted for enhancing opinion highlighting purpose. The experimental results show the outperformance of the proposed approaches in variant levels of supervisory,i.e. different techniques via distinct data domains. Multiple levels of comparison are carried out and discussed for further understanding of the impact of proposed model on several ML techniques.


2011 ◽  
Vol 32 (11) ◽  
pp. 1523-1531 ◽  
Author(s):  
Janick V. Frasch ◽  
Aleksander Lodwich ◽  
Faisal Shafait ◽  
Thomas M. Breuel

2021 ◽  
Author(s):  
Hao Wang ◽  
Yi-Qin Dai ◽  
Jie Yu ◽  
Yong Dong

Abstract Improving resource utilization is an important goal of high-performance computing systems of supercomputing centers. In order to meet this goal, the job scheduler of high-performance computing systems often use backfilling scheduling to fill short-time jobs into the gaps of jobs at the front of the queue. Backfilling scheduling needs to obtain the running time of the job. In the past, the job running times are usually given by users and often far exceeded the actual running time of the job, which leads to inaccurate backfilling and a waste of computing resources. In particular, when the predicted job running time is lower than the actual time, the damage caused to the utilization of the system’s computing resources becomes more serious. Therefore, the prediction accuracy of the job running time is crucial to the utilization of system resources. The use of machine learning methods can make more accurate predictions of the job running time. Aiming at the parallel application of aerodynamics, we propose a job running time prediction framework SU combining supervised and unsupervised learning, and verifies it on the real historical data of the high-performance computing systems of China Aerodynamics Research and Development Center(CARDC). The experimental results show that SU has a high prediction accuracy(80.46%) and a low underestimation rate(24.85%).


1999 ◽  
Vol 32 (2) ◽  
pp. 7772-7777
Author(s):  
István Dalmi ◽  
László Kovács ◽  
István Loránt ◽  
Gábor Terstyánszky

2015 ◽  
Vol 29 (S1) ◽  
Author(s):  
John Bukowy ◽  
Louise Evans ◽  
Elizabeth Broadway ◽  
Alex Dayton ◽  
Allen Cowley

2021 ◽  
Vol 3 (1) ◽  
Author(s):  
Hao Wang ◽  
Yi-Qin Dai ◽  
Jie Yu ◽  
Yong Dong

AbstractImproving resource utilization is an important goal of high-performance computing systems of supercomputing centers. To meet this goal, the job scheduler of high-performance computing systems often uses backfilling scheduling to fill short-time jobs into job gaps at the front of the queue. Backfilling scheduling needs to obtain the running time of the job. In the past, the job running time is usually given by users and often far exceeded the actual running time of the job, which leads to inaccurate backfilling and a waste of computing resources. In particular, when the predicted job running time is lower than the actual time, the damage caused to the utilization of the system’s computing resources becomes more serious. Therefore, the prediction accuracy of the job running time is crucial to the utilization of system resources. The use of machine learning methods can make more accurate predictions of the job running time. Aiming at the parallel application of aerodynamics, we propose a job running time prediction framework SU combining supervised and unsupervised learning and verify it on the real historical data of the high-performance computing systems of China Aerodynamics Research and Development Center (CARDC). The experimental results show that SU has a high prediction accuracy (80.46%) and a low underestimation rate (24.85%).


Sign in / Sign up

Export Citation Format

Share Document