scholarly journals Predicting running time of aerodynamic jobs in HPC system by combining supervised and unsupervised learning method

2021 ◽  
Vol 3 (1) ◽  
Author(s):  
Hao Wang ◽  
Yi-Qin Dai ◽  
Jie Yu ◽  
Yong Dong

AbstractImproving resource utilization is an important goal of high-performance computing systems of supercomputing centers. To meet this goal, the job scheduler of high-performance computing systems often uses backfilling scheduling to fill short-time jobs into job gaps at the front of the queue. Backfilling scheduling needs to obtain the running time of the job. In the past, the job running time is usually given by users and often far exceeded the actual running time of the job, which leads to inaccurate backfilling and a waste of computing resources. In particular, when the predicted job running time is lower than the actual time, the damage caused to the utilization of the system’s computing resources becomes more serious. Therefore, the prediction accuracy of the job running time is crucial to the utilization of system resources. The use of machine learning methods can make more accurate predictions of the job running time. Aiming at the parallel application of aerodynamics, we propose a job running time prediction framework SU combining supervised and unsupervised learning and verify it on the real historical data of the high-performance computing systems of China Aerodynamics Research and Development Center (CARDC). The experimental results show that SU has a high prediction accuracy (80.46%) and a low underestimation rate (24.85%).

2021 ◽  
Author(s):  
Hao Wang ◽  
Yi-Qin Dai ◽  
Jie Yu ◽  
Yong Dong

Abstract Improving resource utilization is an important goal of high-performance computing systems of supercomputing centers. In order to meet this goal, the job scheduler of high-performance computing systems often use backfilling scheduling to fill short-time jobs into the gaps of jobs at the front of the queue. Backfilling scheduling needs to obtain the running time of the job. In the past, the job running times are usually given by users and often far exceeded the actual running time of the job, which leads to inaccurate backfilling and a waste of computing resources. In particular, when the predicted job running time is lower than the actual time, the damage caused to the utilization of the system’s computing resources becomes more serious. Therefore, the prediction accuracy of the job running time is crucial to the utilization of system resources. The use of machine learning methods can make more accurate predictions of the job running time. Aiming at the parallel application of aerodynamics, we propose a job running time prediction framework SU combining supervised and unsupervised learning, and verifies it on the real historical data of the high-performance computing systems of China Aerodynamics Research and Development Center(CARDC). The experimental results show that SU has a high prediction accuracy(80.46%) and a low underestimation rate(24.85%).


2021 ◽  
Author(s):  
Hao Wang ◽  
Yi-Qin Dai ◽  
Jie Yu ◽  
Yong Dong

Abstract Improving resource utilization is an important goal of high-performance computing systems of supercomputing centers. In order to meet this goal, the job scheduler of high-performance computing systems often use backfilling scheduling to fill short-time jobs into the gaps of jobs at the front of the queue. Backfilling scheduling needs to obtain the running time of the job. In the past, the job running times are usually given by users and often far exceeded the actual running time of the job, which leads to inaccurate backfilling and a waste of computing resources. In particular, when the predicted job running time is lower than the actual time, the damage caused to the utilization of the system’s computing resources becomes more serious. Therefore, the prediction accuracy of the job running time is crucial to the utilization of system resources. The use of machine learning methods can make more accurate predictions of the job running time. Aiming at the parallel application of aerodynamics, we propose a job running time prediction framework SU combining supervised and unsupervised learning, and verifies it on the real historical data of the high-performance computing systems of China erodynamics Research and Development Center(CARDC). The experimental results show that SU has a high prediction accuracy(80.46%) and a low underestimation rate(24.85%).


Sign in / Sign up

Export Citation Format

Share Document