Using job scheduler simulator to evaluate the effectiveness of job run time prediction
Journal: Software & Systems (Vol.35, No. 1)Publication Date: 2022-03-16
Authors : S.S. Shumilin; M.Yu. Vorobev;
Page : 124-131
Keywords : mvs-10p; simulator; job scheduling; slurm; predictive analytics;
Abstract
The paper investigates the efficiency of queue scheduling using pretrained models. A supercomputer cluster uses a scheduler to distribute the incoming job flow among the available computing resources. In order to place a job in the queue, the scheduler uses the data specified by a user, including the ordered program runtime. However, users often misjudge the runtime and choose an upper estimate. If the job completes earlier than specified, then the scheduler needs to reschedule the queue. A large number of such events can reduce the efficiency of resource allocation. Recently, there have been many papers describing the use of machine learning to predict the job run time. This allows using the run time calculated by a pretrained model during the scheduling process. However, all the models contain an estimation error. Therefore, the problem is the need to assess the efficiency of planning for a given value of the model error. This paper investigates the effectiveness of the proposed approach by comparing the scheduling efficiency in two scenarios: 1) the scheduler uses the time specified by a user and 2) the scheduler uses the real job runtime. For this purpose, the SLURM scheduler simulator performs simulation on the statistical data of the MVS-10P OP2 supercomputer installed at the Joint Supercomputer Center of the Russian Academy of Sciences. The results show that average waiting time in scenario 2 reduced by 25 %. Slowdown reduced by 50 %. Resource utilization did not change significantly. The experimental results indicate the practicability of using machine learning algorithms to predict the running time of jobs arriving at a supercomputer cluster. Thus, the article provides an estimate of the ultimate optimization, since the experiment assumes a hundred percent prediction accuracy, which to date is not demonstrated by any of the presented works on runtime prediction.
Other Latest Articles
- Modeling an air route network structure with prefractal graphs
- Software implementation of the algorithm for finding the optimal temperature condition of the catalytic process
- An algorithm for ensuring the required level of stability of control of an unmanned aerial vehicle in the conditions of counteraction
- The method for creating parallel software tools for modeling military complexes
- An analysis of the efficiency of the process of servicing the flow of requests for creating IT-services used a simulation model
Last modified: 2022-07-06 17:45:55