ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

Modelling a supercomputer job bundling system based on the Alea simulator

Journal: Software & Systems (Vol.35, No. 4)

Publication Date:

Authors : ; ;

Page : 631-643

Keywords : high-performance computing; job management system; simulation; job bundling; alea;

Source : Download Find it from : Google Scholarexternal

Abstract

Modern supercomputer job management systems (JMS) are complex software using many different scheduling algorithms with various parameters. We cannot predict or calculate the impact of changing these parameters on JMS quality metrics. For this reason, researchers use simulation modelling to determine the optimal JMS parameters. This article discusses the problem of developing a supercomputer job management system model based on the well-known Alea simulator. The object of study is our scheduling algorithm used for developing the supercomputer job bundling system. The algorithm bundles jobs with a long initialization time into groups (packets) according to job types. Initialization is performed once for each group, and then the jobs of the group are executed one after the other. By using a bundling system, it is possible to reduce the initialization overhead and increase the job scheduling efficiency. We implemented the bundling algorithm as a part of the Alea simulator. We have done comparative simulation of implemented algorithm for various workloads. The comparison involved the FCFS and Backfill scheduling algorithms built into Alea. Several workloads with different intensities were generated for the simulation. The minimum job initialization share thresholds for these workloads were determined based on the simulation results. The bundling system noticeably improves the scheduling efficiency compared to the FCFS and Backfill algorithms starting from these thresholds. The study results showed that the developed simulation model could be used as a software tool for a comparative analysis of various algorithms for supercomputer job scheduling.

Last modified: 2023-04-07 16:45:28