Fault Detection and Tolerance in Cluster of Workstations using Message Passing Interface
Journal: Sir Syed University Research journal of Engineering & Technology (SSURJ) (Vol.1, No. 1)Publication Date: 2011-12-15
Authors : Syed Misbahuddin;
Page : 1-4
Keywords : Availability; Cluster of Workstation; MPI Applications.;
Abstract
A Cluster of Workstations (COW) is network based multi-computer system aimed to replace supercomputers. A cluster of workstations works on Divisible Load Theory (DLT) according to which a job is divided into n subtasks and delegated to n workstations in the COW architecture. To get the job completed, all subtasks must be completed. Therefore, for satisfactory job completion, all workstations must be functional. However, a faulty node can suspend the overall job completion task until and unless some fault avoidance and correction measures are taken. This paper presents a fault detection and fault tolerant algorithm which will use Message Passing Interface (MPI) to identify faulty workstations and transfer the subtask being performed by them to a normally working workstation. The assigned workstations will continue their original subtasks in addition to assigned subtasks on time sharing basis.
Other Latest Articles
- RF Based Wireless Fire Security System for Hospitals
- Home Energy Management within Smart Grid via WSN
- Maximum Likelihood Decoder for Variable Length Codes
- Introducing Primality Testing Algorithm with an Implementation on 64 bits RSA Encryption Using Verilog
- Deployment of Sensors to Optimize the Network Coverage Using Genetic Algorithm
Last modified: 2018-12-21 14:47:35