Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Reliability in Grid Computing System

Published

Author(s)

Christopher E. Dabrowski

Abstract

In recent years, grid technology has emerged as an important tool for solving compute-intensive problems within the scientific community and in industry. To further the development and adoption of this technology, researchers and practitioners from different disciplines have collaborated to produce standard specifications for implementing large-scale, interoperable grid-systems. The focus of this activity has been the Open Grid Forum, but other standards development organizations have also produced specifications that are used in grid systems. To date, these specifications have been used to build operational systems that provide basic grid functions. However, to fully realize the potential of grid technology, it also will be critical to ensure that grid systems are highly reliable and that specifications used to build these systems fully support reliable grid services. Similarly, it will be necessary to ensure that grid systems continue to be reliable under conditions of scale, heterogeneity, and dynamism. This study surveys work on grid reliability that has been done in recent years and reviews progress made toward achieving these goals. The survey identifies important issues and problems that researchers are working to overcome in order to develop reliability methods for large-scale, heterogeneous, dynamic environments. The survey also illuminates reliability issues relating to standard specifications used in grid systems, identifying existing specifications that may need to be evolved and areas where new specifications are needed to better support reliability.
Citation
Concurrency and Computation-Practice & Experience

Keywords

Grid computing, reliability, fault tolerance

Citation

Dabrowski, C. (2009), Reliability in Grid Computing System, Concurrency and Computation-Practice & Experience (Accessed April 25, 2024)
Created February 13, 2009, Updated February 19, 2017