GT 4.0 Component Fact Sheet: Web Service Grid Resource Allocation and Management (WS GRAM)

1. Brief component overview

Web Services Grid Resource Allocation and Management (WS GRAM) component comprises a set of WSRF-compliant Web services to locate, submit, monitor, and cancel jobs on Grid computing resources. WS GRAM is not a job scheduler, but rather a set of services and clients for communicating with a range of different batch/cluster job schedulers using a common protocol. WS GRAM is meant to address a range of jobs where reliable operation, stateful monitoring, credential management, and file staging are important.

2. Summary of features

New Features new since 3.2

  • Support for mpich-g2 jobs:

    • multi-job submission capabilities
    • ability to coordinate processes in a job
    • ability to coordinate subjobs in a multi-job

  • Publishing of the job's exit code
  • The ability to select the account under which the remote job will be run. If a user's grid credential is mapped to multiple accounts, then the user can specify, in the RSL, under which account the job should be run.
  • Optional client-specified hold on a state. Released with the new "release" operation.

Other Supported Features

  • Remote job execution and management
  • Uniform and flexible interface to batch scheduling systems
  • File staging before and after job execution
  • File / directory clean up after job execution (after file stage out)

Deprecated Features

  • managed-job-globusrun has been replaced by globusrun-ws.
  • Service managed data streaming of job's stdout/err during execution.
  • File staging using the GASS protocol
  • File caching of stages files, e.g. GASS Cache

3. Usability summary

WS GRAM usability has improved considerably in GT4!

  • Improved service performance:

    • Job Concurrency. The maximum number of jobs a gram service can manage at one time.
    • Job Throughput. The rate at which jobs can be processed (e.g. x /bin/date jobs per minute).
    • Job Latency. The rate at which a single operation to the GRAM service can be processes.
    • Details of performance testing can be found here.
  • Fault Tolerance. The ability for the GRAM service to recover after a container or host crash. Recovery includes the continued processing and monitoring of all jobs managed by the GRAM service at the time of the crash. The 4.0 GRAM architecture was simplified (from GT3) which is the main reason that fault tolerance has improved.

4. Backward compatibility summary

Protocol changes since GT version 3.2:

  • The protocol has been changed to be WSRF compliant. There is no backward compatibility between this version and any previous versions.

5. Technology dependencies

GRAM depends on the following GT components:

  • Java WS Core
  • Transport-Level Security
  • Delegation Service
  • RFT
  • GridFTP
  • MDS - internal libraries

GRAM depends on the following 3rd party software. The dependency exists only for the batch schedulers configured, thus making job submissions possible to the batch scheduling service:

Scheduler adapters are included in the GT 4.0.x releases for these schedulers:

Other scheduler adapters available for GT 4.0.x releases:

6. Tested platforms

Tested platforms for WS GRAM:

  • Linux

    • Fedora Core 1 i686
    • Fedora Core 3 i686
    • Fedora Core 3 yup xeon
    • RedHat 7.3 i686
    • RedHat 9 x86
    • Debian Sarge x86
    • Debian 3.1 i686

Tested containers for WS GRAM:

  • Java WS Core container
  • Tomcat 4.1.31

7. Associated standards

WS GRAM does not currently have any associated standards.

8. For More Information

Click here for more information about this component.