GT 4.0 Component Guide to Public Interfaces: WS GRAM

1. Semantics and syntax of APIs

1.1. Programming Model Overview

This component consists abstractly of two interfaces: the Managed Job Factory Port Type(MJFPT) and the Managed Job Port Type (MJPT).

In actuality there are three service/resource implementations, two of which implement the basic MJPT. The first one is the service which actually talks to a particular local resource manager to execute a process on the remote computer or cluster. This one is called a Managed Executable Job Service (MEJS) and its resource is called the Managed Executable Job Resource (MEJR). The second is a special implementation which accepts a multi-job description, breaks the description up into single-job descriptions, and then submits each of these so-called "sub-jobs" to an MEJS. This implementation is called the Managed Multi-Job Service (MMJS). Its resource is called the Managed Multi-Job Resource (MMJR)

Because of the fact that these two job services use the same port type, the API for accessing both the MEJR and the MMJR are identical. The MJFS creates the appropriate job resource depending on the factory resource used to qualify the operation call. Most of the factory resources represent local resource managers used by the MEJS (PBS, LSF, Condor). There is a special Multi factory resource which represents an abstract multi-job resource manager. The appropriate job description type is required for the two different types of managed job.

2. Semantics and syntax of the WSDL

2.1. Protocol overview

WS-GRAM allows for remote execution and management of programs through the creation of a managed job. The management of the job is taken care of primarily by core toolkit functionality (WS-ResourceLifetime and WS-BaseN implementations). Please see the Java WS Core documentation on notifications and resource lifetime (destruction) for more information.

2.1.1. Managed Job Factory Service (MJFS)

A single MJFS is used to create all jobs for all users. For each local resource manager, a dedicated Managed Job Factory Resource (MJFR) enables the MJFS to publish information about the characteristics of the compute resource, for example:

  • host information
  • GridFTP URL (for file staging and streaming)
  • compute cluster size and configuration, and so on...

In addition, there is a special MJFR which is used for creating MMJRs.

2.1.2. Managed Executable Job Service (MEJS)

A single MEJS is used to manage all executable jobs for all users. Each Managed Executable Job Resource (MEJR) enables the MEJS to publish information about the individual job the MEJR represents. This information can be accessed by querying the MEJS for the resource properties of a given MEJR, such as the:

  • current job state
  • stdout location
  • stderr location
  • exit code, and so on.

2.1.3. Managed Multi-Job Service (MMJS)

A single MMJS is used to manage all multi-jobs for all users. Each Managed Multi-Job Resource (MMJR) enables the MMJS to publish information about the individual multi-job the MMJR represents. This information can be accessed by querying the MMJS for the resource properties of a given MMJR, such as the:

  • current overall job state
  • list of sub-job EPRs

2.2. Operations

There are just two operations defined in the GRAM port types (not counting the Rendezvous port type which is used for MPI job synchronization): "createManagedJob" in the Managed Job Factory port type, and "release" in the Managed Job port type. All other operations (such as canceling/killing the job and querying for resource properties) are provided by the underlying WSRF implementation of the toolkit.

2.2.1. ManagedJobFactoryPortType

  • createManagedJob: This operation creates either a MEJR or MMJR, subscribes the client for notifications if requested, and replies with one or two endpoint references (EPRs). The input of this operation consists of a job description, an optional initial termination time for the job resource, and an optional state notification subscription request.

The first EPR:

  • is qualified with the identifier to the newly created MEJR or MMJR
  • points to either the MEJS or MMJS.

The second EPR:

  • is only present if a notification subscription was requested
  • is qualified with the identifier to the newly created subscription resource
  • points to the subscription manager service.

Using the optional subscription request provides an efficient means of subscribing to the newly created MEJR or MMJR without additional round-trip messages. Clients who subscribe afterwards must check the current status of the job, since the inherent race-condition means some state-changes may have occurred prior to the separate subscription request. In any event, there is a slight risk of lost notifications due to the lack of reliability guarantees in the notification delivery mechanism from WS-BaseNotification.

The ManagedJobFactoryPortType also has all the operations and publishes all the resource properties (via the MJFR) defined in the following WS-ResourceProperties port types:

  • GetResourceProperty
  • GetMultipleResourceProperties
  • QueryResourceProperties

2.2.2. ManagedJobPortType

  • release: This operation takes no parameters and returns nothing. Its purpose is to release a hold placed on a state through the use of the "holdState" field in the job description. See the domain-specific WS GRAM component documentation for more information on the "holdState" field.

The ManagedJobPortType also has all the operations and publishes all the resource properties (via the MJFR) defined in the following port types:

WS-ResourceProperties port types:

  • GetResourceProperty
  • GetMultipleResourceProperties
  • QueryResourceProperties

WS-ResourceLifetime port types:

  • ScheduledResourceTermination
  • ImmediateResourceTermination

WS-BaseNotification port type:

  • NotificationProducer

2.2.3. Managed Executable Job Port Type

This port type does not define any new operations. See Section 2.3, “Resource properties”.

2.2.4. Managed Multi-Job Port Type

This port type does not define any new operations. See Section 2.3, “Resource properties”.

2.3. Resource properties

2.3.1. Managed Job Factory Port Type

  • {http://www.globus.org/namespaces/2004/10/gram/job}condorArchitecture Condor architecture label.
  • {http://www.globus.org/namespaces/2004/10/gram/job}condorOS Condor OS label.
  • {http://www.globus.org/namespaces/2004/10/gram/job}delegationFactoryEndpoint The endpoint reference to the delegation factory used to delegated credentials to the job.
  • {http://mds.globus.org/glue/ce/1.1}GLUECE GLUE data
  • {http://mds.globus.org/glue/ce/1.1}GLUECESummary GLUE data summary
  • {http://www.globus.org/namespaces/2004/10/gram/job}globusLocation The location of the Globus Toolkit installation that these services are running under.
  • {http://www.globus.org/namespaces/2004/10/gram/job}hostCPUType The job host CPU architecture (i686, x86_64, etc...)
  • {http://www.globus.org/namespaces/2004/10/gram/job}hostManufacturer The host manufacturer name. May be "unknown".
  • {http://www.globus.org/namespaces/2004/10/gram/job}hostOSName The host OS name (Linux, Solaris, etc...)
  • {http://www.globus.org/namespaces/2004/10/gram/job}hostOSVersion The host OS version.
  • {http://www.globus.org/namespaces/2004/10/gram/job}localResourceManager The local resource manager type (i.e. Condor, Fork, LSF, Multi, PBS, etc...)
  • {http://mds.globus.org/metadata/2005/02}ServiceMetaDataInfo service start time, Globus Toolkit(R) version, service type name
  • {http://www.globus.org/namespaces/2004/10/gram/job}scratchBaseDirectory The directory recommended by the system administrator to be used for temporary job data.
  • {http://www.globus.org/namespaces/2004/10/gram/job}stagingDelegationFactoryEndpoint The endpoint reference to the delegation factory used to delegated credentials to the staging service (RFT).

2.3.2. Managed Job Port Type

  • {http://www.globus.org/namespaces/2004/09/rendezvous}Capacity Used for Rendezvous.
  • {http://docs.oasis-open.org/wsrf/2004/06/wsrf-WS-ResourceLifetime-1.2-draft-01.xsd}CurrentTime Time of creation.
  • {http://docs.oasis-open.org/wsn/2004/06/wsn-WS-BaseNotification-1.2-draft-01.xsd}FixedTopicSet ???
  • {http://docs.oasis-open.org/wsn/2004/06/wsn-WS-BaseNotification-1.2-draft-01.xsd}FixedTopicSet ???
  • {http://www.globus.org/namespaces/2004/10/gram/job/faults}fault The fault (if generated) indicating the reason for failure of the job to complete.
  • {http://www.globus.org/namespaces/2004/10/gram/job/types}holding Indicates whether a hold has been placed on this job.
  • {http://www.globus.org/namespaces/2004/10/gram/job/types}localUserId The job owner's local user account name.
  • {http://www.globus.org/namespaces/2004/09/rendezvous}RegistrantData Used for Rendezvous.
  • {http://www.globus.org/namespaces/2004/09/rendezvous}RendezvousCompleted Used for Rendezvous.
  • {http://www.globus.org/namespaces/2005/5/gram/job/description}serviceLevelAgreement A wrapper around fields containing the single-job and multi-job descriptions or RSLs. Only one of these sub-fields shall have a non-null value.
  • {http://www.globus.org/namespaces/2004/10/gram/job/types}state The current state of the job.
  • {http://docs.oasis-open.org/wsrf/2004/06/wsrf-WS-ResourceLifetime-1.2-draft-01.xsd}TerminationTime Time when the resource expires.
  • {http://docs.oasis-open.org/wsn/2004/06/wsn-WS-BaseNotification-1.2-draft-01.xsd}Topic Used in notifiation.
  • {http://docs.oasis-open.org/wsn/2004/06/wsn-WS-BaseNotification-1.2-draft-01.xsd}TopicExpressionDialects Used in notifiation.
  • {http://www.globus.org/namespaces/2004/10/gram/job/types}userSubject The GSI certificate DN of the job owner.

2.3.3. Managed Executable Job Port Type

  • {http://www.globus.org/namespaces/2004/09/rendezvous}Capacity Used for Rendezvous.
  • {http://docs.oasis-open.org/wsrf/2004/06/wsrf-WS-ResourceLifetime-1.2-draft-01.xsd}CurrentTime Time of creation.
  • {http://www.globus.org/namespaces/2005/09/gram/job/exec}credentialPath The path (relative to the job process) to the file containing the user proxy used by the job to authenticate out to other services.
  • {http://www.globus.org/namespaces/2004/10/gram/job/types}exitCode The exit code generated by the job process.
  • {http://docs.oasis-open.org/wsn/2004/06/wsn-WS-BaseNotification-1.2-draft-01.xsd}FixedTopicSet ???
  • {http://docs.oasis-open.org/wsn/2004/06/wsn-WS-BaseNotification-1.2-draft-01.xsd}FixedTopicSet ???
  • {http://www.globus.org/namespaces/2004/10/gram/job/faults}fault The fault (if generated) indicating the reason for failure of the job to complete.
  • {http://www.globus.org/namespaces/2004/10/gram/job/types}holding Indicates whether a hold has been placed on this job.
  • {http://www.globus.org/namespaces/2004/10/gram/job/types}localUserId The job owner's local user account name.
  • {http://www.globus.org/namespaces/2004/09/rendezvous}RegistrantData Used for Rendezvous.
  • {http://www.globus.org/namespaces/2004/09/rendezvous}RendezvousCompleted Used for Rendezvous.
  • {http://www.globus.org/namespaces/2005/5/gram/job/description}serviceLevelAgreement A wrapper around fields containing the single-job and multi-job descriptions or RSLs. Only one of these sub-fields shall have a non-null value.
  • {http://www.globus.org/namespaces/2004/10/gram/job/types}state The current state of the job.
  • {http://www.globus.org/namespaces/2005/09/gram/job/exec}stderrURL A GridFTP URL to the file generated by the job which contains the stderr.
  • {http://www.globus.org/namespaces/2005/09/gram/job/exec}stdoutURL A GridFTP URL to the file generated by the job which contains the stdout.
  • {http://docs.oasis-open.org/wsrf/2004/06/wsrf-WS-ResourceLifetime-1.2-draft-01.xsd}TerminationTime Time when the resource expires.
  • {http://docs.oasis-open.org/wsn/2004/06/wsn-WS-BaseNotification-1.2-draft-01.xsd}Topic Used in notifiation.
  • {http://docs.oasis-open.org/wsn/2004/06/wsn-WS-BaseNotification-1.2-draft-01.xsd}TopicExpressionDialects Used in notifiation.
  • {http://www.globus.org/namespaces/2004/10/gram/job/types}userSubject The GSI certificate DN of the job owner.

2.3.4. Managed Multi-Job Port Type

  • {http://www.globus.org/namespaces/2004/09/rendezvous}Capacity Used for Rendezvous.
  • {http://docs.oasis-open.org/wsrf/2004/06/wsrf-WS-ResourceLifetime-1.2-draft-01.xsd}CurrentTime Time of creation.
  • {http://docs.oasis-open.org/wsn/2004/06/wsn-WS-BaseNotification-1.2-draft-01.xsd}FixedTopicSet ???
  • {http://docs.oasis-open.org/wsn/2004/06/wsn-WS-BaseNotification-1.2-draft-01.xsd}FixedTopicSet ???
  • {http://www.globus.org/namespaces/2004/10/gram/job/faults}fault The fault (if generated) indicating the reason for failure of the job to complete.
  • {http://www.globus.org/namespaces/2004/10/gram/job/types}holding Indicates whether a hold has been placed on this job.
  • {http://www.globus.org/namespaces/2004/10/gram/job/types}localUserId The job owner's local user account name.
  • {http://www.globus.org/namespaces/2004/09/rendezvous}RegistrantData Used for Rendezvous.
  • {http://www.globus.org/namespaces/2004/09/rendezvous}RendezvousCompleted Used for Rendezvous.
  • {http://www.globus.org/namespaces/2005/5/gram/job/description}serviceLevelAgreement A wrapper around fields containing the single-job and multi-job descriptions or RSLs. Only one of these sub-fields shall have a non-null value.
  • {http://www.globus.org/namespaces/2004/10/gram/job/types}state The current state of the job.
  • {http://www.globus.org/namespaces/2004/10/gram/job/multi}subJobEndpoint A set of endpoint references to the sub-jobs created by this multi-job.
  • {http://docs.oasis-open.org/wsrf/2004/06/wsrf-WS-ResourceLifetime-1.2-draft-01.xsd}TerminationTime Time when the resource expires.
  • {http://docs.oasis-open.org/wsn/2004/06/wsn-WS-BaseNotification-1.2-draft-01.xsd}Topic Used in notifiation.
  • {http://docs.oasis-open.org/wsn/2004/06/wsn-WS-BaseNotification-1.2-draft-01.xsd}TopicExpressionDialects Used in notifiation.
  • {http://www.globus.org/namespaces/2004/10/gram/job/types}userSubject The GSI certificate DN of the job owner.

3. Command-line tools

Please see the GT 4.0 WS GRAM Command-line Reference.

4. Graphical User Interface

There is no support for this type of interface for WS GRAM.

5. Semantics and syntax of domain-specific interface data

5.1. Single-Job Description

The general form of a job description used to start a single job (meant for creating a Managed Executable Job Resource instance) is as follows:

<job>
    <!--put additional elements here-->
    <executable><!--put executable pat here--></executable>
    <!--put additional elements here-->
</job>

Here is a basic example of a job description for a single-job:

<job>
    <executable>bin/echo</executable>
    <argument>Testing</argument>
    <argument>1...2...3</argument>
    <stdout>${GLOBUS_USER_HOME}/stdout</stdout>
    <stderr>${GLOBUS_USER_HOME}/stderr</stderr>
</job>

5.2. Multi-Job Description

The general form of a job description used to start a multi-job (meant for creating a Managed Multi Job Resource instance) is as follows:

<multiJob>
    <!--Put subjob default elements here.-->
    <job>
        <factoryEndpoint
                xmlns:gram="http://www.globus.org/namespaces/2004/10/gram/job"
                xmlns:wsa="http://schemas.xmlsoap.org/ws/2004/03/addressing">
            <wsa:Address>
                <!--put ManagedJobFactoryService address here-->
            </wsa:Address>
            <wsa:ReferenceProperties>
                <gram:ResourceID><!--put scheduler type here--></gram:ResourceID>
            </wsa:ReferenceProperties>
        </factoryEndpoint>
        <executable><!--put executable path here--></executable>
    </job>
    <!--put additional job elements here-->
</multiJob>

Here is a basic example of a job description for a multi-job:

<multiJob>
    <executable>/bin/echo</executable>
    <stdout>${GLOBUS_USER_HOME}/stdout</stdout>
    <stderr>${GLOBUS_USER_HOME}/stderr</stderr>
    <job>
        <factoryEndpoint
                xmlns:gram="http://www.globus.org/namespaces/2004/10/gram/job"
                xmlns:wsa="http://schemas.xmlsoap.org/ws/2004/03/addressing">
            <wsa:Address>
                https://mymachine.mydomain.com:8443/wsrf/services/ManagedJobFactoryService
            </wsa:Address>
            <wsa:ReferenceProperties>
                <gram:ResourceID>Pbs</gram:ResourceID>
            </wsa:ReferenceProperties>
        </factoryEndpoint>
        <argument>Testing</argument>
        <argument>1...2...3</argument>
    <job>
    <job>
        <factoryEndpoint
                xmlns:gram="http://www.globus.org/namespaces/2004/10/gram/job"
                xmlns:wsa="http://schemas.xmlsoap.org/ws/2004/03/addressing">
            <wsa:Address>
                https://myothermachine.myotherdomain.org:8443/wsrf/services/ManagedJobFactoryService
            </wsa:Address>
            <wsa:ReferenceProperties>
                <gram:ResourceID>Pbs</gram:ResourceID>
            </wsa:ReferenceProperties>
        </factoryEndpoint>
        <argument>Hi There!</argument>
        <argument>Dear John!</argument>
    </job>
</multiJob>

5.3. Staging Directives

The WS-GRAM job description schema imports types from the RFT job description schema for specifying staging directives (i.e. fileStageIn, fileStageOut, and fileCleanUp). See the RFT domain-specific interface documentation for details on these imported types.

Since fileStageIn and fileStageOut are of type TransferRequestType and fileCleanUp is of type DeleteRequestType, mentally replace "transferRequest" with "fileStageIn" or "fileStageOut", and "deleteRequest" with "fileCleanUp" in the RFT domain-specific interface documentation. The Request Options section is of particular usefullness.

5.4. Job Description Schema Reference

Please see the Job Description Schema documentation for details about the job description elements and substitution variables used to define GRAM jobs.

6. Configuration interface

Please see the Configuring WS GRAM.

7. Environment variable interface

There is no support for this type of interface for WS GRAM.