Table of Contents
This component consists abstractly of two interfaces: the Managed Job Factory Port Type(MJFPT) and the Managed Job Port Type (MJPT).
In actuality there are three service/resource implementations, two of which implement the basic MJPT. The first one is the service which actually talks to a particular local resource manager to execute a process on the remote computer or cluster. This one is called a Managed Executable Job Service (MEJS) and its resource is called the Managed Executable Job Resource (MEJR). The second is a special implementation which accepts a multi-job description, breaks the description up into single-job descriptions, and then submits each of these so-called "sub-jobs" to an MEJS. This implementation is called the Managed Multi-Job Service (MMJS). Its resource is called the Managed Multi-Job Resource (MMJR)
Because of the fact that these two job services use the same port type, the API for accessing both the MEJR and the MMJR are identical. The MJFS creates the appropriate job resource depending on the factory resource used to qualify the operation call. Most of the factory resources represent local resource managers used by the MEJS (PBS, LSF, Condor). There is a special Multi factory resource which represents an abstract multi-job resource manager. The appropriate job description type is required for the two different types of managed job.
Java API Documentation Links (Javadoc)
C API Documentation Links
WS-GRAM allows for remote execution and management of programs through the creation of a managed job. The management of the job is taken care of primarily by core toolkit functionality (WS-ResourceLifetime and WS-BaseN implementations). Please see the Java WS Core documentation on notifications and resource lifetime (destruction) for more information.
A single MJFS is used to create all jobs for all users. For each local resource manager, a dedicated Managed Job Factory Resource (MJFR) enables the MJFS to publish information about the characteristics of the compute resource, for example:
- host information
- GridFTP URL (for file staging and streaming)
- compute cluster size and configuration, and so on...
In addition, there is a special MJFR which is used for creating MMJRs.
A single MEJS is used to manage all executable jobs for all users. Each Managed Executable Job Resource (MEJR) enables the MEJS to publish information about the individual job the MEJR represents. This information can be accessed by querying the MEJS for the resource properties of a given MEJR, such as the:
- current job state
- stdout location
- stderr location
- exit code, and so on.
A single MMJS is used to manage all multi-jobs for all users. Each Managed Multi-Job Resource (MMJR) enables the MMJS to publish information about the individual multi-job the MMJR represents. This information can be accessed by querying the MMJS for the resource properties of a given MMJR, such as the:
- current overall job state
- list of sub-job EPRs
There are just two operations defined in the GRAM port types (not counting the Rendezvous port type which is used for MPI job synchronization): "createManagedJob" in the Managed Job Factory port type, and "release" in the Managed Job port type. All other operations (such as canceling/killing the job and querying for resource properties) are provided by the underlying WSRF implementation of the toolkit.
createManagedJob
: This operation creates either a MEJR or MMJR, subscribes the client for notifications if requested, and replies with one or two endpoint references (EPRs). The input of this operation consists of a job description, an optional initial termination time for the job resource, and an optional state notification subscription request.
The first EPR:
- is qualified with the identifier to the newly created MEJR or MMJR
- points to either the MEJS or MMJS.
The second EPR:
- is only present if a notification subscription was requested
- is qualified with the identifier to the newly created subscription resource
- points to the subscription manager service.
Using the optional subscription request provides an efficient means of subscribing to the newly created MEJR or MMJR without additional round-trip messages. Clients who subscribe afterwards must check the current status of the job, since the inherent race-condition means some state-changes may have occurred prior to the separate subscription request. In any event, there is a slight risk of lost notifications due to the lack of reliability guarantees in the notification delivery mechanism from WS-BaseNotification.
The ManagedJobFactoryPortType also has all the operations and publishes all the resource properties (via the MJFR) defined in the following WS-ResourceProperties port types:
GetResourceProperty
GetMultipleResourceProperties
QueryResourceProperties
release:
This operation takes no parameters and returns nothing. Its purpose is to release a hold placed on a state through the use of the "holdState" field in the job description. See the domain-specific WS GRAM component documentation for more information on the "holdState" field.
The ManagedJobPortType also has all the operations and publishes all the resource properties (via the MJFR) defined in the following port types:
WS-ResourceProperties port types:
GetResourceProperty
GetMultipleResourceProperties
QueryResourceProperties
WS-ResourceLifetime port types:
ScheduledResourceTermination
ImmediateResourceTermination
WS-BaseNotification port type:
NotificationProducer
This port type does not define any new operations. See Section 2.3, “Resource properties”.
This port type does not define any new operations. See Section 2.3, “Resource properties”.
{http://www.globus.org/namespaces/2004/10/gram/job}condorArchitecture
Condor architecture label.{http://www.globus.org/namespaces/2004/10/gram/job}condorOS
Condor OS label.{http://www.globus.org/namespaces/2004/10/gram/job}delegationFactoryEndpoint
The endpoint reference to the delegation factory used to delegated credentials to the job.{http://mds.globus.org/glue/ce/1.1}GLUECE
GLUE data{http://mds.globus.org/glue/ce/1.1}GLUECESummary
GLUE data summary{http://www.globus.org/namespaces/2004/10/gram/job}globusLocation
The location of the Globus Toolkit installation that these services are running under.{http://www.globus.org/namespaces/2004/10/gram/job}hostCPUType
The job host CPU architecture (i686, x86_64, etc...){http://www.globus.org/namespaces/2004/10/gram/job}hostManufacturer
The host manufacturer name. May be "unknown".-
{http://www.globus.org/namespaces/2004/10/gram/job}hostOSName
The host OS name (Linux, Solaris, etc...) {http://www.globus.org/namespaces/2004/10/gram/job}hostOSVersion
The host OS version.{http://www.globus.org/namespaces/2004/10/gram/job}localResourceManager
The local resource manager type (i.e. Condor, Fork, LSF, Multi, PBS, etc...){http://mds.globus.org/metadata/2005/02}ServiceMetaDataInfo
service start time, Globus Toolkit(R) version, service type name-
{http://www.globus.org/namespaces/2004/10/gram/job}scratchBaseDirectory
The directory recommended by the system administrator to be used for temporary job data. {http://www.globus.org/namespaces/2004/10/gram/job}stagingDelegationFactoryEndpoint
The endpoint reference to the delegation factory used to delegated credentials to the staging service (RFT).
{http://www.globus.org/namespaces/2004/09/rendezvous}Capacity
Used for Rendezvous.{http://docs.oasis-open.org/wsrf/2004/06/wsrf-WS-ResourceLifetime-1.2-draft-01.xsd}CurrentTime
Time of creation.{http://docs.oasis-open.org/wsn/2004/06/wsn-WS-BaseNotification-1.2-draft-01.xsd}FixedTopicSet
???{http://docs.oasis-open.org/wsn/2004/06/wsn-WS-BaseNotification-1.2-draft-01.xsd}FixedTopicSet
???{http://www.globus.org/namespaces/2004/10/gram/job/faults}fault
The fault (if generated) indicating the reason for failure of the job to complete.{http://www.globus.org/namespaces/2004/10/gram/job/types}holding
Indicates whether a hold has been placed on this job.{http://www.globus.org/namespaces/2004/10/gram/job/types}localUserId
The job owner's local user account name.{http://www.globus.org/namespaces/2004/09/rendezvous}RegistrantData
Used for Rendezvous.{http://www.globus.org/namespaces/2004/09/rendezvous}RendezvousCompleted
Used for Rendezvous.{http://www.globus.org/namespaces/2005/5/gram/job/description}serviceLevelAgreement
A wrapper around fields containing the single-job and multi-job descriptions or RSLs. Only one of these sub-fields shall have a non-null value.{http://www.globus.org/namespaces/2004/10/gram/job/types}state
The current state of the job.{http://docs.oasis-open.org/wsrf/2004/06/wsrf-WS-ResourceLifetime-1.2-draft-01.xsd}TerminationTime
Time when the resource expires.{http://docs.oasis-open.org/wsn/2004/06/wsn-WS-BaseNotification-1.2-draft-01.xsd}Topic
Used in notifiation.{http://docs.oasis-open.org/wsn/2004/06/wsn-WS-BaseNotification-1.2-draft-01.xsd}TopicExpressionDialects
Used in notifiation.{http://www.globus.org/namespaces/2004/10/gram/job/types}userSubject
The GSI certificate DN of the job owner.
{http://www.globus.org/namespaces/2004/09/rendezvous}Capacity
Used for Rendezvous.{http://docs.oasis-open.org/wsrf/2004/06/wsrf-WS-ResourceLifetime-1.2-draft-01.xsd}CurrentTime
Time of creation.{http://www.globus.org/namespaces/2005/09/gram/job/exec}credentialPath
The path (relative to the job process) to the file containing the user proxy used by the job to authenticate out to other services.{http://www.globus.org/namespaces/2004/10/gram/job/types}exitCode
The exit code generated by the job process.{http://docs.oasis-open.org/wsn/2004/06/wsn-WS-BaseNotification-1.2-draft-01.xsd}FixedTopicSet
???{http://docs.oasis-open.org/wsn/2004/06/wsn-WS-BaseNotification-1.2-draft-01.xsd}FixedTopicSet
???{http://www.globus.org/namespaces/2004/10/gram/job/faults}fault
The fault (if generated) indicating the reason for failure of the job to complete.{http://www.globus.org/namespaces/2004/10/gram/job/types}holding
Indicates whether a hold has been placed on this job.{http://www.globus.org/namespaces/2004/10/gram/job/types}localUserId
The job owner's local user account name.{http://www.globus.org/namespaces/2004/09/rendezvous}RegistrantData
Used for Rendezvous.{http://www.globus.org/namespaces/2004/09/rendezvous}RendezvousCompleted
Used for Rendezvous.{http://www.globus.org/namespaces/2005/5/gram/job/description}serviceLevelAgreement
A wrapper around fields containing the single-job and multi-job descriptions or RSLs. Only one of these sub-fields shall have a non-null value.{http://www.globus.org/namespaces/2004/10/gram/job/types}state
The current state of the job.{http://www.globus.org/namespaces/2005/09/gram/job/exec}stderrURL
A GridFTP URL to the file generated by the job which contains the stderr.{http://www.globus.org/namespaces/2005/09/gram/job/exec}stdoutURL
A GridFTP URL to the file generated by the job which contains the stdout.{http://docs.oasis-open.org/wsrf/2004/06/wsrf-WS-ResourceLifetime-1.2-draft-01.xsd}TerminationTime
Time when the resource expires.{http://docs.oasis-open.org/wsn/2004/06/wsn-WS-BaseNotification-1.2-draft-01.xsd}Topic
Used in notifiation.{http://docs.oasis-open.org/wsn/2004/06/wsn-WS-BaseNotification-1.2-draft-01.xsd}TopicExpressionDialects
Used in notifiation.{http://www.globus.org/namespaces/2004/10/gram/job/types}userSubject
The GSI certificate DN of the job owner.
{http://www.globus.org/namespaces/2004/09/rendezvous}Capacity
Used for Rendezvous.{http://docs.oasis-open.org/wsrf/2004/06/wsrf-WS-ResourceLifetime-1.2-draft-01.xsd}CurrentTime
Time of creation.{http://docs.oasis-open.org/wsn/2004/06/wsn-WS-BaseNotification-1.2-draft-01.xsd}FixedTopicSet
???{http://docs.oasis-open.org/wsn/2004/06/wsn-WS-BaseNotification-1.2-draft-01.xsd}FixedTopicSet
???{http://www.globus.org/namespaces/2004/10/gram/job/faults}fault
The fault (if generated) indicating the reason for failure of the job to complete.{http://www.globus.org/namespaces/2004/10/gram/job/types}holding
Indicates whether a hold has been placed on this job.{http://www.globus.org/namespaces/2004/10/gram/job/types}localUserId
The job owner's local user account name.{http://www.globus.org/namespaces/2004/09/rendezvous}RegistrantData
Used for Rendezvous.{http://www.globus.org/namespaces/2004/09/rendezvous}RendezvousCompleted
Used for Rendezvous.{http://www.globus.org/namespaces/2005/5/gram/job/description}serviceLevelAgreement
A wrapper around fields containing the single-job and multi-job descriptions or RSLs. Only one of these sub-fields shall have a non-null value.{http://www.globus.org/namespaces/2004/10/gram/job/types}state
The current state of the job.{http://www.globus.org/namespaces/2004/10/gram/job/multi}subJobEndpoint
A set of endpoint references to the sub-jobs created by this multi-job.{http://docs.oasis-open.org/wsrf/2004/06/wsrf-WS-ResourceLifetime-1.2-draft-01.xsd}TerminationTime
Time when the resource expires.{http://docs.oasis-open.org/wsn/2004/06/wsn-WS-BaseNotification-1.2-draft-01.xsd}Topic
Used in notifiation.{http://docs.oasis-open.org/wsn/2004/06/wsn-WS-BaseNotification-1.2-draft-01.xsd}TopicExpressionDialects
Used in notifiation.{http://www.globus.org/namespaces/2004/10/gram/job/types}userSubject
The GSI certificate DN of the job owner.
WSDL links:
Schema links:
Please see the GT 4.0 WS GRAM Command-line Reference.
The general form of a job description used to start a single job (meant for creating a Managed Executable Job Resource instance) is as follows:
<job> <!--put additional elements here--> <executable><!--put executable pat here--></executable> <!--put additional elements here--> </job>
Here is a basic example of a job description for a single-job:
<job> <executable>bin/echo</executable> <argument>Testing</argument> <argument>1...2...3</argument> <stdout>${GLOBUS_USER_HOME}/stdout</stdout> <stderr>${GLOBUS_USER_HOME}/stderr</stderr> </job>
The general form of a job description used to start a multi-job (meant for creating a Managed Multi Job Resource instance) is as follows:
<multiJob> <!--Put subjob default elements here.--> <job> <factoryEndpoint xmlns:gram="http://www.globus.org/namespaces/2004/10/gram/job" xmlns:wsa="http://schemas.xmlsoap.org/ws/2004/03/addressing"> <wsa:Address> <!--put ManagedJobFactoryService address here--> </wsa:Address> <wsa:ReferenceProperties> <gram:ResourceID><!--put scheduler type here--></gram:ResourceID> </wsa:ReferenceProperties> </factoryEndpoint> <executable><!--put executable path here--></executable> </job> <!--put additional job elements here--> </multiJob>
Here is a basic example of a job description for a multi-job:
<multiJob> <executable>/bin/echo</executable> <stdout>${GLOBUS_USER_HOME}/stdout</stdout> <stderr>${GLOBUS_USER_HOME}/stderr</stderr> <job> <factoryEndpoint xmlns:gram="http://www.globus.org/namespaces/2004/10/gram/job" xmlns:wsa="http://schemas.xmlsoap.org/ws/2004/03/addressing"> <wsa:Address> https://mymachine.mydomain.com:8443/wsrf/services/ManagedJobFactoryService </wsa:Address> <wsa:ReferenceProperties> <gram:ResourceID>Pbs</gram:ResourceID> </wsa:ReferenceProperties> </factoryEndpoint> <argument>Testing</argument> <argument>1...2...3</argument> <job> <job> <factoryEndpoint xmlns:gram="http://www.globus.org/namespaces/2004/10/gram/job" xmlns:wsa="http://schemas.xmlsoap.org/ws/2004/03/addressing"> <wsa:Address> https://myothermachine.myotherdomain.org:8443/wsrf/services/ManagedJobFactoryService </wsa:Address> <wsa:ReferenceProperties> <gram:ResourceID>Pbs</gram:ResourceID> </wsa:ReferenceProperties> </factoryEndpoint> <argument>Hi There!</argument> <argument>Dear John!</argument> </job> </multiJob>
The WS-GRAM job description schema imports types from the RFT job
description schema for specifying staging directives (i.e.
fileStageIn
,
fileStageOut
, and
fileCleanUp
).
See the RFT domain-specific interface documentation
for details on these imported types.
Since
fileStageIn
and
fileStageOut
are of type
TransferRequestType
and fileCleanUp
is of type
DeleteRequestType,
mentally replace "transferRequest" with "fileStageIn" or "fileStageOut", and
"deleteRequest" with "fileCleanUp" in the RFT domain-specific interface
documentation. The
Request Options section is of particular usefullness.
Please see the Job Description Schema documentation for details about the job description elements and substitution variables used to define GRAM jobs.
Please see the Configuring WS GRAM.