GT 4.0 DRS: User's Guide

1. Introduction

The Data Replication Service (DRS) is a technical preview provided with the Globus Toolkit 4.0 and first appeared in the GT 3.9.5 Beta release. The primary functionality of the component allows users to identify a set of desired files existing in their Grid environment, to make local replicas of those data files by transferring files from one or more source locations, and to register the new replicas in a Replica Location Service. The DRS conforms to the WS-RF specification and exposes a WS-Resource (called a "Replicator" resource) which represents the current state of the requested replication activity and allows users to query or subscribe to various Resource Properties in order to monitor the state of the resource. The DRS is built on the GT 4.0 Java WS Core and uses the Globus RLS to locate and register replicas and the Globus RFT to transfer files.

2. Command-line tools

Please see the GT 4.0 DRS Command-line Reference.

3. Usage scenarios

This section describes a few key usage scenarios and provides examples of using the DRS command-line tools.

3.1. Generate a valid proxy

Before using any of the tools, a user must generate a valid user proxy. Use grid-proxy-init.

% $GLOBUS_LOCATION/bin/grid-proxy-init
Your identity: /O=Grid/OU=GlobusTest/OU=simpleCA.mymachine/OU=mymachine/CN=John Doe
Enter GRID pass phrase for this identity:
Creating proxy ................................. Done
Your proxy is valid until: Tue Oct 26 01:33:42 2004
        

3.2. Delegate user credentials

Once you have generated a valid proxy you must create a delegated credential. Your delegated credential will be used by the DRS to make secure calls to other services (e.g., RLS, RFT, etc.) in order to perform the data replication. It is important to ensure that you give your delegated credential enough lifetime to support the running time of your replication activities. To delegate your credential use globus-credential-delegate.

% $GLOBUS_LOCATION/bin/globus-credential-delegate -h myhostname \
 -p 8443 mycredential.epr
EPR will be written to: mycredential.epr
Delegated credential EPR:
Address: https://128.9.72.118:8443/wsrf/services/DelegationService
Reference property[0]:
<ns1:DelegationKey xmlns:ns1="http://www.globus.org/08/2004/delegationService"
>3b6cb210-e9b2-11d9-ab74-f7fa10f094cd</ns1:DelegationKey>
        

3.3. Replication request file

A key parameter for any replication request is the request file. The replication request file is a text file containing CRLF-terminated rows of tab-delimited pairs of Logical Filename (LFN) names and destination (URL) locations. An example of such a file is shown.

% cat testrun.req
testrun-1      gsiftp://myhost:9001/sandbox/files/testrun-1
testrun-2      gsiftp://myhost:9001/sandbox/files/testrun-2
testrun-3      gsiftp://myhost:9001/sandbox/files/testrun-3
testrun-4      gsiftp://myhost:9001/sandbox/files/testrun-4
testrun-5      gsiftp://myhost:9001/sandbox/files/testrun-5
        

3.4. Create replication resource

The initial step for any replication is to create the replication resource. Creating the resource depends on the availability of a DRS service, a delegated credential, and a properly formatted replication request file. The replication request file must be specified by its URL. Currently supported URL schemes for the request file include file, http, and ftp. If the replication client is run local to the service the file scheme is appropriate, whereas if the client is remote than the latter schemes must be used. It is a good practice to specify a filename to save the replication resource's endpoint reference. The endpoint reference is required for all other operations on the resource, such as getting resource properties, starting/stopping, and destroying it. Numerous options are available to influence the behavior of the data replication activities (see globus-replication-create(1)). One option of particular interest is the --start option, which immediately starts the replication activities following creation of the replication resource. An example of using the globus-replication-create(1) tool is shown.

% $GLOBUS_LOCATION/bin/globus-replication-create -s \
 https://myhost:8443/wsrf/services/ReplicationService \
 -C mycredential.epr -V myreplicator.epr file:///scratch/testrun.req
        

This command does not write to stdout when successful unless the --debug option is specified.

3.5. Start replication

Once a replication resource has been create, the replication activities may be started. As mentioned in Section 3.4, “Create replication resource” the replication may be immediately started after it is created. If the immediate start option is not specified, the globus-replication-start(1) tool must be used to start the replication.

% $GLOBUS_LOCATION/bin/globus-replication-start -e myreplicator.epr
        

No output is expect from this command when successful.

3.6. Get replication resource properties

Throughout the lifecycle and after the completion of the replication resource, it will be important to lookup its Resource Properties. The standard WS-RF port types are supported and the supplied tools (e.g., wsrf-get-property) may be used with the DRS and its resources.

% $GLOBUS_LOCATION/bin/wsrf-get-property -e myreplicator.epr \
 "{http://www.globus.org/namespaces/2005/05/replica/replicator}status"
<ns1:status xmlns:ns1="http://www.globus.org/namespaces/2005/05/replica/replicator">
Active</ns1:status>
        

And,

% $GLOBUS_LOCATION/bin/wsrf-get-property -e myreplicator.epr \ 
 "{http://www.globus.org/namespaces/2005/05/replica/replicator}count"
<ns1:count xmlns:ns1="http://www.globus.org/namespaces/2005/05/replica/replicator">
 <ns1:total>10</ns1:total>
 <ns1:finished>0</ns1:finished>
 <ns1:failed>0</ns1:failed>
 <ns1:terminated>0</ns1:terminated>
</ns1:count>
        

3.7. Find replication item status

Throughout the lifecycle and after the completion of the replication resource, it may be helpful to find individual replication items in the replication resource to inspect the detailed status of the replication activities. The globus-replication-finditems(1) tool is used to find replication items. The following example demonstrates the usage when finding a limited number of items, offset into the lookup results set, for a specified status.

% $GLOBUS_LOCATION/bin/globus-replication-finditems -e myreplicator.epr -S Pending -O 1 -L 2
<ns1:FindItemsResponse xmlns:ns1="http://www.globus.org/namespaces/2005/05/replica/replicator">
 <ns1:items xsi:type="ns1:ReplicationItemType" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <ns1:uri xsi:type="xsd:string" xmlns:xsd="http://www.w3.org/2001/XMLSchema">testrun-2</ns1:uri>
  <ns1:priority xsi:type="xsd:int" xmlns:xsd="http://www.w3.org/2001/XMLSchema">1000</ns1:priority>
  <ns1:status xsi:type="ns1:ReplicationItemStatusEnumerationType">Pending</ns1:status>
  <ns1:destinations xsi:type="ns1:DestinationType">
   <ns1:uri xsi:type="xsd:string" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
     gsiftp://myhost:9001/sandbox/files/testrun-2</ns1:uri>
   <ns1:status xsi:type="ns1:DestinationStatusEnumerationType">Pending</ns1:status>
  </ns1:destinations>
 </ns1:items>
 <ns1:items xsi:type="ns1:ReplicationItemType" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <ns1:uri xsi:type="xsd:string" xmlns:xsd="http://www.w3.org/2001/XMLSchema">testrun-3</ns1:uri>
  <ns1:priority xsi:type="xsd:int" xmlns:xsd="http://www.w3.org/2001/XMLSchema">1000</ns1:priority>
  <ns1:status xsi:type="ns1:ReplicationItemStatusEnumerationType">Pending</ns1:status>
  <ns1:destinations xsi:type="ns1:DestinationType">
   <ns1:uri xsi:type="xsd:string" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
     gsiftp://myhost:9001/sandbox/files/testrun-3</ns1:uri>
   <ns1:status xsi:type="ns1:DestinationStatusEnumerationType">Pending</ns1:status>
  </ns1:destinations>
 </ns1:items>
</ns1:FindItemsResponse>
        

3.8. Destroy replication resource

When the replication is complete, the replication resource may be destroyed. Destroying the replication resource will help to free up system resources (namely, memory), especially in the case that the replication entailed a large amount of individual replication activities (i.e., many files specified in the replication request file). The standard WS-RF port types are supported and the supplied wsrf-destroy tool may be used to destroy the DRS resource.

% $GLOBUS_LOCATION/bin/wsrf-destroy -e myreplicator.epr
Destroy operation was successful
        

4. Troubleshooting

The following section provides information about common troubleshooting tips for end users.

4.1. Authorization failure: expected hostname

When authorization is enabled on the container you may need to use the proper hostname when referencing the DRS service rather than using localhost.

% $GLOBUS_LOCATION/bin/globus-replication-create -s \
 https://localhost:8443/wsrf/services/ReplicationService \
 -C mycredential.epr -V myreplicator.epr file:///scratch/testrun.req
Error: ; nested exception is:
        org.globus.common.ChainedIOException: Authentication failed [Caused by:
        Operation unauthorized (Mechanism level: Authorization failed. Expected 
        "/CN=host/loopback" target but received "/C=US/O=Globus Alliance/OU=
        Service/CN=host/myhost")]
        

4.2. Cannot find request file

When using the DRS, ensure that the request file's filename is correct, that it is reachable by the DRS service, and that it has the appropriate permissions for the DRS service to access it.

% $GLOBUS_LOCATION/bin/globus-replication-create -s \
 https://myhost:8443/wsrf/services/ReplicationService -C mycredential.epr \
 -V myreplicator.epr file:///scratch/testrun
Error: java.rmi.RemoteException: Unable to create resource; nested exception is:  
        org.globus.wsrf.ResourceException: Failed to create Replication: 
        /scratch/testrun (No such file or directory); nested exception is:
        java.io.FileNotFoundException: /scratch/testrun (No such file or directory)
        

4.3. Malformed request file

It is important to ensure that the request file is well-formed as specified. A malformed request file will result in a runtime exception.

% $GLOBUS_LOCATION/bin/globus-replication-create -s \
 https://myhost:8443/wsrf/services/ReplicationService -C mycredential.epr \
 -V myreplicator.epr file:///scratch/testrun.req
Error: java.rmi.RemoteException: Unable to create resource; nested exception is:  
        org.globus.wsrf.ResourceException: Failed to create Replication: String
        index out of range: -1; nested exception is:
        java.lang.StringIndexOutOfBoundsException: String index out of range: -1
        

The above error was produced by replacing a delimiting tab character with space characters.