GT 4.0 RLS : User's Guide

1. Introduction

The Replica Location Service (RLS) maintains and provides access to mapping information from logical names for data items to target names.

RLS was co-developed by the Globus team and Work Package 2 of the DataGrid project. The RLS prototype is currently available as an alpha release for testing and evaluation. The distributed RLS is intended to replace the centralized Globus replica catalog available in earlier releases of GT2.x. The distributed RLS provides higher performance, reliability and scalability.

Replication of data items can reduce access latency, improve data locality, and increase robustness, scalability and performance for distributed applications. An RLS typically does not operate in isolation but functions as one component of a data grid architecture (other components include services that provide reliable file transfers, metadata management, reliable replication and workflow management).

The RLS implementation is based on the following mechanisms:

  • Consistent local state maintained in Local Replica Catalogs (LRCs). Local catalogs maintain mappings between arbitrary logical file names (LFNs) and the physical file names (PFNs) associated with those LFNs on its storage system(s).  
  • Collective state with relaxed consistency maintained in Replica Location Indices (RLIs). Each RLI contains a set of mappings from LFNs to LRCs. A variety of index structures can be defined with different performance characteristics simply by varying the number of RLIs and the amount of redundancy and partitioning among the RLIs.  
  • Soft state maintenance of RLI state. LRCs send information about their state to RLIs using soft state protocols. State information in RLIs times out and must be periodically refreshed by soft state updates.  
  • Compression of state updates. This optional compression uses Bloom filters to summarize the content of a Local Replica Catalog before sending a soft state update to a Replica Location Index Node.  
  • Membership and partitioning information maintenance. The current RLS implementation maintains static information about the LRCs and RLIs participating in the distributed system. As new implementations of the RLS are developed, they will use OGSA mechanisms for registration of services and for service lifetime management.

2. Command line tools

Please see the RLS Command Reference.

3. Graphical user interfaces

There is no support for this type of interface for RLS.

4. Troubleshooting

Information on troubleshooting can be found in the FAQ.

5. Usage statistics collection by the Globus Alliance

The following usage statistics are sent by RLS Server by default in a UDP packet:

  • Component identifier
  • Usage data format identifier
  • Time stamp
  • Source IP address
  • Source hostname (to differentiate between hosts with identical private IP addresses)
  • Version number
  • Uptime
  • LRC service indicator
  • RLI service indicator
  • Number of LFNs
  • Number of PFNs
  • Number of Mappings
  • Number of RLI LFNs
  • Number of RLI LRCs
  • Number of RLI Senders
  • Number of RLI Mappings
  • Number of threads
  • Number of connections

The RLS sends the usage statistics at server startup, server shutdown, and once every 24 hours when the service is running.

If you wish to disable this feature, you can set the following environment variable before running the RLS:

export GLOBUS_USAGE_OPTOUT=1

By default, these usage statistics UDP packets are sent to usage-stats.globus.org:4180 but can be redirected to another host/port or multiple host/ports with the following environment variable:

export GLOBUS_USAGE_TARGETS="myhost.mydomain:12345 myhost2.mydomain:54321"

You can also dump the usage stats packets to stderr as they are sent (although most of the content is non-ascii). Use the following environment variable for that:

export GLOBUS_USAGE_DEBUG=MESSAGES

Also, please see our policy statement on the collection of usage statistics.