GT 4.0 OGSA-DAI: Developer's Guide

1. Introduction

This guide contains information of interest to developers working with OGSA-DAI. It provides reference information for application developers, including APIs, architecture, and procedures for using the APIs.

2. Before you begin

2.1. Feature summary

Features new in release GT 4.0:

  • Access to data is provided via an "OGSA-DAI data service". For those that have previously used the OGSI version of OGSA-DAI, this service amalgamates the capabilities of the Grid Data Service Factory (GDSF) and Grid Data Service (GDS) services (the metadata and configuration roles of the GDSF and the metadata and perform document processing aspects of the GDS).
  • Each data service exposes zero or more data service resources. A data service resource manages exposure of and interaction with a data resource. A single data service resource can be viewed as offering capabilities analogous to a single GDS in the OGSI version of OGSA-DAI.
  • Allows multiple data resources to be accessed through a single service. Data service resource identifiers, available from the data service's WS-Addressing endpoint reference, allow a client to target a specific data service resource.
  • A listResources() operation is provided for a data service to list all the identifiers of the data service resources exposed by that service.
  • The data service resource identifiers returned by a data service can subsequently be used by a client to obtain metadata and other information, about the data service resources corresponding to that identifier.
  • Access to data service resource metadata (such as database schemas, request status, etc.) is provided by an implementation of the WS-ResourceProperties specification. In particular support for using the QueryResourceProperties, GetResourceProperties and GetMultipleResourceProperties portTypes is provided.
  • Access to version information about the OGSA-DAI Data Service is available through the getVersion() operation.
  • A WSRF version of the OGSA-DAI GridDataTransport portType supports asynchronous data delivery between data services.

Other Supported Features (features that continue to be supported from previous versions):

  • Perform documents.
  • Extensible activity framework.
  • Statement, delivery and transformation activities.

Deprecated Features

  • The OGSA-DAI Client Toolkit has not been updated yet to support interaction with WSRF-enabled OGSA-DAI services. This will be made available in a future version.
  • The OGSA-DAI registry service, DAISGR, has not been updated for WSRF. This will be addressed in a future version or provided via a third party component.
  • The old graphical demonstrator no longer works with this version of OGSA-DAI.

2.2. Tested platforms

Features new in release GT 4.0:

  • Access to data is provided via an "OGSA-DAI data service". For those that have previously used the OGSI version of OGSA-DAI, this service amalgamates the capabilities of the Grid Data Service Factory (GDSF) and Grid Data Service (GDS) services (the metadata and configuration roles of the GDSF and the metadata and perform document processing aspects of the GDS).
  • Each data service exposes zero or more data service resources. A data service resource manages exposure of and interaction with a data resource. A single data service resource can be viewed as offering capabilities analogous to a single GDS in the OGSI version of OGSA-DAI.
  • Allows multiple data resources to be accessed through a single service. Data service resource identifiers, available from the data service's WS-Addressing endpoint reference, allow a client to target a specific data service resource.
  • A listResources() operation is provided for a data service to list all the identifiers of the data service resources exposed by that service.
  • The data service resource identifiers returned by a data service can subsequently be used by a client to obtain metadata and other information, about the data service resources corresponding to that identifier.
  • Access to data service resource metadata (such as database schemas, request status, etc.) is provided by an implementation of the WS-ResourceProperties specification. In particular support for using the QueryResourceProperties, GetResourceProperties and GetMultipleResourceProperties portTypes is provided.
  • Access to version information about the OGSA-DAI Data Service is available through the getVersion() operation.
  • A WSRF version of the OGSA-DAI GridDataTransport portType supports asynchronous data delivery between data services.

Other Supported Features (features that continue to be supported from previous versions):

  • Perform documents.
  • Extensible activity framework.
  • Statement, delivery and transformation activities.

Deprecated Features

  • The OGSA-DAI Client Toolkit has not been updated yet to support interaction with WSRF-enabled OGSA-DAI services. This will be made available in a future version.
  • The OGSA-DAI registry service, DAISGR, has not been updated for WSRF. This will be addressed in a future version or provided via a third party component.
  • The old graphical demonstrator no longer works with this version of OGSA-DAI.

2.3. Backward compatibility summary

Protocol changes since GT version 3.2:

  • Not backwards compatible with the previous OGSI version.

API changes since GT version 3.2:

  • None

Exception changes since GT version 3.2:

  • None

Schema changes since GT version 3.2:

  • WSDL changes to work with new Java WS Core.

2.4. Technology dependencies

OGSA-DAI depends on the following GT components:

  • Java WS Core

OGSA-DAI depends on the following 3rd party software:

  • Java 1.4.0 (OGSA-DAI WSRF has been tested on this version of Java though may work with other Java 1.4.x flavours).
  • Jakarta ANT 1.5 (see http://ant.apache.org).

Depending on the underlying data resource that OGSA-DAI is going to expose, you may need one or more of the following:

  • For relational databases such as MySQL, PostgreSQL, SQLServer, Oracle and DB2, the corresponding JDBC drivers are required.
  • For XML databases such as Xindice, the corresponding XMLDB drivers are required.
  • To use a full text search engine against flat files, Jakarta Lucene is required.

2.5. Security considerations

OGSA-DAI does not provide any security over and above that already provided by the Globus Toolkit. However, consideration must be given to the role mapping which converts a grid credential to a database username/password. OGSA-DAI comes with two basic role mappers: a simple role mapper which accesses a plain text file with this information and one which encrypts the username/password information. More details can be found in the documentation bundled with the documentation. If neither of these schemes are secure enough then it is possible to replace it by implementing a new role mapper interface.

3. Architecture and design overview

A high-level schematic representation of the OGSA-DAI architecture is shown in the diagram below. This is an end goal. All the components may not yet be available within one of the OGSA-DAI distributions. When you download an OGSA-DAI distribution you will only get components specific to one of the WS-I, WSRF or OGSI based infrastructures. The version of OGSA-DAI included in the GT4.0 release only includes the WSRF components. Also note that there is no client toolkit for WSRF in this release. For information about and availability of other releases please visit the project website www.ogsadai.org.uk.

OGSA-DAI High-level Architecture

OGSA-DAI High-level Architecture

The different components in this diagram are explained below, working from the bottom of the diagram up to the top.

3.1. Data Layer

The data layer consists of data resources which can be exposed via OGSA-DAI. Currently these include:

  • Relational data resources, e.g. MySQL, SQL Server, DB2, Oracle.
  • XML data resources, e.g. Xindice.
  • Files data resources, e.g. files and directories, OMIM, SWISSPROT and EMBL.

3.2. Data Layer-Business Logic Layer Interface

This interface communicates information between the business logic and data layers. This interface is provided by OGSA-DAI classes which invoke JDBC drivers, XMLDB drivers, or other OGSA-DAI classes to manage communications to and from data resources.

3.3. Business Logic Layer

This layer encapsulates the core functionality of OGSA-DAI. This layer consists of components which manage:

  • Execution of Perform documents which encapsulate a pipelined sequence of activities to be executed at the service. These could consist of queries or updates that operate on a data resource and/or data transformation and/or delivery operations acting on the incoming or outgoing data streams.
  • Preparation of responses to client requests for data resource query, update, transformation and delivery activities. Responses include execution status information and can also include data. Responses are in the form of Response documents.
  • Data transformation and delivery management.
  • Connection to, management of and interaction with data resources.

The business logic later is termed the DAI-Core.

3.4. Presentation Layer-Business Logic Layer Interface

This interface communicates information between the presentation and business logic layers. This interface supports invocation of OGSA-DAI functionality within the business logic layer in a way that is independent of any Web or Grid environment, i.e. a way that is also suitable to allow non-Web-enabled clients to access OGSA-DAI functionality directly. This interface provides the following:

  • Components to extract information for the business logic layer from requests arriving via:

    • OGSA-DAI WSRF services.
    • OGSA-DAI WS-I services.
    • OGSA-DAI OGSI services.
  • Note that WS-I is being used in a very specific way - services that only require those standards that are addressed in the WS-I Basic Profile 1.1 document. Components to extract information from the business logic layer and build responses to be provided via:

    • OGSA-DAI WSRF services.
    • OGSA-DAI WS-I services.
    • OGSA-DAI OGSI services.

3.4.1.  Information From Presentation Layer to Business Logic Layer

  • Client proxy certificates and credentials in a Web- and Grid-independent format.
  • Received data.
  • Perform documents from clients.
  • DAI-Core configuration information including data resource drivers, data resource URIs, database user names and passwords, information on supported activities and the legal form of Perform documents.

3.4.2. Information From Business Logic Layer to Presentation Layer

  • Response documents.
  • Data for delivery.
  • Data resource schema.
  • Information on data resource-related activities that can be requested by the user within Perform documents.

3.5. Presentation Layer

This layer encapsulates the functionality relating to exposing OGSA-DAI to a Grid via Web- or Grid-enabled interfaces. For each realisation there is associated WSDL and XML Schema describing the Web- or Grid-enabled interfaces. The following presentation layer interfaces are supported:

  • OGSA-DAI OGSI-compliant services based on the Globus Toolkit 3.2.
  • OGSA-DAI WSRF-compliant services based on the Globus Toolkit 4.0.
  • OGSA-DAI WS-I-compliant services based on Apache Axis 1.2.

3.6. Clients

OGSA-DAI can support access by any suitable OGSI-, WSRF or WS-I-compliant client, depending on the OGSA-DAI presentation layer deployed at the server which the client is trying to access.

OGSA-DAI provides a Client Toolkit which provides a higher-level of interaction with OGSA-DAI services than that supported by exchanging Perform and Response documents. This however is not yet supported by the WSRF version of OGSA-DAI. There will be a version available for the next WSRF version of OGSA-DAI that will be available from the OGSA-DAI web site and will be included in the next Globus Toolkit release.

4. Public interface

The semantics and syntax of the APIs for this component can be found in the public interface guide.

5. Usage scenarios

5.1. Interacting with Data Resources

OGSA-DAI supports interaction with data resources, and other data manipulation operations, via a document-oriented interface. The basic building blocks for doing this are:

  • Activities - form the basic data resource manipulation, data transformation and delivery operations that a client may want to perform. OGSA-DAI comes with a bunch of pre-defined activities, see the documentation in the distribution for more detail on what is available, and if you do not find the activity that you own then you can write your own and plug it into the existing framework. Activities are the basic building block of Perform documents.
  • Perform Documents - allow clients to specify a collection of operation that they would like to perform on a data resource. Activities are collected together in the perform document with a simple data flow going from one activity to the next. In principle a simple operation could be to perform an SQL query on a relational database, take the results and perform an XSL Transform - assuming that the results are returned in XML format and then delivered to a third party using ftp. A number of activities could be linked together to process the data before it is delivered. The object here is to take the computation to the data and avoid as many client-service interaction. Perform documents are submitted to data service resources exposed by OGSA-DAI data services - a data service resource manages the exposure of and interaction with a data resource.
  • Response Documents - are returned by data service resources via OGSA-DAI services to clients to inform them as to the execution status of their Perform documents and, often, to also return data directly back to a client if third party delivery is not being used.

Using these fairly basic properties it is possible to achieve fairly complex interaction patterns as illustrated in the figure below.

Possible OGSA-DAI Scenario Configurations

Possible OGSA-DAI Scenario Configurations

Thus, one is able to do a simple query-response interaction where the data goes directly back to the analyst querying a data service resource exposed by a data service. A third party delivery where the data is returned to a third party consumer and the analyst receives the status of the the perform document. Another scenario has the client passing on the details of the data service resource (and the data service which exposes it) to a third party who then pulls the data from the data service resource via the data service.

The process is relatively similar for updates where the analyst can submit the update directly to a data service resource (via a data service), with the data being transferred in the same message. Alternatively the data for an insert may be pulled or pushed from a third party.

More complex scenarios are also possible with the ability to perform service-to-service communication. However, although it is possible to do this within OGSA-DAI it has been found that performing data transfers using SOAP is not very efficient. More work remains to be done in this area.

Thus based on the simple concept of activities, perform and response documents fairly complex scenarios can be achieved using OGSA-DAI.

6. Debugging

There is no particular debugging advice at this time.

7. Troubleshooting

Please check with the Administration Troubleshooting section for some common errors encountered with OGSA-DAI. More information will also be available from the OGSA-DAI web site (www.ogsadai.org.uk).

8. Related Documentation

OGSA-DAI related presentations can be found at the OGSA-DAI web sited under: www.ogsadai.org.uk/docs - note that this contains information about the different flavours of OGSA-DAI and not just the WSRF version. You also will find links to courses and tutorials at www.ogsadai.org.uk/courses/.