WPS design guide¶
This guide serves as an introduction to the WPS module. As such, it does not contain:
- a primer to the WPS protocol, that can be found in the WPS specification (the module implements the WPS 1.0 specification).
- it does not repeat again what can be already found in the classes javadocs
- it does not explain how to implement a OWS service using the GeoServer OWS framework, that is left to its dedicated guide.
In short, it provides a global vision of how the module fits togheter, leaving the details to other information sources.
General architecture¶
Note
We really need to publish the Javadocs somewhere so that this document can link to them
The module is based on the usual GeoServer OWS framework application:
- a set of KVP parsers and KVP readers to parse the HTTP GET requests, found in the org.geoserver.wps.kvp package
- a set of XML parsers to parse the HTTP POST requests, found int the org.geoserver.wps.xml and org.geoserver.wps.xml.v1_0_0
- a service object interface and implementations responding to the various WPS methods, in particular org.geoserver.wps.DefaultWebProcessingService, which in turn delegates most of the work to the GetCapabilities, DescribeProcess and ExecuteProcess classes
- a set of output transformers taking the results generated by DefaultWebProcessingService and turning them into the appropriate response (usually, XML). You can find some of those in the org.geoserver.wps.response package, whilst some others are generic ones that have been parametrized and declared in the Spring context (see the applicationContext.xml file).
The module uses extensively the following GeoTools modules:
- net.opengis.wps which contains EMF models of the various elements and types described in the WPS schemas. Those objects are usually what flows between the KVP parsers, XML decoders, the service implementation, and the output transformers
- gt-xsd-wps and gt-xsd, used for all XML encoding and decoding needs
- gt-process that provides the concept of a process, with the ability to self describe its inputs and outputs, and of course execute and produce results
The processes¶
The module relies on gt-process SPI based plugin mechanism to lookup and use the processes available in the classpath. Implementing a new process boils down to:
- creating a ProcessFactory implementation
- creating one or more Process implementations
- registering the ProcessFactory in SPI by adding the factory class name in the META-INF/services/org.geotools.process.ProcessFactory file
The WPS module shows an example of the above by bridging the Sextante API to the GeoTools process one, see the org.geoserver.wps.sextante package. This also means it’s possible to rely on libraries of existing processes provided they are wrapped into a GeoTools process API container.
Bridging between objects and I/O formats¶
The WPS specification is very generic. Any process can take as input pretty much anything, and return anything. It basically means WPS is a complex, XML based RPC protocol.
Now, this means WPS can trade vector data, raster data, plain strings and numbers, spreadsheets or word processor and whatever else the imagination can lead one to. Also, given a single type of data, say a plain geometry, there are many useful ways to represent it: it could be GML2, or GML3, or WKT, WKB, or a one row shapefile. Different clients will find some formats easier than others to use, meaning the WPS should try to offer as many option as possible for both input and output.
The classes stored in the org.geoserver.wps.ppio serve exactly this purpose: turning a representation format into an in memory object and vice versa. A new subclass of ProcessParameterIO (PPIO) is needed each time a new format for a known parameter type is desired, or when a process requires a new kind of parameter, and it then needs to be registered in the Spring contex so that ProcessParameterIO.find(Parameter, ApplicationContext) can find it.
Both the XML reader and the XML encoders do use the PPIO dynamically: the WPS document structure is made so that parameters are actually xs:Any, so bot
The code providing the description of the various processes also scans the available ProcessParameterIO implementations so that each parameter can be matched with all formats in which it can be represented.
Implementation level¶
At the moment the WPS is pretty much bare bones protocol wise, it implements only the required behaviour leaving off pretty much everything else. In particulat: - GetCapabilities and DescribeProcess are supported in both GET and POST form, but Execute is implemented only as a POST request - there is no raster data I/O support - there is no asynch support, no process monitoring, no output storage abilities. - there is no integration whatsoever with the WMS to visualize the results of an analysis (this will require output storage and per session catalog extensions) - the vector processes are not using any kind of disk buffering, meaning everything is kept just in memory (won’t scale to bigger data amounts) - there is no set of demo requests nor a GUI to build a request. That is considered fundamental to reduce the time spent trying to figure out how to build a proper request so it will be tackled sooner rather than later.
The transmute package¶
The org.geoserver.wps.transmute package is an earlier attempt at doing what PPIO is doing. It is attempting to also provide a custom schema for each type of input/output, using subsetted schemas that do only contain one type (e.g., GML Point) but that has to reference the full schema definition anyways.
Note
This package is a leftover, should be completely removed and replaced with PPIO usage instead. At the moment only the DescribeProcess code is using it.