Gnosis and generateDS - Analysis, Comparison, and Evaluation

Dave Kuhlman

http://www.rexx.com/~dkuhlman
Email:

Aug 23, 2002

Front Matter

Abstract:

This paper compares the intention and use of several of the capabilities in the Gnosis/objectify library with generateDS.py.

This paper is concerned mainly with the objectify module in the Gnosis library.



Contents

 
1 Introduction - What It Does

[Note: The Gnosis library actually has several modules, however, in this paper we discuss only the objectify module.]

Gnosis/objectify -- Gnosis/objectify creates native Python data structures from an XML document. The Python programmer can then extract information from the XML document using Python data structures that are meaningful and that follow the structure of the XML document.

generateDS.py -- Given an XSchema definition of an XML document type, generateDS.py generates Python classes whose instances can represent elements in documents of that type. generateDS.py also generates a parser that can read and parse a document of the given type.

Our focus in this paper is on the ability of these two technologies to enable us to process XML documents using native Python data structures that are more convenient, usable, and meaningful than those offered by DOM.

In general, we shall be comparing two different approaches to this problem:

 
2 How to Use It

2.1 Gnosis/objectify Mini-How-to

David Mertz, the implementor of Gnosis/objectify provides excellent documentation and commentary in his articles. Refer to the ``See Also'' section, below.

Here is a summary.

  1. Import Gnosis/objectify. On my machine, I've installed the Gnosis library, so the following does it:

    >>> import gnosis.xml.objectify
    

  2. Create a Gnosis/objectify XML object:

    >>> xml_obj = gnosis.xml.objectify.XML_Objectify('people.xml')
    

  3. And from that, create a Python object:

    >>> py_obj = xml_obj.make_instance()
    

  4. Now we are ready to inspect and manipulate this object. For example:

    >>> # Get the first person element inside the root element.
    >>> person = py_obj.person[0]
    >>>
    >>> # Get the ``ratio'' attribute of the person element.
    >>> print person.ratio
    3.2
    >>>
    >>> # Get the ``name'' element in the person element
    >>> print person.name
    <gnosis.xml.objectify._objectify._XO_name instance at 0x8253914>
    >>>
    >>> # Get the characters in the name element.
    >>> print person.name.PCDATA
    Alberta
    

2.2 generateDS.py Mini-How-to

David Kuhlman, the implementor of generateDS.py provides documentation at his Web site. This documentation is included in the generateDS.py distribution, which is also available at that site. Refer to the ``See Also'' section, below.

Here is a summary. Steps toward using generateDS.py:

  1. Create an XSchema definition of the XML document type that you wish to process.

  2. Process this XSchema definition with generateDS.py, which will generate Python class definitions and, optionally, subclass definitions.

  3. Add your application specific code to the subclasses.

  4. Modify the import statement at the top of the subclass file so that it imports the file containing the superclasses.

  5. Modify the main function in the subclass file to suit your needs.

  6. You should now be able to parse and process XML documents by running the subclass file with Python.

3 Uses and Applications

This section describes several possible uses for these two technologies.

3.1 Loading and Using Configuration Files

Both tools seem suitable for reading and writing an XML document used as a configuration file.

A few notes:

3.2 Transformations on XML

Both Gnosis/objectify and generateDS.py seem very appropriate tools for performing transformations on XML documents.

3.2.1 Transformation with Gnosis/objectify

Two approaches seem appropriate:

A couple things to be aware of:

3.2.2 Transformation with generateDS.py

Here are several strategies you can use for implementing XML transformations using generateDS.py:

It may be helpful to write a simple tree walk function that collects the instances that represent specific XML elements in a dictionary. For example, if all elements of a certain type have an ``id'' attribute or a ``name'' attribute (and the name is unique), then you might collect a dictionary whose keys are the IDs or the names. This will enable your transformation methods to look up instances (elements) that are not directly connected to the element currently being processed.

 
4 Comparisons

4.1 Commonalities

How Gnosis/objectify and generateDS.py are similar:

4.2 Contrasts and differences

How Gnosis/objectify and generateDS.py are different:

4.3 Limitations and Restrictions

Both technologies, like DOM, load the entire document into memory. Therefore, neither is suitable for ``very large'' XML documents. You will have to decide how large ``very large'' is.

Both technologies are Python solutions. Python must be installed in order to use them. In addition, both require installation of PyXML, the standard XML support package for Python.

5 Summary

Both technologies make processing XML documents in Python exceptionally easy to do. Both are a step up from DOM. Both make it easy to add application specific code. And, the application specific code is likely to be more meaningful and readable than equivalent code written for use with DOM.

 
See also, Etc.

See Also:

The main Python Web Site
for more information on Python

The Python XML Special Interest Group
for more information on processing XML with Python

Dave's Web Site
for more software and information on using Python for XML and the Web

generateDS.py -- Generate Python data structures from XML Schema
for documentation on and the implementation of generateDS.py

David Mertz's Gnosis download directory
for the latest Gnosis XML library

David Mertz's technical publications at Gnosis
for ``XML Matters'' articles on Gnosis/objectify and other topics

About this document ...

Gnosis and generateDS - Analysis, Comparison, and Evaluation, Aug 23, 2002

This document was generated using the LaTeX2HTML translator.

LaTeX2HTML is Copyright © 1993, 1994, 1995, 1996, 1997, Nikos Drakos, Computer Based Learning Unit, University of Leeds, and Copyright © 1997, 1998, Ross Moore, Mathematics Department, Macquarie University, Sydney.

The application of LaTeX2HTML to the Python documentation has been heavily tailored by Fred L. Drake, Jr. Original navigation icons were contributed by Christopher Petrilli.