Dave Kuhlman
http://www.rexx.com/~dkuhlman
Email: [email protected]
Aug 23, 2002
This paper compares the intention and use of several of the capabilities in the Gnosis/objectify library with generateDS.py.
This paper is concerned mainly with the objectify module in the Gnosis library.
[Note: The Gnosis library actually has several modules, however, in this paper we discuss only the objectify module.]
Gnosis/objectify -- Gnosis/objectify creates native Python data structures from an XML document. The Python programmer can then extract information from the XML document using Python data structures that are meaningful and that follow the structure of the XML document.
generateDS.py -- Given an XSchema definition of an XML document type, generateDS.py generates Python classes whose instances can represent elements in documents of that type. generateDS.py also generates a parser that can read and parse a document of the given type.
Our focus in this paper is on the ability of these two technologies to enable us to process XML documents using native Python data structures that are more convenient, usable, and meaningful than those offered by DOM.
In general, we shall be comparing two different approaches to this problem:
David Mertz, the implementor of Gnosis/objectify provides excellent documentation and commentary in his articles. Refer to the ``See Also'' section, below.
Here is a summary.
>>> import gnosis.xml.objectify
>>> xml_obj = gnosis.xml.objectify.XML_Objectify('people.xml')
>>> py_obj = xml_obj.make_instance()
>>> # Get the first person element inside the root element. >>> person = py_obj.person[0] >>> >>> # Get the ``ratio'' attribute of the person element. >>> print person.ratio 3.2 >>> >>> # Get the ``name'' element in the person element >>> print person.name <gnosis.xml.objectify._objectify._XO_name instance at 0x8253914> >>> >>> # Get the characters in the name element. >>> print person.name.PCDATA Alberta
David Kuhlman, the implementor of generateDS.py provides documentation at his Web site. This documentation is included in the generateDS.py distribution, which is also available at that site. Refer to the ``See Also'' section, below.
Here is a summary. Steps toward using generateDS.py:
This section describes several possible uses for these two technologies.
Both tools seem suitable for reading and writing an XML document used as a configuration file.
A few notes:
Both Gnosis/objectify and generateDS.py seem very appropriate tools for performing transformations on XML documents.
Two approaches seem appropriate:
xml_obj = gnosis.xml.objectify.XML_Objectify(inFileName) people = xml_obj.make_instance() for person in people.person: print 'Person: %s' % person.name.PCDATA
class _XO_people: def export(self): for person in self.person: person.export() class _XO_person: def export(self): print 'Person:' print ' Name: %s' % self.name.PCDATA) showLevel(ostrm, level) def generate(inFileName): # Put our classes in the xml.objectify namespace. gnosis.xml.objectify._XO_people = _XO_people gnosis.xml.objectify._XO_person = _XO_person xml_obj = gnosis.xml.objectify.XML_Objectify(inFileName) root = xml_obj.make_instance() root.export()
Note the two assignment statements before ``objectifying'' the XML object. These put our classes into the xml.objectify namespace.
A couple things to be aware of:
<person> <description></description> </person>
The Python instance for the ``description'' element, will have no member ``PCDATA''.
You can use a try: except:
block to check for it.
dir(obj)
), you may do a double-take on fact that the attribute for
a child element sometimes contains a list and sometimes contains a
single item, not a list with a single item. And, yet, whether
list or single item, it responds correctly to list access protocol
(e.g. len(people.person)
and for item in people.person:
).
How can this be? The answer is that these instances are members of
the subclass of a class that implements both the __len__
and __getitem__ methods.
Here are several strategies you can use for implementing XML transformations using generateDS.py:
It may be helpful to write a simple tree walk function that collects the instances that represent specific XML elements in a dictionary. For example, if all elements of a certain type have an ``id'' attribute or a ``name'' attribute (and the name is unique), then you might collect a dictionary whose keys are the IDs or the names. This will enable your transformation methods to look up instances (elements) that are not directly connected to the element currently being processed.
How Gnosis/objectify and generateDS.py are similar:
How Gnosis/objectify and generateDS.py are different:
Both technologies, like DOM, load the entire document into memory. Therefore, neither is suitable for ``very large'' XML documents. You will have to decide how large ``very large'' is.
Both technologies are Python solutions. Python must be installed in order to use them. In addition, both require installation of PyXML, the standard XML support package for Python.
Both technologies make processing XML documents in Python exceptionally easy to do. Both are a step up from DOM. Both make it easy to add application specific code. And, the application specific code is likely to be more meaningful and readable than equivalent code written for use with DOM.
See Also:
This document was generated using the LaTeX2HTML translator.
LaTeX2HTML is Copyright © 1993, 1994, 1995, 1996, 1997, Nikos Drakos, Computer Based Learning Unit, University of Leeds, and Copyright © 1997, 1998, Ross Moore, Mathematics Department, Macquarie University, Sydney.
The application of LaTeX2HTML to the Python documentation has been heavily tailored by Fred L. Drake, Jr. Original navigation icons were contributed by Christopher Petrilli.