libxsltmod -- Simple Python interface to libxslt
Contents:
Back to top
What Is libxsltmod?
libxsltmod is a Python extension module that enables you to use
libxslt to perform XSLT transformations from Python scripts.
libxsltmod exposes several functions to Python. There is one
function that can be used to perform an XSLT transform and write
the result to a file. There is another that other performs an XSLT
transform and returns the result as a Python string. And then
there are several functions to support the reuse of previously
compiled stylesheets without recompiling.
You can learn more about libxslt at
the libxslt home page.
Back to top
Installation
Down-load the distribution. You can find it at:
libxsltmod-1.5a.tar.gz.
See the README file. Basically, you will need to do the following:
- Build and install libxml2. You can find it at
http://xmlsoft.org.
- Build and install libxslt. You can find it at
http://xmlsoft.org/XSLT/.
- Un-compress and un-tar libxsltmod. Something like the
following should work:
tar xzvf libxsltmod-1.5a.tar.gz
- Build libxsltmod with something like:
python setup.py build
python setup.py install
Since libxsltmod needs the libxml2 and libxslt libraries,
you will need to make them "findable". On my Linux machine, I used
the following:
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib
Back to top
Calling libxsltmod
libxsltmod exposes the following functions:
- translate_to_file
- Translate a document and write the result to a file.
- translate_to_string
- Translate a document and return the result as a Python string.
- translate_to_stream
- Translate a document and write the result to a stream, i.e. an
instance of a Python class that exposes a write method.
- compile_stylesheet
- Compile a stylesheet and return a stylesheet object.
- set_flag
- Set a flag to be passed to libxml/libxslt.
translate_to_file
Translate a document and write the result to a file.
Prototype:
translate_to_file(
stylesheetType, ## 's': string; 'f': filename; 'c': compiled
stylesheet, ## stylesheet string, filename, or object
inDocumentType, ## 's': string; 'f': filename
inDocument, ## input document string or filename
outfileName, ## name of file to write result to
messageHandler, ## error message handler [None]
paramDict) ## dictionary of named user parameters [None]
Where:
- stylesheetType is the source for the stylesheet. It
can be 's' for string or 'f' for filename or 'c' for compiled (see
compile_stylesheet).
- stylesheet is the stylesheet. It is either a string
containing the stylesheet (if stylesheetType is 's') or a string
containing the name of a file containing the stylesheet (if
stylesheetType is 'f') or a previously compiled stylesheet (if the
stylesheetType is 'c').
- inDocumentType is the source for the XML document to be
transformed. It can be 's' for string or 'f' for filename.
- inDocument is the XML document to be transformed. It
is either a string containing the document (if inDocumentType is
's') or a string containing the name of a file containing the
document (if inDocumentType is 'f').
- outfileName is the name of the file in which to write
the transformed document.
- messageHandler (optional) is an instance of an error
handler class or None. And error handler class is a class that
contains a write method that takes a single argument (the
text of the error message). The error handling mechanism of the
XSLT processor will call this method for each warning, error, and
fatal error encountered, passing the text of the message. If this
argument is None or omitted, normal error processing will occur,
i.e. the message will be written to stdout.
- paramDict (optional) is a Python dictionary containing
named parameters for XSLT processing. These parameters are passed
through to libxslt for access by XSLT stylesheets. If this argument
argument is None or omitted, no arguments are assumed (effectively,
an empty dictionary.
Example:
Here is an example that translates an XML document in a file using
an XSLT stylesheet in a file. It writes the result to a file.
import libxsltmod
class MessageHandler:
def write(self, msg):
print 'message:', msg
def translate(stylesheetName, infileName, outfileName):
messageHandler = MessageHandler()
paramDict = {'param1': 'Value #1'}
libxsltmod.translate_to_file(
'f', stylesheetName,
'f', infileName,
'output.html',
messageHandler, paramDict)
translate_to_string
Translate a document and return the result as a Python string.
Prototype:
translate_to_string(
stylesheetType, ## 's': string; 'f': filename; 'c': compiled
stylesheet, ## stylesheet string, filename, or object
inDocumentType, ## 's' for string; 'f' for filename
inDocument, ## input document string or filename
messageHandler, ## error message handler [None]
paramDict) ## dictionary of named user parameters [None]
Where:
- stylesheetType is the source for the stylesheet. It
can be 's' for string or 'f' for filename or 'c' for compiled (see
compile_stylesheet).
- stylesheet is the stylesheet. It is either a string
containing the stylesheet (if stylesheetType is 's') or a string
containing the name of a file containing the stylesheet (if
stylesheetType is 'f') or a previously compiled stylesheet (if the
stylesheetType is 'c'; see compile_stylesheet).
- inDocumentType is the source for the XML document to be
transformed. It can be 's' for string or 'f' for filename.
- inDocument is the XML document to be transformed. It
is either a string containing the document (if inDocumentType is
's') or a string containing the name of a file containing the
document (if inDocumentType is 'f').
- messageHandler (optional) is an instance of an error
handler class or None. And error handler class is a class that
contains a write method that takes a single argument (the
text of the error message). The error handling mechanism of the
XSLT processor will call this method for each warning, error, and
fatal error encountered, passing the text of the message. If this
argument is None or omitted, normal error processing will occur,
i.e. the message will be written to stdout.
- paramDict (optional) is a Python dictionary containing
named user parameters for XSLT processing. These parameters are
passed through to libxslt for access by XSLT stylesheets. If this
argument is None or omitted, no arguments are assumed (effectively,
an empty dictionary.
Example:
Here is an example that translates an XML document in a file using
an XSLT stylesheet in a string that has been read from a file. It
returns the result as a string. This example also catches error
messages.
import libxsltmod
class MessageHandler:
def __init__(self):
self.content = ''
def write(self, msg):
self.content = self.content + msg
def getContent(self):
return self.content
def translate():
inFile = open('transform1.xsl', 'r')
stylesheetStr = inFile.read()
inFile.close()
handler = MessageHandler()
paramDict = {'param1': 'Value #1'}
try:
result = libxsltmod.translate_to_string(
's', stylesheetStr,
'f', 'document1.xml',
handler, paramDict)
return (0, result)
except RuntimeError:
messages = handler.getContent()
return (1, messages)
translate_to_stream
Translate a document and write the result to a stream. For our
purposes here, a stream is an instance of any Python class that
supports a write method, which takes one argument, the
content. This method will be called with each chunk of content
that translate_to_stream produces.
Prototype:
translate_to_stream(
stylesheetType, ## 's': string; 'f': filename; 'c': compiled
stylesheet, ## stylesheet string, filename, or object
inDocumentType, ## 's': string; 'f': filename
inDocument, ## input document string or filename
messageHandler, ## error message handler
streamObject, ## the object to which output is written
paramDict) ## dictionary of named user parameters [None]
Where:
- stylesheetType is the source for the stylesheet. It
can be 's' for string or 'f' for filename or 'c' for compiled (see
compile_stylesheet).
- stylesheet is the stylesheet. It is either a string
containing the stylesheet (if stylesheetType is 's') or a string
containing the name of a file containing the stylesheet (if
stylesheetType is 'f') or a previously compiled stylesheet (if the
stylesheetType is 'c').
- inDocumentType is the source for the XML document to be
transformed. It can be 's' for string or 'f' for filename.
- inDocument is the XML document to be transformed. It
is either a string containing the document (if inDocumentType is
's') or a string containing the name of a file containing the
document (if inDocumentType is 'f').
- outfileName is the name of the file in which to write
the transformed document.
- messageHandler is an instance of an error handler class
or None. And error handler class is a class that contains a
write method that takes a single argument (the text of the
error message). The error handling mechanism of the XSLT processor
will call this method for each warning, error, and fatal error
encountered, passing the text of the message. If this argument is
None, normal error processing will occur, i.e. the message will be
written to stdout.
- streamObject is an instance of a class that which
implements a write method which takes one argument (the
content to be written). Any of the following can be used: file
objects, sys.stdout, StringIO and cStringIO objects, and instances
of classes you define yourself containing a write method.
The write method will be called (possibly multiple times)
with the results of the transformation.
- paramDict (optional) is a Python dictionary containing
named user parameters for XSLT processing. These parameters are
passed through to libxslt for access by XSLT stylesheets. If this
argument is None or omitted, no arguments are assumed (effectively,
an empty dictionary.
Example:
Here is an example that translates an XML document in a file using
an XSLT stylesheet in a file. It writes the result to a file.
import libxsltmod
class MessageHandler:
def write(self, msg):
print 'message:', msg
class ContentHandler:
def write(self, msg):
print 'content:', msg
def translate(stylesheetName, infileName, outfileName):
messageHandler = MessageHandler()
paramDict = {'param1': 'Value #1'}
contentHandler = ContentHandler()
libxsltmod.translate_to_file(
'f', stylesheetName,
'f', infileName,
messageHandler,
contentHandler, paramDict)
compile_stylesheet
Compile a stylesheet and return a stylesheet object.
A stylesheet object is a Python datatype that exposes no methods.
It's purpose and use is to be passed as the stylesheet to either
translate_to_file or translate_to_string with the 'c'
stylesheetType.
Prototype:
compile_stylesheet(
stylesheetType ## 's' for string; 'f' for filename
stylesheet ## stylesheet string or filename
messageHandler) ## error message handler
Where:
- stylesheetType is the source for the stylesheet. It
can be 's'' for string or 'f' for filename.
- stylesheet is the stylesheet. It is either a string
containing the stylesheet (if stylesheetType is 's') or a string
containing the name of a file containing the stylesheet (if
stylesheetType is 'f').
- messageHandler is an instance of an error handler class
or None. And error handler class is a class that contains a
write method that takes a single argument (the text of the
error message). The error handling mechanism of the XSLT processor
will call this method for each warning, error, and fatal error
encountered, passing the text of the message. If this argument is
None, normal error processing will occur, i.e. the message will be
written to stdout.
Example:
Here is an example that compiles a stylesheet from a file, then
transforms two documents.
import libxsltmod
stylesheet = libxsltmod.compile_stylesheet('f', 'transform1.xsl')
libxsltmod.translate_to_file(
'c', stylesheet,
'f', 'document1.xml',
'result1.html')
libxsltmod.translate_to_file(
'c', stylesheet,
'f', 'document2.xml',
'result2.html')
set_flag
Set a flag to be passed to libxml/libxslt.
Prototype:
set_flag(
flagName, ## the name of the flag to be set
flagValue) ## the new value for the flag (0/1)
Where:
- flagName is the name (a string) of the flag to be set.
Currently, the following flags are supported: "XML_DETECT_IDS" and
"XML_COMPLETE_ATTRS".
- flagValue is the value to which the flag is to be set.
May be 0 or 1.
Here is some documentation on the flags from the libxml2 source code
(in libxml2-2.4.17/include/libxml/parser.h):
/**
* XML_DETECT_IDS:
*
* Bit in the loadsubset context field to tell to do ID/REFs lookups
* Use it to initialize xmlLoadExtDtdDefaultValue
*/
#define XML_DETECT_IDS 2
/**
* XML_COMPLETE_ATTRS:
*
* Bit in the loadsubset context field to tell to do complete the
* elements attributes lists with the ones defaulted from the DTDs
* Use it to initialize xmlLoadExtDtdDefaultValue
*/
#define XML_COMPLETE_ATTRS 4
Note that the flags XML_DETECT_IDS and XML_COMPLETE_ATTRS require a DTD.
Example:
Here is an example that sets the flags. Setting the
XML_COMPLETE_ATTRS flag causes omitted attributes which have a
default value in the DTD to be given that value.
import libxsltmod
class MessageHandler:
def write(self, msg):
print 'message:', msg
def translate(stylesheetName, infileName, outfileName):
messageHandler = MessageHandler()
libxsltmod.set_flag('XML_DETECT_IDS', 1)
libxsltmod.set_flag('XML_COMPLETE_ATTRS', 1)
libxsltmod.translate_to_file(
'f', stylesheetName,
'f', infileName,
'output.html',
messageHandler)
Back to top
Additional Information
Examples
You can find examples of the use of libxsltmod in the Examples
directory.
XSLT
You can find more information about XSLT at
The
XML Cover Pages, in particular at the
Extensible
Stylesheet Language (XSL) page.
Back to top
Dave Kuhlman
[email protected]
Last update: 5/15/02