Symbian
Symbian OS Library

SYMBIAN OS V9.3

[Index] [Spacer] [Previous] [Next]



How to use XML Parser


How to parse an XML document

[Top]


Summary

This section explains how to write applications which respond to the output of the Parser framework by implementing the content handler functionality of the parser framework.

[Top]


Example scenario

You are writing a complex application with numerous settings which the user customises from a tab window. This means you need the application to save these settings so that it has them available when it is restarted after being shut down. You want a generic solution which will work on a wide range of platforms, for a wide range of applications and in a wide range of environments. A good way to do this while ensuring cross-platform compatibility is to save the settings in an XML file which the application parses on start-up. You provide the functionality with which your application responds to the XML statements by implementing the MContentHandler interface. As the parser detects tags and their content, it calls the associated content handler functions to respond with the required behaviour.

[Top]


Implementing MContentHandler

The callback functions which an implementation of MContentHandler must provide are listed below - these correspond to functions defined in the ContentHandler interface of the SAX specification. The last parameter of each function is an error code. If no error has taken place then the null error code KErrNone is returned by the framework: other error codes are defined in the file XMLFrameworkErrors.h.

MContentHandler callback SAX specification function

OnStartDocumentL()

OnEndDocumentL()

OnStartElementL()

OnEndElementL()

OnContentL()

OnStartPrefixMappingL()

OnEndPrefixMappingL()

OnIgnorableWhiteSpaceL()

OnSkippedEntityL()

OnProcessingInstructionL()

OnError()

GetExtendedInterface()

startDocument()

endDocument()

startElement()

endElement()

characters()

startPrefixMapping()

endPrefixMapping()

ignorableWhitespace()

skippedEntity()

processingInstruction()

{para}

{para}

You need to know the following classes which are parameters of the functions you have to implement.

RDocumentParameters

A class containing the character set the document uses.

RAttributeArray

A class consisting of an array of RAttribute objects: these hold the name (as an RTagInfo object), value and type of each attribute of the element.

RTagInfo

A class containing information about an XML tag: its namespace URI, its namespace prefix and its local name.

Sample code

class CMyContentHandler : public CBase, public MContentHandler
{
public:

// A callback to indicate the start of the document. 
void CXmlExample::OnStartDocumentL(const RDocumentParameters&, TInt)
    {
    iConsole->Printf(KOnStartDoc);
    iConsole->Printf(KPressAKey);
    iConsole->Getch();

    iNumElements = 0;
    iNumSkippedEntities = 0;
    iNumPrefixMappings = 0;
    iNumPrefixUnmappings = 0;
    
    }
    
// A callback to indicate an element has been parsed.     
void CXmlExample::OnStartElementL(const RTagInfo&, const RAttributeArray&, TInt)
    {
     iConsole->Printf(KOnStartEle);

    if(iLeaveOnStartElement)
        if(iNumElements++ == 0)
            {
            iConsole->Printf(KOnStartErr, KExpectedLeaveCode);
            User::After(1);
            User::Leave(KExpectedLeaveCode);
            }

    iNumElements++;

    }

// implementations of the other callbacks
// ...

}

[Top]


Using your Content Handler object

To have your content handler object called from the XML framework, you instantiate it in the client application code and pass it to the constructor method of a parser object.

Sample code

...
CMyContentHandler* ch = CMyContentHandler::NewL();
CParser* parser = CParser::NewLC(KXmlMimeType,ch);
parser->ParseL(myXMLdata); // this will result in callbacks to ch
...

[Top]


Choosing a Parser Plugin

Constructing an instance of a parser plugin

The XML framework contains several different parsers, called parser plugins. If you are writing an application using the framework you do not need to specify a particular parser as the framework will select one for you using data which you supply. There are two ways of doing this: in both cases you need to know the mime type of the file you want to parse. If that is all you know about your files, you call the constructor method of a CParser object with the mime type as a parameter. However, you may want to specify a particular parser variant (usually identified by the name of its supplier). This involves a two step process. First you construct a CMatchData object and pass it the data about the files and parser variant: then you pass the CMatchData object to the constructor method of a CParser object as shown in the following code.

// 1. Create CMatchData object
CMatchData *matchData = CMatchData::NewLC();

// 2. Set Type
matchData->SetMimeTypeL(_LIT(“text/xml”));

// 3. Set variant string
matchData->SetVariantL(_LIT(“LicenseeX”));

// 4. Call creation method (assumption that content handler was created previously)
CParser* parser = CParser::NewLC(*matchData, *contentHandler);

// 5. Use the parser
// ….

// 6. Destroy the parser and CMatchData object
CleanupStack::PopAndDestroy(2, matchData);

Calling a parser plugin

To parse a document you must write code which somewhere includes calls to the parse methods of a CParser object. It is often simplest to call them indirectly from the global parse methods provided with the framework.

The global parse methods are Xml::ParseL(Xml::CParser& aParser, RFs& aFs, const TDesC& aFilename) and Xml::ParseL(Xml::CParser& aParser, RFile& aFile)

The global parse methods call the CParser object parse methods which are

Which methods you call depends on the nature of the input to the parser. Input may consist of one file or several files and it may be received in one piece or asynchronously in chunks. The files may be of the same or different types and asynchronous input may or may not be buffered before parsing.

Global parse method A makes a single call to each of the CParser parse methods. This is the simplest approach but it only works when you are parsing a single file which has previously been buffered.

Global parse methods B and C have the same functionality: the only difference is how they identify the input file (by name or from an RFile object). They call CParser::ParseL() in a loop and then call CParser::ParseEnd(). The use of a loop means that input does not have to be buffered, but only one file can be parsed by this technique.

This is because of the functionality of the CParser parse methods.

To parse several unbuffered documents of the same type, you need multiple calls to global parse method B or C. To parse several buffered documents possibly of different types, you need multiple calls to global parse method A. Other eventualities require individual calls to the parse methods of CParser.

[Top]


Adding a Content Processor Plugin

Example scenario

Sometimes you want to do more with a document than just parse it: for instance you might want to validate the parsed text against a specification, and then autocorrect spelling mistakes in the validated text.

Implementing MContentProcessor

In such a case you write three applications, a parser, a validator and an autocorrector. They should implement the MContentProcessor interface. Writing a content processor is almost exactly the same as writing a content handler as explained inHow To Parse an XML Document: MContentProcessor is a small extension of MContentHandler and you write each application by implementing the callback functions of MContentHandler. The only difference is that MContentProcessor has a mechanism for directing output, so that the output of your parser is the input to your validator and the output of your validator is the input of your autocorrector.

You direct output of a content processor by implementing its SetContentSink() function so that your parser outputs to your validator and your validator outputs to your autocorrector. A sequence of several applications linked in this way is called a plugin chain.

To perform the actual parsing, you use a CParser object as explained in Choosing a Parser Plugin. To ensure that parsing is followed by validation and autocorrection, you associate the CParser object with the plugin chain. You do this by calling the SetProcessorChainL() function of the CParser object with a list of the items in the plugin chain as a parameter.