Main Page   Modules   Class Hierarchy   Alphabetical List   Compound List   File List   Compound Members   File Members   Related Pages  

ACEXML_Parser Class Reference

A SAX based parser. More...

#include "ACEXML/parser/parser/Parser.h"

Inheritance diagram for ACEXML_Parser:

Inheritance graph
[legend]
Collaboration diagram for ACEXML_Parser:

Collaboration graph
[legend]
List of all members.

Public Methods

 ACEXML_Parser (void)
 Default constructor. More...

virtual ~ACEXML_Parser (void)
 Destructor. More...

virtual ACEXML_ContentHandlergetContentHandler (void) const
 Return the current content handler. More...

virtual ACEXML_DTDHandlergetDTDHandler (void) const
 Return the current DTD handler. More...

virtual ACEXML_EntityResolvergetEntityResolver (void) const
 Return the current entity resolver. More...

virtual ACEXML_ErrorHandlergetErrorHandler (void) const
 Return the current error handler. More...

virtual int getFeature (const ACEXML_Char *name, ACEXML_Env &xmlenv)
 Look up the value of a feature. More...

virtual void setFeature (const ACEXML_Char *name, int boolean_value, ACEXML_Env &xmlenv)
 Activating or deactivating a feature. More...

virtual void * getProperty (const ACEXML_Char *name, ACEXML_Env &xmlenv)
 Look up the value of a property. More...

virtual void setProperty (const ACEXML_Char *name, void *value, ACEXML_Env &xmlenv)
 Set the value of a property. More...

virtual void parse (ACEXML_InputSource *input, ACEXML_Env &xmlenv)
 Parse an XML document. More...

virtual void parse (const ACEXML_Char *systemId, ACEXML_Env &xmlenv)
 Parse an XML document from a system identifier (URI). More...

virtual void setContentHandler (ACEXML_ContentHandler *handler)
 Allow an application to register a content event handler. More...

virtual void setDTDHandler (ACEXML_DTDHandler *handler)
 Allow an application to register a DTD event handler. More...

virtual void setEntityResolver (ACEXML_EntityResolver *resolver)
 Allow an application to register an entity resolver. More...

virtual void setErrorHandler (ACEXML_ErrorHandler *handler)
 Allow an application to register an error event handler. More...

ACEXML_Char skip_whitespace (ACEXML_Char **whitespace)
 Skip any whitespaces encountered until the first non-whitespace character is encountered and consumed from the current input CharStream. More...

int skip_whitespace_count (ACEXML_Char *peek=0)
 Skip any whitespaces encountered until the first non-whitespace character. More...

int is_whitespace (ACEXML_Char c)
 Check if a character c is a whitespace. More...

int is_whitespace_or_equal (ACEXML_Char c)
 Check if a character c is a whitespace or '='. More...

int is_nonname (ACEXML_Char c)
 Check if a character c is a valid character for nonterminal NAME. More...

int skip_equal (void)
 Skip an equal sign. More...

int get_quoted_string (ACEXML_Char *&str)
 Get a quoted string. More...

int parse_processing_instruction (ACEXML_Env &xmlenv)
 Parse a PI statement. More...

int grok_comment ()
 Skip over a comment. More...

ACEXML_Charread_name (ACEXML_Char ch=0)
 Read a name from the input CharStream (until white space). More...

int parse_doctypedecl (ACEXML_Env &xmlenv)
 Parse the DOCTYPE declaration. More...

void parse_element (int is_root, ACEXML_Env &xmlenv)
 Parse an XML element. More...

void parse_xml_prolog (ACEXML_Env &xmlenv)
 Parse XML Prolog. More...

int parse_char_reference (ACEXML_Char *buf, size_t len)
 Parse a character reference, i.e., "&x20;" or "". More...

const ACEXML_Stringparse_reference (void)
 Parse an entity reference, i.e., "&". More...

int parse_cdata (ACEXML_Env &xmlenv)
 Parse a CDATA section. More...

int parse_internal_dtd (ACEXML_Env &xmlenv)
 Parse a "markupdecl" section, this includes both "markupdecl" and "DeclSep" sections in XML specification. More...

int parse_element_decl (ACEXML_Env &xmlenv)
 Parse an "ELEMENT" decl. More...

int parse_entity_decl (ACEXML_Env &xmlenv)
 Parse an "ENTITY" decl. More...

int parse_attlist_decl (ACEXML_Env &xmlenv)
 Parse an "ATTLIST" decl. More...

int parse_notation_decl (ACEXML_Env &xmlenv)
 Parse a "NOTATION" decl. More...

int parse_external_id_and_ref (ACEXML_Char *&publicId, ACEXML_Char *&systemId, ACEXML_Env &xmlenv)
 Parse an ExternalID or a reference to PUBLIC ExternalID. More...

int parse_children_definition (ACEXML_Env &xmlenv)
 Parse the "children" and "Mixed" non-terminals in contentspec. More...

int parse_child (int skip_open_paren, ACEXML_Env &xmlenv)
 Parse a cp non-terminal. More...


Protected Methods

ACEXML_Char get (void)
 Get a character. More...

ACEXML_Char peek (void)
 Peek a character. More...

int try_grow_cdata (size_t size, size_t &len, ACEXML_Env &xmlenv)
 Check if more data can be added to a character buffer in obstack. More...


Static Protected Attributes

const ACEXML_Char simple_parsing_feature_ [] = { 'S', 'i', 'm', 'p', 'l', 'e', 0 }
 This constant string defines the name of "simple XML parsing" feature. More...

const ACEXML_Char namespaces_feature_ [] = {'h', 't', 't', 'p', ':', '/', '/', 'x', 'm', 'l', '.', 'o', 'r', 'g', '/', 's', 'a', 'x', '/', 'f', 'e', 'a', 't', 'u', 'r', 'e', 's', '/', 'n', 'a', 'm', 'e', 's', 'p', 'a', 'c', 'e', 's', 0 }
 This constant string defines the SAX XML Namespace feature. More...

const ACEXML_Char namespace_prefixes_feature_ [] = {'h', 't', 't', 'p', ':', '/', '/', 'x', 'm', 'l', '.', 'o', 'r', 'g', '/', 's', 'a', 'x', '/', 'f', 'e', 'a', 't', 'u', 'r', 'e', 's', '/', 'n', 'a', 'm', 'e', 's', 'p', 'a', 'c', 'e', '-', 'p', 'r', 'e', 'f', 'i', 'x', 'e', 's', 0 }
 This constant string defines the SAX XML Namespace prefixes feature. More...


Private Methods

void report_error (const ACEXML_Char *message, ACEXML_Env &xmlenv)
 Dispatch errors to ErrorHandler. More...

void report_warning (const ACEXML_Char *message, ACEXML_Env &xmlenv)
 Dispatch warnings to ErrorHandler. More...

void report_fatal_error (const ACEXML_Char *message, ACEXML_Env &xmlenv)
 Dispatch fatal errors to ErrorHandler. More...

void report_prefix_mapping (const ACEXML_Char *prefix, const ACEXML_Char *uri, const ACEXML_Char *name, int start, ACEXML_Env &xmlenv)
 Dispatch prefix mapping calls to the ContentHandler. More...

int parse_token (const ACEXML_Char *keyword)
 Parse a keyword. More...


Private Attributes

ACEXML_DTDHandlerdtd_handler_
 Keeping track of the handlers. We do not manage the memory for handlers. More...

ACEXML_EntityResolverentity_resolver_
ACEXML_ContentHandlercontent_handler_
ACEXML_ErrorHandlererror_handler_
ACEXML_CharStreaminstream_
 Feature and properties management structure here. Current input char stream. More...

ACEXML_Chardoctype_
 My doctype, if any. More...

ACEXML_Chardtd_system_
 External DTD System Literal, if any. More...

ACEXML_Chardtd_public_
 External DTD Public Literal, if any. More...

ACE_Obstack_T< ACEXML_Charobstack_
ACEXML_NamespaceSupport xml_namespace_
ACEXML_Entity_Manager entities_
ACEXML_LocatorImpl locator_
int simple_parsing_
int namespaces_
int namespace_prefixes_

Detailed Description

A SAX based parser.


Constructor & Destructor Documentation

ACEXML_Parser::ACEXML_Parser void   
 

Default constructor.

ACEXML_Parser::~ACEXML_Parser void    [virtual]
 

Destructor.


Member Function Documentation

ACEXML_INLINE ACEXML_Char ACEXML_Parser::get void    [protected]
 

Get a character.

int ACEXML_Parser::get_quoted_string ACEXML_Char *&    str
 

Get a quoted string.

Quoted strings are used to specify attribute values and this routine will replace character and entity references on-the-fly. Parameter entities are not allowed (or replaced) in this function. (But regular entities are.)

Parameters:
str  returns the un-quoted string.
Return values:
0  on success, -1 otherwise.

ACEXML_INLINE ACEXML_ContentHandler * ACEXML_Parser::getContentHandler void    const [virtual]
 

Return the current content handler.

Reimplemented from ACEXML_XMLReader.

ACEXML_INLINE ACEXML_DTDHandler * ACEXML_Parser::getDTDHandler void    const [virtual]
 

Return the current DTD handler.

Reimplemented from ACEXML_XMLReader.

ACEXML_INLINE ACEXML_EntityResolver * ACEXML_Parser::getEntityResolver void    const [virtual]
 

Return the current entity resolver.

Reimplemented from ACEXML_XMLReader.

ACEXML_INLINE ACEXML_ErrorHandler * ACEXML_Parser::getErrorHandler void    const [virtual]
 

Return the current error handler.

Reimplemented from ACEXML_XMLReader.

int ACEXML_Parser::getFeature const ACEXML_Char   name,
ACEXML_Env   xmlenv
[virtual]
 

Look up the value of a feature.

This method allows programmers to check whether a specific feature has been activated in the parser.

Reimplemented from ACEXML_XMLReader.

void * ACEXML_Parser::getProperty const ACEXML_Char   name,
ACEXML_Env   xmlenv
[virtual]
 

Look up the value of a property.

Reimplemented from ACEXML_XMLReader.

int ACEXML_Parser::grok_comment void   
 

Skip over a comment.

The first character encountered should always be the first '-' in the comment prefix "<@!--".

ACEXML_INLINE int ACEXML_Parser::is_nonname ACEXML_Char    c
 

Check if a character c is a valid character for nonterminal NAME.

Return values:
1  if true, 0 otherwise.

ACEXML_INLINE int ACEXML_Parser::is_whitespace ACEXML_Char    c
 

Check if a character c is a whitespace.

Return values:
1  if c is a valid white space character. 0 otherwise.

ACEXML_INLINE int ACEXML_Parser::is_whitespace_or_equal ACEXML_Char    c
 

Check if a character c is a whitespace or '='.

Return values:
1  if true, 0 otherwise.

void ACEXML_Parser::parse const ACEXML_Char   systemId,
ACEXML_Env   xmlenv
[virtual]
 

Parse an XML document from a system identifier (URI).

Reimplemented from ACEXML_XMLReader.

void ACEXML_Parser::parse ACEXML_InputSource   input,
ACEXML_Env   xmlenv
[virtual]
 

Parse an XML document.

Reimplemented from ACEXML_XMLReader.

int ACEXML_Parser::parse_attlist_decl ACEXML_Env   xmlenv
 

Parse an "ATTLIST" decl.

Thse first character this method expects is always the 'A' (the first char) in the word "ATTLIST".

Return values:
0  on success, -1 otherwise.

int ACEXML_Parser::parse_cdata ACEXML_Env   xmlenv
 

Parse a CDATA section.

The first character should always be the first '[' in CDATA definition.

Return values:
0  on success.
-1  if fail.

int ACEXML_Parser::parse_char_reference ACEXML_Char   buf,
size_t    len
 

Parse a character reference, i.e., "&x20;" or "&#30;".

The first character encountered should be the '#' char.

Parameters:
buf  points to a character buffer for the result.
len  specifies the capacities of the buffer.
Return values:
0  on success and -1 otherwise.

int ACEXML_Parser::parse_child int    skip_open_paren,
ACEXML_Env   xmlenv
 

Parse a cp non-terminal.

cp can either be a seq or a choice. This function calls itself recursively.

Parameters:
skip_open_paren  when non-zero, it indicates that the open paren of the seq or choice has already been removed from the input stream.
Return values:
0  on success, -1 otherwise.

int ACEXML_Parser::parse_children_definition ACEXML_Env   xmlenv
 

Parse the "children" and "Mixed" non-terminals in contentspec.

The first character this function sees must be the first open paren '(' in children.

Return values:
0  on success, -1 otherwise.

int ACEXML_Parser::parse_doctypedecl ACEXML_Env   xmlenv
 

Parse the DOCTYPE declaration.

The first character encountered should always be 'D' in doctype prefix: "<@!DOCTYPE".

void ACEXML_Parser::parse_element int    is_root,
ACEXML_Env   xmlenv
 

Parse an XML element.

The first character encountered should be the first character of the element "Name".

Parameters:
is_root  If not 0, then we are expecting to see the "root" element now, and the next element's name need to match the name defined in DOCTYPE definition, i.e., this->doctype_.
Todo:
Instead of simply checking for the root element based on the argument is_root, we should instead either pass in some sort of validator or allow the function to return the element name so it can be used in a validator.

int ACEXML_Parser::parse_element_decl ACEXML_Env   xmlenv
 

Parse an "ELEMENT" decl.

The first character this method expects is always the 'L' (the second char) in the word "ELEMENT".

Return values:
0  on success, -1 otherwise.

int ACEXML_Parser::parse_entity_decl ACEXML_Env   xmlenv
 

Parse an "ENTITY" decl.

The first character this method expects is always the 'N' (the second char) in the word "ENTITY".

Return values:
0  on success, -1 otherwise.

int ACEXML_Parser::parse_external_id_and_ref ACEXML_Char *&    publicId,
ACEXML_Char *&    systemId,
ACEXML_Env   xmlenv
 

Parse an ExternalID or a reference to PUBLIC ExternalID.

Possible cases are in the forms of:

SYSTEM 'quoted string representing system resource' PUBLIC 'quoted name of public ID' 'quoted resource' PUBLIC 'quoted name we are referring to'

The first character this function sees must be either 'S' or 'P'. When the function finishes parsing, the input stream points at the first non-whitespace character.

Parameters:
publicId  returns the unquoted publicId read. If none is available, it will be reset to 0.
systemId  returns the unquoted systemId read. If none is available, it will be reset to 0.
Return values:
0  on success, -1 otherwise.

int ACEXML_Parser::parse_internal_dtd ACEXML_Env   xmlenv
 

Parse a "markupdecl" section, this includes both "markupdecl" and "DeclSep" sections in XML specification.

int ACEXML_Parser::parse_notation_decl ACEXML_Env   xmlenv
 

Parse a "NOTATION" decl.

The first character this method expects is always the 'N' (the first char) in the word "NOTATION".

Return values:
0  on success, -1 otherwise.

int ACEXML_Parser::parse_processing_instruction ACEXML_Env   xmlenv
 

Parse a PI statement.

The first character encountered should always be '?' in the PI prefix "<?".

Return values:
0  on success, -1 otherwise.

const ACEXML_String * ACEXML_Parser::parse_reference void   
 

Parse an entity reference, i.e., "&".

The first character encountered should be the character following '&'.

Returns:
A pointer to the resolved const ACEXML_String if success (previously defined), 0 otherwise.

int ACEXML_Parser::parse_token const ACEXML_Char   keyword [private]
 

Parse a keyword.

void ACEXML_Parser::parse_xml_prolog ACEXML_Env   xmlenv
 

Parse XML Prolog.

ACEXML_INLINE ACEXML_Char ACEXML_Parser::peek void    [protected]
 

Peek a character.

ACEXML_Char * ACEXML_Parser::read_name ACEXML_Char    ch = 0
 

Read a name from the input CharStream (until white space).

If ch @!= 0, then we have already consumed the first name character from the input CharStream, otherwise, read_name will use this->get() to acquire the initial character.

Returns:
A pointer to the string in the obstack, 0 if it's not a valid name.

void ACEXML_Parser::report_error const ACEXML_Char   message,
ACEXML_Env   xmlenv
[private]
 

Dispatch errors to ErrorHandler.

void ACEXML_Parser::report_fatal_error const ACEXML_Char   message,
ACEXML_Env   xmlenv
[private]
 

Dispatch fatal errors to ErrorHandler.

void ACEXML_Parser::report_prefix_mapping const ACEXML_Char   prefix,
const ACEXML_Char   uri,
const ACEXML_Char   name,
int    start,
ACEXML_Env   xmlenv
[private]
 

Dispatch prefix mapping calls to the ContentHandler.

Parameters:
prefix  Namespace prefix
uri  Namespace URI
name  Local name
start  1 => startPrefixMapping 0 => endPrefixMapping

void ACEXML_Parser::report_warning const ACEXML_Char   message,
ACEXML_Env   xmlenv
[private]
 

Dispatch warnings to ErrorHandler.

ACEXML_INLINE void ACEXML_Parser::setContentHandler ACEXML_ContentHandler   handler [virtual]
 

Allow an application to register a content event handler.

Reimplemented from ACEXML_XMLReader.

ACEXML_INLINE void ACEXML_Parser::setDTDHandler ACEXML_DTDHandler   handler [virtual]
 

Allow an application to register a DTD event handler.

Reimplemented from ACEXML_XMLReader.

ACEXML_INLINE void ACEXML_Parser::setEntityResolver ACEXML_EntityResolver   resolver [virtual]
 

Allow an application to register an entity resolver.

Reimplemented from ACEXML_XMLReader.

ACEXML_INLINE void ACEXML_Parser::setErrorHandler ACEXML_ErrorHandler   handler [virtual]
 

Allow an application to register an error event handler.

Reimplemented from ACEXML_XMLReader.

void ACEXML_Parser::setFeature const ACEXML_Char   name,
int    boolean_value,
ACEXML_Env   xmlenv
[virtual]
 

Activating or deactivating a feature.

Reimplemented from ACEXML_XMLReader.

void ACEXML_Parser::setProperty const ACEXML_Char   name,
void *    value,
ACEXML_Env   xmlenv
[virtual]
 

Set the value of a property.

Reimplemented from ACEXML_XMLReader.

int ACEXML_Parser::skip_equal void   
 

Skip an equal sign.

Return values:
0  when succeeds, -1 if no equal sign is found.

ACEXML_Char ACEXML_Parser::skip_whitespace ACEXML_Char **    whitespace
 

Skip any whitespaces encountered until the first non-whitespace character is encountered and consumed from the current input CharStream.

Parameters:
whitespace  Return a pointer to the string of skipped whitespace after proper conversion. Null if there's no whitespace found.
Returns:
The first none-white space characters (which will be consumed from the CharStream.) If no whitespace is found, it returns 0.
See also:
skip_whitespace_count

int ACEXML_Parser::skip_whitespace_count ACEXML_Char   peeky = 0
 

Skip any whitespaces encountered until the first non-whitespace character.

The first non-whitespace character is not consumed. This method does peek into the input CharStream and therefore is more expensive than skip_whitespace.

Parameters:
peek  If non-null, peek points to a ACEXML_Char where skip_whitespace_count stores the first non-whitespace character it sees (character is not removed from the stream.)
Returns:
The number of whitespace characters consumed.
See also:
skip_whitespace

int ACEXML_Parser::try_grow_cdata size_t    size,
size_t &    len,
ACEXML_Env   xmlenv
[protected]
 

Check if more data can be added to a character buffer in obstack.

If not, the existing data in the buffer will be cleared out by freezing the segment and pass it out thru a content_handler_->characters () call. counter records the length of the existing data in obstack.


Member Data Documentation

ACEXML_ContentHandler* ACEXML_Parser::content_handler_ [private]
 

ACEXML_Char* ACEXML_Parser::doctype_ [private]
 

My doctype, if any.

ACEXML_DTDHandler* ACEXML_Parser::dtd_handler_ [private]
 

Keeping track of the handlers. We do not manage the memory for handlers.

ACEXML_Char* ACEXML_Parser::dtd_public_ [private]
 

External DTD Public Literal, if any.

ACEXML_Char* ACEXML_Parser::dtd_system_ [private]
 

External DTD System Literal, if any.

ACEXML_Entity_Manager ACEXML_Parser::entities_ [private]
 

ACEXML_EntityResolver* ACEXML_Parser::entity_resolver_ [private]
 

ACEXML_ErrorHandler* ACEXML_Parser::error_handler_ [private]
 

ACEXML_CharStream* ACEXML_Parser::instream_ [private]
 

Feature and properties management structure here. Current input char stream.

@

ACEXML_LocatorImpl ACEXML_Parser::locator_ [private]
 

int ACEXML_Parser::namespace_prefixes_ [private]
 

int ACEXML_Parser::namespaces_ [private]
 

ACE_Obstack_T<ACEXML_Char> ACEXML_Parser::obstack_ [private]
 

int ACEXML_Parser::simple_parsing_ [private]
 

ACEXML_NamespaceSupport ACEXML_Parser::xml_namespace_ [private]
 


The documentation for this class was generated from the following files:
Generated on Thu Oct 10 17:28:03 2002 for ACEXML by doxygen1.2.13.1 written by Dimitri van Heesch, © 1997-2001