Package nltk :: Package chunk :: Module regexp :: Class RegexpChunkParser
[hide private]
[frames] | no frames]

Class RegexpChunkParser

source code

       object --+        
                |        
parse.api.ParserI --+    
                    |    
     api.ChunkParserI --+
                        |
                       RegexpChunkParser
Known Subclasses:

A regular expression based chunk parser. RegexpChunkParser uses a sequence of rules to find chunks of a single type within a text. The chunking of the text is encoded using a ChunkString, and each rule acts by modifying the chunking in the ChunkString. The rules are all implemented using regular expression matching and substitution.

The RegexpChunkRule class and its subclasses (ChunkRule, ChinkRule, UnChunkRule, MergeRule, and SplitRule) define the rules that are used by RegexpChunkParser. Each rule defines an apply method, which modifies the chunking encoded by a given ChunkString.

Instance Methods [hide private]
 
__init__(self, rules, chunk_node='NP', top_node='S', trace=0)
Construct a new RegexpChunkParser.
source code
None
_trace_apply(self, chunkstr, verbose)
Apply each of this RegexpChunkParser's rules to chunkstr, in turn.
source code
None
_notrace_apply(self, chunkstr)
Apply each of this RegexpChunkParser's rules to chunkstr, in turn.
source code
Tree
parse(self, chunk_struct, trace=None)
Returns: the best chunk structure for the given tokens and return a tree.
source code
list of RegexpChunkRule
rules(self)
Returns: the sequence of rules used by RegexpChunkParser.
source code
string
__repr__(self)
Returns: a concise string representation of this RegexpChunkParser.
source code
string
__str__(self)
Returns: a verbose string representation of this RegexpChunkParser.
source code

Inherited from parse.api.ParserI: batch_iter_parse, batch_nbest_parse, batch_parse, batch_prob_parse, grammar, iter_parse, nbest_parse, prob_parse

Inherited from object: __delattr__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __setattr__

    Deprecated

Inherited from parse.api.ParserI: batch_test, get_parse, get_parse_dict, get_parse_list, get_parse_prob

Instance Variables [hide private]
list of RegexpChunkRule _rules
The list of rules that should be applied to a text.
int _trace
The default level of tracing.
Properties [hide private]

Inherited from object: __class__

Method Details [hide private]

__init__(self, rules, chunk_node='NP', top_node='S', trace=0)
(Constructor)

source code 

Construct a new RegexpChunkParser.

Parameters:
  • rules (list of RegexpChunkRule) - The sequence of rules that should be used to generate the chunking for a tagged text.
  • chunk_node (string) - The node value that should be used for chunk subtrees. This is typically a short string describing the type of information contained by the chunk, such as "NP" for base noun phrases.
  • top_node (string) - The node value that should be used for the top node of the chunk structure.
  • trace (int) - The level of tracing that should be used when parsing a text. 0 will generate no tracing output; 1 will generate normal tracing output; and 2 or higher will generate verbose tracing output.
Overrides: object.__init__

_trace_apply(self, chunkstr, verbose)

source code 

Apply each of this RegexpChunkParser's rules to chunkstr, in turn. Generate trace output between each rule. If verbose is true, then generate verbose output.

Parameters:
  • chunkstr (ChunkString) - The chunk string to which each rule should be applied.
  • verbose (boolean) - Whether output should be verbose.
Returns: None

_notrace_apply(self, chunkstr)

source code 

Apply each of this RegexpChunkParser's rules to chunkstr, in turn.

Parameters:
  • chunkstr (ChunkString) - The chunk string to which each rule should be applied.
Returns: None

parse(self, chunk_struct, trace=None)

source code 
Parameters:
  • tokens - The list of (word, tag) tokens to be chunked.
Returns: Tree
the best chunk structure for the given tokens and return a tree.
Overrides: api.ChunkParserI.parse
(inherited documentation)

rules(self)

source code 
Returns: list of RegexpChunkRule
the sequence of rules used by RegexpChunkParser.

__repr__(self)
(Representation operator)

source code 

repr(x)

Returns: string
a concise string representation of this RegexpChunkParser.
Overrides: object.__repr__

__str__(self)
(Informal representation operator)

source code 

str(x)

Returns: string
a verbose string representation of this RegexpChunkParser.
Overrides: object.__str__