Class RegexpChunkRule
source code
object --+
|
RegexpChunkRule
- Known Subclasses:
-
A rule specifying how to modify the chunking in a
ChunkString, using a transformational regular expression.
The RegexpChunkRule class itself can be used to implement
any transformational rule based on regular expressions. There are also a
number of subclasses, which can be used to implement simpler types of
rules, based on matching regular expressions.
Each RegexpChunkRule has a regular expression and a
replacement expression. When a RegexpChunkRule is applied to a
ChunkString, it searches the ChunkString for
any substring that matches the regular expression, and replaces it using
the replacement expression. This search/replace operation has the same
semantics as re.sub.
Each RegexpChunkRule also has a description string, which
gives a short (typically less than 75 characters) description of the
purpose of the rule.
This transformation defined by this RegexpChunkRule
should only add and remove braces; it should not modify the
sequence of angle-bracket delimited tags. Furthermore, this
transformation may not result in nested or mismatched bracketing.
|
|
|
None
|
|
string
|
descr(self)
Returns:
a short description of the purpose and/or effect of this rule. |
source code
|
|
string
|
|
|
Inherited from object:
__delattr__,
__getattribute__,
__hash__,
__new__,
__reduce__,
__reduce_ex__,
__setattr__,
__str__
|
|
Inherited from object:
__class__
|
__init__(self,
regexp,
repl,
descr)
(Constructor)
| source code
|
Construct a new RegexpChunkRule.
- Parameters:
regexp (regexp or string) - This RegexpChunkRule's regular expression. When this
rule is applied to a ChunkString, any substring that
matches regexp will be replaced using the
replacement string repl. Note that this must be a
normal regular expression, not a tag pattern.
repl (string) - This RegexpChunkRule's replacement expression. When
this rule is applied to a ChunkString, any substring
that matches regexp will be replaced using
repl.
descr (string) - A short description of the purpose and/or effect of this rule.
- Overrides:
object.__init__
|
|
Apply this rule to the given ChunkString. See the class
reference documentation for a description of what it means to apply a
rule.
- Parameters:
chunkstr (ChunkString) - The chunkstring to which this rule is applied.
- Returns:
None
- Raises:
ValueError - If this transformation generated an invalid chunkstring.
|
- Returns:
string
- a short description of the purpose and/or effect of this rule.
|
|
repr(x)
- Returns:
string
- A string representation of this rule. This string representation
has the form:
<RegexpChunkRule: '{<IN|VB.*>}'->'<IN>'>
Note that this representation does not include the description
string; that string can be accessed separately with the
descr method.
- Overrides:
object.__repr__
|
|
Create a RegexpChunkRule from a string description. Currently, the
following formats are supported:
{regexp} # chunk rule
}regexp{ # chink rule
regexp}{regexp # split rule
regexp{}regexp # merge rule
Where regexp is a regular expression for the rule. Any
text following the comment marker (#) will be used as the
rule's description:
>>> RegexpChunkRule.parse('{<DT>?<NN.*>+}')
<ChunkRule: '<DT>?<NN.*>+'>
|