Class RegexpChunkRule
source code
object --+
|
RegexpChunkRule
- Known Subclasses:
-
A rule specifying how to modify the chunking in a
ChunkString
, using a transformational regular expression.
The RegexpChunkRule
class itself can be used to implement
any transformational rule based on regular expressions. There are also a
number of subclasses, which can be used to implement simpler types of
rules, based on matching regular expressions.
Each RegexpChunkRule
has a regular expression and a
replacement expression. When a RegexpChunkRule
is applied to a
ChunkString
, it searches the ChunkString
for
any substring that matches the regular expression, and replaces it using
the replacement expression. This search/replace operation has the same
semantics as re.sub
.
Each RegexpChunkRule
also has a description string, which
gives a short (typically less than 75 characters) description of the
purpose of the rule.
This transformation defined by this RegexpChunkRule
should only add and remove braces; it should not modify the
sequence of angle-bracket delimited tags. Furthermore, this
transformation may not result in nested or mismatched bracketing.
|
|
None
|
|
string
|
descr(self)
Returns:
a short description of the purpose and/or effect of this rule. |
source code
|
|
string
|
|
Inherited from object :
__delattr__ ,
__getattribute__ ,
__hash__ ,
__new__ ,
__reduce__ ,
__reduce_ex__ ,
__setattr__ ,
__str__
|
Inherited from object :
__class__
|
__init__(self,
regexp,
repl,
descr)
(Constructor)
| source code
|
Construct a new RegexpChunkRule.
- Parameters:
regexp (regexp or string ) - This RegexpChunkRule 's regular expression. When this
rule is applied to a ChunkString , any substring that
matches regexp will be replaced using the
replacement string repl . Note that this must be a
normal regular expression, not a tag pattern.
repl (string ) - This RegexpChunkRule 's replacement expression. When
this rule is applied to a ChunkString , any substring
that matches regexp will be replaced using
repl .
descr (string ) - A short description of the purpose and/or effect of this rule.
- Overrides:
object.__init__
|
Apply this rule to the given ChunkString . See the class
reference documentation for a description of what it means to apply a
rule.
- Parameters:
chunkstr (ChunkString ) - The chunkstring to which this rule is applied.
- Returns:
None
- Raises:
ValueError - If this transformation generated an invalid chunkstring.
|
- Returns:
string
- a short description of the purpose and/or effect of this rule.
|
repr(x)
- Returns:
string
- A string representation of this rule. This string representation
has the form:
<RegexpChunkRule: '{<IN|VB.*>}'->'<IN>'>
Note that this representation does not include the description
string; that string can be accessed separately with the
descr method.
- Overrides:
object.__repr__
|
Create a RegexpChunkRule from a string description. Currently, the
following formats are supported:
{regexp} # chunk rule
}regexp{ # chink rule
regexp}{regexp # split rule
regexp{}regexp # merge rule
Where regexp is a regular expression for the rule. Any
text following the comment marker (# ) will be used as the
rule's description:
>>> RegexpChunkRule.parse('{<DT>?<NN.*>+}')
<ChunkRule: '<DT>?<NN.*>+'>
|