Home | Trees | Indices | Help |
|
---|
|
object --+ | PunktToken
Stores a token of text with annotations produced during sentence boundary detection.
|
|||
|
|||
Inherited from |
|||
Derived properties | |||
---|---|---|---|
|
|||
String representation | |||
|
|||
|
|
|||
_properties =
|
|||
Regular expressions for properties | |||
---|---|---|---|
_RE_ELLIPSIS = re.compile(r'\.\.
|
|||
_RE_NUMERIC = re.compile(r'^-
|
|||
_RE_INITIAL = re.compile(r'
|
|||
_RE_ALPHA = re.compile(r'
|
|
|||
abbr | |||
ellipsis | |||
linestart | |||
parastart | |||
period_final | |||
sentbreak | |||
tok | |||
type | |||
Inherited from |
|||
Derived properties | |||
---|---|---|---|
type_no_period The type with its final period removed if it has one. |
|||
type_no_sentperiod The type with its final period removed if it is marked as a sentence break. |
|||
first_upper True if the token's first character is uppercase. |
|||
first_lower True if the token's first character is lowercase. |
|||
first_case | |||
is_ellipsis True if the token text is that of an ellipsis. |
|||
is_number True if the token text is that of a number. |
|||
is_initial True if the token text is that of an initial. |
|||
is_alpha True if the token text is all alphabetic. |
|||
is_non_punct True if the token is either a number or is alphabetic. |
|
x.__init__(...) initializes x; see x.__class__.__doc__ for signature
|
A string representation of the token that can reproduce it with eval(), which lists all the token's non-default annotations.
|
A string representation akin to that used by Kiss and Strunk.
|
|
_properties
|
|
type_no_periodThe type with its final period removed if it has one.
|
type_no_sentperiodThe type with its final period removed if it is marked as a sentence break.
|
first_upperTrue if the token's first character is uppercase.
|
first_lowerTrue if the token's first character is lowercase.
|
first_case
|
is_ellipsisTrue if the token text is that of an ellipsis.
|
is_numberTrue if the token text is that of a number.
|
is_initialTrue if the token text is that of an initial.
|
is_alphaTrue if the token text is all alphabetic.
|
is_non_punctTrue if the token is either a number or is alphabetic.
|
Home | Trees | Indices | Help |
|
---|
Generated by Epydoc 3.0beta1 on Wed Aug 27 15:08:58 2008 | http://epydoc.sourceforge.net |