__init__(self,
pattern,
gaps=False,
discard_empty=True,
flags=56)
(Constructor)
| source code
|
Construct a new tokenizer that splits strings using the given regular
expression pattern . By default, pattern will
be used to find tokens; but if gaps is set to
False , then patterns will be used to find
separators between tokens instead.
- Parameters:
pattern (str ) - The pattern used to build this tokenizer. This pattern may safely
contain grouping parenthases.
gaps (bool ) - True if this tokenizer's pattern should be used to find
separators between tokens; False if this tokenizer's pattern
should be used to find the tokens themselves.
discard_empty (bool ) - True if any empty tokens ('' ) generated by the
tokenizer should be discarded. Empty tokens can only be
generated if _gaps is true.
flags (int ) - The regexp flags used to compile this tokenizer's pattern. By
default, the following flags are used: re.UNICODE |
re.MULTILINE | re.DOTALL .
- Overrides:
object.__init__
|