Class TaggerI

object --+
Known Subclasses:

A processing interface for assigning a tag to each token in a list. Tags are case sensitive strings that identify some property of each token, such as its part of speech or its sense.

Some taggers require specific types for their tokens. This is generally indicated by the use of a sub-interface to TaggerI. For example, featureset taggers, which are subclassed from FeaturesetTaggerI, require that each token be a featureset.

Subclasses must define:

list of (token, tag)
tag(self, tokens)
Determine the most appropriate tag sequence for the given token sequence, and return a corresponding list of tagged tokens.
batch_tag(self, sentences)
Apply self.tag() to each element of sentences.
Inherited from object: __delattr__, __getattribute__, __hash__, __init__, __new__, __reduce__, __reduce_ex__, __repr__, __setattr__, __str__

Inherited from object: __class__

tag(self, tokens)

Determine the most appropriate tag sequence for the given token sequence, and return a corresponding list of tagged tokens. A tagged token is encoded as a tuple (token, tag).

Returns: list of (token, tag)

batch_tag(self, sentences)

Apply self.tag() to each element of sentences. I.e.:

>>> return [self.tag(tokens) for tokens in sentences]