Package nltk :: Module util :: Class AbstractLazySequence
[hide private]
[frames] | no frames]

Class AbstractLazySequence

source code

object --+
         |
        AbstractLazySequence
Known Subclasses:

An abstract base class for read-only sequences whose values are computed as needed. Lazy sequences act like tuples -- they can be indexed, sliced, and iterated over; but they may not be modified.

The most common application of lazy sequences in NLTK is for corpus view objects, which provide access to the contents of a corpus without loading the entire corpus into memory, by loading pieces of the corpus from disk as needed.

The result of modifying a mutable element of a lazy sequence is undefined. In particular, the modifications made to the element may or may not persist, depending on whether and when the lazy sequence caches that element's value or reconstructs it from scratch.

Subclasses are required to define two methods:

Instance Methods [hide private]
 
__len__(self)
Return the number of tokens in the corpus file underlying this corpus view.
source code
 
iterate_from(self, start)
Return an iterator that generates the tokens in the corpus file underlying this corpus view, starting at the token number start.
source code
 
__getitem__(self, i)
Return the ith token in the corpus file underlying this corpus view.
source code
 
__iter__(self)
Return an iterator that generates the tokens in the corpus file underlying this corpus view.
source code
 
count(self, value)
Return the number of times this list contains value.
source code
 
index(self, value, start=None, stop=None)
Return the index of the first occurance of value in this list that is greater than or equal to start and less than stop.
source code
 
__contains__(self, value)
Return true if this list contains value.
source code
 
__add__(self, other)
Return a list concatenating self with other.
source code
 
__radd__(self, other)
Return a list concatenating other with self.
source code
 
__mul__(self, count)
Return a list concatenating self with itself count times.
source code
 
__rmul__(self, count)
Return a list concatenating self with itself count times.
source code
 
__repr__(self)
Returns: A string representation for this corpus view that is similar to a list's representation; but if it would be more than 60 characters long, it is truncated.
source code
 
__cmp__(self, other)
Return a number indicating how self relates to other.
source code
 
__hash__(self)
hash(x)
source code

Inherited from object: __delattr__, __getattribute__, __init__, __new__, __reduce__, __reduce_ex__, __setattr__, __str__

Class Variables [hide private]
  _MAX_REPR_SIZE = 60
Properties [hide private]

Inherited from object: __class__

Method Details [hide private]

iterate_from(self, start)

source code 

Return an iterator that generates the tokens in the corpus file underlying this corpus view, starting at the token number start. If start>=len(self), then this iterator will generate no tokens.

__getitem__(self, i)
(Indexing operator)

source code 

Return the ith token in the corpus file underlying this corpus view. Negative indices and spans are both supported.

index(self, value, start=None, stop=None)

source code 

Return the index of the first occurance of value in this list that is greater than or equal to start and less than stop. Negative start & stop values are treated like negative slice bounds -- i.e., they count from the end of the list.

__repr__(self)
(Representation operator)

source code 

repr(x)

Returns:
A string representation for this corpus view that is similar to a list's representation; but if it would be more than 60 characters long, it is truncated.
Overrides: object.__repr__

__cmp__(self, other)
(Comparison operator)

source code 

Return a number indicating how self relates to other.

  • If other is not a corpus view or a list, return -1.
  • Otherwise, return cmp(list(self), list(other)).

Note: corpus views do not compare equal to tuples containing equal elements. Otherwise, transitivity would be violated, since tuples do not compare equal to lists.

__hash__(self)
(Hashing function)

source code 

hash(x)

Raises:
  • ValueError - Corpus view objects are unhashable.
Overrides: object.__hash__