Home | Trees | Indices | Help |
|
---|
|
Functions to find and load NLTK resource files, such as corpora, grammars, and
saved processing objects. Resource files are identified using URLs, such
as"nltk:corpora/abc/rural.txt
" or
"http://nltk.org/sample/toy.cfg
". The following
URL protocols are supported:
file:path
": Specifies the file whose
path is path
. Both relative and absolute paths
may be used.
http://host/{path}
": Specifies the
file stored on the web server host
at path
path
.
nltk:path
": Specifies the file stored
in the NLTK data package at path
. NLTK will
search for these files in the directories specified by nltk.data.path.
If no protocol is specified, then the default protocol
"nltk:
" will be used.
This module provides to functions that can be used to access a resource file, given its URL: load() loads a given resource, and adds it to a resource cache; and retrieve() copies a given resource to a local file.
|
|||
PathPointer An abstract base class for 'path pointers,' used by NLTK's data package to identify specific paths. |
|||
FileSystemPathPointer A path pointer that identifies a file which can be accessed directly via a given absolute path. |
|||
GzipFileSystemPathPointer A subclass of FileSystemPathPointer that identifies a
gzip-compressed file located at a given absolute path.
|
|||
ZipFilePathPointer A path pointer that identifies a file contained within a zipfile, which can be accessed by reading that zipfile. |
|||
LazyLoader | |||
OpenOnDemandZipFile A subclass of zipfile.ZipFile that closes its file
pointer whenever it is not using it; and re-opens it when it needs
to read data from the zipfile.
|
|||
Seekable Unicode Stream Reader | |||
---|---|---|---|
SeekableUnicodeStreamReader A stream reader that automatically encodes the source byte stream into unicode (like codecs.StreamReader ); but still
supports the seek() and tell() operations
correctly.
|
|
|||
str
|
|
||
|
|||
|
|||
|
|||
|
|||
|
|
|||
path =
A list of directories where the NLTK data package might reside. |
|||
_resource_cache = <WeakValueDictionary at 8144776> A weakref dictionary used to cache resources so that they won't need to be loaded more than once. |
|||
FORMATS =
A dictionary describing the formats that are supported by NLTK's load() method. |
|||
AUTO_FORMATS =
A dictionary mapping from file extensions to format names, used by load() when format="auto" to decide the format for a given
resource url.
|
|||
d =
|
|
Find the given resource from the NLTK data package, and return a
corresponding path name. If the given resource is not found, raise a
|
Copy the given resource to a local file. If no filename is specified,
then use the URL's filename. If there is already a file named
|
Load a given resource from the NLTK data package. The following resource formats are currently supported:
If no format is specified,
|
Write out a grammar file, ignoring escaped and empty lines
|
Remove all objects from the resource cache. See Also: load() |
Helper function that returns an open file object for a resource, given
its resource URL. If the given resource URL uses the 'ntlk' protocol, or
uses no protocol, then use nltk.data.find to find its path, and open it with the
given mode; if the resource URL uses the 'file' protocol, then open the
file with the given mode; otherwise, delegate to
|
|
pathA list of directories where the NLTK data package might reside. These directories will be checked in order when looking for a resource in the data package. Note that this allows users to substitute in their own versions of resources, if they have them (e.g., in their home directory under ~/nltk/data).
|
FORMATSA dictionary describing the formats that are supported by NLTK's load() method. Keys are format names, and values are format descriptions.
|
AUTO_FORMATSA dictionary mapping from file extensions to format names, used by load() when
|
Home | Trees | Indices | Help |
|
---|
Generated by Epydoc 3.0beta1 on Wed Aug 27 15:08:50 2008 | http://epydoc.sourceforge.net |