Package encodings
[hide private]
[frames] | no frames]

Package encodings

Standard "encodings" Package

    Standard Python encoding modules are stored in this package
    directory.

    Codec modules must have names corresponding to normalized encoding
    names as defined in the normalize_encoding() function below, e.g.
    'utf-8' must be implemented by the module 'utf_8.py'.

    Each codec module must export the following interface:

    * getregentry() -> codecs.CodecInfo object
    The getregentry() API must a CodecInfo object with encoder, decoder,
    incrementalencoder, incrementaldecoder, streamwriter and streamreader
    atttributes which adhere to the Python Codec Interface Standard.

    In addition, a module may optionally also define the following
    APIs which are then used by the package's codec search function:

    * getaliases() -> sequence of encoding name strings to use as aliases

    Alias names returned by getaliases() must be normalized encoding
    names as defined by normalize_encoding().

Written by Marc-Andre Lemburg (mal@lemburg.com).

(c) Copyright CNRI, All Rights Reserved. NO WARRANTY.

Submodules [hide private]

Classes [hide private]
  CodecRegistryError
Functions [hide private]
 
normalize_encoding(encoding)
Normalize an encoding name.
 
search_function(encoding)
Variables [hide private]
  _cache = {}
  _unknown = '--unknown--'
  _import_tail = ['*']
  _norm_encoding_map = ' ...
  _aliases = {'037': 'cp037', '1026': 'cp1026', '1140': 'cp1140'...

Imports: codecs, types, aliases, ascii, cp869, latin_1, utf_8


Function Details [hide private]

normalize_encoding(encoding)

 

Normalize an encoding name.

Normalization works as follows: all non-alphanumeric characters except the dot used for Python package names are collapsed and replaced with a single underscore, e.g. ' -;#' becomes '_'. Leading and trailing underscores are removed.

Note that encoding names should be ASCII only; if they do use non-ASCII characters, these must be Latin-1 compatible.


Variables Details [hide private]

_norm_encoding_map

Value:
'                                              . 0123456789       ABCD\
EFGHIJKLMNOPQRSTUVWXYZ      abcdefghijklmnopqrstuvwxyz                \
                                                                      \
                                               '

_aliases

Value:
{'037': 'cp037',
 '1026': 'cp1026',
 '1140': 'cp1140',
 '1250': 'cp1250',
 '1251': 'cp1251',
 '1252': 'cp1252',
 '1253': 'cp1253',
 '1254': 'cp1254',
...