Python Comments

Author: Dave Kuhlman
Address:
dkuhlman@rexx.com
http://www.rexx.com/~dkuhlman
Revision: 1.1a
Date: Feb. 4, 2007
Copyright: Copyright (c) 2006 Dave Kuhlman. All Rights Reserved. This software is subject to the provisions of the MIT License http://www.opensource.org/licenses/mit-license.php.

Abstract

Various notes and Python, Jython, training, etc.

Contents

2007/02/04

sys.path, PYTHONPATH, site.py, etc

First, let's understand what we are trying to accompish. We are trying to figure out where Python looks for modules to be imported and how to tell Python where to look.

When Python processes an import statement, it searches directories listed in sys.path and it searches in those directories in the order that they occur in sys.path. This implies the following:

  1. If a directory is listed in your sys.path, Python will find modules to import there. If not, it won't.
  2. Python checks (searches) the directories listed in sys.path in the order in which they occur in sys.path.
  3. If Python finds a module to be imported in a directory in your sys.path, it stops looking for that module, and, therefore, it will not find the module in a directory listed later in your sys.path.

You can find out what is in sys.path using something like the following:

>>> import sys
>>> for path in sys.path:
...     print path

So, if sys.path is the key to determining where Python looks for modules to be imported and the key to controlling where Python looks, how do items (directories) get into sys.path? The next section explains this.

The process

These notes were taken from my python2.x/site.py and then a bit of experimentation. You may want to look at python2.x/site.py yourself.

  1. xxx
  2. xxx

An important thing to note is that .pth files are processed last

Complications -- easy_install and setuptools

If you have used easy_install on your system and you did not specify the -m or --multi-version, then easy_install creates and adds directories to a easy-install.pth file. But, it is even a bit more complex than that because easy_install also puts a bit of code in easy-install.pth that forces paths in which it has installed a package to the top of sys.path. Further more, this apparently happens after the PYTHONPATH environment variable is processed. So, you cannot

And, easy_install also appears to add it's own site.py, and tricks Python into using the easy_install version of site.py instead of the standard one. See these notes:

So perhaps a message to somebody would help, but I don't know who:

I've noticed some strange behavior on my machine since installing pylons. I believe that it is done by easy_install, but I'm not sure.

There is alse a site.py in lib/python2.5/site-packages (in addition to the one in lib/python2.5). Where did that come from? Did easy_install put it there? There are very few comments in it, as though someone does not want me to know who put it there.

Does anyone have a recommended method of overriding paths inserted into sys.path by easy_install? Basically, I want to force lib/python2.5/site-packages to the front/top. I've been using a sitecustomize.py file in my current directory, but that has the feel of a kludge to it.

Thanks for help.

Dave

What to do

Some options:

  • Modify python2.x/site.py -- Wrong! This file is distributed with Python. It will be overwritten the next time you upgrade. That is probably not what you want.
  • Add a sitecustomize.py to each directory where you need your custom path. The down-side here is remembering to add this file. On Linux/UNIX you can make it slightly more convenient by place a single copy in a common place, then creating a symbolic link to it from each needed directory. That will make it easier to update someday.
  • Add your own .pth file. If you name your file something like zzz_mypaths.pth, your file will be processed last and will win.

2006/07/28

Namespaces

Explanation

A variable is a name bound to a value in a namespace.

A namespace is a dictionary in which Python can look up a name (possible) to obtain its value. Note that the use of dictionaries to implement namespaces is an implementation specific to Python. Other programming languages implement namespaces differently.

Names refer to objects. Objects can be integers, tuples, list, dictionaries, strings, instances of classes, functions, classes (themselves), other Python built-in types, and instances of classes. And, don't forget, references to objects can also be held in data structures (lists, dictionaries, etc).

Determining which namespace a name is in is static. It can be determined by a lexical scan of the code. If a variable is assigned a value anywhere in a scope (specifically within a function or method body), then that variable is local to that scope. If Python does not find a variable in the local scope, then it looks next in the global scope (also sometimes called the module scope) and then in the built-ins scope. But, the global statement can be used to force Python to find and use a global variable (a variable defined at top level in a module) rather than create a local one.

Determining whether a name is bound (has a value) in a namespace is dynamic. You must follow the logic of the code in order to determine (1) when a variable has been bound to a value and (2) what value the variable is been bound to at any given point in the execution of the program. A variable is given a value at a certain time during the execution of the code in a scope. For example, in the following function, the variable count is not bound to a value until the end of the first iteration of the loop:

def test_dynamic():
    for idx in range(5):
        if idx > 1:
            x = count
            print 'x:', x
        count = idx

In Python, since the use of objects and references to them are so pervasive and consistent, we sometimes conflate a variable and the object it refers to. So, for example, if we have the following code:

total = 25
items = [11, 22, 33]
def func1():
    pass
class Class1:
    pass

we sometimes say:

  • total is an integer.
  • items is an list.
  • func1 is a function.
  • Class1 is a class.

But, if we were more careful, we might say:

  • total is a variable that refers to an integer object.
  • items is a variable that refers to a list object.
  • func1 is a variable that refers to a function object.
  • Class1 is a variable that refers to a class object.

Or, even:

  • total is a name bound to an integer object in the current namespace.
  • items is a name bound to a list object in the current namespace.
  • func1 is a name bound to an function object in the current namespace.
  • Class1 is a name bound to an class object in the current namespace.

Rules for determining in which scope a name is bound

Trying to keep this as simple as possible ... might have to ignore a few corner cases ...

Use these rules to determine the scope of a name/variable:

  1. If the name or variable is assigned in a scope, then the variable is bound in that scope.
  2. Else, the name/variable is bound in the global or module scope, or is a built-in.
  3. Else, the name is undefined.

In order to force Python to use a global variable when it is assigned a value in a function, use the global statement.

A few additional notes:

  • Modules and functions (and methods) create scopes.

  • Python does not treat statement blocks as separate scopes. For example, the body or block in an if/else or for statement does not create a separate scope.

  • A class does not create a separate scope (although the methods in it do). For example an instance of the following class:

    ClassGlobal1 = 'Class global data'
    
    class TestScope:
        ClassGlobal1 = 'Class local data'
        def show(self):
            print '(TestScope.show) ClassGlobal1: %s' % ClassGlobal1
    

    references the global (module) variable ClassGlobal1, not the one defined in the class, and will print out:

    (TestScope.show) ClassGlobal1: Class global data
    

    In order to access the class level variable, I must qualify it with the class, for example, as follows:

    TestScope.ClassGlobal1
    
  • Although it may seem obvious, if from moduleA I import moduleB, then call function1 in moduleB, it is moduleB's module scope that is seen by function1, not moduleA's. In other words, it is lexical scope that matters, not dynamic scope. This is true by design, and so that I can understand moduleB and the code in it by reading that code and without knowing the code in moduleA or any other module from which function1 might be called.

Namespace binding statements etc

All the following Python statements bind a value to a name (a variable) in a namespace:

  • The assignment statement -- Binds the computed value of an expression (on the right-hand side of the assignment operator) to a name.
  • The def statement -- Creates a function object and binds that function object to a name.
  • The class statement -- Creates a class object and binds that class object to a name.
  • The import statement -- Evaluates a module (or a package) and binds that module object or an object within the module object to a name.
  • The for statement -- Binds the next item(s) from a sequence or iterator to a name.
  • The except: clause in a try: statement -- Binds an exception to a name.

Also, function or method parameters bind an actual parameter value from a call of the function to the formal parameter name in the local (function/method) scope. Or, if the actual value is omitted in the call and there is a default value, the default value is bound to the formal parameter name in the local scope. Additional notes: (1) The binding of actual values to formal parameters happens each time the function is called. (2) Default values, if the function definition has any, are evaluated only once, which is why you usually do not want to use mutable objects as default values. Specifically, do:

def f(values=None):
    if values is None:
        values = []
    ...

Do not do:

def f(values=[]):
    ...

The second of the above creates only one empty list which is shared by all invocations that omit the parameter.

Functions and classes and other objects

In Python, it is important to understand that functions and classes (and other types of objects too) are objects that can be referred to by a variable, stored in a data structure (e.g. a list or dictionary), passed to a function, returned by a function, etc.

It is also important to realize that variables in Python are simply names in a namespace that refer to objects of some kind.

Or, summarized in other words:

  1. Names are references to objects and

  2. Objects are first class, which means that we can:

    • Store them in variables.
    • Store them in data structures.
    • Pass them to functions and methods.
    • Return them from functions and methods.

    Although, perhaps we should qualify the above a bit by saying that variables and data structures hold references to objects, and that we can pass references to objects into functions, and so on.

Accessing namespace dictionaries

The built-in functions globals() and locals() return the dictionaries the represent the global and the local namespaces respectively. Caution: Although you can get a dictionary that represents the current global or local namespace, it is questionable that you can modify a namespace by modifying the dictionary returned by globals() or locals(). Actually, it seems to work with globals(), but definitely fails in some case with locals().

Nested scopes -- Note that for lexically/statically nested scopes (for example, a function defined inside a function), it seems that globals() and locals() still give access to all items in the accessible namespaces, but do not give dictionary style access to all visible scopes. In particular, variables in nested scopes are not included in locals() and globals() For more on this, see PEP 227: Statically Nested Scopes.

When you want to inspect Python's namespaces and symbol tables, also look at the built-in functions dir() and vars(). dir() returns a (not necessarily complete) list of names in the current local symbol table. vars() returns a dictionary corresponding to the current local symbol table. Note that dir() and vars() can take an optional object as an argument. See the Python documentation (below) for more on this.

For more information on the above built-in functions, see 2.1 Built-in Functions in the Python Library Reference.

Additional information

More help and explanation on names, namespaces, bindings, etc is here:

2006/06/27

Global variables

When and why are global variables considered evil? Are there ways to reduce their evil-ness?

Many programming "authorities" consider global variables harmful. Why? Perhaps if we can answer this question we could ...

What is wrong with global variables? Why are they harmful?

  • Global variables make your code more difficult to understand. Our code becomes less logical and coherent if its behavior changes depending on a variable that is modified by unrelated parts of a system or if a section of our code can change the (future) behavior of unrelated sections of our code.
  • Global variables increase the difficulty of maintaining our code. One reason is that with increased use of global variables, it becomes more difficult to find all the places where a global variable is modified. So, when we read a section of code whose behavior is affected by a global variable, before we can understand that code, we must track down and understand all the locations where that global variable is modified.

A few comments on global variables in Python:

  • Global variables used as constants are less dangerous than those which are modified during execution.
  • The global statement in Python alerts us to functions that might modify a global variable.
  • Most globals in Python are global within a module. If you avoid modifying module globals from another module, then it is usually easy to search and determine where a global variable is used and modified.

If and when you must use global variables, here are things that you can do:

  • Constants are less evil than globals which are modified.
  • Limit the range of access of a global variable. Module global is better than system-wide global. A (class) variable that is global across an entire class is better than a variable that is global across an entire module.
  • Controlled and restricted access and modification is better. If you can specify and enforce simple rules that control (1) the access to and (2) the modification of a global variable, then your code will be both more understandable and more maintainable.
  • Documented variables are better than un-documented ones. If you can explain in several brief sentences, what a global variable is for, who (what function, method, or class) can use it, who can modify it, and under what conditions, and why, then your use of that global variable is less harmful.

2006/06/20

Interfaces

Introduction

Python is less rigid than some languages with respect to interfaces. And, while there are several implementations of some of the capabilities provided by interfaces (for example, in Zope), there is, as of this date, no implementation of interfaces in the Python standard library.

So, in Python, what do we mean by an interface? An interface, sometimes called a protocol, is a description of the methods and their signatures that must be implemented by a class in order to satisfy the protocol. If a class implements those methods, then we say that the class implements the interface/protocol and we say that the interface is provided by the class (that implements it).

Why would we care about defining and implementing interfaces? An interface can serve as a way for a service provider to tell a service consumer/user about the characteristics of a class or instance that the user must provide. An example is provided by the standard input, output, and error streams in the Python sys module:

stdin stdout stderr

File objects corresponding to the interpreter's standard input, output and error streams. stdin is used for all interpreter input except for scripts but including calls to input() and raw_input(). stdout is used for the output of print and expression statements and for the prompts of input() and raw_input(). The interpreter's own prompts and (almost all of) its error messages go to stderr. stdout and stderr needn't be built-in file objects: any object is acceptable as long as it has a write() method that takes a string argument. (Changing these objects doesn't affect the standard I/O streams of processes executed by os.popen(), os.system() or the exec*() family of functions in the os module.) [emphasis added]

See: 3.1 sys -- System-specific parameters and functions (and search for "stdin").

This suggests that we adopt the following point of view -- If I learn the required interface for a replacement for sys.stdout, then I have learned how to implement a class that redirects or filters output from my program. In a similar way, the developer of a framework can tell me how to implement a class, so that I can pass the class or an instance of the class to the framework, and by doing so can customize the behavior of the framework.

Let's categorize several approaches to interfaces in Python from strict to loose:

  1. Strict -- Example, Zope interfaces. This implementation actually gives interfaces behaviors that I can use in my code. For example, I can ask whether a class or instance implements a given interface.
  2. Python abstract classes -- Use a Python class with method headers and documentation but no implementation as a means of describing your interface.
  3. Loose -- "Duck-typing" -- If it looks like a duck and quacks like a duck ... With this approach, I give a text description of the requirements for a class, and by doing so, inform a user of the minimum requirements that must be satisfied in order to use that class or an instance of it for a specific purpose.

Zope interfaces

Zope provides an implementation of interfaces for Python. Using Zope interfaces, if you are not already using Zope, might be more trouble than it is worth. And, it will require that all your users will need to install Zope also. Still, especially for someone who is interested in tutoring and teaching new Python programmers, possibly programmers who are familiar with Java, Zope interfaces may be worth thinking about.

Although they are implemented in the Zope distribution, it can also be installed separately and can be used outside of Zope applications.

Here are a few of the capabilities provided by Zope interfaces (copied from my_zope_install/lib/python/zope/interface/README.txt):

  • We can ask an interface whether it is implemented by a class:

    >>> IFoo.implementedBy(Foo)
    True
    
  • We can ask whether an interface is provided by an object:

    >>> foo = Foo()
    >>> IFoo.providedBy(foo)
    True
    
  • We can ask what interfaces are implemented by an object:

    >>> list(zope.interface.implementedBy(Foo))
    [<InterfaceClass __main__.IFoo>]
    

For more information on the Zope implementation of interfaces, see:

Here is a trivial example:

from zope.interface import Interface, implements

class IA(Interface):
    def show(self, level):
        """Show this object.
        """

class A:
    implements(IA)
    def show(self, msg):
        print '(A.show) msg: "%s"' % msg

def test():
    a = A()
    a.show('hello')
    print IA.implementedBy(A)

test()

Notes:

  • In order for the above example to work, you will need to do one of the following:
    1. Add your_zope_install/lib/python to your PYTHONPATH environment variable. This requires installing Zope.
    2. Install the Zope interface package, which is available at Welcome to the Interfaces Wiki
  • One advantage of using Zope interfaces (over the more informal methods described below) is that interfaces that you define and the zope.interface module itself have capabilities that help you with the use of interfaces.

Python abstract classes

We can document an interface by providing a Python class containing method headers and method doc-strings but no method implementations.

Here is an example:

class MyAbstract:
    def __init__(self, name):
        if self.__class__ == MyAbstract:
            raise NotImplementedError, 'class MyAbstract is abstract'
    def write(self, msg):
        """Write a message.
        """
    def show(self):
        """Display the name etc.
        """

class MyConcrete(MyAbstract):
    """This class implements interface MyAbstract.
    """
    def __init__(self, name):
        MyAbstract.__init__(self, name)
        self.name = name
    def write(self, msg):
        print '(MyConcrete:%s) msg: %s' % (self.name, msg, )
    def show(self):
        print '(MyConcrete) name: %s' % self.name

Notes:

  • The check for the creation of an instance of the abstract class is only for the paranoid among us.
  • There is actually no need for class MyConrete to inherit from class MyAbstract. Class MyAbstract is merely serving to document the interface.
  • Our empty methods in the abstract class do not require a pass statement, even though their bodies are empty. The doc-string serves this purpose.

Using documentation and "duck-typing"

We can describe the interface or protocol in text, saying something like "any class that has a method named this with these arguments and a method named that with those arguments and ...". Needless to say, if there are more than a couple of methods, you are likely to want to switch to the use of an abstract class to document your interface.

Error checking for interfaces

The following example might provide suggestions for those who want to do a bit of error checking in code that requires a class or instance that must provide an interface:

import inspect

class A:
    def show(self, level):
        print '(A.show) level: %s' % level

class B:
    def show(self):
        print '(B.show) level: %s' % level

#
# This function requires an object that implements a method named 'show'
#   which takes one argument in addition to self.
#
def test_interface(obj):
    # Does it have a 'show' attribute?
    if not hasattr(obj, 'show'):
        raise RuntimeError, 'obj must support method show'
    meth = getattr(obj, 'show')
    argnames = inspect.getargspec(meth)[0]
    # Does it have at least one argument (in addition to self)?
    if len(argnames) != 2:
        raise RuntimeError, 'method show must take two args (self + level)'
    obj.show(25)

def test():
    a = A()
    # This call is OK.
    test_interface(a)
    b = B()
    # This call generates an exception.  Class B does not
    #   support the required protocol.
    test_interface(b)

test()

Running the above code produces the following output:

(A.show) level: 25
Traceback (most recent call last):
  File "tmp.py", line 33, in ?
    test()
  File "tmp.py", line 31, in test
    test_interface(b)
  File "tmp.py", line 24, in test_interface
    raise RuntimeError, 'method show must take two args (self + level)'
RuntimeError: method show must take two args (self + level)

Notes:

  • Module inspect is from the Python standard library.

Summary

A few words in summary:

  • Interfaces are sometimes called "protocols" in the Python community.
  • Interfaces are "a good thing". They make our code more understandable. They give the developer of a framework to enable a user of the framework to customize the behavior of the framework. We should use more of them.
  • In Python, interfaces (also known as protocols) are about as informal as you want to make them.
  • Python (as of this date) has no built-in support for interfaces. But, see Python enhancement proposal PEP 245 -- Python Interface Syntax and the Zope Interfaces Wiki Welcome to the Interfaces Wiki
  • Since there is no language support in Python for interfaces, when you specify an interface that a user must follow, you are likely to want to do some error checking in the code that uses a class or instance that should support an interface. For example, you might want to check to determine that a class or instance implements a particular method. See section Error checking for interfaces above for an example.

The following articles provide additional reading on interfaces:

2006/05/24

Simplicity and complexity

Python is a simple language.

Python is an advanced and complex language.

The classes I have taught have been 3 and 4 day classes. It is not possible to teach all of Python in that amount of time. So, designing the contents of a class requires balance and compromise between covering the basic parts and covering the advanced parts.

So, to start off a discussion, I'll list the basic parts and the more advanced parts.

Basic:

  • Lexical structure
  • Built-in data types
  • Statements
  • Functions -- (1) Calling; (2) defining.
  • Modules and import
  • Basic OOP (object-oriented programming): (1) Creating and using instances; (2) defining simple classes.

Advanced:

  • Iterators and generators -- (1) List comprehensions; (2) generator expressions; (3) defining generator functions; (4) defining generator classes.
  • New style classes
  • Advanced OOP: (1) Defining sub-classes; (2) class variables; (3) emulating built-in data types; (4) generator/iterator classes.

2006/05/20

Course coverage; student needs

A Jython course, in particular, seems to be attended by students with different needs: (1) some need to write and maintain scripts; (2) some need to extend Jython with Java; (3) some need to embed Jython in a Java application. I'm suggesting that you get this divergence out on the table and that you teach two separate sections for those two separate needs.

And, what if you cannot separate your training into two separate classes for two groups of students? Some suggestions:

  • Try to keep everyone interested. Argue that review is good for those who already know Python and are here for the Java/Jython-specific parts. Argue that a little knowledge of the Java/Jython aspects will help script writers (those not interested in embedding and extending Jython) to know what they can ask for and expect from the maintainers of their Jython environment.
  • Be prepared with separate sets of exercises. At least try to keep class members busy with tasks that are in their area of interest.
  • Frequently switch back and forth between the these areas. Explain how knowledge in one area is useful in the other. For example:
    • Explain how knowing Python and knowing the needs of Python programmers will be helpful to the person who will embed Jython into a Java application.
    • Explain how knowing how to embed Jython in a Java application will be helpful to the person who needs to use Jython in an embedded enviroment by giving them hints about what features and capabilities to ask for.
    • Explain how knowing what can be added to a Java class in support of Jython can help a script maintainer/writer to know what support to ask for from the Java people.

The next three sub-sections give a very high level description of suggested contents for these two separate courses (or sessions).

Script writers and maintainers

Prerequisites -- None, but familiarity with some other progamming lanuage is helpful, and familiarity with an object-oriented language is a plus.

Content:

  • Programming in Python: (1) Introduction to Python; (2) Python lexical matters; (3) built-in data types; (4) statements; (5) functions; (6) modules; (7) classes and OOP.
  • Using Java class libraries from Jython.

Extending Jython in Java.

Advantages and limitations of adding special Jython support to a Java class.

Features:

  • Doc strings
  • Emmulating Python/Jython built-in types.

Jython embedders

Prerequisites -- (1) Knowledge of Python; (2) knowledge of Java.

Content:

  • Creating the interpreter.
  • Evaluating scripts.
  • Passing values between Java and Jython.
  • Catching exceptions -- Trapping errors and exceptions thrown in Jython scripts.

2006/05/16

Class materials

I typically use my class notes as my materials for the class. These materials are posted at my Web site here: http://www.rexx.com/~dkuhlman/#proposed-python-courses

I usually ask the class members to open a Web browser window and point it at the relevant materials. I also point out that the text version in reST (reStructuredText) can be found by following the link at the bottom of the page. Some students like to take notes by loading the text version into a text editor and adding their notes to it.

Docutils can also be used to produce LaTeX from reST and then I use pdflatex to produce PDF. One training company I worked with, took the PDF and printed paper copies. Some students prefer that, although I do not know why.

2006/05/12

Jython training, cont'd

Questions

This is one of the aspects of Python/Jython training that gives me the most difficulty. There are classes where I stop and ask for questions, and cannot get a response. I try probing. I try to start a question session by asking questions of the class. But, sometimes there is nothing. I believe that some class members believe that they are being polite by not putting me on the hot spot with a difficult question. But, actually, responding to questions is the part I like the most.

If pulling questions out of the class is the part that gives you difficulty, too, here are a few things you might try:

  • Threaten that if you do not get questions, you will ask questions.
  • Refuse to move on to the next topic until you get at least one question.
  • After finishing a topic, ask the class, "Now, how would you use that?" What you are hoping to do is to initiate a discussion.

Agenda

Having an agenda or schedule is important. It establishes goals for the class, which will be of help and guidance for you, and also informs class members on what they can expect. It also gives you something to keep yourself on track.

Prepare a one to two page agenda. I usually break it up into a description of morning and afternoon sessions. You can find an example here: Jython Agenda. Note that this, like most of my documents, was generated from reStructuredText (reST) using Docutils, and you will find a link to the source document at the bottom of the page.

2006/05/11

Jython training

A few notes based on my experiences in teaching Jython/Python:

Teaching Python vs. Teaching Jython

Teaching Jython breaks down into two major areas:

  1. Learning the Python language, because Jython is Python. That's the beauty of Jython.
  2. Learning how Jython connects with Java, because Jython gives us such powerful and easy access to Java. That's also the beauty of Jython.

In a typical class, you are likely to face two categories of students:

  1. Those who do not know Python and want to learn it so that they can write Python/Jython code.
  2. Those who already know Python, and want to learn the Jython-specific Java part.

The difficulty is that the first type of student will be lost when you get to the Java part, and the second type of student will be bored while you do the Python part.

An ideal solution would be to teach two separate classes for the two types of students. But, you will often not have that option.

My own solution so far has been to try to make the Python sessions as lively as possible and to make the Jython/Java sessions as basic as possible. That's not the best of solutions, but it's the best I have been able to do so far. One possible tweak on this approach is to try to save the last afternoon of the training for practical exercises and individual projects. This allows each student to pick the area where s/he has an interest and feels a need to learn. In my last training, this approach was reasonably successful.

Practical Exercises

If you intend to teach Jython/Python, planning for practical exercises is almost as important as planning the teaching, explaining, and lecturing that you will do. Exercises are especially problematic in classes I teach because these classes are relatively short. How do you teach programming in Python and Jython's connections with Java and include practical exercises that are reasonably realistic all in a three or four day training schedule?

Homework? That might help stretch the available time. But, I don't think assigning work for after-hours is realistic.