Differences between PyPy and CPython¶
This page documents the few differences and incompatibilities between the PyPy Python interpreter and CPython. Some of these differences are “by design”, since we think that there are cases in which the behaviour of CPython is buggy, and we do not want to copy bugs.
Differences that are not listed here should be considered bugs of PyPy.
Extension modules¶
List of extension modules that we support:
Supported as built-in modules (in pypy/module/):
__builtin__ __pypy__ _ast _codecs _collections _continuation _ffi _hashlib _io _locale _lsprof _md5 _minimal_curses _multiprocessing _random _rawffi _sha _socket _sre _ssl _warnings _weakref _winreg array binascii bz2 cStringIO cmath cpyext crypt errno exceptions fcntl gc imp itertools marshal math mmap operator parser posix pyexpat select signal struct symbol sys termios thread time token unicodedata zipimport zlib
When translated on Windows, a few Unix-only modules are skipped, and the following module is built instead:
_winreg
Supported by being rewritten in pure Python (possibly using cffi): see the lib_pypy/ directory. Examples of modules that we support this way: ctypes, cPickle, cmath, dbm, datetime... Note that some modules are both in there and in the list above; by default, the built-in module is used (but can be disabled at translation time).
The extension modules (i.e. modules written in C, in the standard CPython) that are neither mentioned above nor in lib_pypy/ are not available in PyPy. (You may have a chance to use them anyway with cpyext.)
Subclasses of built-in types¶
Officially, CPython has no rule at all for when exactly overridden method of subclasses of built-in types get implicitly called or not. As an approximation, these methods are never called by other built-in methods of the same object. For example, an overridden __getitem__() in a subclass of dict will not be called by e.g. the built-in get() method.
The above is true both in CPython and in PyPy. Differences can occur about whether a built-in function or method will call an overridden method of another object than self. In PyPy, they are generally always called, whereas not in CPython. For example, in PyPy, dict1.update(dict2) considers that dict2 is just a general mapping object, and will thus call overridden keys() and __getitem__() methods on it. So the following code prints 42 on PyPy but foo on CPython:
>>>> class D(dict):
.... def __getitem__(self, key):
.... return 42
....
>>>>
>>>> d1 = {}
>>>> d2 = D(a='foo')
>>>> d1.update(d2)
>>>> print d1['a']
42
Mutating classes of objects which are already used as dictionary keys¶
Consider the following snippet of code:
class X(object):
pass
def __evil_eq__(self, other):
print 'hello world'
return False
def evil(y):
d = {x(): 1}
X.__eq__ = __evil_eq__
d[y] # might trigger a call to __eq__?
In CPython, __evil_eq__ might be called, although there is no way to write a test which reliably calls it. It happens if y is not x and hash(y) == hash(x), where hash(x) is computed when x is inserted into the dictionary. If by chance the condition is satisfied, then __evil_eq__ is called.
PyPy uses a special strategy to optimize dictionaries whose keys are instances of user-defined classes which do not override the default __hash__, __eq__ and __cmp__: when using this strategy, __eq__ and __cmp__ are never called, but instead the lookup is done by identity, so in the case above it is guaranteed that __eq__ won’t be called.
Note that in all other cases (e.g., if you have a custom __hash__ and __eq__ in y) the behavior is exactly the same as CPython.
Ignored exceptions¶
In many corner cases, CPython can silently swallow exceptions. The precise list of when this occurs is rather long, even though most cases are very uncommon. The most well-known places are custom rich comparison methods (like __eq__); dictionary lookup; calls to some built-in functions like isinstance().
Unless this behavior is clearly present by design and documented as such (as e.g. for hasattr()), in most cases PyPy lets the exception propagate instead.
Object Identity of Primitive Values, is and id¶
Object identity of primitive values works by value equality, not by identity of the wrapper. This means that x + 1 is x + 1 is always true, for arbitrary integers x. The rule applies for the following types:
- int
- float
- long
- complex
This change requires some changes to id as well. id fulfills the following condition: x is y <=> id(x) == id(y). Therefore id of the above types will return a value that is computed from the argument, and can thus be larger than sys.maxint (i.e. it can be an arbitrary long).
Miscellaneous¶
- Hash randomization (-R) is ignored in PyPy. As documented in http://bugs.python.org/issue14621, some of us believe it has no purpose in CPython either.
- sys.setrecursionlimit(n) sets the limit only approximately, by setting the usable stack space to n * 768 bytes. On Linux, depending on the compiler settings, the default of 768KB is enough for about 1400 calls.
- since the implementation of dictionary is different, the exact number which __hash__ and __eq__ are called is different. Since CPython does not give any specific guarantees either, don’t rely on it.
- assignment to __class__ is limited to the cases where it works on CPython 2.5. On CPython 2.6 and 2.7 it works in a bit more cases, which are not supported by PyPy so far. (If needed, it could be supported, but then it will likely work in many more case on PyPy than on CPython 2.6/2.7.)
- the __builtins__ name is always referencing the __builtin__ module, never a dictionary as it sometimes is in CPython. Assigning to __builtins__ has no effect.
- directly calling the internal magic methods of a few built-in types with invalid arguments may have a slightly different result. For example, [].__add__(None) and (2).__add__(None) both return NotImplemented on PyPy; on CPython, only the latter does, and the former raises TypeError. (Of course, []+None and 2+None both raise TypeError everywhere.) This difference is an implementation detail that shows up because of internal C-level slots that PyPy does not have.
- on CPython, [].__add__ is a method-wrapper, and list.__add__ is a slot wrapper. On PyPy these are normal bound or unbound method objects. This can occasionally confuse some tools that inspect built-in types. For example, the standard library inspect module has a function ismethod() that returns True on unbound method objects but False on method-wrappers or slot wrappers. On PyPy we can’t tell the difference, so ismethod([].__add__) == ismethod(list.__add__) == True.
- the __dict__ attribute of new-style classes returns a normal dict, as opposed to a dict proxy like in CPython. Mutating the dict will change the type and vice versa. For builtin types, a dictionary will be returned that cannot be changed (but still looks and behaves like a normal dictionary).
- PyPy prints a random line from past #pypy IRC topics at startup in interactive mode. In a released version, this behaviour is supressed, but setting the environment variable PYPY_IRC_TOPIC will bring it back. Note that downstream package providers have been known to totally disable this feature.