Differences between PyPy and CPython¶
This page documents the few differences and incompatibilities between the PyPy Python interpreter and CPython. Some of these differences are “by design”, since we think that there are cases in which the behaviour of CPython is buggy, and we do not want to copy bugs.
Differences that are not listed here should be considered bugs of PyPy.
Extension modules¶
List of extension modules that we support:
Supported as built-in modules (in pypy/module/):
__builtin__ __pypy__ _ast _codecs _collections _continuation _ffi _hashlib _io _locale _lsprof _md5 _minimal_curses _multiprocessing _random _rawffi _sha _socket _sre _ssl _warnings _weakref _winreg array binascii bz2 cStringIO cmath cpyext crypt errno exceptions fcntl gc imp itertools marshal math mmap operator parser posix pyexpat select signal struct symbol sys termios thread time token unicodedata zipimport zlib
When translated on Windows, a few Unix-only modules are skipped, and the following module is built instead:
_winreg
Supported by being rewritten in pure Python (possibly using
cffi
): see the lib_pypy/ directory. Examples of modules that we support this way:ctypes
,cPickle
,cmath
,dbm
,datetime
... Note that some modules are both in there and in the list above; by default, the built-in module is used (but can be disabled at translation time).
The extension modules (i.e. modules written in C, in the standard CPython) that are neither mentioned above nor in lib_pypy/ are not available in PyPy. (You may have a chance to use them anyway with cpyext.)
Subclasses of built-in types¶
Officially, CPython has no rule at all for when exactly
overridden method of subclasses of built-in types get
implicitly called or not. As an approximation, these methods
are never called by other built-in methods of the same object.
For example, an overridden __getitem__()
in a subclass of
dict
will not be called by e.g. the built-in get()
method.
The above is true both in CPython and in PyPy. Differences
can occur about whether a built-in function or method will
call an overridden method of another object than self
.
In PyPy, they are often called in cases where CPython would not.
Two examples:
class D(dict):
def __getitem__(self, key):
return "%r from D" % (key,)
class A(object):
pass
a = A()
a.__dict__ = D()
a.foo = "a's own foo"
print a.foo
# CPython => a's own foo
# PyPy => 'foo' from D
glob = D(foo="base item")
loc = {}
exec "print foo" in glob, loc
# CPython => base item
# PyPy => 'foo' from D
Mutating classes of objects which are already used as dictionary keys¶
Consider the following snippet of code:
class X(object):
pass
def __evil_eq__(self, other):
print 'hello world'
return False
def evil(y):
d = {x(): 1}
X.__eq__ = __evil_eq__
d[y] # might trigger a call to __eq__?
In CPython, __evil_eq__ might be called, although there is no way to write
a test which reliably calls it. It happens if y is not x
and hash(y) ==
hash(x)
, where hash(x)
is computed when x
is inserted into the
dictionary. If by chance the condition is satisfied, then __evil_eq__
is called.
PyPy uses a special strategy to optimize dictionaries whose keys are instances
of user-defined classes which do not override the default __hash__
,
__eq__
and __cmp__
: when using this strategy, __eq__
and
__cmp__
are never called, but instead the lookup is done by identity, so
in the case above it is guaranteed that __eq__
won’t be called.
Note that in all other cases (e.g., if you have a custom __hash__
and
__eq__
in y
) the behavior is exactly the same as CPython.
Ignored exceptions¶
In many corner cases, CPython can silently swallow exceptions. The precise list of when this occurs is rather long, even though most cases are very uncommon. The most well-known places are custom rich comparison methods (like __eq__); dictionary lookup; calls to some built-in functions like isinstance().
Unless this behavior is clearly present by design and documented as such (as e.g. for hasattr()), in most cases PyPy lets the exception propagate instead.
Object Identity of Primitive Values, is
and id
¶
Object identity of primitive values works by value equality, not by identity of
the wrapper. This means that x + 1 is x + 1
is always true, for arbitrary
integers x
. The rule applies for the following types:
int
float
long
complex
This change requires some changes to id
as well. id
fulfills the
following condition: x is y <=> id(x) == id(y)
. Therefore id
of the
above types will return a value that is computed from the argument, and can
thus be larger than sys.maxint
(i.e. it can be an arbitrary long).
Notably missing from the list above are str
and unicode
. If your
code relies on comparing strings with is
, then it might break in PyPy.
Note that for floats there “is
” only one object per “bit pattern”
of the float. So float('nan') is float('nan')
is true on PyPy,
but not on CPython because they are two objects; but 0.0 is -0.0
is always False, as the bit patterns are different. As usual,
float('nan') == float('nan')
is always False. When used in
containers (as list items or in sets for example), the exact rule of
equality used is “if x is y or x == y
” (on both CPython and PyPy);
as a consequence, because all nans
are identical in PyPy, you
cannot have several of them in a set, unlike in CPython. (Issue #1974)
Miscellaneous¶
Hash randomization (
-R
) is ignored in PyPy. As documented in http://bugs.python.org/issue14621, some of us believe it has no purpose in CPython either.You can’t store non-string keys in type objects. For example:
class A(object): locals()[42] = 3
won’t work.
sys.setrecursionlimit(n)
sets the limit only approximately, by setting the usable stack space ton * 768
bytes. On Linux, depending on the compiler settings, the default of 768KB is enough for about 1400 calls.since the implementation of dictionary is different, the exact number which
__hash__
and__eq__
are called is different. Since CPython does not give any specific guarantees either, don’t rely on it.assignment to
__class__
is limited to the cases where it works on CPython 2.5. On CPython 2.6 and 2.7 it works in a bit more cases, which are not supported by PyPy so far. (If needed, it could be supported, but then it will likely work in many more case on PyPy than on CPython 2.6/2.7.)the
__builtins__
name is always referencing the__builtin__
module, never a dictionary as it sometimes is in CPython. Assigning to__builtins__
has no effect.directly calling the internal magic methods of a few built-in types with invalid arguments may have a slightly different result. For example,
[].__add__(None)
and(2).__add__(None)
both returnNotImplemented
on PyPy; on CPython, only the latter does, and the former raisesTypeError
. (Of course,[]+None
and2+None
both raiseTypeError
everywhere.) This difference is an implementation detail that shows up because of internal C-level slots that PyPy does not have.on CPython,
[].__add__
is amethod-wrapper
, andlist.__add__
is aslot wrapper
. On PyPy these are normal bound or unbound method objects. This can occasionally confuse some tools that inspect built-in types. For example, the standard libraryinspect
module has a functionismethod()
that returns True on unbound method objects but False on method-wrappers or slot wrappers. On PyPy we can’t tell the difference, soismethod([].__add__) == ismethod(list.__add__) == True
.in pure Python, if you write
class A(object): def f(self): pass
and have a subclassB
which doesn’t overridef()
, thenB.f(x)
still checks thatx
is an instance ofB
. In CPython, types written in C use a different rule. IfA
is written in C, any instance ofA
will be accepted byB.f(x)
(and actually,B.f is A.f
in this case). Some code that could work on CPython but not on PyPy includes:datetime.datetime.strftime(datetime.date.today(), ...)
(here,datetime.date
is the superclass ofdatetime.datetime
). Anyway, the proper fix is arguably to use a regular method call in the first place:datetime.date.today().strftime(...)
the
__dict__
attribute of new-style classes returns a normal dict, as opposed to a dict proxy like in CPython. Mutating the dict will change the type and vice versa. For builtin types, a dictionary will be returned that cannot be changed (but still looks and behaves like a normal dictionary).some functions and attributes of the
gc
module behave in a slightly different way: for example,gc.enable
andgc.disable
are supported, but instead of enabling and disabling the GC, they just enable and disable the execution of finalizers.PyPy prints a random line from past #pypy IRC topics at startup in interactive mode. In a released version, this behaviour is suppressed, but setting the environment variable PYPY_IRC_TOPIC will bring it back. Note that downstream package providers have been known to totally disable this feature.