Containers for storing coercion data¶
This module provides TripleDict
and MonoDict
. These are
structures similar to WeakKeyDictionary
in Python’s weakref
module, and are optimized for lookup speed. The keys for TripleDict
consist of triples (k1,k2,k3) and are looked up by identity rather than
equality. The keys are stored by weakrefs if possible. If any one of the
components k1, k2, k3 gets garbage collected, then the entry is removed from
the TripleDict
.
Key components that do not allow for weakrefs are stored via a normal
refcounted reference. That means that any entry stored using a triple
(k1,k2,k3) so that none of the k1,k2,k3 allows a weak reference behaves
as an entry in a normal dictionary: Its existence in TripleDict
prevents it from being garbage collected.
That container currently is used to store coercion and conversion maps between
two parents (trac ticket #715) and to store homsets of pairs of objects of a
category (trac ticket #11521). In both cases, it is essential that the parent
structures remain garbage collectable, it is essential that the data access is
faster than with a usual WeakKeyDictionary
, and we enforce
the “unique parent condition” in Sage (parent structures should be identical
if they are equal).
MonoDict
behaves similarly, but it takes a single item as a key. It
is used for caching the parents which allow a coercion map into a fixed other
parent (trac ticket #12313).
By trac ticket #14159, MonoDict
and TripleDict
can be optionally
used with weak references on the values.
-
class
sage.structure.coerce_dict.
MonoDict
¶ Bases:
object
This is a hashtable specifically designed for (read) speed in the coercion model.
It differs from a python WeakKeyDictionary in the following important ways:
- Comparison is done using the ‘is’ rather than ‘==’ operator.
- Only weak references to the keys are stored if at all possible. Keys that do not allow for weak references are stored with a normal refcounted reference.
- The callback of the weak references is safe against recursion, see below.
There are special cdef set/get methods for faster access. It is bare-bones in the sense that not all dictionary methods are implemented.
IMPLEMENTATION:
It is implemented as a hash table with open addressing, similar to python’s dict.
If ki supports weak references then ri is a weak reference to ki with a callback to remove the entry from the dictionary if ki gets garbage collected. If ki is does not support weak references then ri is identical to ki. In the latter case the presence of the key in the dictionary prevents it from being garbage collected.
INPUT:
size
– unused parameter, present for backward compatibility.data
– optional iterable defining initial data.threshold
– unused parameter, present for backward compatibility.weak_values
– optional bool (default False). If it is true, weak references to the values in this dictionary will be used, when possible.
EXAMPLES:
sage: from sage.structure.coerce_dict import MonoDict sage: L = MonoDict() sage: a = 'a'; b = 'ab'; c = '-15' sage: L[a] = 1 sage: L[b] = 2 sage: L[c] = 3
The key is expected to be a unique object. Hence, the item stored for
c
can not be obtained by providing another equal string:sage: L[a] 1 sage: L[b] 2 sage: L[c] 3 sage: L['-15'] Traceback (most recent call last): ... KeyError: '-15'
Not all features of Python dictionaries are available, but iteration over the dictionary items is possible:
sage: # for some reason the following failed in "make ptest" sage: # on some installations, see #12313 for details sage: sorted(L.iteritems()) # random layout [('-15', 3), ('a', 1), ('ab', 2)] sage: # the following seems to be more consistent sage: set(L.iteritems()) {('-15', 3), ('a', 1), ('ab', 2)} sage: del L[c] sage: sorted(L.iteritems()) [('a', 1), ('ab', 2)] sage: len(L) 2 sage: for i in range(1000): ....: L[i] = i sage: len(L) 1002 sage: L['a'] 1 sage: L['c'] Traceback (most recent call last): ... KeyError: 'c'
Note that this kind of dictionary is also used for caching actions and coerce maps. In previous versions of Sage, the cache was by strong references and resulted in a memory leak in the following example. However, this leak was fixed by trac ticket #715, using weak references:
sage: K = GF(1<<55,'t') sage: for i in range(50): ....: a = K.random_element() ....: E = EllipticCurve(j=a) ....: P = E.random_point() ....: Q = 2*P sage: import gc sage: n = gc.collect() sage: from sage.schemes.elliptic_curves.ell_finite_field import EllipticCurve_finite_field sage: LE = [x for x in gc.get_objects() if isinstance(x, EllipticCurve_finite_field)] sage: len(LE) # indirect doctest 1
Here, we demonstrate the use of weak values.
sage: M = MonoDict(13) sage: MW = MonoDict(13, weak_values=True) sage: class Foo: pass sage: a = Foo() sage: b = Foo() sage: k = 1 sage: M[k] = a sage: MW[k] = b sage: M[k] is a True sage: MW[k] is b True sage: k in M True sage: k in MW True
While
M
uses a strong reference toa
,MW
uses a weak reference tob
, and after deletingb
, the corresponding item ofMW
will be removed during the next garbage collection:sage: import gc sage: del a,b sage: _ = gc.collect() sage: k in M True sage: k in MW False sage: len(MW) 0 sage: len(M) 1
Note that
MW
also accepts values that do not allow for weak references:sage: MW[k] = int(5) sage: MW[k] 5 The following demonstrates that :class:`MonoDict` is safer than :class:`~weakref.WeakKeyDictionary` against recursions created by nested callbacks; compare :trac:`15069` (the mechanism used now is different, though):: sage: M = MonoDict(11) sage: class A: pass sage: a = A() sage: prev = a sage: for i in range(1000): ....: newA = A() ....: M[prev] = newA ....: prev = newA sage: len(M) 1000 sage: del a sage: len(M) 0 The corresponding example with a Python :class:`weakref.WeakKeyDictionary` would result in a too deep recursion during deletion of the dictionary items:: sage: import weakref sage: M = weakref.WeakKeyDictionary() sage: a = A() sage: prev = a sage: for i in range(1000): ....: newA = A() ....: M[prev] = newA ....: prev = newA sage: len(M) 1000 sage: del a Exception RuntimeError: 'maximum recursion depth exceeded while calling a Python object' in <function remove at ...> ignored sage: len(M)>0 True Check that also in the presence of circular references, :class:`MonoDict` gets properly collected:: sage: import gc sage: def count_type(T): ....: return len([c for c in gc.get_objects() if isinstance(c,T)]) sage: _=gc.collect() sage: N=count_type(MonoDict) sage: for i in range(100): ....: V = [ MonoDict(11,{"id":j+100*i}) for j in range(100)] ....: n= len(V) ....: for i in range(n): V[i][V[(i+1)%n]]=(i+1)%n ....: del V ....: _=gc.collect() ....: assert count_type(MonoDict) == N sage: count_type(MonoDict) == N True AUTHORS: - Simon King (2012-01) - Nils Bruin (2012-08) - Simon King (2013-02) - Nils Bruin (2013-11)
-
iteritems
()¶ EXAMPLES:
sage: from sage.structure.coerce_dict import MonoDict sage: L = MonoDict(31) sage: L[1] = None sage: L[2] = True sage: list(sorted(L.iteritems())) [(1, None), (2, True)]
-
class
sage.structure.coerce_dict.
MonoDictEraser
¶ Bases:
object
Erase items from a
MonoDict
when a weak reference becomes invalid.This is of internal use only. Instances of this class will be passed as a callback function when creating a weak reference.
EXAMPLES:
sage: from sage.structure.coerce_dict import MonoDict sage: class A: pass sage: a = A() sage: M = MonoDict() sage: M[a] = 1 sage: len(M) 1 sage: del a sage: import gc sage: n = gc.collect() sage: len(M) # indirect doctest 0
AUTHOR:
- Simon King (2012-01)
- Nils Bruin (2013-11)
-
class
sage.structure.coerce_dict.
TripleDict
¶ Bases:
object
This is a hashtable specifically designed for (read) speed in the coercion model.
It differs from a python dict in the following important ways:
- All keys must be sequence of exactly three elements. All sequence types (tuple, list, etc.) map to the same item.
- Comparison is done using the ‘is’ rather than ‘==’ operator.
There are special cdef set/get methods for faster access. It is bare-bones in the sense that not all dictionary methods are implemented.
It is implemented as a list of lists (hereafter called buckets). The bucket is chosen according to a very simple hash based on the object pointer, and each bucket is of the form [id(k1), id(k2), id(k3), r1, r2, r3, value, id(k1), id(k2), id(k3), r1, r2, r3, value, ...], on which a linear search is performed. If a key component ki supports weak references then ri is a weak reference to ki; otherwise ri is identical to ki.
INPUT:
size
– an integer, the initial number of buckets. To spread objects evenly, the size should ideally be a prime, and certainly not divisible by 2.data
– optional iterable defining initial data.threshold
– optional number, default \(0.7\). It determines how frequently the dictionary will be resized (large threshold implies rare resizing).weak_values
– optional bool (default False). If it is true, weak references to the values in this dictionary will be used, when possible.
If any of the key components k1,k2,k3 (this can happen for a key component that supports weak references) gets garbage collected then the entire entry disappears. In that sense this structure behaves like a nested
WeakKeyDictionary
.EXAMPLES:
sage: from sage.structure.coerce_dict import TripleDict sage: L = TripleDict() sage: a = 'a'; b = 'b'; c = 'c' sage: L[a,b,c] = 1 sage: L[a,b,c] 1 sage: L[c,b,a] = -1 sage: list(L.iteritems()) # random order of output. [(('c', 'b', 'a'), -1), (('a', 'b', 'c'), 1)] sage: del L[a,b,c] sage: list(L.iteritems()) [(('c', 'b', 'a'), -1)] sage: len(L) 1 sage: for i in range(1000): ....: L[i,i,i] = i sage: len(L) 1001 sage: L = TripleDict(L) sage: L[c,b,a] -1 sage: L[a,b,c] Traceback (most recent call last): ... KeyError: ('a', 'b', 'c') sage: L[a] Traceback (most recent call last): ... KeyError: 'a' sage: L[a] = 1 Traceback (most recent call last): ... KeyError: 'a'
Note that this kind of dictionary is also used for caching actions and coerce maps. In previous versions of Sage, the cache was by strong references and resulted in a memory leak in the following example. However, this leak was fixed by trac ticket #715, using weak references:
sage: K = GF(1<<55,'t') sage: for i in range(50): ....: a = K.random_element() ....: E = EllipticCurve(j=a) ....: P = E.random_point() ....: Q = 2*P sage: import gc sage: n = gc.collect() sage: from sage.schemes.elliptic_curves.ell_finite_field import EllipticCurve_finite_field sage: LE = [x for x in gc.get_objects() if isinstance(x, EllipticCurve_finite_field)] sage: len(LE) # indirect doctest 1
Here, we demonstrate the use of weak values.
sage: class Foo: pass sage: T = TripleDict(13) sage: TW = TripleDict(13, weak_values=True) sage: a = Foo() sage: b = Foo() sage: k = 1 sage: T[a,k,k]=1 sage: T[k,a,k]=2 sage: T[k,k,a]=3 sage: T[k,k,k]=a sage: TW[b,k,k]=1 sage: TW[k,b,k]=2 sage: TW[k,k,b]=3 sage: TW[k,k,k]=b sage: len(T) 4 sage: len(TW) 4 sage: (k,k,k) in T True sage: (k,k,k) in TW True sage: T[k,k,k] is a True sage: TW[k,k,k] is b True
Now,
T
holds a strong reference toa
, namely inT[k,k,k]
. Hence, when we deletea
, all items ofT
survive:sage: del a sage: _ = gc.collect() sage: len(T) 4
Only when we remove the strong reference, the items become collectable:
sage: del T[k,k,k] sage: _ = gc.collect() sage: len(T) 0
The situation is different for
TW
, since it only holds weak references toa
. Therefore, all items become collectable after deletinga
:sage: del b sage: _ = gc.collect() sage: len(TW) 0
Note
The index \(h\) corresponding to the key [k1, k2, k3] is computed as a value of unsigned type size_t as follows:
\[h = id(k1) + 13*id(k2) xor 503 id(k3)\]The natural type for this quantity is Py_ssize_t, which is a signed quantity with the same length as size_t. Storing it in a signed way gives the most efficient storage into PyInt, while preserving sign information.
In previous situations there were some problems with ending up with negative indices, which required casting to an unsigned type, i.e., (<size_t> h)% N since C has a sign-preserving % operation This caused problems on 32 bits systems, see trac ticket #715 for details. This is irrelevant for the current implementation.
AUTHORS:
- Robert Bradshaw, 2007-08
- Simon King, 2012-01
- Nils Bruin, 2012-08
- Simon King, 2013-02
- Nils Bruin, 2013-11
-
iteritems
()¶ EXAMPLES:
sage: from sage.structure.coerce_dict import TripleDict sage: L = TripleDict(31) sage: L[1,2,3] = None sage: list(L.iteritems()) [((1, 2, 3), None)]
-
class
sage.structure.coerce_dict.
TripleDictEraser
¶ Bases:
object
Erases items from a
TripleDict
when a weak reference becomes invalid.This is of internal use only. Instances of this class will be passed as a callback function when creating a weak reference.
EXAMPLES:
sage: from sage.structure.coerce_dict import TripleDict sage: class A: pass sage: a = A() sage: T = TripleDict() sage: T[a,ZZ,None] = 1 sage: T[ZZ,a,1] = 2 sage: T[a,a,ZZ] = 3 sage: len(T) 3 sage: del a sage: import gc sage: n = gc.collect() sage: len(T) # indirect doctest 0
AUTHOR:
- Simon King (2012-01)
- Nils Bruin (2013-11)