``sets`` --- Unordered collections of unique elements ***************************************************** New in version 2.3. Deprecated since version 2.6: The built-in ``set``/``frozenset`` types replace this module. The ``sets`` module provides classes for constructing and manipulating unordered collections of unique elements. Common uses include membership testing, removing duplicates from a sequence, and computing standard math operations on sets such as intersection, union, difference, and symmetric difference. Like other collections, sets support ``x in set``, ``len(set)``, and ``for x in set``. Being an unordered collection, sets do not record element position or order of insertion. Accordingly, sets do not support indexing, slicing, or other sequence-like behavior. Most set applications use the ``Set`` class which provides every set method except for ``__hash__()``. For advanced applications requiring a hash method, the ``ImmutableSet`` class adds a ``__hash__()`` method but omits methods which alter the contents of the set. Both ``Set`` and ``ImmutableSet`` derive from ``BaseSet``, an abstract class useful for determining whether something is a set: ``isinstance(obj, BaseSet)``. The set classes are implemented using dictionaries. Accordingly, the requirements for set elements are the same as those for dictionary keys; namely, that the element defines both ``__eq__()`` and ``__hash__()``. As a result, sets cannot contain mutable elements such as lists or dictionaries. However, they can contain immutable collections such as tuples or instances of ``ImmutableSet``. For convenience in implementing sets of sets, inner sets are automatically converted to immutable form, for example, ``Set([Set(['dog'])])`` is transformed to ``Set([ImmutableSet(['dog'])])``. class class sets.Set([iterable]) Constructs a new empty ``Set`` object. If the optional *iterable* parameter is supplied, updates the set with elements obtained from iteration. All of the elements in *iterable* should be immutable or be transformable to an immutable using the protocol described in section *Protocol for automatic conversion to immutable*. class class sets.ImmutableSet([iterable]) Constructs a new empty ``ImmutableSet`` object. If the optional *iterable* parameter is supplied, updates the set with elements obtained from iteration. All of the elements in *iterable* should be immutable or be transformable to an immutable using the protocol described in section *Protocol for automatic conversion to immutable*. Because ``ImmutableSet`` objects provide a ``__hash__()`` method, they can be used as set elements or as dictionary keys. ``ImmutableSet`` objects do not have methods for adding or removing elements, so all of the elements must be known when the constructor is called. Set Objects =========== Instances of ``Set`` and ``ImmutableSet`` both provide the following operations: +---------------------------------+--------------+-----------------------------------+ | Operation | Equivalent | Result | +=================================+==============+===================================+ | ``len(s)`` | | cardinality of set *s* | +---------------------------------+--------------+-----------------------------------+ | ``x in s`` | | test *x* for membership in *s* | +---------------------------------+--------------+-----------------------------------+ | ``x not in s`` | | test *x* for non-membership in | | | | *s* | +---------------------------------+--------------+-----------------------------------+ | ``s.issubset(t)`` | ``s <= t`` | test whether every element in *s* | | | | is in *t* | +---------------------------------+--------------+-----------------------------------+ | ``s.issuperset(t)`` | ``s >= t`` | test whether every element in *t* | | | | is in *s* | +---------------------------------+--------------+-----------------------------------+ | ``s.union(t)`` | ``s | t`` | new set with elements from both | | | | *s* and *t* | +---------------------------------+--------------+-----------------------------------+ | ``s.intersection(t)`` | ``s & t`` | new set with elements common to | | | | *s* and *t* | +---------------------------------+--------------+-----------------------------------+ | ``s.difference(t)`` | ``s - t`` | new set with elements in *s* but | | | | not in *t* | +---------------------------------+--------------+-----------------------------------+ | ``s.symmetric_difference(t)`` | ``s ^ t`` | new set with elements in either | | | | *s* or *t* but not both | +---------------------------------+--------------+-----------------------------------+ | ``s.copy()`` | | new set with a shallow copy of | | | | *s* | +---------------------------------+--------------+-----------------------------------+ Note, the non-operator versions of ``union()``, ``intersection()``, ``difference()``, and ``symmetric_difference()`` will accept any iterable as an argument. In contrast, their operator based counterparts require their arguments to be sets. This precludes error-prone constructions like ``Set('abc') & 'cbs'`` in favor of the more readable ``Set('abc').intersection('cbs')``. Changed in version 2.3.1: Formerly all arguments were required to be sets. In addition, both ``Set`` and ``ImmutableSet`` support set to set comparisons. Two sets are equal if and only if every element of each set is contained in the other (each is a subset of the other). A set is less than another set if and only if the first set is a proper subset of the second set (is a subset, but is not equal). A set is greater than another set if and only if the first set is a proper superset of the second set (is a superset, but is not equal). The subset and equality comparisons do not generalize to a complete ordering function. For example, any two disjoint sets are not equal and are not subsets of each other, so *all* of the following return ``False``: ``ab``. Accordingly, sets do not implement the ``__cmp__()`` method. Since sets only define partial ordering (subset relationships), the output of the ``list.sort()`` method is undefined for lists of sets. The following table lists operations available in ``ImmutableSet`` but not found in ``Set``: +---------------+--------------------------------+ | Operation | Result | +===============+================================+ | ``hash(s)`` | returns a hash value for *s* | +---------------+--------------------------------+ The following table lists operations available in ``Set`` but not found in ``ImmutableSet``: +----------------------------------------+---------------+-----------------------------------+ | Operation | Equivalent | Result | +========================================+===============+===================================+ | ``s.update(t)`` | *s* |= *t* | return set *s* with elements | | | | added from *t* | +----------------------------------------+---------------+-----------------------------------+ | ``s.intersection_update(t)`` | *s* &= *t* | return set *s* keeping only | | | | elements also found in *t* | +----------------------------------------+---------------+-----------------------------------+ | ``s.difference_update(t)`` | *s* -= *t* | return set *s* after removing | | | | elements found in *t* | +----------------------------------------+---------------+-----------------------------------+ | ``s.symmetric_difference_update(t)`` | *s* ^= *t* | return set *s* with elements from | | | | *s* or *t* but not both | +----------------------------------------+---------------+-----------------------------------+ | ``s.add(x)`` | | add element *x* to set *s* | +----------------------------------------+---------------+-----------------------------------+ | ``s.remove(x)`` | | remove *x* from set *s*; raises | | | | ``KeyError`` if not present | +----------------------------------------+---------------+-----------------------------------+ | ``s.discard(x)`` | | removes *x* from set *s* if | | | | present | +----------------------------------------+---------------+-----------------------------------+ | ``s.pop()`` | | remove and return an arbitrary | | | | element from *s*; raises | | | | ``KeyError`` if empty | +----------------------------------------+---------------+-----------------------------------+ | ``s.clear()`` | | remove all elements from set *s* | +----------------------------------------+---------------+-----------------------------------+ Note, the non-operator versions of ``update()``, ``intersection_update()``, ``difference_update()``, and ``symmetric_difference_update()`` will accept any iterable as an argument. Changed in version 2.3.1: Formerly all arguments were required to be sets. Also note, the module also includes a ``union_update()`` method which is an alias for ``update()``. The method is included for backwards compatibility. Programmers should prefer the ``update()`` method because it is supported by the builtin ``set()`` and ``frozenset()`` types. Example ======= >>> from sets import Set >>> engineers = Set(['John', 'Jane', 'Jack', 'Janice']) >>> programmers = Set(['Jack', 'Sam', 'Susan', 'Janice']) >>> managers = Set(['Jane', 'Jack', 'Susan', 'Zack']) >>> employees = engineers | programmers | managers # union >>> engineering_management = engineers & managers # intersection >>> fulltime_management = managers - engineers - programmers # difference >>> engineers.add('Marvin') # add element >>> print engineers # doctest: +SKIP Set(['Jane', 'Marvin', 'Janice', 'John', 'Jack']) >>> employees.issuperset(engineers) # superset test False >>> employees.update(engineers) # update from another set >>> employees.issuperset(engineers) True >>> for group in [engineers, programmers, managers, employees]: # doctest: +SKIP ... group.discard('Susan') # unconditionally remove element ... print group ... Set(['Jane', 'Marvin', 'Janice', 'John', 'Jack']) Set(['Janice', 'Jack', 'Sam']) Set(['Jane', 'Zack', 'Jack']) Set(['Jack', 'Sam', 'Jane', 'Marvin', 'Janice', 'John', 'Zack']) Protocol for automatic conversion to immutable ============================================== Sets can only contain immutable elements. For convenience, mutable ``Set`` objects are automatically copied to an ``ImmutableSet`` before being added as a set element. The mechanism is to always add a *hashable* element, or if it is not hashable, the element is checked to see if it has an ``__as_immutable__()`` method which returns an immutable equivalent. Since ``Set`` objects have a ``__as_immutable__()`` method returning an instance of ``ImmutableSet``, it is possible to construct sets of sets. A similar mechanism is needed by the ``__contains__()`` and ``remove()`` methods which need to hash an element to check for membership in a set. Those methods check an element for hashability and, if not, check for a ``__as_temporarily_immutable__()`` method which returns the element wrapped by a class that provides temporary methods for ``__hash__()``, ``__eq__()``, and ``__ne__()``. The alternate mechanism spares the need to build a separate copy of the original mutable object. ``Set`` objects implement the ``__as_temporarily_immutable__()`` method which returns the ``Set`` object wrapped by a new class ``_TemporarilyImmutableSet``. The two mechanisms for adding hashability are normally invisible to the user; however, a conflict can arise in a multi-threaded environment where one thread is updating a set while another has temporarily wrapped it in ``_TemporarilyImmutableSet``. In other words, sets of mutable sets are not thread-safe. Comparison to the built-in ``set`` types ======================================== The built-in ``set`` and ``frozenset`` types were designed based on lessons learned from the ``sets`` module. The key differences are: * ``Set`` and ``ImmutableSet`` were renamed to ``set`` and ``frozenset``. * There is no equivalent to ``BaseSet``. Instead, use ``isinstance(x, (set, frozenset))``. * The hash algorithm for the built-ins performs significantly better (fewer collisions) for most datasets. * The built-in versions have more space efficient pickles. * The built-in versions do not have a ``union_update()`` method. Instead, use the ``update()`` method which is equivalent. * The built-in versions do not have a ``_repr(sorted=True)`` method. Instead, use the built-in ``repr()`` and ``sorted()`` functions: ``repr(sorted(s))``. * The built-in version does not have a protocol for automatic conversion to immutable. Many found this feature to be confusing and no one in the community reported having found real uses for it.