|
1 |
| -\section{\module{sets} --- |
2 |
| - Unordered collections of unique elements} |
3 |
| - |
4 |
| -\declaremodule{standard}{sets} |
5 |
| -\modulesynopsis{Implementation of sets of unique elements.} |
6 |
| -
8000
\moduleauthor{Greg V. Wilson}{gvwilson@nevex.com} |
7 |
| -\moduleauthor{Alex Martelli}{aleax@aleax.it} |
8 |
| -\moduleauthor{Guido van Rossum}{guido@python.org} |
9 |
| -\sectionauthor{Raymond D. Hettinger}{python@rcn.com} |
10 |
| - |
11 |
| -\versionadded{2.3} |
12 |
| - |
13 |
| -The \module{sets} module provides classes for constructing and manipulating |
14 |
| -unordered collections of unique elements. Common uses include membership |
15 |
| -testing, removing duplicates from a sequence, and computing standard math |
16 |
| -operations on sets such as intersection, union, difference, and symmetric |
17 |
| -difference. |
18 |
| - |
19 |
| -Like other collections, sets support \code{\var{x} in \var{set}}, |
20 |
| -\code{len(\var{set})}, and \code{for \var{x} in \var{set}}. Being an |
21 |
| -unordered collection, sets do not record element position or order of |
22 |
| -insertion. Accordingly, sets do not support indexing, slicing, or |
23 |
| -other sequence-like behavior. |
24 |
| - |
25 |
| -Most set applications use the \class{Set} class which provides every set |
26 |
| -method except for \method{__hash__()}. For advanced applications requiring |
27 |
| -a hash method, the \class{ImmutableSet} class adds a \method{__hash__()} |
28 |
| -method but omits methods which alter the contents of the set. Both |
29 |
| -\class{Set} and \class{ImmutableSet} derive from \class{BaseSet}, an |
30 |
| -abstract class useful for determining whether something is a set: |
31 |
| -\code{isinstance(\var{obj}, BaseSet)}. |
32 |
| - |
33 |
| -The set classes are implemented using dictionaries. Accordingly, the |
34 |
| -requirements for set elements are the same as those for dictionary keys; |
35 |
| -namely, that the element defines both \method{__eq__} and \method{__hash__}. |
36 |
| -As a result, sets |
37 |
| -cannot contain mutable elements such as lists or dictionaries. |
38 |
| -However, they can contain immutable collections such as tuples or |
39 |
| -instances of \class{ImmutableSet}. For convenience in implementing |
40 |
| -sets of sets, inner sets are automatically converted to immutable |
41 |
| -form, for example, \code{Set([Set(['dog'])])} is transformed to |
42 |
| -\code{Set([ImmutableSet(['dog'])])}. |
43 |
| - |
44 |
| -\begin{classdesc}{Set}{\optional{iterable}} |
45 |
| -Constructs a new empty \class{Set} object. If the optional \var{iterable} |
46 |
| -parameter is supplied, updates the set with elements obtained from iteration. |
47 |
| -All of the elements in \var{iterable} should be immutable or be transformable |
48 |
| -to an immutable using the protocol described in |
49 |
| -section~\ref{immutable-transforms}. |
50 |
| -\end{classdesc} |
51 |
| - |
52 |
| -\begin{classdesc}{ImmutableSet}{\optional{iterable}} |
53 |
| -Constructs a new empty \class{ImmutableSet} object. If the optional |
54 |
| -\var{iterable} parameter is supplied, updates the set with elements obtained |
55 |
| -from iteration. All of the elements in \var{iterable} should be immutable or |
56 |
| -be transformable to an immutable using the protocol described in |
57 |
| -section~\ref{immutable-transforms}. |
58 |
| - |
59 |
| -Because \class{ImmutableSet} objects provide a \method{__hash__()} method, |
60 |
| -they can be used as set elements or as dictionary keys. \class{ImmutableSet} |
61 |
| -objects do not have methods for adding or removing elements, so all of the |
62 |
| -elements must be known when the constructor is called. |
63 |
| -\end{classdesc} |
64 |
| - |
65 |
| - |
66 |
| -\subsection{Set Objects \label{set-objects}} |
67 |
| - |
68 |
| -Instances of \class{Set} and \class{ImmutableSet} both provide |
69 |
| -the following operations: |
70 |
| - |
71 |
| -\begin{tableiii}{c|c|l}{code}{Operation}{Equivalent}{Result} |
72 |
| - \lineiii{len(\var{s})}{}{cardinality of set \var{s}} |
73 |
| - |
74 |
| - \hline |
75 |
| - \lineiii{\var{x} in \var{s}}{} |
76 |
| - {test \var{x} for membership in \var{s}} |
77 |
| - \lineiii{\var{x} not in \var{s}}{} |
78 |
| - {test \var{x} for non-membership in \var{s}} |
79 |
| - \lineiii{\var{s}.issubset(\var{t})}{\code{\var{s} <= \var{t}}} |
80 |
| - {test whether every element in \var{s} is in \var{t}} |
81 |
| - \lineiii{\var{s}.issuperset(\var{t})}{\code{\var{s} >= \var{t}}} |
82 |
| - {test whether every element in \var{t} is in \var{s}} |
83 |
| - |
84 |
| - \hline |
85 |
| - \lineiii{\var{s}.union(\var{t})}{\var{s} \textbar{} \var{t}} |
86 |
| - {new set with elements from both \var{s} and \var{t}} |
87 |
| - \lineiii{\var{s}.intersection(\var{t})}{\var{s} \&\ \var{t}} |
88 |
| - {new set with elements common to \var{s} and \var{t}} |
89 |
| - \lineiii{\var{s}.difference(\var{t})}{\var{s} - \var{t}} |
90 |
| - {new set with elements in \var{s} but not in \var{t}} |
91 |
| - \lineiii{\var{s}.symmetric_difference(\var{t})}{\var{s} \^\ \var{t}} |
92 |
| - {new set with elements in either \var{s} or \var{t} but not both} |
93 |
| - \lineiii{\var{s}.copy()}{} |
94 |
| - {new set with a shallow copy of \var{s}} |
95 |
| -\end{tableiii} |
96 |
| - |
97 |
| -Note, the non-operator versions of \method{union()}, |
98 |
| -\method{intersection()}, \method{difference()}, and |
99 |
| -\method{symmetric_difference()} will accept any iterable as an argument. |
100 |
| -In contrast, their operator based counterparts require their arguments to |
101 |
| -be sets. This precludes error-prone constructions like |
102 |
| -\code{Set('abc') \&\ 'cbs'} in favor of the more readable |
103 |
| -\code{Set('abc').intersection('cbs')}. |
104 |
| -\versionchanged[Formerly all arguments were required to be sets]{2.3.1} |
105 |
| - |
106 |
| -In addition, both \class{Set} and \class{ImmutableSet} |
107 |
| -support set to set comparisons. Two sets are equal if and only if |
108 |
| -every element of each set is contained in the other (each is a subset |
109 |
| -of the other). |
110 |
| -A set is less than another set if and only if the first set is a proper |
111 |
| -subset of the second set (is a subset, but is not equal). |
112 |
| -A set is greater than another set if and only if the first set is a proper |
113 |
| -superset of the second set (is a superset, but is not equal). |
114 |
| - |
115 |
| -The subset and equality comparisons do not generalize to a complete |
116 |
| -ordering function. For example, any two disjoint sets are not equal and |
117 |
| -are not subsets of each other, so \emph{all} of the following return |
118 |
| -\code{False}: \code{\var{a}<\var{b}}, \code{\var{a}==\var{b}}, or |
119 |
| -\code{\var{a}>\var{b}}. |
120 |
| -Accordingly, sets do not implement the \method{__cmp__} method. |
121 |
| - |
122 |
| -Since sets only define partial ordering (subset relationships), the output |
123 |
| -of the \method{list.sort()} method is undefined for lists of sets. |
124 |
| - |
125 |
| -The following table lists operations available in \class{ImmutableSet} |
126 |
| -but not found in \class{Set}: |
127 |
| - |
128 |
| -\begin{tableii}{c|l}{code}{Operation}{Result} |
129 |
| - \lineii{hash(\var{s})}{returns a hash value for \var{s}} |
130 |
| -\end{tableii} |
131 |
| - |
132 |
| -The following table lists operations available in \class{Set} |
133 |
| -but not found in \class{ImmutableSet}: |
134 |
| - |
135 |
| -\begin{tableiii}{c|c|l}{code}{Operation}{Equivalent}{Result} |
136 |
| - \lineiii{\var{s}.update(\var{t})} |
137 |
| - {\var{s} \textbar= \var{t}} |
138 |
| - {return set \var{s} with elements added from \var{t}} |
139 |
| - \lineiii{\var{s}.intersection_update(\var{t})} |
140 |
| - {\var{s} \&= \var{t}} |
141 |
| - {return set \var{s} keeping only elements also found in \var{t}} |
142 |
| - \lineiii{\var{s}.difference_update(\var{t})} |
143 |
| - {\var{s} -= \var{t}} |
144 |
| - {return set \var{s} after removing elements found in \var{t}} |
145 |
| - \lineiii{\var{s}.symmetric_difference_update(\var{t})} |
146 |
| - {\var{s} \textasciicircum= \var{t}} |
147 |
| - {return set \var{s} with elements from \var{s} or \var{t} |
148 |
| - but not both} |
149 |
| - |
150 |
| - \hline |
151 |
| - \lineiii{\var{s}.add(\var{x})}{} |
152 |
| - {add element \var{x} to set \var{s}} |
153 |
| - \lineiii{\var{s}.remove(\var{x})}{} |
154 |
| - {remove \var{x} from set \var{s}; raises \exception{KeyError} |
155 |
| - if not present} |
156 |
| - \lineiii{\var{s}.discard(\var{x})}{} |
157 |
| - {removes \var{x} from set \var{s} if present} |
158 |
| - \lineiii{\var{s}.pop()}{} |
159 |
| - {remove and return an arbitrary element from \var{s}; raises |
160 |
| - \exception{KeyError} if empty} |
161 |
| - \lineiii{\var{s}.clear()}{} |
162 |
| - {remove all elements from set \var{s}} |
163 |
| -\end{tableiii} |
164 |
| - |
165 |
| -Note, the non-operator versions of \method{update()}, |
166 |
| -\method{intersection_update()}, \method{difference_update()}, and |
167 |
| -\method{symmetric_difference_update()} will accept any iterable as |
168 |
| -an argument. |
169 |
| -\versionchanged[Formerly all arguments were required to be sets]{2.3.1} |
170 |
| - |
171 |
| -Also note, the module also includes a \method{union_update()} method |
172 |
| -which is an alias for \method{update()}. The method is included for |
173 |
| -backwards compatibility. Programmers should prefer the |
174 |
| -\method{update()} method because it is supported by the builtin |
175 |
| -\class{set()} and \class{frozenset()} types. |
176 |
| - |
177 |
| -\subsection{Example \label{set-example}} |
178 |
| - |
179 |
| -\begin{verbatim} |
180 |
| ->>> from sets import Set |
181 |
| ->>> engineers = Set(['John', 'Jane', 'Jack', 'Janice']) |
182 |
| ->>> programmers = Set(['Jack', 'Sam', 'Susan', 'Janice']) |
183 |
| ->>> managers = Set(['Jane', 'Jack', 'Susan', 'Zack']) |
184 |
| ->>> employees = engineers | programmers | managers # union |
185 |
| ->>> engineering_management = engineers & managers # intersection |
186 |
| ->>> fulltime_management = managers - engineers - programmers # difference |
187 |
| ->>> engineers.add('Marvin') # add element |
188 |
| ->>> print engineers |
189 |
| -Set(['Jane', 'Marvin', 'Janice', 'John', 'Jack']) |
190 |
| ->>> employees.issuperset(engineers) # superset test |
191 |
| -False |
192 |
| ->>> employees.union_update(engineers) # update from another set |
193 |
| ->>> employees.issuperset(engineers) |
194 |
| -True |
195 |
| ->>> for group in [engineers, programmers, managers, employees]: |
196 |
| -... group.discard('Susan') # unconditionally remove element |
197 |
| -... print group |
198 |
| -... |
199 |
| -Set(['Jane', 'Marvin', 'Janice', 'John', 'Jack']) |
200 |
| -Set(['Janice', 'Jack', 'Sam']) |
201 |
| -Set(['Jane', 'Zack', 'Jack']) |
202 |
| -Set(['Jack', 'Sam', 'Jane', 'Marvin', 'Janice', 'John', 'Zack']) |
203 |
| -\end{verbatim} |
204 |
| - |
205 |
| - |
206 |
| -\subsection{Protocol for automatic conversion to immutable |
207 |
| - \label{immutable-transforms}} |
208 |
| - |
209 |
| -Sets can only contain immutable elements. For convenience, mutable |
210 |
| -\class{Set} objects are automatically copied to an \class{ImmutableSet} |
211 |
| -before being added as a set element. |
212 |
| - |
213 |
| -The mechanism is to always add a hashable element, or if it is not |
214 |
| -hashable, the element is checked to see if it has an |
215 |
| -\method{__as_immutable__()} method which returns an immutable equivalent. |
216 |
| - |
217 |
| -Since \class{Set} objects have a \method{__as_immutable__()} method |
218 |
| -returning an instance of \class{ImmutableSet}, it is possible to |
219 |
| -construct sets of sets. |
220 |
| - |
221 |
| -A similar mechanism is needed by the \method{__contains__()} and |
222 |
| -\method{remove()} methods which need to hash an element to check |
223 |
| -for membership in a set. Those methods check an element for hashability |
224 |
| -and, if not, check for a \method{__as_temporarily_immutable__()} method |
225 |
| -which returns the element wrapped by a class that provides temporary |
226 |
| -methods for \method{__hash__()}, \method{__eq__()}, and \method{__ne__()}. |
227 |
| - |
228 |
| -The alternate mechanism spares the need to build a separate copy of |
229 |
| -the original mutable object. |
230 |
| - |
231 |
| -\class{Set} objects implement the \method{__as_temporarily_immutable__()} |
232 |
| -method which returns the \class{Set} object wrapped by a new class |
233 |
| -\class{_TemporarilyImmutableSet}. |
234 |
| - |
235 |
| -The two mechanisms for adding hashability are normally invisible to the |
236 |
| -user; however, a conflict can arise in a multi-threaded environment |
237 |
| -where one thread is updating a set while another has temporarily wrapped it |
238 |
| -in \class{_TemporarilyImmutableSet}. In other words, sets of mutable sets |
239 |
| -are not thread-safe. |
240 |
| - |
241 |
| - |
242 |
| -\subsection{Comparison to the built-in \class{set} types |
243 |
| - \label{comparison-to-builtin-set}} |
244 |
| - |
245 |
| -The built-in \class{set} and \class{frozenset} types were designed based |
246 |
| -on lessons learned from the \module{sets} module. The key differences are: |
247 |
| - |
248 |
| -\begin{itemize} |
249 |
| -\item \class{Set} and \class{ImmutableSet} were renamed to \class{set} and |
250 |
| - \class{frozenset}. |
251 |
| -\item There is no equivalent to \class{BaseSet}. Instead, use |
252 |
| - \code{isinstance(x, (set, frozenset))}. |
253 |
| -\item The hash algorithm for the built-ins performs significantly better |
254 |
| - (fewer collisions) for most datasets. |
255 |
| -\item The built-in versions have more space efficient pickles. |
256 |
| -\item The built-in versions do not have a \method{union_update()} method. |
257 |
| - Instead, use the \method{update()} method which is equivalent. |
258 |
| -\item The built-in versions do not have a \method{_repr(sorted=True)} method. |
259 |
| - Instead, use the built-in \function{repr()} and \function{sorted()} |
260 |
| - functions: \code{repr(sorted(s))}. |
261 |
| -\item The built-in version does not have a protocol for automatic conversion |
262 |
| - to immutable. Many found this feature to be confusing and no one |
263 |
| - in the community reported having found real uses for it. |
264 |
| -\end{itemize} |
0 commit comments