@@ -15,7 +15,7 @@ that CPython counts how many different places there are that have a reference to
15
15
object. Such a place could be another object, or a global (or static) C variable, or
16
16
a local variable in some C function. When an object’s reference count becomes zero,
17
17
the object is deallocated. If it contains references to other objects, their
18
- reference count is decremented. Those other objects may be deallocated in turn, if
18
+ reference counts are decremented. Those other objects may be deallocated in turn, if
19
19
this decrement makes their reference count become zero, and so on. The reference
20
20
count field can be examined using the ``sys.getrefcount `` function (notice that the
21
21
value returned by this function is always 1 more as the function also has a reference
@@ -46,7 +46,7 @@ does not handle reference cycles. For instance, consider this code:
46
46
47
47
In this example, ``container `` holds a reference to itself, so even when we remove
48
48
our reference to it (the variable "container") the reference count never falls to 0
49
- because it still has its own internal reference and therefore it will never be
49
+ because it still has its own internal reference. Therefore it would never be
50
50
cleaned just by simple reference counting. For this reason some additional machinery
51
51
is needed to clean these reference cycles between objects once they become
52
52
unreachable. This is the cyclic garbage collector, usually called just Garbage
@@ -92,7 +92,7 @@ As is explained later in the `Optimization: reusing fields to save memory`_ sect
92
92
these two extra fields are normally used to keep doubly linked lists of all the
93
93
objects tracked by the garbage collector (these lists are the GC generations, more on
94
94
that in the `Optimization: generations `_ section), but they are also
95
- reused to fullfill other pourposes when the full doubly linked list structure is not
95
+ reused to fullfill other purposes when the full doubly linked list structure is not
96
96
needed as a memory optimization.
97
97
98
98
Doubly linked lists are used because they efficiently support most frequently required operations. In
@@ -144,8 +144,8 @@ lists are maintained: one list contains all objects to be scanned, and the other
144
144
contain all objects "tentatively" unreachable.
145
145
146
146
To understand how the algorithm works, Let’s take the case of a circular linked list
147
- which has one link referenced by a variable A , and one self-referencing object which
148
- is completely unreachable
147
+ which has one link referenced by a variable `` A `` , and one self-referencing object which
148
+ is completely unreachable:
149
149
150
150
.. code-block :: python
151
151
@@ -156,14 +156,15 @@ is completely unreachable
156
156
... self .next_link = next_link
157
157
158
158
>> > link_3 = Link()
159
- >> > link_2 = Link(link3 )
160
- >> > link_1 = Link(link2 )
159
+ >> > link_2 = Link(link_3 )
160
+ >> > link_1 = Link(link_2 )
161
161
>> > link_3.next_link = link_1
162
+ >> > A = link_1
163
+ >> > del link_1, link_2, link_3
162
164
163
165
>> > link_4 = Link()
164
166
>> > link_4.next_link = link_4
165
167
166
- >> > del link_4
167
168
>> > gc.collect()
168
169
2
169
170
@@ -194,16 +195,16 @@ This is because another object that is reachable from the outside (``gc_refs > 0
194
195
can still have references to it. For instance, the ``link_2 `` object in our example
195
196
ended having ``gc_refs == 0 `` but is referenced still by the ``link_1 `` object that
196
197
is reachable from the outside. To obtain the set of objects that are really
197
- unreachable, the garbage collector scans again the container objects using the
198
- ``tp_traverse `` slot with a different traverse function that marks objects with
198
+ unreachable, the garbage collector re- scans the container objects using the
199
+ ``tp_traverse `` slot; this time with a different traverse function that marks objects with
199
200
``gc_refs == 0 `` as "tentatively unreachable" and then moves them to the
200
201
tentatively unreachable list. The following image depicts the state of the lists in a
201
- moment when the GC processed the ``link 3 `` and ``link 4 `` objects but has not
202
- processed ``link 1 `` and ``link 2 `` yet.
202
+ moment when the GC processed the ``link_3 `` and ``link_4 `` objects but has not
203
+ processed ``link_1 `` and ``link_2 `` yet.
203
204
204
205
.. figure :: images/python-cyclic-gc-3-new-page.png
205
206
206
- Then the GC scans the next ``link 1 `` object. Because its has ``gc_refs == 1 ``
207
+ Then the GC scans the next ``link_1 `` object. Because its has ``gc_refs == 1 ``
207
208
the gc does not do anything special because it knows it has to be reachable (and is
208
209
already in what will become the reachable list):
209
210
@@ -213,9 +214,9 @@ When the GC encounters an object which is reachable (``gc_refs > 0``), it traver
213
214
its references using the ``tp_traverse `` slot to find all the objects that are
214
215
reachable from it, moving them to the end of the list of reachable objects (where
215
216
they started originally) and setting its ``gc_refs `` field to 1. This is what happens
216
- to ``link 2 `` and ``link 3 `` below as they are reachable from ``link 1 ``. From the
217
- state in the previous image and after examining the objects referred to by ``link1 ``
218
- the GC knows that ``link 3 `` is reachable after all, so it is moved back to the
217
+ to ``link_2 `` and ``link_3 `` below as they are reachable from ``link_1 ``. From the
218
+ state in the previous image and after examining the objects referred to by ``link_1 ``
219
+ the GC knows that ``link_3 `` is reachable after all, so it is moved back to the
219
ED4F
220
original list and its ``gc_refs `` field is set to one so if the GC visits it again, it
220
221
does know that is reachable. To avoid visiting a object twice, the GC marks all
221
222
objects that are already visited once (by unsetting the ``PREV_MASK_COLLECTING `` flag)
@@ -273,13 +274,13 @@ follows these steps in order:
273
274
set is going to be destroyed and has weak references with callbacks, these
274
275
callbacks need to be honored. This process is **very ** delicate as any error can
275
276
cause objects that will be in an inconsistent state to be resurrected or reached
276
- by some python functions invoked from the callbacks. To avoid these weak references
277
+ by some python functions invoked from the callbacks. In addition, weak references
277
278
that also are part of the unreachable set (the object and its weak reference
278
- are in a cycles that are unreachable) then the weak reference needs to be cleaned
279
- immediately and the callback must not be executed so it does not trigger later
280
- when the ``tp_clear `` slot is called, causing havoc. This is fine because both
281
- the object and the weakref are going away, so it's legitimate to pretend the
282
- weak reference is going away first so the callback is never executed .
279
+ are in a cycles that are unreachable) need to be cleaned
280
+ immediately, without executing the callback. Otherwise it will be triggered later,
281
+ when the ``tp_clear `` slot is called, causing havoc. Ignoring the weak reference's
282
+ callback is fine because both the object and the weakref are going away, so it's
283
+ legitimate to say the weak reference is going away first.
283
284
284
285
2. If an object has legacy finalizers (``tp_del `` slot) move them to the
285
286
``gc.garbage `` list.
@@ -296,7 +297,7 @@ follows these steps in order:
296
297
Optimization: generations
297
298
-------------------------
298
299
299
- In order to limit the time each garbage collection takes, the GC is uses a popular
300
+ In order to limit the time each garbage collection takes, the GC uses a popular
300
301
optimization: generations. The main idea behind this concept is the assumption that
301
302
most objects have a very short lifespan and can thus be collected shortly after their
302
303
creation. This has proven to be very close to the reality of many Python programs as
@@ -314,7 +315,7 @@ it will be moved to the last generation (generation 2) where it will be
314
315
surveyed the least often.
315
316
316
317
Generations are collected when the number of objects that they contain reach some
317
- predefined threshold which is unique for each generation and is lower than the older
318
+ predefined threshold, which is unique for each generation and is lower the older
318
319
generations are. These thresholds can be examined using the ``gc.get_threshold ``
319
320
function:
320
321
@@ -411,7 +412,7 @@ of ``PyGC_Head`` discussed in the `Memory layout and object structure`_ section:
411
412
dereferenced directly and the extra information must be stripped off before
412
413
obtaining the real memory address. Special care needs to be taken with
413
414
functions that directly manipulate the linked lists, as these functions
414
- normally asume the pointers inside the lists are in a consistent state.
415
+ normally assume the pointers inside the lists are in a consistent state.
415
416
416
417
417
418
* The ``_gc_prev` `` field is normally used as the "previous" pointer to maintain the
0 commit comments