8000 Pickle handle self references in classes · Issue #82951 · python/cpython · GitHub
[go: up one dir, main page]

Skip to content

Pickle handle self references in classes #82951

New issue

Have a question about this project? Sign up for a free GitHub account to open an iss 8000 ue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
SaimRaza mannequin opened this issue Nov 11, 2019 · 4 comments
Closed

Pickle handle self references in classes #82951

SaimRaza mannequin opened this issue Nov 11, 2019 · 4 comments
Assignees
Labels
3.7 (EOL) end of life stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@SaimRaza
Copy link
Mannequin
SaimRaza mannequin commented Nov 11, 2019
BPO 38770
Nosy @serhiy-storchaka, @furkanonder

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = 'https://github.com/serhiy-storchaka'
closed_at = None
created_at = <Date 2019-11-11.21:10:17.849>
labels = ['3.7', 'type-bug', 'library']
title = 'Pickle handle self references in classes'
updated_at = <Date 2020-04-21.23:57:29.528>
user = 'https://bugs.python.org/SaimRaza'

bugs.python.org fields:

activity = <Date 2020-04-21.23:57:29.528>
actor = 'furkanonder'
assignee = 'serhiy.storchaka'
closed = False
closed_date = None
closer = None
components = ['Library (Lib)']
creation = <Date 2019-11-11.21:10:17.849>
creator = 'Saim Raza'
dependencies = []
files = []
hgrepos = []
issue_num = 38770
keywords = []
message_count = 3.0
messages = ['356388', '366951', '366956']
nosy_count = 3.0
nosy_names = ['serhiy.storchaka', 'Saim Raza', 'furkanonder']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue38770'
versions = ['Python 2.7', 'Python 3.7']

Linked PRs

@SaimRaza
Copy link
Mannequin Author
SaimRaza mannequin commented Nov 11, 2019

If the __qualname__ of a class is set to have a circular reference to itself, pickle behaves differently based on protocol. Following script demonstrates the issue:

======================================================

from __future__ import print_function

import pickle, sys

class Foo:
    __name__ = __qualname__ = "Foo.ref"
    pass

Foo.ref = Foo

print(sys.version_info)

for proto in range(0, pickle.HIGHEST_PROTOCOL + 1):
    print("{}:".format(proto), end=" ")
    try:
        pkl = pickle.dumps(Foo, proto)
        print("Dump OK,", end=" ")
        assert(pickle.loads(pkl) is Foo)
        print("Load OK,")
    except Exception as err:
        print(repr(err))

======================================================
OUTPUT:
Python2.7:
sys.version_info(major=2, minor=7, micro=16, releaselevel='final', serial=0)
0: Dump OK, Load OK,
1: Dump OK, Load OK,
2: Dump OK, Load OK,

Python3.7:
sys.version_info(major=3, minor=7, micro=3, releaselevel='final', serial=0)
0: RecursionError('maximum recursion depth exceeded while pickling an object')
1: RecursionError('maximum recursion depth exceeded while pickling an object')
2: RecursionError('maximum recursion depth exceeded while pickling an object')
3: RecursionError('maximum recursion depth exceeded while pickling an object')
4: Dump OK, Load OK,
======================================================

This was introduced as a side effect of bpo-23611 (?). I can think of the following approaches to fix the issue and make the behavior consistent:

  1. Check if the class has a self-reference and raise an error for all protocols.
  2. Use memoization to handle self-references. I am not sure what should be dumped in this case. In the example above Foo will exist in the namespace but not Foo.ref.
  3. Dump such classes similar to Python 2 pickle and Python 3 pickle protocol >= 4.

I had a stab at pickle.py and had a bit of success in doing point 3 above. Posting this issue for discussions. I would be happy to submit a PR for this issue.

Thanks,
Saim Raza

@SaimRaza SaimRaza mannequin added 3.7 (EOL) end of life stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Nov 11, 2019
@serhiy-storchaka serhiy-storchaka self-assigned this Nov 11, 2019
@furkanonder
Copy link
Mannequin
furkanonder mannequin commented Apr 21, 2020

I ran your script and didn't get RecursionError. The issue seems to be fixed.

Python 3.8.2 (default, Apr  8 2020, 14:31:25) 
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from __future__ import print_function
>>> 
>>> import pickle, sys
>>> 
>>> class Foo:
...     __name__ = __qualname__ = "Foo.ref"
...     pass
... 
>>> Foo.ref = Foo
>>> 
>>> print(sys.version_info)
sys.version_info(major=3, minor=8, micro=2, releaselevel='final', serial=0)
>>> for proto in range(0, pickle.HIGHEST_PROTOCOL + 1):
...     print("{}:".format(proto), end=" ")
...     try:
...         pkl = pickle.dumps(Foo, proto)
...         print("Dump OK,", end=" ")
...         assert(pickle.loads(pkl) is Foo)
...         print("Load OK,")
...     except Exception as err:
...         print(repr(err))
... 
0: PicklingError("Can't pickle <class '__main__.Foo.ref'>: import of module '__main__' failed")
1: PicklingError("Can't pickle <class '__main__.Foo.ref'>: import of module '__main__' failed")
2: PicklingError("Can't pickle <class '__main__.Foo.ref'>: import of module '__main__' failed")
3: PicklingError("Can't pickle <class '__main__.Foo.ref'>: import of module '__main__' failed")
4: Dump OK, Load OK,
5: Dump OK, Load OK,
>>>

@furkanonder
Copy link
Mannequin
furkanonder mannequin commented Apr 21, 2020

Ahh. I misunderstood the problem. Pickle behaves differently when it is a circular reference. If you have a solution, I am waiting with curiosity.

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
serhiy-storchaka added a commit to serhiy-storchaka/cpython that referenced this issue Jul 23, 2024
Serializing objects with complex __qualname__ (such as unbound methods and
nested classes) by name no longer involves serializing parent objects by value
in pickle protocols < 4.
@serhiy-storchaka
Copy link
Member

Pickling unbound methods and nested classes is natively supported in protocol 4. In protocol 3 and lower, only top-level classes and functions (i.e. these that do not have a dot in __qualname__) were initially supported. Later support of nested names was implemented via the getattr() function (see bpo-23611). Foo.ref is now represented as getattr(Foo, 'ref'). The problem in your example is that if Foo.__qualname__ is 'Foo.ref', Foo is pickled as getattr(Foo, 'ref'), that involves infinite recursion.

This problem can be solved by forcing all parent objects to be serialized by name without using the normal dispatch mechanism: represent Foo.ref as getattr(getattr(module, 'Foo'), 'ref').

I consider this a bugfix because it matches the behavior in protocols < 4 with the behavior in protocols >= 4 which is used by default since 3.8 (see bpo-23403). I wish I did it from the beginning.

serhiy-storchaka added a commit that referenced this issue Jul 25, 2024
Serializing objects with complex __qualname__ (such as unbound methods and
nested classes) by name no longer involves serializing parent objects by value
in pickle protocols < 4.
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Jul 25, 2024
…onGH-122149)

Serializing objects with complex __qualname__ (such as unbound methods and
nested classes) by name no longer involves serializing parent objects by value
in pickle protocols < 4.
(cherry picked from commit dc07f65)

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Jul 25, 2024
…onGH-122149)

Serializing objects with complex __qualname__ (such as unbound methods and
nested classes) by name no longer involves serializing parent objects by value
in pickle protocols < 4.
(cherry picked from commit dc07f65)

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
serhiy-storchaka added a commit that referenced this issue Jul 25, 2024
…122149) (GH-122265)

Serializing objects with complex __qualname__ (such as unbound methods and
nested classes) by name no longer involves serializing parent objects by value
in pickle protocols < 4.
(cherry picked from commit dc07f65)

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
serhiy-storchaka added a commit that referenced this issue Jul 25, 2024
…122149) (GH-122264)

Serializing objects with complex __qualname__ (such as unbound methods and
nested classes) by name no longer involves serializing parent objects by value
in pickle protocols < 4.
(cherry picked from commit dc07f65)

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
serhiy-storchaka added a commit to serhiy-storchaka/cpython that referenced this issue Jul 26, 2024
It was introduced in the previous commit.
serhiy-storchaka added a commit to serhiy-storchaka/cpython that referenced this issue Jul 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.7 (EOL) end of life stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
Projects
Status: Done
Development

No branches or pull requests

1 participant
0