8000 Calling unittest.assertDictEqual for medium-size dictionaries takes too long · Issue #99151 · python/cpython · GitHub
[go: up one dir, main page]

Skip to content

Calling unittest.assertDictEqual for medium-size dictionaries takes too long #99151

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
boaza opened this issue Nov 6, 2022 · 2 comments
Open
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@boaza
Copy link
boaza commented Nov 6, 2022

Calling assertDictEqual(d1, d2) / takes forever, even for medium-size dictionaries. To reproduce:

from unittest import TestCase
from random import randint

    def test_assert_dict(self):
        r = 10000000
        num = 10000
        d1 = dict((randint(0, r), randint(0, r)) for _ in range(num))
        d2 = dict((randint(0, r), randint(0, r)) for _ in range(num))
        self.assertDictEqual(d1, d2)

Probably related to issue #63416.

Tested on Python 3.10, Windows 11

Linked PRs

@boaza boaza added the type-bug An unexpected behavior, bug, or error label Nov 6, 2022
@ronaldoussoren ronaldoussoren added the stdlib Python modules in the Lib dir label Dec 22, 2022
@ronaldoussoren
Copy link
Contributor

See also #51180 about slowness in difflib.ndiff (used by assertDictEquals to report about differences).

One avenue to explore would be to pick one of the other diff functions: unified_diff appears to be a lot faster, although this does change the output.

@merlinz01
Copy link

I think a simple difflib.ndiff over pprint is a bad choice for 'assertDictEqual`. The output it gives is often very confusing, like this:

E       AssertionError: {'second': 2, 'seventh': 7, 'sixth': 6, 'fo[53 chars]': 3} != {'first': 1, 'seventh': 7, 'ninth': 9, 'sec[66 chars]': 3}
E         {'eighth': 8,
E          'fifth': 5,
E          'first': 1,
E          'fourth': 4,
E       +  'ninth': 9,
E          'second': 2,
E          'seventh': 7,
E       -  'sixth': 6,
E       -  'third': 3}
E       ?            ^
E       
E       +  'third': 3,
E       ?            ^
E       
E       +  'zeroth': 0}

I put together a better implementation which gives much nicer output, handles nested structures, and is WAY faster than the current implementation for the test given.

E   AssertionError: {'second': 2, 'seventh': 7, 'sixth': 6, 'fo[53 chars]': 3} != {'first': 1, 'seventh': 7, 'ninth': 9, 'sec[66 chars]': 3}
E   
E   Diff:
E   {
E       'eighth': 8,
E       'fifth': 5,
E       'first': 1,
E       'fourth': 4,
E       'second': 2,
E       'seventh': 7,
E       'third': 3,
E     - 'sixth': 6,
E     + 'ninth': 9,
E     + 'zeroth': 0,
E   }

I would be happy to contribute my code if that is desired.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

3 participants
0