8000 Optimize regex compilation in Distribution._read_sections · Issue #292 · python/importlib_metadata · GitHub
[go: up one dir, main page]

Skip to content
Optimize regex compilation in Distribution._read_sections #292
@tirkarthi

Description

@tirkarthi

In below lines the regex pattern is compiled for each line in a loop which is not needed since we can just use re.compile once and use the pattern object. This improves the speed as per below benchmarks.

def _read_sections(lines):
section = None
for line in filter(None, lines):
section_match = re.match(r'\[(.*)\]$', line)
if section_match:
section = section_match.group(1)
continue
yield locals()

Benchmark of comparing requires.txt for dask

$  wc requires.txt
 38  31 427 requires.txt
$ python -m pyperf timeit -s 'from importlib_metadata import Distribution; requires = open("requires.txt").read()' 'Distribution._deps_from_requires_text(requires)' -o no_pattern.json
$ python -m pyperf timeit -s 'from importlib_metadata import Distribution; requires = open("requires.txt").read()' 'Distribution._deps_from_requires_text(requires)' -o pattern.json
$ python -m pyperf compare_to no_pattern.json pattern.json --table
+-----------+------------+-----------------------+
| Benchmark | no_pattern | pattern               |
+===========+============+=======================+
| timeit    | 53.4 us    | 30.4 us: 1.76x faster |
+-----------+------------+-----------------------+

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0