-
-
Notifications
You must be signed in to change notification settings - Fork 32.2k
gh-86618: Improve colorsys.rgb_to_hls code #23306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hate to be that guy, but since this is about optimization I think some before/after benchmarks need to be included together with the test data and some details of the test environment.
@jstasiak I'm not sure if I'm benchmarking wrongly, but I don't see any performance improvements either: import timeit
timeit.timeit(stmt="rgb_to_hls(255, 255, 255)", setup="from colorsys import rgb_to_hls") On main branch: On this PR's branch: Basically no significant change. |
Using this simple — yet more comprehensive — script which uses the test data, as recommended by @jstasiak : # /usr/bin/env python
import timeit
import statistics
from Lib.test.test_colorsys import ColorsysTest
if __name__ == '__main__':
number = 100
repeat = 100
results = timeit.repeat('ColorsysTest().test_hls_roundtrip()',
setup='from Lib.test.test_colorsys import ColorsysTest',
number=number,
repeat=repeat)
running_times = list(sorted(results))
mean = statistics.mean(running_times)
std = statistics.stdev(running_times)
print(f"Running time: {mean} ± {std} ms. "
f"(number, repeat) = ({number}, {repeat})") I get on:
Testing environment: $ uname -a ; gcc -v
Linux torus 5.8.15-201.fc32.x86_64 #1 SMP Thu Oct 15 15:56:44 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Using built-in specs.
COLLECT_GCC=/usr/bin/gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/10/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-redhat-linux
Configured with: ../configure --enable-bootstrap --enable-languages=c,c++,fortran,objc,obj-c++,ada,go,d,lto --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared --enable-threads=posix --enable-checking=release --enable-multilib --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-gcc-major-version-only --with-linker-hash-style=gnu --enable-plugin --enable-initfini-array --with-isl --enable-offload-targets=nvptx-none --without-cuda-driver --enable-gnu-indirect-function --enable-cet --with-tune=generic --with-arch_32=i686 --build=x86_64-redhat-linux
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 10.2.1 20201016 (Red Hat 10.2.1-6) (GCC) |
@jstasiak: I guess it would be better to close this PR, open a bpo-issue and attach a new PR to it. What do you think? |
You can open the bpo issue then prepend the issue to the current PR title, it will then auto link :). There isn't really a need to close this PR. |
Done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this module is one of those proposed for removal. But that proposal was rejected for 3.9 and the module has not yet been deprecated.
Addition and subtraction are so fast that I am not surprised that a speedup in hard to detect. But the change makes the code easier to read. Since the PR is made and the change is easy enough to verify as correct and this function is thoroughly tested, I think it worth making.
While this change is conceptually trivial, the code change itself is not, for the purpose skipping the CLA and bpo issue. I will, however, skip a news item.
There is no report available for the Pipelines failure. Given the nature of the change and its passing on multiple other tests, I am going to assume that the Pipelines failure is unrelated.
Cache repeated sum and difference to make code slightly faster and easier to read.
Cache repeated sum and difference to make code slightly faster and easier to read.