-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
ENH: optimize STRING_compare by using memcmp #4572
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
maybe because memcmp does not stop on null so short strings with long paddings are slower? |
oh wait this is wrong when comparing strings with garbage behind NULL edit: no the old function does also not stop, but its probably not worth thinking about what is fastest, should not be an important function |
@juliantaylor I don't remember ;) Some of the gcc library functions were inefficient and slow back in the day, |
turns out this is actually useful when using a void type to sort an array by byte values |
@@ -2598,12 +2599,14 @@ STRING_compare(char *ip1, char *ip2, PyArrayObject *ap) | |||
const unsigned char *c1 = (unsigned char *)ip1; | |||
const unsigned char *c2 = (unsigned char *)ip2; | |||
const size_t len = PyArray_DESCR(ap)->elsize; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, I think it is correct that we compare all bytes, since only trailing 0-bytes are stripped (as\x00d\x00\x00
is 'as\x00d). Probably doesn't matter, but
elsize` actually is int I think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, heh, in my mind I combined the old and new usage of i
, so nvm.
I guess this can go in, speeds up at least one usecase and is simple |
ENH: optimize STRING_compare by using memcmp
No description provided.