ENH: optimize STRING_compare by using memcmp #4572

juliantaylor · 2014-03-31T20:10:02Z

No description provided.

juliantaylor · 2014-03-31T20:13:22Z

@charris back in 2008 (382e672) you changed this function 8000 to use a loop instead of strncmp as that does not handle nulls, though is there a reason you did not consider memcmp?
its faster on my glibc 2.19 even for small arrays (4 bytes on amd64)

juliantaylor · 2014-03-31T20:19:15Z

maybe because memcmp does not stop on null so short strings with long paddings are slower?
maybe its not such a good idea then

juliantaylor · 2014-03-31T20:20:36Z

oh wait this is wrong when comparing strings with garbage behind NULL

edit: no the old function does also not stop, but its probably not worth thinking about what is fastest, should not be an important function

charris · 2014-03-31T20:52:26Z

@juliantaylor I don't remember ;) Some of the gcc library functions were inefficient and slow back in the day, memcpy was notorious that way. The function was probably supposed to bail at the first null, so the function may actually be incorrect as stands, although I think strings are filled out with zeros. Looks like strncmp should work and may be better these days.

juliantaylor · 2014-05-16T17:05:31Z

turns out this is actually useful when using a void type to sort an array by byte values
http://stackoverflow.com/a/22750502/1633169

seberg · 2014-05-26T14:37:08Z

numpy/core/src/multiarray/arraytypes.c.src

@@ -2598,12 +2599,14 @@ STRING_compare(char *ip1, char *ip2, PyArrayObject *ap)
    const unsigned char *c1 = (unsigned char *)ip1;
    const unsigned char *c2 = (unsigned char *)ip2;
    const size_t len = PyArray_DESCR(ap)->elsize;


Looks good to me, I think it is correct that we compare all bytes, since only trailing 0-bytes are stripped (as\x00d\x00\x00 is 'as\x00d). Probably doesn't matter, butelsize` actually is int I think

Oh, heh, in my mind I combined the old and new usage of i, so nvm.

juliantaylor · 2014-09-02T21:40:05Z

I guess this can go in, speeds up at least one usecase and is simple

ENH: optimize STRING_compare by using memcmp

juliantaylor closed this Mar 31, 2014

juliantaylor reopened this May 16, 2014

ENH: optimize STRING_compare by using memcmp

2f6da63

seberg reviewed May 26, 2014
View reviewed changes

juliantaylor added a commit that referenced this pull request Sep 2, 2014

Merge pull request #4572 from juliantaylor/string-cmp

0f0575c

ENH: optimize STRING_compare by using memcmp

juliantaylor merged commit 0f0575c into numpy:master Sep 2, 2014

juliantaylor deleted the string-cmp branch September 2, 2014 21:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

ENH: optimize STRING_compare by using memcmp #4572

ENH: optimize STRING_compare by using memcmp #4572

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ENH: optimize STRING_compare by using memcmp #4572

ENH: optimize STRING_compare by using memcmp #4572

Uh oh!

Conversation

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!