-
-
Notifications
You must be signed in to change notification settings - Fork 11.1k
ENH: Added libdivide for floor divide #17727
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
179038f
e89175b
565759b
d0c934c
b02399a
f0ddb7c
72dcc04
19835d2
3975a28
90e6cf5
969aa03
44a3a31
b3d70ef
931134b
6e2e281
90a84af
61c3d38
827bc38
0ce0ebd
c85c44a
0517f13
a769d6f
0e2116f
f93ca93
285d810
9825795
1f104fd
2fde590
8912ffd
a5e1235
ca4ba20
28aa883
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
- Loading branch information
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -832,7 +832,6 @@ NPY_NO_EXPORT void | |
* #TYPE = BYTE, SHORT, INT, LONG, LONGLONG# | ||
* #type = npy_byte, npy_short, npy_int, npy_long, npy_longlong# | ||
* #c = ,,,l,ll# | ||
* #div = s32, s32, s32, s64, s64# | ||
*/ | ||
|
||
NPY_NO_EXPORT NPY_GCC_OPT_3 void | ||
|
@@ -847,6 +846,19 @@ NPY_NO_EXPORT NPY_GCC_OPT_3 void | |
UNARY_LOOP_FAST(@type@, @type@, *out = in > 0 ? 1 : (in < 0 ? -1 : 0)); | ||
} | ||
|
||
/* Using nested loops, few more fields to be added in the future */ | ||
/**begin repeat1 | ||
* #kind = t, gen, do# | ||
*/ | ||
/* Libdivde only supports 32 and 64 bit types | ||
* We try to pick the best possible one */ | ||
#if NPY_BITSOF_@TYPE@ <= 32 | ||
#define libdivide_@type@_@kind@ libdivide_s32_@kind@ | ||
#else | ||
#define libdivide_@type@_@kind@ libdivide_s64_@kind@ | ||
ganesh-k13 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
#endif | ||
/**end repeat1**/ | ||
|
||
#ifndef USE_LEGACY_DIVISION | ||
NPY_NO_EXPORT void | ||
@TYPE@_divide(char **args, npy_intp const *dimensions, npy_intp const *steps, void *NPY_UNUSED(func)) | ||
|
@@ -857,7 +869,7 @@ NPY_NO_EXPORT void | |
const @type@ in2 = *(@type@ *)ip2; | ||
|
||
/* Creating a divisor of 0 is treated as an error by libdivide */ | ||
struct libdivide_@div@_t fast_d = in2 ? libdivide_@div@_gen(in2) : (struct libdivide_@div@_t){0}; | ||
struct libdivide_@type@_t fast_d = in2 ? libdivide_@type@_gen(in2) : (struct libdivide_@type@_t){0}; | ||
ganesh-k13 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
BINARY_LOOP_FIXED { | ||
const @type@ in1 = *(@type@ *)ip1; 8000 | ||
/* | ||
|
@@ -872,10 +884,10 @@ NPY_NO_EXPORT void | |
*((@type@ *)op1) = 0; | ||
} | ||
else if (((in1 > 0) != (in2 > 0)) && (in1 % in2 != 0)) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I honestly think we can avoid this There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So I still don't know how removing the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh, interesting. @ganesh-k13 two things: First make sure you are dividing a positive by a negative number (or vice versa), otherwise this is not hit at all. Second, was the timing difference with libdivide? I guess it might be the compiler is smart enough to optimize the modulo away, but I would be surprised if it is smart enough when libdivide is being used? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. They seem to have not done it yet: ridiculousfish/libdivide#9 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. All this changes is subtract one for rounding purproses, Now unless there is some edge case again, I think you can just do without the subtract, and then move the if to later, so that There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You were right about not hitting the case, @seberg , seems like in the profile script I forgot to invert the signs. Above method seems to work, few edge cases to iron out(like <= 0, etc), will try them. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I found three edge cases:
Let me know if any more are there. |
||
*((@type@ *)op1) = libdivide_@div@_do(in1, &fast_d) - 1; | ||
*((@type@ *)op1) = libdivide_@type@_do(in1, &fast_d) - 1; | ||
} | ||
else { | ||
*((@type@ *)op1) = libdivide_@div@_do(in1, &fast_d); | ||
*((@type@ *)op1) = libdivide_@type@_do(in1, &fast_d); | ||
} | ||
} | ||
} | ||
|
Uh oh!
There was an error while loading. Please reload this page.