-
-
Notifications
You must be signed in to change notification settings - Fork 56.2k
GAPI Fluid: Dynamic dispatching for Split3 kernel #21441
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GAPI Fluid: Dynamic dispatching for Split3 kernel #21441
Conversation
@@ -207,6 +207,18 @@ ABSDIFFC_SIMD(float) | |||
|
|||
#undef ABSDIFFC_SIMD | |||
|
|||
#define SPLIT3_SIMD(SRC, DST) \ | |||
int split3_simd(const SRC in[], DST out1[], DST out2[], \ | |||
DST out3[], const int width) \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes no sense to make this function a macro for this kernel. Make this function as a normal function.
} | ||
#endif | ||
#if CV_SIMD | ||
w = split3_simd(in, out1, out2, out3, width); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove empty line.
//------------------------- | ||
|
||
#define SPLIT3_SIMD(SRC, DST) \ | ||
int split3_simd(const SRC in[], DST out1[], DST out2[], \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes no sense to make this function as a macro for this kernel. Make this function as a normal function with uchar type for all parameters.
@@ -184,6 +184,14 @@ ABSDIFFC_SIMD(float) | |||
|
|||
#undef ABSDIFFC_SIMD | |||
|
|||
#define SPLIT3_SIMD(DST1, DST2, DST3, SRC) \ | |||
int split3_simd(const SRC in[], DST1 out1[], DST2 out2[], \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And there is only one DST type. Three different DST types are redundant.
This construction must be forward declaration for this function. So this macro would be the same as this
NOTE:
It makes no sense to make this function as a macro for this kernel. Please make this function as a usual function.
int x = 0; \ | ||
for (; x <= width - nlanes; x += nlanes) \ | ||
{ \ | ||
v_uint8 a, b, c; \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be better to move creation and initialization of these variables from the loop.
2d19f3f
to
0b8d606
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
{ | ||
constexpr int nlanes = v_uint8::nlanes; | ||
int x = 0; | ||
v_uint8 a, b, c; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
v_uint8 a, b, c;
Why is it outside of the loop body?
Keep variables declaration near their usage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it necessary to create vectors at each iteration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is necessary to keep vector values between iterations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is necessary to keep vector values between iterations?
@alalek
There is no necessity to do setzero()
for each iteration of the loop since they are initialized by the v_deinterleave()
function before calculations. So creation and first initialization with zeroes can be moved from the loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where do you see "setzero()" here?
Good rule of thumb for local variables to declare them in place where they are used.
No need to cross for loop
boundary or any other, especially for SIMD register wrappers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
8bb494b
to
5e89b9a
Compare
Split3 SIMD.xlsx
Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
Patch to opencv_extra has the same branch name.