8000 MAINT: restore auto-vectorization of inplace operations by juliantaylor · Pull Request #8852 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

MAINT: restore auto-vectorization of inplace operations #8852

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 27, 2017

Conversation

juliantaylor
Copy link
Contributor
@juliantaylor juliantaylor commented Mar 27, 2017

GCC 6/7 lost the ability to vectorize inplace operations with our
current hinting. This causes inplace operations to become slower than
out of place operations which is bad, especially as we automatically
avoid temporaries now.
This issue has been filed in GCC PR80198.

Luckily gcc also has a no loop dependence pragma which we can use to
enforce the vectorization in the inplace code path.
In the inplace scalar path an extra code hint is sufficient.

* unfortunately gcc 6/7 regressed and we need to give it additional hints to
* vectorize inplace operations (PR80198)
* must only be used after op1 == ip1 or ip2 has been checked
* TODO: using ivdep might allow other compilers to vectorize too
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if someone cares about other compilers, you can likely use this type of pragma there too. Last I checked clang could not vectorize this code, with the pragma it might be able to but it needs testing.

GCC 6/7 lost the ability to vectorize inplace operations with our
current hinting. This causes inplace operations to become slower than
out of place operations which is bad, especially as we automatically
avoid temporaries now.
This issue has been filed in GCC PR80198.

Luckily gcc also has a no loop dependence pragma which we can use to
enforce the vectorization in the inplace code path.
In the inplace scalar path an extra code hint is sufficient.
@charris
Copy link
Member
charris commented Mar 27, 2017

Thanks Julian.

@juliantaylor juliantaylor deleted the gcc-inplace-fix branch March 27, 2017 18:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0