You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Does anyone have any insights on getting 16 byte aligned memory (at least on x86/_64 systems)? I saw #1165 and #1166 but this doesn't seem to have gone anywhere.
I have implemented dSFMT as an alternative to the random-kit generator in an attempt at general structure for RandomState here https://github.com/bashtage/ng-numpy-randomstate/ . Unfortunately I can't use -DHAVE_SSE2 since on occasion the allocated memory is not aligned.
The obvious solution is to over allocate memory to ensure that the key array si 16-byte aligned, but this feels clumsy at best. If anyone has experience with this it would be appreciated.
The text was updated successfully, but these errors were encountered:
btw. on x86_64 systems allocated memory is always 16 byte aligned
also modern intel and amd cpus don't care much about alignment anymore, that ended with the haswell
Thanks - I'm surprised I missed #5312 which is almost exactly what I need, and done better than my one-off-solution.
I think alignment still matters when interfacing with 3rd party code which uses SSE2 intrinsics that require alignment (short of rewriting SSE2 parts of the code).
Does anyone have any insights on getting 16 byte aligned memory (at least on x86/_64 systems)? I saw #1165 and #1166 but this doesn't seem to have gone anywhere.
I have implemented dSFMT as an alternative to the random-kit generator in an attempt at general structure for RandomState here https://github.com/bashtage/ng-numpy-randomstate/ . Unfortunately I can't use
-DHAVE_SSE2
since on occasion the allocated memory is not aligned.The obvious solution is to over allocate memory to ensure that the key array si 16-byte aligned, but this feels clumsy at best. If anyone has experience with this it would be appreciated.
The text was updated successfully, but these errors were encountered: