Closed
Description
I noticed that in many of my codes, seemingly harmless lines like
numpy.add.at(target, idx, vals)
take a large share of the runtime. I investigated and found that one gets a speed-up of a factor of 40 (!) by simply moving the critical code to C++.
(Not sure what's actually done in numpy.ufunc
.) I put this into a very simple module, https://github.com/nschloe/fastfunc, so feel free to take some code out of there.