8000 Remove unnessary numpy lookups for int, float, complex, bool and str by MSeifert04 · Pull Request #6603 · astropy/astropy · GitHub
[go: up one dir, main page]

Skip to content

Remove unnessary numpy lookups for int, float, complex, bool and str #6603

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Sep 28, 2017
Merged

Remove unnessary numpy lookups for int, float, complex, bool and str #6603

merged 12 commits into from
Sep 28, 2017

Conversation

MSeifert04
Copy link
Contributor
@MSeifert04 MSeifert04 commented Sep 24, 2017

The np.int, np.float, etc. are just aliases for the Python builtins so we really shouldn't be using them. Note that NumPy recently did the same (based on this issue).

The aliases are:

  • np.object
  • np.str
  • np.int
  • np.float
  • np.complex
  • np.bool

But not the ones that end with an underscore (np.float_) or a number (np.int32) or np.intp and np.intc which are actually different.

Note that the PR probably fixed "too much". I just replaced the np. in front of these or removed it altogether where it didn't make sense. However, in some cases it could just be that the author wanted to use np.*_ with the trailing underscore. So this probably requires a careful review.

Checklist:

  • convolution
  • coordinates
  • cosmology
  • io.ascii
  • io.fits
  • io.misc
  • modeling
  • nddata
  • stats
  • table
  • time
  • units
  • utils
  • wcs

@MSeifert04 MSeifert04 added this to the v2.0.3 milestone Sep 24, 2017
@astropy-bot
Copy link
astropy-bot bot commented Sep 24, 2017

Hi there @MSeifert04 👋 - thanks for the pull request! I'm just a friendly 🤖 that checks for issues related to the changelog and making sure that this pull request is milestoned and labelled correctly. This is mainly intended for the maintainers, so if you are not a maintainer you can ignore this, and a maintainer will let you know if any action is required on your part 😃.

Everything looks good from my point of view! 👍

If there are any issues with this message, please report them here

@mhvk
Copy link
Contributor
mhvk commented Sep 24, 2017

Now at the right PR: @MSeifert04 - I didn't look at everything, but time and units are definitely OK.

@MSeifert04 MSeifert04 modified the milestones: v2.0.3, v3.0.0 Sep 24, 2017
@@ -32,7 +32,7 @@
MexicanHat1DKernel, Tophat2DKernel, AiryDisk2DKernel, Ring2DKernel]


NUMS = [1, 1., np.float(1.), np.float32(1.), np.float64(1.)]
NUMS = [1, 1., np.float32(1.), np.float64(1.)]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is one of the cases where I wasn't sure what was intended. Given that np.float(1.) is identical to 1. I simply removed it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed

@@ -229,7 +229,7 @@ def test_item_access_type(self, Column):
Tests for #3095, which forces integer item access to always return a plain
ndarray or MaskedArray, even in the case of a multi-dim column.
"""
integer_types = (int, np.int)
integer_types = [int]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one was a bit weird. Was it indentded to test np.int_?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked back at #3095, and, yes, this is definitely meant to test all possible integer types, so replace this with np.int_. (The actual code looks for subclasses of int and np.integer)

@pllim
Copy link
Member
pllim commented Sep 25, 2017

This touches too many sub-packages of various functionalities. I recommend that we keep a check list of sub-packages being touched and whether each is approved by its lead/deputy maintainer; Just to be on the safe side.

@MSeifert04
Copy link
Contributor Author
MSeifert04 commented Sep 25, 2017

Actually the changes don't change anything, except that they removes some obvious redundancy. But a checklist is probably in order. 😄

@pllim
Copy link
Member
pllim commented Sep 25, 2017

Just want to ensure due diligence to ensure that the redundancy wasn't added on purpose. But if I am not making sense, then feel free to ignore. 😅

@MSeifert04
Copy link
Contributor Author

Well, given that np.int is int it probably doesn't make sense to have redundancy (test for int and np.int). But it's always possible that the original author wanted to test np.int_ which would be different.

However it's impossible to be really sure on that. But I can re-add the redundancy so the PR doesn't change anything except for the np lookup.

@@ -386,7 +386,7 @@ def default(self, obj):
import numpy as np
if isinstance(obj, (np.ndarray, np.number)):
return obj.tolist()
elif isinstance(obj, (complex, np.complex)):
elif isinstance(obj, (complex, np.complexfloating)):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if that check is correct or if I should rather have removed the second type.

Copy link
Member
@pllim pllim Sep 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, dear. I added this in #552 to support Cone Search. I'll have to find time to dig through the 181 comments to see why I wrote this the way I did...

Copy link
Contributor Author
@MSeifert04 MSeifert04 Sep 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind, on second thought I should just remove the second type. Currently it was only checking for complex, complex. But np.complexfloating would be already catched by the np.number isinstance check (np.complexfloating is a subclass of np.number) above so it's likewise a no-op.

@@ -17,7 +17,6 @@
('U4', STRING_TYPE_NAMES[(True, 'U')] + '4'),
(np.void, 'void'),
(np.int32, 'int32'),
(np.bool, 'bool'),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was identical to the next line so I removed it.

MSeifert04
@@ -793,7 +793,7 @@ class Column(BaseColumn):
Examples include:

- Python non-string type (float, int, bool)
- Numpy non-string type (e.g. np.float32, np.int64, np.bool)
- Numpy non-string type (e.g. np.float32, np.int64, bool)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this should've been np.bool_ given that it refers to the NumPy type.

@MSeifert04
Copy link
Contributor Author

I think I commented all non-trivial changes made here. The other changes were rather trivial: dtype specifiers for arrays and conversions. Maybe that makes reviewing it easier. :)

Copy link
Contributor
@mhvk mhvk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MSeifert04 - coordinates is all OK, and for cosmology most of the dtype can be removed (those in tests are OK).

@@ -169,12 +169,12 @@ def __init__(self, H0, Om0, Ode0, Tcmb0=0, Neff=3.04,
self.name = name

# Tcmb may have units
self._Tcmb0 = u.Quantity(Tcmb0, unit=u.K, dtype=np.float)
self._Tcmb0 = u.Quantity(Tcmb0, unit=u.K, dtype=float)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be removed altogether (the default is float64, which is what one would want here).

if not self._Tcmb0.isscalar:
raise ValueError("Tcmb0 is a non-scalar quantity")

# Hubble parameter at z=0, km/s/Mpc
self._H0 = u.Quantity(H0, unit=u.km / u.s / u.Mpc, dtype=np.float)
self._H0 = u.Quantity(H0, unit=u.km / u.s / u.Mpc, dtype=float)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here dtype can be removed as well.

@@ -371,15 +371,15 @@ def m_nu(self):
if not self._massivenu:
# Only massless
return u.Quantity(np.zeros(self._nmasslessnu), u.eV,
dtype=np.float)
dtype=float)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And here.

if self._nmasslessnu == 0:
# Only massive
return u.Quantity(self._massivenu_mass, u.eV,
dtype=np.float)
dtype=float)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And here...

F438
# A mix -- the most complicated case
numass = np.append(np.zeros(self._nmasslessnu),
self._massivenu_mass.value)
return u.Quantity(numass, u.eV, dtype=np.float)
return u.Quantity(numass, u.eV, dtype=float)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And here again.

@@ -671,7 +671,7 @@ def Onu(self, z):
if isiterable(z):
z = np.asarray(z)
if self._Onu0 == 0:
return np.zeros(np.asanyarray(z).shape, dtype=np.float)
return np.zeros(np.asanyarray(z).shape, dtype=float)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove dtype here too.

@@ -771,7 +771,7 @@ def nu_relative_density(self, z):
return prefac * self._Neff
else:
return prefac * self._Neff *\
np.ones(np.asanyarray(z).shape, dtype=np.float)
np.ones(np.asanyarray(z).shape, dtype=float)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For ones, float is also the default.

@@ -1592,7 +1592,7 @@ def w(self, z):
if np.isscalar(z):
return -1.0
else:
return -1.0 * np.ones(np.asanyarray(z).shape, dtype=np.float)
return -1.0 * np.ones(np.asanyarray(z).shape, dtype=float)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same.

@@ -1616,7 +1616,7 @@ def de_density_scale(self, z):
if np.isscalar(z):
return 1.
else:
return np.ones(np.asanyarray(z).shape, dtype=np.float)
return np.ones(np.asanyarray(z).shape, dtype=float)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same

@@ -1938,7 +1938,7 @@ def w(self, z):
if np.isscalar(z):
return self._w0
else:
return self._w0 * np.ones(np.asanyarray(z).shape, dtype=np.float)
return self._w0 * np.ones(np.asanyarray(z).shape, dtype=float)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same

@MSeifert04
Copy link
Contributor Author

@taldcroft Could you have a look?

Especially the table test which checked int and np.int was a bit strange.

Copy link
Contributor
@mhvk mhvk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked through all remaining stuff and think it is OK modulo the comments for table and io.misc. I think this is ready to go in with those taken into account.

@@ -229,7 +229,7 @@ def test_item_access_type(self, Column):
Tests for #3095, which forces integer item access to always return a plain
ndarray or MaskedArray, even in the case of a multi-dim column.
"""
integer_types = (int, np.int)
integer_types = [int]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked back at #3095, and, yes, this is definitely meant to test all possible integer types, so replace this with np.int_. (The actual code looks for subclasses of int and np.integer)

@@ -599,7 +599,7 @@ def _check_interpolate_indices(self, indices_orig, indices_clipped, max_input_mj

# See explanation in _refresh_table_as_needed for these conditions
auto_max_age = (conf.auto_max_age if conf.auto_max_age is not None
F987 else np.finfo(np.float).max)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is OK.

@@ -32,7 +32,7 @@
MexicanHat1DKernel, Tophat2DKernel, AiryDisk2DKernel, Ring2DKernel]


NUMS = [1, 1., np.float(1.), np.float32(1.), np.float64(1.)]
NUMS = [1, 1., np.float32(1.), np.float64(1.)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed

@@ -23,11 +23,11 @@

ALL_DTYPES = [np.uint8, np.uint16, np.uint32, np.uint64, np.int8,
np.int16, np.int32, np.int64, np.float32, np.float64,
np.bool, '|S3']
bool, '|S3']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't really matter, but I think this more logically is np.bool_ (or is alias, np.bool8)

np.float(2.0), np.float64(),
np.complex(3, 4), np.complex_(3 + 4j),
2.0, np.float64(),
complex(3, 4), np.complex_(3 + 4j),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe just 3 + 4j?

@@ -262,15 +262,15 @@ def ignore_aliases(self, data):
# Numpy dtypes
AstropyDumper.add_representer(np.bool_,
yaml.representer.SafeRepresenter.represent_bool)
for np_type in [np.int_, np.int, np.intc, np.intp, np.int8, np.int16, np.int32,
for np_type in [np.int_, int, np.intc, np.intp, np.int8, np.int16, np.int32,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1241

I think here int can be removed, since yaml already has it built in.

F438
np.int64, np.uint8, np.uint16, np.uint32, np.uint64]:
AstropyDumper.add_representer(np_type,
yaml.representer.SafeRepresenter.represent_int)
for np_type in [np.float_, np.float, np.float16, np.float32, np.float64,
for np_type in [np.float_, float, np.float16, np.float32, np.float64,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same for float

np.longdouble]:
AstropyDumper.add_representer(np_type,
yaml.representer.SafeRepresenter.represent_float)
for np_type in [np.complex_, np.complex, np.complex64, np.complex128]:
for np_type in [np.complex_, complex, np.complex64, np.complex128]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But not for complex since this is a custom representer.

@taldcroft
Copy link
Member

Well I admit I did not realize that int and np.int (etc) were the same thing. Is there a real performance driver for doing this PR? I imagine in most cases in the code these lookups are not in an inner loop and the extra 30 ns is not actually affecting anything. In that case my general attitude would be to leave well enough alone since testing is not always perfect.

@MSeifert04
Copy link
Contributor Author

Well I admit I did not realize that int and np.int (etc) were the same thing.

Yes, that's the main driver for the PR. It's not actually about the lookup cost or if NumPy will ever deprecate them (probably not). However as can be seen by the existing "complicated" cases it has been leading to unintended behavior, almost "problems". For example testing the wrong thing, overwriting the YAML writer for plain ints and such like.

@mhvk
Copy link
Contributor
mhvk commented Sep 28, 2017

Agreed with @MSeifert04 here - I didn't know either that np.float is float - and I think it is worthwhile just for our code not reinforcing that misconception! (And this is the right time to do it, with nearly every file touched anyway for the python2 removal...)

@taldcroft
Copy link
Member

OK, fair enough.

@mhvk
Copy link
Contributor
mhvk commented Sep 28, 2017

OK, this seems all done, so merging. Thanks, @MSeifert04!

@mhvk mhvk merged commit 9e64eb0 into astropy:master Sep 28, 2017
@MSeifert04
Copy link
Contributor Author

Thanks everyone for the reviews and comments. 👍

@MSeifert04 MSeifert04 deleted the unnessary_numpy_lookup branch September 28, 2017 23:17
maxnoe pushed a commit to maxnoe/astropy that referenced this pull request Jan 6, 2020
As discussed in astropy#6603 (and numpy/numpy#6103) it's
unnecessary and misleading to use the Python types from the NumPy namespace.

With this PR this uses the default NumPy dtype corresponding to these
Python types.
ntessore pushed a commit to glass-dev/cosmology that referenced this pull request Dec 7, 2020
…py_lookup

Remove unnessary numpy lookups for int, float, complex, bool and str
ntessore pushed a commit to glass-dev/cosmology that referenced this pull request Dec 8, 2020
…py_lookup

Remove unnessary numpy lookups for int, float, complex, bool and str
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants
0