-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
default data-type allocation for arrays could be misleading and harmful #18624
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I think this falls into a category of ever changable. Or perhaps better described as this is an intentional features. |
I don't think we can change the default data-type. I would go as far as argue that its convenient that the default is integer (when all inputs are integers). The second point seems possible to me (a warning when the assignment is unsafe). A bit similar to the |
I am not sure about this, I suppose numpy datatype will convert your new data to merge with the defined datatype (NOT default datatype). I sure that changing one value only to fit with the datatype is completely faster than changing the data in the whole array, especially when it is large array. Ex: 50000 rows * 1000 columns. Moreover, this implementation is used to ensure that your data must be matched with other data value instead. |
we will have our desirable output if we use a python list instead of np array for defining n. |
@hamed4343 yeah, there are even code comments for years around this type of thing, saying things like "We really should be using same-kind casting here" (which would give you an error in most of these cases when doing things like |
I think there is. I've seen (and probably written) code that performs some kind of rounding (usually floor) and then stores a float in an integer array.
When typing is more complete so that it is common to tell type checkers that x is an integer array, and if this becomes a type checking fault, then it will probably be fixed. Until then it is not very motivating. |
Yeah, I am not sure its feasible considering the downstream impact. And right now there are more important things that will also annoy downstream and we try to avoid doing too much of that at once. One way to tune it down – and a cool feature – might be if we make it an "unsafe-but-warn-on-loss" casting level (is float64 -> float32 lossy there?). That would not really be a casting level though, since it is parameter for the actual cast loop! (It has to inspect the values as it goes, casting safety is defined only based the dtypes themselves – or should be. Unless you limit this to some scalars only at least.) That would be more like adding a parameter to |
Thanks for the report. It is very unlikely we touch the way this works currently, we could touch how assignment does things, but that is a duplicate of e.g. gh-8733. |
Reproducing code example:
Error message:
NumPy/Python version information:
The text was updated successfully, but these errors were encountered: