-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
ENH: change default empty array dtype, currently 'float64' #10405
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The default data type is float64 also for zeros, empty, et al. It's also useful to keep in mind that Numpy API is no longer as open for changes as it was 10 years ago when it was established, and small cosmetic changes (which usually are subjective) are more likely to be harmful than useful, in that they can break many existing codebases. |
I do not even believe there is a reasonable "minimal dtype". Even if a change would be reasonably possible, I doubt it makes sense. Yes, this is sometimes annoying, but no changing the default type is not the solution, and generally I personally think the solution is: Too bad, sometimes the user may have to explicitly put in the dtype by doing the array cast by hand. |
Thank you very much for the comments. I agree with some and disagree with some other comments.
I presented one example where changing the dtype does fix a problem. The change is certainly not cosmetic.
Indeed, changing something can always lead to problems downstream, and that is particularly risky if there is no estimation whatsoever how much could break. I would suggest that I make the change locally and compile a handful of large numerical libraries against it (e.g., scipy, sympy, pandas, scikit-learn, and astropy). If there are too many downstream changes required, I'd say let's forget about it. If there is no change, perhaps we could continue discussing. |
Maybe it is not just cosmetic, but I disagree with the solution. It has a bad smell for me, basically it seems like replacing one problem with another one, and yes that might fix many problems but it also will create weird new ones (of course those might be better, but they might also be more tricky?!). |
Thanks again for the comment. Let me first say that I'm not emotionally attached to the fix, it's just an inconsistency I noticed which I though could be fixed easily enough. I can also understand that a package like numpy takes a rather conservative approach in software development. When rejecting, it certainly makes things clearer if a small counterexample is given, something along the lines of
That said, feel free to close. |
Well, I do not have a very concrete example, but if you sum up an empty array or use its dtype for anything specifically, you will then get an integer dtype (possibly a small integer). |
While I agree with initial idea, that # default is np.int8
>>> np.concatenate([[100], []]) + np.concatenate([[100], []])
np.array([-56], dtype=int8) But, saying this, I still believe that if the default type was chosen as the numpy's default integer type, this would be more reasonable for me. |
I think the toy example could be made more explicit to solve this
Seems to solve that particular edge case at least :S if list concatenation is all that is needed, I would be wary of stating that |
very similar discussion happening here: #1586 |
Would it be worthwhile to have |
Pretty much just requires a merge: gh-16134 (although in the meantime a merge conflict happened). EDIT: Although, since that still requires casting, I am not sure it would help you much, it would still be a floating point array, so it is an unsafe cast to integer. That PR also adds |
(From #10135.)
In Numpy, empty arrays have the default data type
numpy.float64
.I'm questioning whether this is the right choice.
In principle, the default dtype choice for empty arrays is arbitrary since there is no data to work with. One good reason to choose a small data type is that operations with empty arrays should not needlessly augment the dtype. The current setting leads to rather surprising dtype changes like
This would not have happened with, e.g.,
numpy.array([]).dtype == numpy.int8
.The text was updated successfully, but these errors were encountered: