Description
The context is when returning indices and/or counts of array elements the unique_*()
APIs may have to promote the data type of the returned array (=default integer type) depending on the situation, so they should have been using "default array index data type" instead.
IIUC the term "default array index data type" is still referred to by some functions like argsort()
, but
- it's never defined properly (originally added in Specify output array data types #57)
- I could be wrong but the discussion around its behavior (specifically, when to promote) is still missing
- it's removed from the "Data Types" section in Update formatting and organization of dtype guidance #206 and becomes a lingering term
Copying @kgryte from #317 (comment):
Originally, when writing the
unique
specification, the output dtype was the "default index data type". I wonder if we need to revive that distinction. Namely, that an array library should have three default data types:
- floating-point data type
- integer data type
- index data type
Here, having a default index data type makes sense, as counts should align accordingly (i.e., a count should never exceed the maximum array index).
Furthermore, while it may often be the case that indices will have the same dtype as the default integer dtype, this need not be the case. For example, indices may be
int64
, while the default integer dtype could beint32
due to better target hardware support (e.g., GPUs).