BUG: np.digitize casts integers to float64

Which leads to:

# all numbers appear between their successor and predecessor, right?
>>> check = lambda x: np.digitize(x, [x - 1, x + 1]) == 1

>>> check(1)
True
>>> check(2**52)
True
>> check(2**53)  # uh oh
False

The workaround is:

def digitize(x, bins, right=False):
    # arguments below are swapped, so this is swapped too
    if right:
        side = 'left' 
    else:
        side = 'right'
    return np.searchsorted(bins, x, side=side)

The issue right now is that the monotonicity detection in digitize is forcing everything to be case to float64. In almost all cases the user probably already sorted their input, so this is not only pointless overhead, but it's causing harmful behavior too.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions