8000 unique and NaN entries (Trac #1514) · Issue #2111 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content
unique and NaN entries (Trac #1514) #2111
Closed
@thouis

Description

@thouis

Original ticket http://projects.scipy.org/numpy/ticket/1514 on 2010-06-18 by trac user rspringuel, assigned to unknown.

When unique operates on an array with multiple NaN entries its return includes a NaN for each entry that was NaN in the original array.

Examples:
a = random.randint(5,size=100).astype(float)

a[12] = nan #add a single nan entry
unique(a)
array([ 0., 1., 2., 3., 4., NaN])
a[20] = nan #add a second
unique(a)
array([ 0., 1., 2., 3., 4., NaN, NaN])
a[13] = nan
unique(a) #and a third
array([ 0., 1., 2., 3., 4., NaN, NaN, NaN])

This is probably due to the fact that x == y evaluates to False if both x and y are NaN. Unique needs to have "or (isnan(x) and isnan(y))" added to the conditional that checks for the presence of a value in the already identified values. I don't know were unique lives in numpy and couldn't find it when I went looking, so I can't make the change myself (or even be sure what the exact syntax of the conditional should be).

Also, the following function can be used to patch over the behavior.

def nanunique(x):
a = numpy.unique(x)
r = []
for i in a:
if i in r or (numpy.isnan(i) and numpy.any(numpy.isnan(r))):
continue
else:
r.append(i)
return numpy.array(r)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0