unique and NaN entries (Trac #1514)

Original ticket http://projects.scipy.org/numpy/ticket/1514 on 2010-06-18 by trac user rspringuel, assigned to unknown.

When unique operates on an array with multiple NaN entries its return includes a NaN for each entry that was NaN in the original array.

Examples:
a = random.randint(5,size=100).astype(float)

a[12] = nan #add a single nan entry
unique(a)
array([ 0., 1., 2., 3., 4., NaN])
a[20] = nan #add a second
unique(a)
array([ 0., 1., 2., 3., 4., NaN, NaN])
a[13] = nan
unique(a) #and a third
array([ 0., 1., 2., 3., 4., NaN, NaN, NaN])

This is probably due to the fact that x == y evaluates to False if both x and y are NaN. Unique needs to have "or (isnan(x) and isnan(y))" added to the conditional that checks for the presence of a value in the already identified values. I don't know were unique lives in numpy and couldn't find it when I went looking, so I can't make the change myself (or even be sure what the exact syntax of the conditional should be).

Also, the following function can be used to patch over the behavior.

def nanunique(x):
a = numpy.unique(x)
r = []
for i in a:
if i in r or (numpy.isnan(i) and numpy.any(numpy.isnan(r))):
continue
else:
r.append(i)
return numpy.array(r)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions