8000 nditer "out" operands are not always accessible · Issue #10956 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

nditer "out" operands are not always accessible #10956

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
ahaldane opened this issue Apr 23, 2018 · 4 comments
Open

nditer "out" operands are not always accessible #10956

ahaldane opened this issue Apr 23, 2018 · 4 comments

Comments

@ahaldane
Copy link
Member

See some discussion in #10951.

The issue is that the nditer operands attribute does not always refer to the actual operand array, but rather to the temporary buffer/copy of the operand array, despite what the nditer docstring says.

This is a problem for code like:

>>> def square(a):
...     with np.nditer([a, None]) as it:
...         for x, y in it:
...             y[...] = x*x
...         return it.operands[1]

since the returned value could in principle be a temporary buffer, rather than the newly created 'out' array. In practice, this hasn't been a problem because the out arrays are never buffered, I think.

Because of our recent improvements to the nditer as a context manager, and #10951, there is a second problem that the operands cannot be accessed once the iterator context is exited. For instance, this means the return statement above can't currently be placed outside the with statement, even though in principle that is where it should go because the operand is only "written back to" at context exit.

@ahaldane
Copy link
Member Author

I am thinking the solution is to make the it.operands attribute refer to the original array, rather than the buffer. I think this is back-compatible in all real cases.

@mattip
Copy link
Member
mattip commented Apr 25, 2018

Not clear. What should operands point to if dtype is specified:

    a = np.arange(6, dtype='f4')
    au = a.byteswap().newbyteorder()
    it = np.nditer(au, [], [['readwrite', 'updateifcopy']],
                        casting='equiv', op_dtypes=[np.dtype('f4')])

If it.operand[0] is au, then there is no interface to the internal ndarray.

Maybe the issue is more "what do we mean by operands" When using it.copy(), the operands are shared and it is not really clear when to resolve writeback of temporary data, should it be done multiple times, once when the first iterator is closed or somehow else?:

def test_copy():
    a = np.arange(6, dtype='f4')
    au = a.byteswap().newbyteorder()
    it1 = np.nditer(au, [], [['readwrite', 'updateifcopy']],
                        casting='equiv', op_dtypes=[np.dtype('f4')])
    with it1:
        it2 = it1.copy()
        x = it1.operands[0]
        assert x is it2.operands[0]
        assert x.flags.writebackifcopy is True
    assert x.flags.writebackifcopy is False
    # no warning on it2 deallocate, the operand is resolved via it1

or using close:

def test_copy():
    a = np.arange(6, dtype='f4')
    au = a.byteswap().newbyteorder()
    it1 = np.nditer(au, [], [['readwrite', 'updateifcopy']],
                        casting='equiv', op_dtypes=[np.dtype('f4')])
    it2 = it1.copy()
    x = it1.operands[0]
    assert x is it2.operands[0]
    assert x.flags.writebackifcopy is True
    it1.close()
    assert x.flags.writebackifcopy is False

@mattip
Copy link
Member
mattip commented Jun 20, 2018

I should have addressed the original issue and not diverged into discussion of nditer.copy.

the returned value could in principle be a temporary buffer, rather than the newly created 'out' array

I am reasonably sure this could not happen when the out array is specified as None.

the operands cannot be accessed once the iterator context is exited

Replace that with "cannot be accessed once the writeback has occurred" and the issue no longer is specific to the new code, it existed before as well since the only way to trigger the writeback was via del it or it = None.

Changing it.operands to return a list of [x.base if x.base else x for x in it.operands] rather than [x for x in it.operands] would be a breaking change since we would no longer have access to the actual values used during iteration. For instance, it would not allow one to initialize the temporary operand values before iterating in this example of in-place operation:

a = np.arange(9, dtype='f4').reshape(3,3)
au = a.byteswap().newbyteorder()
it = np.nditer([a, au, au], ...) # set up so that the second operand is readonly and third is writeonly
with it:
    it.operands[2][...] = 0 # will not modify au
    for x,y,z  in it:
        z = operation(x,y)

This is used in the C level by einsum, as I discovered when fixing a related issue #11381, see the use of PyArray_AssignZero in 4c75113

If desired, one could always access the it.operand[n].base, although it will have a readonly flag until writeback occurs.

@ahaldane
Copy link
Member Author
ahaldane commented Jun 20, 2018

True, that would break. On the other hand, if you added a return it.operands[2] in there, and operand 2 was cast to another type, then I think you would be returning the wrong dtype result. Also you need an it.reset() in there if you are using buffers.

By the way, Just to avoid confusing myself, there are three arrays we are discussing: The "original array" operand, the "temp array" operand currently accessible as it.operands, and the buffers ('x' in your example).

The issue here is that it.operands refers to the temp array, which contradicts the doc and doesn't give us a way to access the "original array" after writeback occurs from the temp to original array.

What about simply adding a new attribute, it.outputs (or a better name) which refers to the "original arrays" and is accessible after the iterator is closed. Most old code doing return it.operands[2] which worked correctly before still does, but we can make clear in the docs what is going on and recommend using return it.outputs[2] instead, after the iterator context is exited.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants
0