8000 ENH: Added prepend and append to cumsum by madphysicist · Pull Request #14542 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

ENH: Added prepend and append to cumsum #14542

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

madphysicist
Copy link
Contributor

Similar to diff, and equally useful, this PR adds prepend and append arguments to np.cumsum. It does not add them to the C implemented np.ndarray.cumsum, but rather re-uses the code in diff at the top level.

Inspiration: https://stackoverflow.com/questions/57965499/fill-vector-values-between-specified-indices/57965631#comment102355646_57965631

This is still a WIP, but I needed a PR number to create a release notes snippet.

@mattip
Copy link
Member
mattip commented Sep 18, 2019

API changes need to hit the mailing list numpy-discussion@python.org

@madphysicist
Copy link
Contributor Author

@mattip Definitely will do. I want to shape up the PR a bit first. There are currently no tests, for example.

@paul-the-noob
Copy link

prepend would indeed be super handy the obvious application being undoing a diff. Not so sure about append, though. What's the use case?

@madphysicist
Copy link
Contributor Author

@paul-the-noob. Full invertibility of diff, which now supports both arguments. I hadn't thought beyond that, really.

@seberg
Copy link
Member
seberg commented Sep 19, 2019

To be honest, on first sight, I am not sure I quite like this. Mainly because previously this was just a super thin wrapper around np.add.accumulate and after this it is not. If we compare with reductions for example, if it was mainly about your prepend, this could also be a generic initial.

@paul-the-noob
Copy link

I was going to suggest that this might be useful for accumulators in general. Not sure about the best name, though. initial is unambiguous for reduce, but for accumulate does it mean "shift first term" and default to "neutral element" or "prepend" and default to None?

@WarrenWeckesser
Copy link
Member

Quick question: why is the default value for the new parameters np._NoValue? Why not None?

@eric-wieser
Copy link
Member

@WarrenWeckesser: In general the answer to that question is "because we are considering object arrays, where None is no longer an out-of-band value". I suppose for cumsum specifically you could argue that any type that implements addition against None is not worth supporting, but there are other operation where that statement is harder to justify.

@seberg
Copy link
Member
seberg commented Feb 14, 2020

In most cases we currently use np._NoValue to signal that a parameter was not passed and thus is not forwarded to the subclass/array-like which may not support it.

But, I think Eric is right, that here it may make sense to allow None as an actual value here. Maybe it already makes sense here and allows that None means the same as np.nan for floating point, even if float np.nan is not a good reason itself.

@WarrenWeckesser
Copy link
Member

@eric-wieser & @seberg, thanks.

Maybe it already makes sense here and allows that None means the same as np.nan for floating point, even if float np.nan is not a good reason itself.

I think I missed something. Are you talking about a hypothetical object where None acts like nan, or does this actually occur somewhere in existing code?

@seberg
Copy link
Member
seberg commented Feb 14, 2020

I thought it might do np.asarray(prepend, dtype=arr.dtype) in which case it would interpret None as np.nan (it does not seem to; and there probably would be an issue around casting safety if it did).

@charris
Copy link
Member
charris commented Feb 14, 2020

About _NoValue,

commit 4b4d8510739c9ea7a80391e2656d84a3c5e4a9c3
Author: Matthew Brett <matthew.brett@gmail.com>
Date:   Sun Mar 15 11:31:37 2015 -0700

    ENH: _NoValue class at top-level to test kwargs
    
    Add _NoValue class at top level to make it possible to detect when
    non-default values got passed to a keyword argument, as in:
    
        def func(a, b=np._NoValue):
            if b is not np._NoValue:
                warnings.warn("Argument b is deprecated",
                              DeprecationWarning)

There is a discussion thread about that use. The original use case was changing the name of an optional keyword argument.

_NoValue has also been used when adding new keywords that might not be implemented in downstream duck arrays or ndarray subclasses. That usage has changed over the years and may no longer be needed. It isn't clear to me that non-implemented keyword arguments are even detected these days.

@WarrenWeckesser
Copy link
Member

This PR is tagged "needs decision", so the following comments are meant to help guide that decision.

By far the most common use-case that I've seen is prepending a scalar. I think it would be valuable to expand the API to cover at least this use-case.

The typical case arises when you have a sequence of sizes, and you want to create a corresponding sequence of offsets, e.g.:

In [14]: sizes = np.array([4, 3, 1, 5])

The offsets:

In [15]: np.concatenate(([0], sizes)).cumsum()
Out[15]: array([ 0,  4,  7,  8, 13])

Actually, in this case we typically don't need the final value:

In [16]: np.concatenate(([0], sizes[:-1])).cumsum()
Out[16]: array([0, 4, 7, 8])

Some examples in the numpy source code:

And a few from scipy (there are more than this):

If we enhance cumsum like this, for consistency we should also make a similar enhancement to cumprod. An example from scipy:

@mhvk
Copy link
Contributor
mhvk commented Feb 15, 2020

Bit of a drive-by comment, but my sense would be to add initial to .accumulate for symmetry with reductions and to help the use cases found. Thus, there would be a possibility to prepend but not to append.

@WarrenWeckesser
Copy link
Member

@madphysicist, thanks for contributing this PR.

We discussed this in the developers call yesterday. The conclusion there matches the discussion that has taken place so far in the PR:

  • There is not a strong interest in adding prepend and append to cumsum, so we won't continue with the PR.
  • There is interest in the simpler option of prepending a single initial value. If possible, this should be done in the accumulate method of the ufuncs, by adding the parameter initial.

Would you be interested in working on adding initial to the accumulate method? It looks like this would involve modifying the C functions PyUFunc_GenericReduction and PyUFunc_Accumulate in the file ufunc_object.c.

@madphysicist
Copy link
Contributor Author
madphysicist commented Mar 12, 2020 via email

@WarrenWeckesser
Copy link
Member

I'll take a look at the functions you mentioned and will let you know if this is something I can do.

Great, thanks for taking a look.

In the meantime, I'll close this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants
0