8000 functools.partial placeholders · Issue #119127 · python/cpython · GitHub
[go: up one dir, main page]

Skip to content

functools.partial placeholders #119127

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
dg-pb opened this issue May 17, 2024 · 19 comments
Closed

functools.partial placeholders #119127

dg-pb opened this issue May 17, 2024 · 19 comments
Assignees
Labels
extension-modules C modules in the Modules dir type-feature A feature request or enhancement

Comments

@dg-pb
Copy link
Contributor
dg-pb commented May 17, 2024

Feature or enhancement

Proposal:

I know that this has been brought up in the past, but I would appreciate if this was reconsidered. I would be happy to bring this up to a nice level.

from functools import partial
from partial2 import partial as partial2, Placeholder

_ = VOID = Placeholder()

%timeit partial(opr.sub, 1, 2)     # 130 ns
%timeit partial2(opr.sub, 1, 2)    # 155 ns

p1 = partial(opr.sub, 1)
p2 = partial2(opr.sub, 1)
p3 = partial2(opr.sub, _, 1)
p4 = partial2(opr.sub, VOID, 1)

print(p1(2))    # -1
print(p2(2))    # -1
print(p3(2))    # 1
print(p4(2))    # 1

%timeit p1(2)   # 48 ns
%timeit p2(2)   # 52 ns
%timeit p3(2)   # 54 ns
%timeit p4(2)   # 54 ns

The logic is as follows:

def partial(func, *p_args, **p_kwds):
    p_args = list(p_args)
    nvoid = sum(a is VOID for a in p_args)
    arng = range(len(p_args))

    def wrapper(*args, **kwds):
        if len(args) < nvoid:
            raise ValueError('Not all VOIDs are filled')
        t_args = p_args.copy()
        j = 0
        for i in arng:
            if p_args[i] is VOID:
                t_args[i] = args[j]
                j += 1
        t_args.extend(args[j:])
        return func(*t_args, **{**p_kwds, **kwds})
    return wrapper

vectorcall change looks like:

if (np) {
        nargs_new = pto_nargs + nargs - np;
        Py_ssize_t ip = 0;
        for (Py_ssize_t i=0; i < pto_nargs; i++) {
            if (PyObject_TypeCheck(pto_args[i], &placeholderType)){
                memcpy(stack + i, args + ip, 1 * sizeof(PyObject*));
                ip += 1;
            } else {
                memcpy(stack + i, pto_args + i, 1 * sizeof(PyObject*));
            }
        }
        if (nargs_total > np){
            memcpy(stack + pto_nargs, args + np, (nargs_total - np) * sizeof(PyObject*));
        }
    // EO Placeholders ------
    } else {

Has this already been discussed elsewhere?

I have already discussed this feature proposal on Discourse

Links to previous discussion of this feature:

https://discuss.python.org/t/functools-partial-placeholder-arguments/53487

Linked PRs

@dg-pb dg-pb added the type-feature A feature request or enhancement label May 17, 2024
@rhettinger rhettinger self-assigned this May 17, 2024
@Eclips4 Eclips4 added the extension-modules C modules in the Modules dir label May 17, 2024
@rhettinger
Copy link
Contributor
rhettinger commented May 17, 2024

@dg-pb
Copy link
Contributor Author
dg-pb commented May 17, 2024

[quote="Raymond Hettinger, post:20, topic:19163, username:rhettinger"]
I’m 100% opposed to including better-partial or anything like it in the standard library. At best, it would be an attractive nuisance.
[/quote]

In https://discuss.python.org/t/functools-partial-placeholder-arguments/53487 I have laid out several examples and use cases for this. But the main gain here is speed improvement by being able to efficiently utilise functions from operator module to use as predicates to functionals, where argument order is not convenient. I have been lately working on iterator recipes and from my experience the partial wrap is 20 ns vs lambda call being ~50ns can result in considerable speed-ups. Especially if it is used more than once.

[quote="Raymond Hettinger, post:9, topic:19163, username:rhettinger"]
Every variant I’ve seen is more complex, less speedy, and harder to debug than a simple def or lambda. Every variant required something extra to learn and remember, unlike def or lambda which are Python basics.
[/quote]

  • More complex. I acknowledge that complexity here is an issue, but I believe that I have found sufficiently simple solution, which can be expressed in 17 lines of python code (see above). The code is fully working and includes all needed error checks.
  • Less speedy. The solution has minimal call overhead. Only few nanoseconds and even less if it does not contain placeholders. It did add 25ns to initialisation, but I am still figuring out why it is so much as id doesn't look right. See initial comparisons above.
  • Harder to debug. It is inevitably harder to debug than simple lambda. But this is also the case for functools.partial as it is now.
  • Requires new learning. From my experience it is very easy to learn and understand. It is visually intuitive. All needed attributes are exposed so that the user can easily investigate and understand.
p3 = partial2(opr.sub, _, 1)
print(inspect.signature(p3.func))    # <Signature (a, b, /)>
print(p3.args)                # (<partial2.Placeholder object at 0x111e95ad0>, 1)

Placeholder sentinel still needs work, it would have much nicer representation. E.g: (functools.PPHoldder, 1)

Personally, I would make use of this functionality in algorithm recipes, iterator recipes, utility functions and similar places of lower level code where performance is important. And I don't see this as lambda or def replacement, but rather functionality that is well suited for a clear purpose.

Also, partial already exists with many of the problems that are being used against this functionality. But given the fact that partial exists already why not make it better and make use of it in the area where it excels (performance)?

Why I think this deserves reconsideration is that this hasn't been implemented and benchmarked before. All the previous attempts were in python, where instead of performance gain which has led me to this point, the result was a significant performance loss - up to 20x slower than current standard library implementation. So naturally, those are not used, because lambda/def is a simpler, much faster than better_partial or any other 3rd party implementation and has all the flexibility.

While before supporting reasoning was convenience, Linters, MyPy issues and similar, my main argument is performance.

@rhettinger
Copy link
Contributor
rhettinger commented Jun 6, 2024

I've left this open for comment and no one else has stepped in to opine, so I'm going to go ahead and approve it.

The API is a minimal addition that gives more flexibility without venturing into creating a mini-DSL. It is unclear whether it will present any typing challenges beyond what partial() already does, but I expect that will only be a minor obstacle.

The PR needs a lot of work but the concept is sound. I've been trying it out for a while and it looks reasonable in real code. It supports multiple placeholders but in the examples I've found only one is needed.

I'm dubious about the motivation as an optimization (PyPy doesn't need this and CPython may yet optimize a plain lambda version). However, it does make partial more broadly useful and that is a sufficient reason.

@rhettinger
Copy link
Contributor
rhettinger commented Jun 6, 2024

FWIW, the example I most enjoyed is bin8 = partial(format, Placeholder, '08b').

@picnixz
Copy link
Member
picnixz commented Jun 8, 2024

Personally, most of the time I need a placeholder for is when I want right partialization, not left partialization. While there are times where partialization in arbitrary order makes sense, an alternative would to provide rpartial interface without introducing new singletons or immortal objects. The specs on how to use rpartial would still remain an open question (i.e., is rpartial(f, a, b)(x) for f(x, a, b) or f(x, b, a)?)

@serhiy-storchaka
Copy link
Member

rpartial() could be simpler, and it would cover most of the use cases, but the placeholder provides more flexibility. If Raymond is fine with this, we should choose this option.

Although, there is a case which is supported by rpartial(), but not be the proposed feature with a placeholder -- when we want to add positional arguments after the variable number of positional arguments for the partial function. I think that we could support *Placeholder as a placeholder for variable number of arguments. But this can be implemented in a separate issue, for the first step it is enough to support Placeholder for individual arguments.

@sobolevn
Copy link
Member

I like functional programming a lot (proof: https://github.com/dry-python/returns), but I don't think that this API is nice. Here are my thoughts:

  1. The use-case is rather niche. There are many other ways of doing this: named arguments, intermediate functions, lambdas, signature changes, etc. Most of these ways are way simplier.
  2. I feel like this opens a black hole for possible confusions and bugs (and confused bugs).
  3. Some linters would have to special case _ usage in argument passing, because right now _ marks an unused variable for most cases.
  4. Typing support for this would be very hard. And since partial is already special-cased in mypy (https://github.com/python/mypy/blob/master/mypy/plugins/functools.py), it would make the plugin even more complex
  5. Given all the complexity it adds, it still does not provide all the features regular functions can do: annotations, defaults, named args. This solution is very specific: only right positional arguments. Is it worth it?

It is nice as a separate package, but I don't feel like I would like to use it in the stdlib. Or teach how to use it in the stdlib. Sorry :(

@dg-pb
Copy link
Contributor Author
dg-pb commented Jun 26, 2024

5. This solution is very specific: only right positional arguments.

This is either not correct or I am failing to understand what you mean.

@serhiy-storchaka
Copy link
Member

There is a question: how interpret Placeholder passed as a keyword argument? It could be expected that partial(func, a=Placeholder) is a callable that passes its first positional argument as the "a" keyword argument to func(). But this is difficult to implement, and we currently have no good use cases for this (although it may be changed in future). So I suggest to reject passing Placeholder as a keyword argument.

@dg-pb
Copy link
Contributor Author
dg-pb commented Aug 12, 2024

partial(func, a=Placeholder) is a callable that passes its first positional argument as the "a" keyword argument to func()

Yeah, I don't see any need for this. And it would indeed be difficult, I think there is enough of complexity for now to settle.

I think there is no harm in literal treatment of such. At least I can not see any potential issues here. It just gets treated as any other object.

I tried to come up with potential benefits. E.g. passing Placeholder via partial keyword which constructs another partial, where Placeholder (or not) is put into positional argument. But can not come up with anything realistic. Although I potentially see myself trying to exploit this fact in the future.

So I am generally neutral, but slightly positive leaving it as it is due to performance.

@serhiy-storchaka
Copy link
Member

It is easier to make valid something that was previously invalid than to change semantic of already valid expression.

@dg-pb
Copy link
Contributor Author
dg-pb commented Aug 12, 2024

So I am still unsure about this.

If it was free, it would be natural.

However as I suspected it is quite costly. Effect on instantiation for both versions given the number of keyword arguments:

# no-kw    1      3     8
C:        5%    10%     20%
PY       15%    18%     20%

For someone who mostly cares about __call__ performance this is not a big impact, but for applications that rely on creating many partial objects this is fairly significant.

Is the chance to make use of Placeholder in keywords in the future actually realistic? It is unlikely that adding complexity to keywords will be worth it given the effort required and performance decrease - such extension would have much higher performance penalty (compared to this extension) as manipulating dicts is much more costly than tuples.

Also, is there any significant risk that someone is going to pass Placeholder to partial via keywords? Even if by any chance it is desired to extend functionality to keywords (in some way) in the future, I don't think it would break a lot of code as it is very unlikely anyone will be passing Placeholder to keywords.

@dg-pb
Copy link
Contributor Author
dg-pb commented Aug 12, 2024

Btw, python wrapper for your mentioned functionality:

import functools as ftl

def partial(func, *args, **kwds):
    """keyword Placeholders will be filled with priority by positionals
    Examples:
        >>> def foo(a, b, *, c=0, d=0):
        ...     return a, b, c, d
        >>> p = partial_kwph(foo, Placeholder, 2, c=Placeholder, d=4)
        >>> p(3, 1)
        (1, 2, 3, 4)
    """
    kw_keys = [k for k, v in kwds.items() if v is ftl.Placeholder]
    if not kw_keys:
        return ftl.partial(func, *args, **kwds)
    else:
        kw_enum = list(enumerate(kw_keys))
        n_kwph = len(kw_enum)
        fprtl = ftl.partial(func, *args)
        def wrapped(*call_args, **call_kwds):
            total_kw = kwds
            for i, k in kw_enum:
                total_kw[k] = call_args[i]
            if call_kwds:
                total_kw = {**total_kw, **call_kwds}
            return fprtl(*call_args[n_kwph:], **total_kw)
        return wrapped

def foo(a, b, *, c, d):
    return a, b, c, d

p = partial(foo, ftl.Placeholder, 2, c=ftl.Placeholder, d=4)
print(p(3, 1))     # (1, 2, 3, 4)

@dg-pb
Copy link
Contributor Author
dg-pb commented Aug 12, 2024

This is slightly less than 2x slower, but given the rarity of such need I think it is good enough:

def foo(a, b, *, c, d):
    return a, b, c, d

p0 = ftl.partial(foo, 1, c=3, d=4)
p1 = ftl.partial(foo, ftl.Placeholder, 2, c=3, d=4)
p2 = partial(foo, ftl.Placeholder, 2, c=ftl.Placeholder, d=4)

lembdas = {
    'without_PH': lambda: without_PH(2),
    'with_PH': lambda: with_PH(1),
    'with_kwPH': lambda: with_kwPH(3, 1)
}
# Note, all functions return: (1, 2, 3, 4)

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃  macOS-11.7.10-x86_64-i386-64bit-Mach-O | CPython: 3.14.0a0   ┃
┃                 5 repeats, 100,000 times                      ┃
┣━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┫
┃                                Units: ns                    0 ┃
┃                                          ┏━━━━━━━━━━━━━━━━━━━━┫
┃                               without_PH304 ┃
┃                                  with_PH298 ┃
┃                                with_kwPH587 ┃
┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┻━━━━━━━━━━━━━━━━━━━━┛

@serhiy-storchaka
Copy link
Member

Well, this is not critical. If the use of Placeholder is always documented only for positional arguments, we can leave the use as a keyword argument unspecified. We can add additional checks before 3.14 release.

The only thing I have concern about right now is unnecessary complication of __new__ for the Placeholder type. It is not require for the main function. Most singletons do not bother with such thing. It is easy to add later if needed. Without this, the PR LGTM.

@dg-pb
Copy link
Contributor Author
dg-pb commented Aug 16, 2024

Apologies for delay, but this, although might seem trivial, to me is the hardest part of this PR.

Mostly because there is little to grab on to as there are no clear guidelines/protocols/examples to follow for this yet.

Placeholder is not as significant as builtin sentinels, but is more significant than various sentinel collections, which are closer to Enum than sentinel, while https://peps.python.org/pep-0661/ is in draft and has been stale forever.

Another reason is that advantages/disadvantages of different options are very small in absolute sense. Both are valid and neither has any significant issues.

So the question is the following a) or b)?:
a) type(Placeholder)() is Placeholder
b) type(Placeholder)() -> SomeError

  1. Similar Objects

Apart from builtins sentinels, there is one other, which is more closely comparable to Placeholder in significance and placement.

import typing
type(typing.NoDefault)() is typing.NoDefault    # True

Note, it's type is not serializable:

import pickle
pickle.dumps(type(typing.NoDefault))
# _pickle.PicklingError: Can't pickle <class 'NoDefaultType'>: attribute lookup NoDefaultType on builtins failed
  1. Complexity.

If __new__ can be used to get sentinel instance, then it eliminates the need for custom __reduce__ methods. Thus, standard serialization protocol can be used. Although, keeping those methods could still be beneficial for performance.

Thus, in terms of technical complexity, a) is slightly simpler as it breaks fewer standard protocols (in case the type is serializable).

In terms of conceptual complexity, I would say it is ties. Special exception to raise error, which breaks standard instantiation protocol is as complicated as returning singleton instance instead.

  1. Convenience of API

It is not uncommon to have a pattern, where default is lazy function. E.g.:

def get(self, key, lazy_default=None):
    if key not in self and default is not None:
        return lazy_default()
    return self[key]

In this case, there is an option to:
get(..., default=type(Placeholder))
instead of:
get(..., default=lambda: Placeholder).

There are possibly other similar small benefits of object and its type mimicking standard type/object instantiation protocol instead of raising an error.

So in this respect, I would say a) has a slight advantage.

@dg-pb
Copy link
Contributor Author
dg-pb commented Sep 5, 2024

There is 1 point which is in favour of b).

There is less chance that the user will depend on type(Placeholder)() raising an error than returning a singleton object. Thus, it is easier to revert from b) to a) than from a) to b).

However, I don't think this is the main consideration as I think choosing more appropriate option from the very beginning would be better so that the probability for future need of reversion is minimised.

After all, the cost function for reverting is not simply a) vs b), but rather:

P(A to B) x ReversionCost(A to B)
# versus
P(B to A) x ReversionCost(B to A)

Where P(A to B) is probability that in the future this will need to be changed from A to B.

So I would rather pick more appropriate option from the beginning.

Maybe someone who knows the rationale for making None & friends singletons could help. I.e. why was decision made for type(None)() to return None instead of raising an error? Are there any specific reasons or is it more along the lines of "smoother user experience" for constant which is highly exposed to the user?

To me Placeholder still seems closer to None (& friends) or typing.NoReturn than for example, to constants in inspect module, such as _ParameterKind enums or any of the results of '(?m)^[^\W\d]+\w* \= object\(\)', where none of sentinels seem to be user facing.

So currently my position is that A is more appropriate.
@serhiy-storchaka, given this comment and the one above, are you still convinced that B is better?
@rhettinger, do you have any preferences here?

@serhiy-storchaka
Copy link
Member

I still prefer option b. We should not add feature without a clear use case. Actually, we should discourage any use of Placeholder besides passing it as a positional argument to partial() or partialmethod(). Even partial(func, arg1, arg2) can be unsafe if arg1 is dynamically generated and can be Placeholder. So using a fabric function that returns Placeholder is very unlikely.

Waiting on @rhettinger.

@dg-pb dg-pb closed this as completed Sep 26, 2024
628C
JelleZijlstra added a commit to JelleZijlstra/cpython that referenced this issue Sep 26, 2024
Edits for improving the clarity and writing of the docs for partial.
vstinner added a commit to vstinner/cpython that referenced this issue Sep 26, 2024
* The module state now stores a strong reference to the Placeholder
  singleton.
* Use a regular dealloc function.
* Add Py_TPFLAGS_HAVE_GC flag and traverse function to help the GC to
  collect the type when _functools extension is unloaded.
vstinner added a commit to vstinner/cpython that referenced this issue Sep 26, 2024
* The module state now stores a strong reference to the Placeholder
  singleton.
* Use a regular dealloc function.
* Add Py_TPFLAGS_HAVE_GC flag and traverse function to help the GC to
  collect the type when _functools extension is unloaded.
vstinner added a commit to vstinner/cpython that referenced this issue Sep 26, 2024
* The module state now stores a strong reference to the Placeholder
  singleton.
* Use a regular dealloc function.
* Add Py_TPFLAGS_HAVE_GC flag and traverse function to help the GC to
  collect the type when _functools extension is unloaded.
vstinner added a commit that referenced this issue Sep 26, 2024
* The module state now stores a strong reference to the Placeholder
  singleton.
* Use a regular dealloc function.
* Add Py_TPFLAGS_HAVE_GC flag and a traverse function to help the GC
  to collect the type when a _functools extension is unloaded.
@dg-pb
Copy link
Contributor Author
dg-pb commented Oct 17, 2024

3 follow-up PRs.
Largely independent from each other.
All ready for review.

  1. gh-125028: Prohibit placeholders in partial keywords #126062
  2. gh-124652: partialmethod simplifications #124788
    • @rhettinger, in my opinion, all fell into places reasonably well. Simplified inspect as expected.
  3. gh-119109: improve functools.partial vectorcall with keywords #124584
    • @vstinner, hopefully you can take a look when you have some time.

millerdev added a commit to dimagi/commcare-hq that referenced this issue Apr 8, 2025
FutureWarning: functools.partial will be a method descriptor in future
Python versions; wrap it in staticmethod() if you want to preserve the
old behavior

Context:
- python/cpython#121027
- python/cpython#119127
millerdev added a commit to dimagi/commcare-hq that referenced this issue Apr 8, 2025
FutureWarning: functools.partial will be a method descriptor in future
Python versions; wrap it in staticmethod() if you want to preserve the
old behavior

Context:
- python/cpython#121027
- python/cpython#119127
millerdev added a commit to dimagi/commcare-hq that referenced this issue Apr 8, 2025
FutureWarning: functools.partial will be a method descriptor in future
Python versions; wrap it in staticmethod() if you want to preserve the
old behavior

Context:
- python/cpython#121027
- python/cpython#119127
millerdev added a commit to dimagi/commcare-hq that referenced this issue Apr 8, 2025
FutureWarning: functools.partial will be a method descriptor in future
Python versions; wrap it in staticmethod() if you want to preserve the
old behavior

Context:
- python/cpython#121027
- python/cpython#119127
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
extension-modules C modules in the Modules dir type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

6 participants
0