Additional function arg unpacking (PEP 448) #5807

dlech · 2020-03-26T22:55:04Z

This is an attempt to implement PEP 448 for function arguments. It allows multiple *args and **args to be unpacked. (Popular request in #1329).

I'm curious to see how it affects code size and performance.

There are still some todo items:

There is still a null object emitted in the bytecodes that used to be the **args that needs to be removed
There are probably edge cases where memory allocation is wrong, especially in the case where one or more of the *args are iterators instead of list/tuple.

tve · 2020-03-27T03:26:41Z

might want to fix the typo in the title ;-)

dlech · 2020-03-27T03:47:34Z

If that is my worst mistake, we are in good shape 😆

dpgeorge · 2020-03-28T13:14:23Z

Thanks for working on this. It's a tough problem, especially to make it efficient (in RAM usage, execution time, code size). Properly reviewing this would take some time.

For the record, the code size change is:

   bare-arm:   +48 +0.072% 
minimal x86:  +128 +0.086% 
   unix x64:  +144 +0.029% 
unix nanbox:  +272 +0.061% 
      stm32:   +52 +0.013% PYBV10
     cc3200:   +48 +0.026% 
    esp8266:   +72 +0.011% GENERIC
      esp32:   +36 +0.003% GENERIC[incl -32(data)]
        nrf:   +52 +0.036% pca10040
       samd:   +48 +0.047% ADAFRUIT_ITSYBITSY_M4_EXPRESS

…coded-samples audiomp3: add decoded_samples property

dpgeorge · 2022-03-01T07:27:42Z

Can you please rebase this on latest master?

dlech · 2022-03-04T21:57:34Z

Rebased.

dlech · 2022-03-04T23:49:55Z

There is still a null object emitted in the bytecodes that used to be the **args that needs to be removed

Added a commit to address this todo item.

py/compile.c

tests/basics/python34.py

dpgeorge · 2022-03-10T04:18:41Z

There is still a null object emitted in the bytecodes that used to be the **args that needs to be removed

Added a commit to address this todo item.

IMO it makes sense to squash this commit into the first one, so that the cmd_showbc test does not change so much and then get reverted. That would make it a bit clearer as to the impact of this change, and how it changes generated bytecode.

There are probably edge cases where memory allocation is wrong, especially in the case where one or more of the *args are iterators instead of list/tuple.

Do you mean it might segfault in an edge case? Or that it simply makes a bad guess at how much memory to preallocate? As far as I can tell it all looks OK.

In fact this has exposed a bug in current master, the following will segfault:

def f(*a, **b):
    print(a)
    print(b)
f(*range(100), **{str(i):i for i in range(100)})

With this PR, that bug is fixed.

dpgeorge · 2022-03-10T04:26:52Z

I think the general approach here is very clever. When *seq or **dict is used as an argument the Python stack looks like this:

fun(arg0, arg1, *seq, key0=value0, key1=value1, **dict)
old: fun arg0 arg1 key0 value0 key1 value1 seq dict
new: fun arg0 arg1 seq key0 value0 key1 value1 NULL dict 4

For simple cases:

fun(*seq)
old: fun seq NULL
new: fun seq 1

fun(**dict)
old: fun NULL dict
new: fun NULL dict 0

Also in the case of using **dict the bytecode for the opcode increased by 1 byte because in now needs 2 bytes to encode the unum = (npos | nkw <<8) value. As such fun(**dict) uses 2 extra bytes in total in the bytecode (one for the unum, one for the seq bitmap).

I think this can be optimised. Eg if npos==0 then don't store the bitmap. And if npos==0 and nkw==0 then implicitly add 1 to nkw.

dlech · 2022-03-10T17:19:48Z

IMO it makes sense to squash this commit into the first one,

Agreed.

Do you mean it might segfault in an edge case? Or that it simply makes a bad guess at how much memory to preallocate?

The latter.

I think this can be optimised. Eg if npos==0 then don't store the bitmap. And if npos==0 and nkw==0 then implicitly add 1 to nkw.

I will have a look.

dlech · 2022-03-24T01:51:06Z

I have updated this with the suggested changes. With regard to the suggested optimizations, I did it a bit differently. " if npos==0 then don't store the bitmap." was simple enough but "And if npos==0 and nkw==0 then implicitly add 1 to nkw." got messy since it requires checking for this in many places. So, I figured that if we were going to add all of checks everywhere anyway, we could do it slightly differently and optimize (at least in terms of fewer bytes emitted) more cases. See the commit message for the full explanation.

Given that arg unpacking is probably not frequently used, I'm not sure the trade off in compiler size/complexity is worth the savings in emitted byte code size.

dpgeorge · 2022-03-24T03:17:20Z

Given that arg unpacking is probably not frequently used, I'm not sure the trade off in compiler size/complexity is worth the savings in emitted byte code size.

You might be right, it does look complicated. I will review it.

peterhinch · 2022-03-24T09:48:08Z

FWIW I think this is a very useful feature - subject to code size, of course :)

For example in GUI code, widgets may have multiple constructor args to define colors, shapes, fonts, sizes, callbacks etc. A dict enables a set of common attributes to be assigned to multiple widgets. The ability to assign more than one dict would be useful. You might have one to define attributes common to all controls and another for additional attributes applicable only to (say) Meter instances.

dpgeorge · 2022-03-29T05:17:15Z

tests/basics/fun_kwargs.py

+    return x
+
+
+f4(*print_ret(["a", "b"]), kw_arg=print_ret(None))


this test should go in tests/baics/fun_kwvarargs.py

dpgeorge · 2022-03-29T05:39:23Z

Code size change for supporting multiple * and ** args is:

   bare-arm:   +80 +0.139% 
minimal x86:  +133 +0.081% 
   unix x64:  +144 +0.028% 
unix nanbox:  +300 +0.065% 
      stm32:   +68 +0.017% PYBV10
     cc3200:   +80 +0.044% 
    esp8266:   +92 +0.013% GENERIC
      esp32:   +92 +0.006% GENERIC[incl -16(data)]
        nrf:   +16 +0.009% pca10040
        rp2:   +48 +0.010% PICO
       samd:   +64 +0.046% ADAFRUIT_ITSYBITSY_M4_EXPRESS

Then the additional code size change on top of that for the encoding optimisation is:

   bare-arm:  +120 +0.208% 
minimal x86:  +204 +0.124% 
   unix x64:  +374 +0.071% 
unix nanbox:  +592 +0.128% 
      stm32:  +160 +0.041% PYBV10
     cc3200:  +128 +0.070% 
    esp8266:  +156 +0.022% GENERIC
      esp32:  +188 +0.012% GENERIC
        nrf:  +148 +0.085% pca10040
        rp2:  +136 +0.028% PICO
       samd:   -56 -0.040% ADAFRUIT_ITSYBITSY_M4_EXPRESS

I'd say that optimisation is not worth it. It's relatively rare to use * / ** args and there are better ways than that to spend bytes on reducing/optimising the bytecode.

So please remove that encoding optimisation commit from this PR. That should also fix the CI issues with native emitter.

dpgeorge · 2022-03-29T05:41:22Z

py/vm.c

@@ -927,8 +927,8 @@ unwind_jump:;
                    // unum & 0xff == n_positional
                    // (unum >> 8) & 0xff == n_keyword
                    // We have following stack layout here:
-                    // fun arg0 arg1 ... kw0 val0 kw1 val1 ... seq dict <- TOS
-                    sp -= (unum & 0xff) + ((unum >> 7) & 0x1fe) + 2;
+                    // fun arg0 arg1 ... kw0 val0 kw1 val1 ... bitmap <- TOS


In this commit the bitmap does not exist so it should be seq instead. I.e., this commit should just delete the dict entry. Then in the next commit, seq is renamed to bitmap.

dpgeorge · 2022-03-29T05:49:01Z

Performance change (without the encoding optimisation commit), on PYBv1.0:

diff of scores (higher is better)
N=100 M=100                        p0 ->         p1         diff      diff% (error%)
bm_chaos.py                    351.40 ->     355.38 :      +3.98 =  +1.133% (+/-0.01%)
bm_fannkuch.py                  77.31 ->      77.98 :      +0.67 =  +0.867% (+/-0.01%)
bm_fft.py                     2526.30 ->    2549.84 :     +23.54 =  +0.932% (+/-0.00%)
bm_float.py                   5905.82 ->    5939.79 :     +33.97 =  +0.575% (+/-0.00%)
bm_hexiom.py                    47.17 ->      47.27 :      +0.10 =  +0.212% (+/-0.01%)
bm_nqueens.py                 4419.24 ->    4451.92 :     +32.68 =  +0.739% (+/-0.00%)
bm_pidigits.py                 638.01 ->     640.70 :      +2.69 =  +0.422% (+/-0.23%)
core_import_mpy_multi.py       553.34 ->     556.64 :      +3.30 =  +0.596% (+/-0.01%)
core_import_mpy_single.py       68.36 ->      68.69 :      +0.33 =  +0.483% (+/-0.01%)
core_qstr.py                    63.73 ->      63.86 :      +0.13 =  +0.204% (+/-0.01%)
core_yield_from.py             356.12 ->     355.00 :      -1.12 =  -0.315% (+/-0.00%)
misc_aes.py                    411.90 ->     413.86 :      +1.96 =  +0.476% (+/-0.01%)
misc_mandel.py                3350.85 ->    3414.97 :     +64.12 =  +1.914% (+/-0.00%)
misc_pystone.py               2225.86 ->    2232.44 :      +6.58 =  +0.296% (+/-0.00%)
misc_raytrace.py               368.11 ->     372.18 :      +4.07 =  +1.106% (+/-0.00%)
viper_call0.py                 572.83 ->     572.73 :      -0.10 =  -0.017% (+/-0.00%)
viper_call1a.py                546.71 ->     546.71 :      +0.00 =  +0.000% (+/-0.00%)
viper_call1b.py                436.01 ->     435.91 :      -0.10 =  -0.023% (+/-0.01%)
viper_call1c.py                440.59 ->     440.51 :      -0.08 =  -0.018% (+/-0.00%)
viper_call2a.py                532.82 ->     532.85 :      +0.03 =  +0.006% (+/-0.01%)
viper_call2b.py                377.24 ->     377.17 :      -0.07 =  -0.019% (+/-0.00%)

dlech · 2022-03-29T16:06:39Z

I have made the requested changes, dropped the extra commits and rebased.

codecov-commenter · 2022-03-29T16:22:27Z

Codecov Report

Merging #5807 (dcd0bdb) into master (5e685a9) will decrease coverage by 0.00%.
The diff coverage is 98.94%.

❗ Current head dcd0bdb differs from pull request most recent head 12755dc. Consider uploading reports for the commit 12755dc to get more accurate results

@@            Coverage Diff             @@
##           master    #5807      +/-   ##
==========================================
- Coverage   98.24%   98.24%   -0.01%     
==========================================
  Files         154      154              
  Lines       20273    20286      +13     
==========================================
+ Hits        19918    19929      +11     
- Misses        355      357       +2

Impacted Files	Coverage Δ
py/compile.c	`99.81% <94.44%> (-0.13%)`	⬇️
py/emitbc.c	`100.00% <100.00%> (ø)`
py/emitnative.c	`99.31% <100.00%> (ø)`
py/runtime.c	`99.72% <100.00%> (+<0.01%)`	⬆️
py/vm.c	`98.98% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5e685a9...12755dc. Read the comment docs.

8000

dlech · 2022-03-29T17:08:53Z

The changes in code coverage show some potential issues:

this branch is not hit because the cpydiff isn't run in the coverage tests.
this branch and this branch is no longer hit because we are now allocating one mp_obj_t for each kw arg, including the ** args plus the size of the ** arg if available. So we are allocating one extra mp_obj_t that isn't being used for each ** arg. Probably the same applies to *args as well.

dlech · 2022-03-29T17:39:00Z

I have added a commit to address the overallocation issue. But it doesn't seem to affect the code coverage, so there is probably something else I am missing.

dlech · 2022-03-29T18:20:54Z

I've added a few more commits that should take care of the dropped coverage issues. So the only question that remains is should we add a test that duplicates the cpydiff to get the last bit of coverage or should we just include all cpydiff in the coverage run or don't worry about coverage for that branch?

This is a partial implementation of PEP 448 to allow multiple ** unpackings when calling a function or method. The compiler is modified to encode the argument as a None: obj key-value pair (similar to how regular keyword arguments are encoded as str: obj pairs). The extra object that was pushed on the stack to hold a single ** unpacking object is no longer used and is removed. The runtime is modified to decode this new format. Signed-off-by: David Lechner <david@pybricks.com>

dpgeorge · 2022-03-29T22:56:19Z

So the only question that remains is should we add a test that duplicates the cpydiff to get the last bit of coverage

Yes I think we should add another test for this in the main test suite (this is to check when there are too many positional args, right?). Because cpydiff tests are intended to be changed/removed if/when the feature becomes compatible with CPython. Thus cpydiff tests cannot be relied upon for coverage.

docs/differences/python_35.rst

py/runtime.c

This is a partial implementation of PEP 448 to allow unpacking multiple star args in a function or method call. This is implemented by changing the emitted bytecodes so that both positional args and star args are stored as positional args. A bitmap is added to indicate if an argument at a given position is a positional argument or a star arg. In the bytecodes, this new bitmap takes the place of the old star arg. It is stored as a small int, so this means only the first N arguments can be star args where N is the number of bits in a small int. The runtime is modified to interpret this new bytecode format while still trying to perform as few memory reallocations as possible. Signed-off-by: David Lechner <david@pybricks.com>

This fixes overallocating an extra mp_obj_t when the length of *args and **args is known. Previously we were allocating 1 mp_obj_t for each n_args and n_kw plus the length of each *arg and **arg (if they are known). Since n_args includes *args and n_kw includes **args, this was allocating an extra mp_obj_t in addition to the length of these args when unpacked. To fix this, we just subtract 1 from the length to account for the 1 already implicitly allocated by n_args and n_kw. Signed-off-by: David Lechner <david@pybricks.com>

To reach this check, n_kw has to be >= 1 and therefore args2_alloc has to be >= 2. Therefore new_alloc will always be >= 4. So this check will never be true and can be removed. Signed-off-by: David Lechner <david@pybricks.com>

This fixes code coverage for the case where a *arg without __len__ is unpacked and uses exactly the amount of memory that was allocated for kw args. This triggers the code branch where the memory for the kw args gets reallocated since it was used already by the *arg unpacking. Signed-off-by: David Lechner <david@pybricks.com>

This replaces instances of uint with size_t and int with ssize_t in the mp_call_prepare_args_n_kw_var() function since all of the variables are used as array offsets. Also sort headers while we are touching this. Signed-off-by: David Lechner <david@pybricks.com>

dpgeorge · 2022-03-31T06:19:07Z

Thanks for the updates. This is looking very good now. Merged in 1e99d29 through 2e3f204

dpgeorge · 2022-03-31T06:19:38Z

I have a few extra coverage tests to add in a separate PR.

dlech force-pushed the pep-448 branch 3 times, most recently from 84a6ff9 to 129849f Compare March 27, 2020 01:59

dlech changed the title ~~Additional function arg unacking (PEP 448)~~ Additional function arg unpacking (PEP 448) Mar 27, 2020

dpgeorge added the py-core Relates to py/ directory in source label Mar 28, 2020

dlech mentioned this pull request May 31, 2021

Syntax error with * in function argument list #7338

Closed

dpgeorge mentioned this pull request Oct 28, 2021

Discussion of Python 3.5 support #1329

Open

50 tasks

tannewt pushed a commit to tannewt/circuitpython that referenced this pull request Jan 7, 2022

Merge pull request micropython#5807 from unwiredben/add-mp3decoder-de…

f2f38c7

…coded-samples audiomp3: add decoded_samples property

dlech mentioned this pull request Feb 24, 2022

Static and memory-mappable .mpy files #8191

Merged

dlech force-pushed the pep-448 branch 3 times, most recently from b50221d to d6aa8de Compare March 4, 2022 21:37

dlech force-pushed the pep-448 branch from f7d249f to 860c694 Compare March 4, 2022 23:43

dpgeorge reviewed Mar 10, 2022

View reviewed changes

py/compile.c Show resolved Hide resolved

tests/basics/python34.py Show resolved Hide resolved

dlech force-pushed the pep-448 branch from 860c694 to 0b62643 Compare March 24, 2022 01:44

dpgeorge reviewed Mar 29, 2022

View reviewed changes

dlech force-pushed the pep-448 branch from 0b62643 to 11e46f0 Compare March 29, 2022 16:05

dlech force-pushed the pep-448 branch 2 times, most recently from dcd0bdb to 12755dc Compare March 29, 2022 19:14

dpgeorge reviewed Mar 29, 2022

View reviewed changes

docs/differences/python_35.rst Outdated Show resolved Hide resolved

dpgeorge reviewed Mar 29, 2022

View reviewed changes

py/runtime.c Outdated Show resolved Hide resolved

dlech and others added 4 commits March 30, 2022 10:49

py/runtime: Drop new_alloc < 4 check.

6ed9ebd

To reach this check, n_kw has to be >= 1 and therefore args2_alloc has to be >= 2. Therefore new_alloc will always be >= 4. So this check will never be true and can be removed. Signed-off-by: David Lechner <david@pybricks.com>

dlech force-pushed the pep-448 branch from 12755dc to 1c846c0 Compare March 30, 2022 15:50

dlech force-pushed the pep-448 branch from 1c846c0 to e60ac9e Compare March 30, 2022 15:57

dpgeorge closed this Mar 31, 2022

dlech deleted the pep-448 branch March 31, 2022 14:46

dlech mentioned this pull request May 7, 2023

Weird (mis)behaviour when passing kwarg along with *args #11439

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Additional function arg unpacking (PEP 448) #5807

Additional function arg unpacking (PEP 448) #5807

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Additional function arg unpacking (PEP 448) #5807

Additional function arg unpacking (PEP 448) #5807

Uh oh!

Conversation

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!