py/objint,py/binary: Add int.to_bytes(signed) parameter, add common overflow checks. #16311

projectgus · 2024-11-27T06:56:56Z

Summary

This PR is attempting to bring some CircuitPython changes upstream to MicroPython. From these PRs:

The changes:

Implement overflow checks for buffer conversions, in a similar but more general way to py/objint: Fix int.to_bytes() buffer size checks. #13087.
Add optional int.to_bytes(signed=bool) parameter. This removes the CPython incompatibility where currently int.to_bytes in MicroPython behaves as if signed=self < 0 when converting.

After cherry-picking commits directly from CircuitPython and resolving conflicts, some additional commits were added:

Update existing MicroPython tests to pass. Most tests are now simpler, as behaviour now closer to CPython.
Restore optional passing of arguments for byteorder and length, as per py/objint: Make byteorder argument optional in byte-conversion methods. #14387 and CPython >=3.11.
Do not raise OverflowError from array constructors if a value is out of range, truncate the value instead. This is a documented difference between CPython and MicroPython, and after discussing with @dpgeorge decided changing it here has the potential to break existing MP code. The overflow checks are kept if MICROPY_PREVIEW_VERSION_2 is set, so that we can change behaviour to match CPython (and CircuitPython) in MicroPython 2. The original justification for not checking here was to save code size, but the firmware code size will actually become smaller after making this change (due to a common code path for overflow checks in array constructors and int.to_bytes). I guess Python code size may become a littler bigger if masking needs to be added, but it also eliminates a potential silent bug.
Additional refactoring commit to merge similar code paths and bring the code size back down closer to current master.
Add support for V2 non-truncating setters to moductypes.
Extend coverage of mp_binary_set_val_array_from_int so the project coverage metric stays passing.
Double the allowed stack size on the Windows port. This seems necessary to allow debug builds to pass the new unit tests in this PR.

Testing

Unit tests passing on unix and rp2 ports.
Unit tests passing on Zephyr port (uses long long implementation for bigint).

Trade-offs and Alternatives

We already have some of this bounds checking functionality in master now, but the approach taken by CircuitPython is more general and has potential to apply the same checks in multiple places (i.e. array constructors) with no further impact on code size.

codecov · 2024-11-27T07:02:39Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.36%. Comparing base (b4cf82b) to head (6f3c7e0).

Additional details and impacted files

@@            Coverage Diff             @@
##           master   #16311      +/-   ##
==========================================
- Coverage   98.54%   98.36%   -0.18%     
==========================================
  Files         169      169              
  Lines       21864    21892      +28     
==========================================
- Hits        21545    21534      -11     
- Misses        319      358      +39

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

github-actions · 2024-11-27T07:09:45Z

Code size report:

   bare-arm:  +120 +0.211% 
minimal x86:  +320 +0.170% [incl +32(data)]
   unix x64:   +48 +0.006% standard
      stm32:  +148 +0.038% PYBV10
     mimxrt:   -96 -0.026% TEENSY40
        rp2:    -8 -0.001% RPI_PICO_W
       samd:   -84 -0.031% ADAFRUIT_ITSYBITSY_M4_EXPRESS
  qemu rv32:  -172 -0.038% VIRT_RV32

ports/unix/coverage.c

According to the micropython#16311. Signed-off-by: Ihor Nehrutsa <Ihor.Nehrutsa@gmail.com>

Add `length`, `byteorder`, `signed` according to the micropython#16311. Signed-off-by: Ihor Nehrutsa <Ihor.Nehrutsa@gmail.com>

projectgus · 2025-02-26T06:31:02Z

Only two things left here...

Project coverage has somehow gone down even though the patch coverage is 100%. I've added some additional cove in coverage.c but it's still not quite enough.
The Windows x64 Debug builds are failing in the uctypes_load_store test with RuntimeError: maximum recursion depth exceeded. I don't know if this is an actual stack usage problem, a bug triggered when a unittest assert fails, or something else...

projectgus · 2025-02-26T06:32:56Z

The Windows x64 Debug builds are failing in the uctypes_load_store test with RuntimeError: maximum recursion depth exceeded

Ah OK, there's a whole section about stack usage in the windows port README so I'm guessing we're just running out of stack.

Add `length`, `byteorder`, `signed` according to the micropython#16311. Signed-off-by: Ihor Nehrutsa <Ihor.Nehrutsa@gmail.com>

For both small and long integers, raise an exception if calling struct.pack, adding an element to an array.array, or formatting an int with int.to_bytes would overflow the requested size. (Cherry-picked from CircuitPython commit 095c844.) Signed-off-by: Angus Gratton <angus@redyak.com.au>

Cherry-picked from CircuitPython commit d103ac1. Signed-off-by: Angus Gratton <angus@redyak.com.au>

Cherry-picked from CircuitPython commit c592bd6. Signed-off-by: Angus Gratton <angus@redyak.com.au>

Signed-off-by: Angus Gratton <angus@redyak.com.au>

Fixes case where destination nbytes is non-power-of-2 size. Also adds tests. Cherry-picked from CircuitPython commit 8664a65. Signed-off-by: Angus Gratton <angus@redyak.com.au>

Originally added in 80c5e76 and 0b432b3, then replaced by cherry-pick in 7e4ee62, now restored. This work was funded through GitHub Sponsors. Signed-off-by: Angus Gratton <angus@redyak.com.au>

- CPython and CircuitPython both raise OverflowError if an array constructor passes an out of bounds value. - MicroPython V1.x truncates the integer to suit. - The plan is for MicroPython V2 to change this to be the same as CircuitPython. The bounds checking cherry-picked from CircuitPython in 50bd33b and d352a73 adds these checks to array constructors. Move them behind a macro guard, and also rewrite the tests from CircuitPython to pass on MicroPython V1.x (but should be easy to convert over for MP2 in the future). This work was funded through GitHub Sponsors. Signed-off-by: Angus Gratton <angus@redyak.com.au>

Refactors similar code paths to a common mp_obj_int_to_bytes() function to reduce code size. This commit should have no functional changes. This work was funded through GitHub Sponsors. Signed-off-by: Angus Gratton <angus@redyak.com.au> Signed-off-by: Angus Gratton <angus@redyak.com.au>

Expands tests to match, although I think there may still be some corner cases where this doesn't work as expected on V2. Should be no change on V1. This work was funded through GitHub Sponsors. Signed-off-by: Angus Gratton <angus@redyak.com.au> Signed-off-by: Angus Gratton <angus@redyak.com.au>

Signed-off-by: Angus Gratton <angus@redyak.com.au>

This seems like it's only really a problem on Debug builds, but I think can't hurt to increase it on all windows builds. This work was funded through GitHub Sponsors. Signed-off-by: Angus Gratton <angus@redyak.com.au>

projectgus · 2025-03-04T05:10:57Z

      stm32:  +156 +0.040% PYBV10

Had a quick look into why stm32 got bigger while all the other bare metal ports got smaller. Didn't find anything unusual: no significant new symbols are being linked compared to master. Several functions shrunk as they now call a common helper. I think maybe they just didn't shrink as much as on the other bare metal ports...

Diff of linked symbol sizes

--- ./build-PYBV10-master/firmware.syms 2025-03-04 15:59:56.145991980 +1100
+++ ./build-PYBV10/firmware.syms        2025-03-04 16:00:08.656797386 +1100
@@ -5948,0 +5949 @@
+00000018 t allowed_args.0
@@ -6199 +6200 @@
-00000264 t compile_scope_inline_asm
+00000260 t compile_scope_inline_asm
@@ -6503 +6504 @@
-00000970 t emit_inline_thumb_op
+00000978 t emit_inline_thumb_op
@@ -7326 +7327 @@
-000000d8 t int_to_bytes
+0000005c t int_to_bytes
@@ -7770,3 +7771,3 @@
-0000002a T mp_binary_set_int
-00000114 T mp_binary_set_val
-0000008c T mp_binary_set_val_array
+0000005a T mp_binary_set_int
+000000f2 T mp_binary_set_val
+0000006e T mp_binary_set_val_array
@@ -8329 +8330 @@
-00000042 T mp_obj_exception_attr
+00000040 T mp_obj_exception_attr
@@ -8373 +8374 @@
-00000016 T mp_obj_int_to_bytes_impl
+00000114 T mp_obj_int_to_bytes
@@ -8521 +8522 @@
-0000091c T mp_qstr_const_hashes
+00000922 T mp_qstr_const_hashes
@@ -8523 +8524 @@
-0000048e T mp_qstr_const_lengths
+00000491 T mp_qstr_const_lengths
@@ -8525 +8526 @@
-00001250 T mp_qstr_const_pool
+0000125c T mp_qstr_const_pool
@@ -8891 +8892 @@
-000000d2 T mpz_as_bytes
+0000009e T mpz_as_bytes
@@ -9768 +9769 @@
-00000084 t set_aligned
+00000080 t set_aligned
@@ -10078 +10079 @@
-0000007c t task_attr
+00000078 t task_attr
@@ -10216 +10217 @@
-00000200 t uctypes_struct_attr_op
+00000204 t uctypes_struct_attr_op

(Files generated with nm --print-size firmware.elf | cut -d' ' -f2- | tee firmware.syms)

I don't see much we can do to improve this. Interestingly, the firmware size increase on PYBV10 is about the same if LTO is enabled (although the firmwares are overall smaller.)

dpgeorge · 2025-03-04T05:16:45Z

Had a quick look into why stm32 got bigger while all the other bare metal ports got smaller. Didn't find anything unusual

What about:

-00000016 T mp_obj_int_to_bytes_impl
+00000114 T mp_obj_int_to_bytes

That's 254 extra bytes. Is that similar on other ports that did decrease?

projectgus · 2025-03-04T06:15:08Z

That's 254 extra bytes. Is that similar on other ports that did decrease?

I think so, this is the common function that the others now call. They grow by basically the same amount on both samd and rp2 ports. Here's samd:

@@ -6342 +6338 @@
-00000016 T mp_obj_int_to_bytes_impl
+00000114 T mp_obj_int_to_bytes

Full size diffs

--- build-ADAFRUIT_ITSYBITSY_M4_EXPRESS-master/firmware.syms    2025-03-04 17:06:29.044966391 +1100
+++ build-ADAFRUIT_ITSYBITSY_M4_EXPRESS/firmware.syms   2025-03-04 17:07:24.983870265 +1100
@@ -993 +992,0 @@
-t $d
@@ -4381,3 +4379,0 @@
-t $t
-t $t
-t $t
@@ -4404,0 +4401 @@
+00000018 t allowed_args.0
@@ -4620 +4617 @@
-00000260 t compile_scope_inline_asm
+00000264 t compile_scope_inline_asm
@@ -4822 +4819 @@
-00000968 t emit_inline_thumb_op
+00000970 t emit_inline_thumb_op
@@ -5312 +5309 @@
-000000d8 t int_to_bytes
+0000005c t int_to_bytes
@@ -5758,4 +5755,3 @@
-0000002a T mp_binary_set_int
-0000016c T mp_binary_set_val
-0000008c T mp_binary_set_val_array
-000000f2 T mp_binary_set_val_array_from_int
+0000005a T mp_binary_set_int
+00000150 T mp_binary_set_val
+0000006e T mp_binary_set_val_array
@@ -5777 +5773 @@
-0000008c t mp_builtin_compile
+00000088 t mp_builtin_compile
@@ -6300 +6296 @@
-0000003a T mp_obj_exception_attr
+00000038 T mp_obj_exception_attr
@@ -6342 +6338 @@
-00000016 T mp_obj_int_to_bytes_impl
+00000114 T mp_obj_int_to_bytes
@@ -6477 +6473 @@
-00000542 T mp_qstr_const_hashes
+00000536 T mp_qstr_const_hashes
@@ -6479 +6475 @@
-000002a1 T mp_qstr_const_lengths
+0000029b T mp_qstr_const_lengths
@@ -6481 +6477 @@
-00000a9c T mp_qstr_const_pool
+00000a84 T mp_qstr_const_pool
@@ -6802 +6798 @@
-000000d2 T mpz_as_bytes
+0000009e T mpz_as_bytes
@@ -7172 +7168 @@
-00000084 t set_aligned
+00000080 t set_aligned
@@ -7245 +7241 @@
-00000034 t slice_attr
+00000030 t slice_attr
@@ -7402 +7398 @@
-00000070 t task_attr
+0000007c t task_attr
@@ -7553 +7549 @@
-00000200 t uctypes_struct_attr_op
+00000204 t uctypes_struct_attr_op
@@ -7599 +7595 @@
-00000038 t usb_device_attr
+00000034 t usb_device_attr

--- build-RPI_PICO-master/firmware.syms        2025-03-04 17:01:17.898379768 +1100
+++ build-RPI_PICO/firmware.syms 2025-03-04 17:01:28.760234428 +1100
@@ -31,4 +30,0 @@
-t $d
-t $d
-t $d
-t $d
@@ -244 +239,0 @@
-t $d
@@ -630,0 +626,2 @@
+t $d
+r $d
@@ -1202,0 +1200 @@
+t $d
@@ -1536,0 +1535 @@
+t $d
@@ -5462,3 +5460,0 @@
-t $t
-t $t
-t $t
@@ -5503,0 +5500 @@
+00000018 r allowed_args.0
@@ -5826 +5823 @@
-00000278 t compile_scope_inline_asm
+00000274 t compile_scope_inline_asm
@@ -6144 +6141 @@
-00000498 t emit_inline_thumb_op
+00000494 t emit_inline_thumb_op
@@ -6781 +6778 @@
-000000d4 t int_to_bytes
+00000060 t int_to_bytes
@@ -6786 +6783 @@
-00000044 t iobase_ioctl
+00000040 t iobase_ioctl
@@ -7352,4 +7349,3 @@
-0000002a T mp_binary_set_int
-00000154 T mp_binary_set_val
-00000084 T mp_binary_set_val_array
-00000084 T mp_binary_set_val_array_from_int
+00000056 T mp_binary_set_int
+0000012e T mp_binary_set_val
+00000064 T mp_binary_set_val_array
@@ -7370 +7366 @@
-00000064 t mp_builtin_compile
+00000068 t mp_builtin_compile
@@ -7885 +7881 @@
-0000005c T mp_native_type_from_qstr
+00000064 T mp_native_type_from_qstr
@@ -7962 +7958 @@
-00000016 T mp_obj_int_to_bytes_impl
+00000110 T mp_obj_int_to_bytes
@@ -8109 +8105 @@
-000002ee R mp_qstr_const_hashes
+000002f1 R mp_qstr_const_hashes
@@ -8111 +8107 @@
-000002ee R mp_qstr_const_lengths
+000002f1 R mp_qstr_const_lengths
@@ -8113 +8109 @@
-00000bd0 R mp_qstr_const_pool
+00000bdc R mp_qstr_const_pool
@@ -8355 +8351 @@
-00000064 T mp_vfs_blockdev_init
+00000060 T mp_vfs_blockdev_init
@@ -8365 +8361 @@
-00000068 T mp_vfs_getcwd
+0000006c T mp_vfs_getcwd
@@ -8367,2 +8363,2 @@
-00000060 T mp_vfs_ilistdir
-00000070 t mp_vfs_ilistdir_it_iternext
+00000064 T mp_vfs_ilistdir
+00000074 t mp_vfs_ilistdir_it_iternext
@@ -8370 +8366 @@
-0000007c T mp_vfs_import_stat
+0000007a T mp_vfs_import_stat
@@ -8433 +8429 @@
-0000002e T mp_vfs_rename
+00000030 T mp_vfs_rename
@@ -8435 +8431 @@
-00000016 T mp_vfs_rmdir
+00000018 T mp_vfs_rmdir
@@ -8437 +8433 @@
-00000040 T mp_vfs_stat
+0000003c T mp_vfs_stat
@@ -8439 +8435 @@
-00000054 T mp_vfs_statvfs
+00000058 T mp_vfs_statvfs
@@ -8441 +8437 @@
-0000009c T mp_vfs_umount
+000000a0 T mp_vfs_umount
@@ -8450 +8446 @@
-00000140 T mpz_as_bytes
+000000ea T mpz_as_bytes
@@ -9025 +9021 @@
-00000084 t set_aligned
+0000007c t set_aligned
@@ -9102 +9098 @@
-00000038 t slice_attr
+00000034 t slice_attr

Ah! But the very big difference on these ports is that, unlike stm32, the mp_binary_set_val_array_from_int function is no longer linked in this PR! That's about 127 bytes.

That's the function which is still called from stm32/adc.c . So if we can find another way to implement that, the code would go down on stm32 (and we also wouldn't need the changes in #16858).

projectgus · 2025-03-04T06:15:52Z

EDIT: Accidentally buried the useful info in the <details> tag above.

projectgus · 2025-03-04T06:25:49Z

That's the function which is still called from stm32/adc.c . So if we can find another way to implement that, the code would go down on stm32 (and we also wouldn't need the changes in #16858).

Specifically pyb.ADC.read_timed and read_timed_multi are documented as supporting either an array of 8/16-bit integers or a bytearray.

mp_binary_set_val_array_from_int supports any numeric typecode. So I think there's potentially a much simpler version we can embed into adc.c and remove the other function entirely. But probably best to do in a separate PR.

projectgus · 2025-03-04T06:30:22Z

py/binary.c

-                return;
-            }
-            #endif
-            mp_binary_set_val_array_from_int(typecode, p, index, mp_obj_get_int(val_in));


Double checking re: discussion in comments above, this is the place that previously called mp_binary_set_val_array_from_int() on all ports.

I think this is a valid change (and covered by tests), non-integer typecodes are handled above so all of the other possibilities are integers of different sizes.

We can probably call mp_binary_set_val_array from adc.c as well, will need to test to be certain.

projectgus · 2025-03-04T06:32:27Z

py/binary.c

+        default: {
+            size_t size = mp_binary_get_size('@', typecode, NULL);
+            p = (uint8_t *)p + index * size;
+            mp_obj_int_to_bytes(val_in, size, p, MP_ENDIANNESS_BIG, is_signed(typecode), OVERFLOW_CHECKS);


This endianness choice seems wrong, though, at least mp_binary_set_val_array_from_int would have used native endianness...? Need to make sure this is being tested.

According to the micropython#16311. Signed-off-by: Ihor Nehrutsa <Ihor.Nehrutsa@gmail.com>

Add `length`, `byteorder`, `signed` according to the micropython#16311. Signed-off-by: Ihor Nehrutsa <Ihor.Nehrutsa@gmail.com>

Support signed param: result = int.from_bytes(bytearray(), order='big'|'little', signed=False|True) Add `length`, `byteorder`, `signed` according to the micropython#16311. Signed-off-by: Ihor Nehrutsa <Ihor.Nehrutsa@gmail.com>

projectgus added the py-core Relates to py/ directory in source label Nov 27, 2024

projectgus force-pushed the feature/int_tobytes_signed branch from 77160cc to 0fb050a Compare November 27, 2024 06:59

projectgus force-pushed the feature/int_tobytes_signed branch 3 times, most recently from c84ab58 to 88b516a Compare December 4, 2024 07:27

projectgus force-pushed the feature/int_tobytes_signed branch from 88b516a to 140c2c9 Compare December 11, 2024 05:14

projectgus mentioned this pull request Dec 17, 2024

extmod/uctypes: fix addressof when returning large values #16440

Merged

projectgus mentioned this pull request Feb 24, 2025

WIP py/objint.c: Add signed param to int.from_bytes(). #16798

Open

projectgus force-pushed the feature/int_tobytes_signed branch 4 times, most recently from 4faf739 to b3b3d4e Compare February 25, 2025 07:14

projectgus commented Feb 25, 2025

View reviewed changes

ports/unix/coverage.c Outdated Show resolved Hide resolved

IhorNehrutsa added a commit to IhorNehrutsa/micropython that referenced this pull request Feb 25, 2025

py/objint: Rename signd to is_signed.

6d5e840

According to the micropython#16311. Signed-off-by: Ihor Nehrutsa <Ihor.Nehrutsa@gmail.com>

IhorNehrutsa added a commit to IhorNehrutsa/micropython that referenced this pull request Feb 25, 2025

py/objint.c: Use kw args in int.from_bytes().

2f466fd

Add `length`, `byteorder`, `signed` according to the micropython#16311. Signed-off-by: Ihor Nehrutsa <Ihor.Nehrutsa@gmail.com>

IhorNehrutsa added a commit to IhorNehrutsa/micropython that referenced this pull request Feb 26, 2025

py/objint.c: Use kw args in int.from_bytes().

4977a78

Add `length`, `byteorder`, `signed` according to the micropython#16311. Signed-off-by: Ihor Nehrutsa <Ihor.Nehrutsa@gmail.com>

projectgus mentioned this pull request Mar 4, 2025

unix/coverage: Increase coverage of mp_binary_set_val_array_from_int. #16858

Closed

Matt Wozniski and others added 9 commits March 4, 2025 15:58

py/objint: Handle truth values; speed up smallint checks.

89c14f0

Cherry-picked from CircuitPython commit d103ac1. Signed-off-by: Angus Gratton <angus@redyak.com.au>

py/objint: Implement to_bytes(..., signed=True).

8daac9a

Cherry-picked from CircuitPython commit c592bd6. Signed-off-by: Angus Gratton <angus@redyak.com.au>

tests: Update for new int.to_bytes() behaviour.

b8c67a9

Signed-off-by: Angus Gratton <angus@redyak.com.au>

py/objint: Use approx of original @godlygeek code for smallints.

4207702

Fixes case where destination nbytes is non-power-of-2 size. Also adds tests. Cherry-picked from CircuitPython commit 8664a65. Signed-off-by: Angus Gratton <angus@redyak.com.au>

py/objint: Restore CPython >= 3.11 defaults for int.to_bytes().

46dc37f

Originally added in 80c5e76 and 0b432b3, then replaced by cherry-pick in 7e4ee62, now restored. This work was funded through GitHub Sponsors. Signed-off-by: Angus Gratton <angus@redyak.com.au>

projectgus added 2 commits March 4, 2025 15:58

ci: Allow unittest to be imported when testing mpy merge.

92a77be

Signed-off-by: Angus Gratton <angus@redyak.com.au>

windows: Double the stack size allowance on windows port.

6f3c7e0

This seems like it's only really a problem on Debug builds, but I think can't hurt to increase it on all windows builds. This work was funded through GitHub Sponsors. Signed-off-by: Angus Gratton <angus@redyak.com.au>

projectgus force-pushed the feature/int_tobytes_signed branch from 2c3f941 to 6f3c7e0 Compare March 4, 2025 05:11

projectgus commented Mar 4, 2025

View reviewed changes

IhorNehrutsa added a commit to IhorNehrutsa/micropython that referenced this pull request Mar 13, 2025

py/objint: Rename signd to is_signed.

735435a

According to the micropython#16311. Signed-off-by: Ihor Nehrutsa <Ihor.Nehrutsa@gmail.com>

IhorNehrutsa added a commit to IhorNehrutsa/micropython that referenced this pull request Mar 13, 2025

py/objint.c: Use kw args in int.from_bytes().

44e5e3c

Add `length`, `byteorder`, `signed` according to the micropython#16311. Signed-off-by: Ihor Nehrutsa <Ihor.Nehrutsa@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

py/objint,py/binary: Add int.to_bytes(signed) parameter, add common overflow checks. #16311

py/objint,py/binary: Add int.to_bytes(signed) parameter, add common overflow checks. #16311

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

py/objint,py/binary: Add int.to_bytes(signed) parameter, add common overflow checks. #16311

Are you sure you want to change the base?

py/objint,py/binary: Add int.to_bytes(signed) parameter, add common overflow checks. #16311

Conversation

Uh oh!

Summary

Testing

Trade-offs and Alternatives

Uh oh!

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!