8000 py/objint,py/binary: Add int.to_bytes(signed) parameter, add common overflow checks. by projectgus · Pull Request #16311 · micropython/micropython · GitHub
[go: up one dir, main page]

Skip to content

py/objint,py/binary: Add int.to_bytes(signed) parameter, add common overflow checks. #16311

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

projectgus
Copy link
Contributor
@projectgus projectgus commented Nov 27, 2024

Summary

This PR is attempting to bring some CircuitPython changes upstream to MicroPython. From these PRs:

  1. Add overflow checks for int to bytes conversions adafruit/circuitpython#1860
  2. Handle truth values; speed up smallint checks adafruit/circuitpython#1879
  3. Implement to_bytes(..., signed=True) adafruit/circuitpython#2625

The changes:

  • Implement overflow checks for buffer conversions, in a similar but more general way to py/objint: Fix int.to_bytes() buffer size checks. #13087.
  • Add optional int.to_bytes(signed=bool) parameter. This removes the CPython incompatibility where currently int.to_bytes in MicroPython behaves as if signed=self < 0 when converting.

After cherry-picking commits directly from CircuitPython and resolving conflicts, some additional commits were added:

  • Update existing MicroPython tests to pass. Most tests are now simpler, as behaviour now closer to CPython.
  • Restore optional passing of arguments for byteorder and length, as per py/objint: Make byteorder argument optional in byte-conversion methods. #14387 and CPython >=3.11.
  • Do not raise OverflowError from array constructors if a value is out of range, truncate the value instead. This is a documented difference between CPython and MicroPython, and after discussing with @dpgeorge decided changing it here has the potential to break existing MP code. The overflow checks are kept if MICROPY_PREVIEW_VERSION_2 is set, so that we can change behaviour to match CPython (and CircuitPython) in MicroPython 2. The original justification for not checking here was to save code size, but the firmware code size will actually become smaller after making this change (due to a common code path for overflow checks in array constructors and int.to_bytes). I guess Python code size may become a littler bigger if masking needs to be added, but it also eliminates a potential silent bug.
  • Additional refactoring commit to merge similar code paths and bring the code size back down closer to current master.
  • Add support for V2 non-truncating setters to moductypes.
  • Extend coverage of mp_binary_set_val_array_from_int so the project coverage metric stays passing.
  • Double the allowed stack size on the Windows port. This seems necessary to allow debug builds to pass the new unit tests in this PR.

Testing

  • Unit tests passing on unix and rp2 ports.
  • Unit tests passing on Zephyr port (uses long long implementation for bigint).

Trade-offs and Alternatives

  • We already have some of this bounds checking functionality in master now, but the approach taken by CircuitPython is more general and has potential to apply the same checks in multiple places (i.e. array constructors) with no further impact on code size.

@projectgus projectgus added the py-core Relates to py/ directory in source label Nov 27, 2024
@projectgus projectgus force-pushed the feature/int_tobytes_signed branch from 77160cc to 0fb050a Compare November 27, 2024 06:59
Copy link
codecov bot commented Nov 27, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.36%. Comparing base (b4cf82b) to head (6f3c7e0).

Additional details and impacted files
@@            Coverage Diff             @@
##           master   #16311      +/-   ##
==========================================
- Coverage   98.54%   98.36%   -0.18%     
==========================================
  Files         169      169              
  Lines       21864    21892      +28     
==========================================
- Hits        21545    21534      -11     
- Misses        319      358      +39     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
github-actions bot commented Nov 27, 2024

Code size report:

   bare-arm:  +120 +0.211% 
minimal x86:  +320 +0.170% [incl +32(data)]
   unix x64:   +48 +0.006% standard
      stm32:  +148 +0.038% PYBV10
     mimxrt:   -96 -0.026% TEENSY40
        rp2:    -8 -0.001% RPI_PICO_W
       samd:   -84 -0.031% ADAFRUIT_ITSYBITSY_M4_EXPRESS
  qemu rv32:  -172 -0.038% VIRT_RV32

@projectgus projectgus force-pushed the feature/int_tobytes_signed branch 3 times, most recently from c84ab58 to 88b516a Compare December 4, 2024 07:27
@projectgus projectgus force-pushed the feature/int_tobytes_signed branch from 88b516a to 140c2c9 Compare December 11, 2024 05:14
@projectgus projectgus force-pushed the feature/int_tobytes_signed branch 4 times, most recently from 4faf739 to b3b3d4e Compare February 25, 2025 07:14
IhorNehrutsa added a commit to IhorNehrutsa/micropython that referenced this pull request Feb 25, 2025
According to the micropython#16311.

Signed-off-by: Ihor Nehrutsa <Ihor.Nehrutsa@gmail.com>
IhorNehrutsa added a commit to IhorNehrutsa/micropython that referenced this pull request Feb 25, 2025
Add `length`, `byteorder`, `signed`
according to the micropython#16311.

Signed-off-by: Ihor Nehrutsa <Ihor.Nehrutsa@gmail.com>
@projectgus
Copy link
Contributor Author

Only two things left here...

  • Project coverage has somehow gone down even though the patch coverage is 100%. I've added some additional cove in coverage.c but it's still not quite enough.
  • The Windows x64 Debug builds are failing in the uctypes_load_store test with RuntimeError: maximum recursion depth exceeded. I don't know if this is an actual stack usage problem, a bug triggered when a unittest assert fails, or something else...

@projectgus
Copy link
Contributor Author
  • The Windows x64 Debug builds are failing in the uctypes_load_store test with RuntimeError: maximum recursion depth exceeded

Ah OK, there's a whole section about stack usage in the windows port README so I'm guessing we're just running out of stack.

IhorNehrutsa added a commit to IhorNehrutsa/micropython that referenced this pull request Feb 26, 2025
Add `length`, `byteorder`, `signed`
according to the micropython#16311.

Signed-off-by: Ihor Nehrutsa <Ihor.Nehrutsa@gmail.com>
Matt Wozniski and others added 9 commits March 4, 2025 15:58
For both small and long integers, raise an exception if calling
struct.pack, adding an element to an array.array, or formatting an int
with int.to_bytes would overflow the requested size.

(Cherry-picked from CircuitPython commit 095c844.)

Signed-off-by: Angus Gratton <angus@redyak.com.au>
Cherry-picked from CircuitPython commit d103ac1.

Signed-off-by: Angus Gratton <angus@redyak.com.au>
Cherry-picked from CircuitPython commit c592bd6.

Signed-off-by: Angus Gratton <angus@redyak.com.au>
Signed-off-by: Angus Gratton <angus@redyak.com.au>
Fixes case where destination nbytes is non-power-of-2 size.

Also adds tests.

Cherry-picked from CircuitPython commit 8664a65.

Signed-off-by: Angus Gratton <angus@redyak.com.au>
Originally added in 80c5e76 and 0b432b3, then replaced by cherry-pick in
7e4ee62, now restored.

This work was funded through GitHub Sponsors.

Signed-off-by: Angus Gratton <angus@redyak.com.au>
- CPython and CircuitPython both raise OverflowError if an array
  constructor passes an out of bounds value.
- MicroPython V1.x truncates the integer to suit.
- The plan is for MicroPython V2 to change this to be the same
  as CircuitPython.

The bounds checking cherry-picked from CircuitPython in 50bd33b and d352a73
adds these checks to array constructors. Move them behind a macro guard,
and also rewrite the tests from CircuitPython to pass on MicroPython
V1.x (but should be easy to convert over for MP2 in the future).

This work was funded through GitHub Sponsors.

Signed-off-by: Angus Gratton <angus@redyak.com.au>
Refactors similar code paths to a common mp_obj_int_to_bytes() function
to reduce code size. This commit should have no functional changes.

This work was funded through GitHub Sponsors.

Signed-off-by: Angus Gratton <angus@redyak.com.au>
Signed-off-by: Angus Gratton <angus@redyak.com.au>
Expands tests to match, although I think there may still be some corner
cases where this doesn't work as expected on V2. Should be no change on V1.

This work was funded through GitHub Sponsors.

Signed-off-by: Angus Gratton <angus@redyak.com.au>
Signed-off-by: Angus Gratton <angus@redyak.com.au>
Signed-off-by: Angus Gratton <angus@redyak.com.au>
This seems like it's only really a problem on Debug builds, but
I think can't hurt to increase it on all windows builds.

This work was funded through GitHub Sponsors.

Signed-off-by: Angus Gratton <angus@redyak.com.au>
@projectgus
Copy link
Contributor Author
projectgus commented Mar 4, 2025
      stm32:  +156 +0.040% PYBV10

Had a quick look into why stm32 got bigger while all the other bare metal ports got smaller. Didn't find anything unusual: no significant new symbols are being linked compared to master. Several functions shrunk as they now call a common helper. I think maybe they just didn't shrink as much as on the other bare metal ports...

Diff of linked symbol sizes
--- ./build-PYBV10-master/firmware.syms 2025-03-04 15:59:56.145991980 +1100
+++ ./build-PYBV10/firmware.syms        2025-03-04 16:00:08.656797386 +1100
@@ -5948,0 +5949 @@
+00000018 t allowed_args.0
@@ -6199 +6200 @@
-00000264 t compile_scope_inline_asm
+00000260 t compile_scope_inline_asm
@@ -6503 +6504 @@
-00000970 t emit_inline_thumb_op
+00000978 t emit_inline_thumb_op
@@ -7326 +7327 @@
-000000d8 t int_to_bytes
+0000005c t int_to_bytes
@@ -7770,3 +7771,3 @@
-0000002a T mp_binary_set_int
-00000114 T mp_binary_set_val
-0000008c T mp_binary_set_val_array
+0000005a T mp_binary_set_int
+000000f2 T mp_binary_set_val
+0000006e T mp_binary_set_val_array
@@ -8329 +8330 @@
-00000042 T mp_obj_exception_attr
+00000040 T mp_obj_exception_attr
@@ -8373 +8374 @@
-00000016 T mp_obj_int_to_bytes_impl
+00000114 T mp_obj_int_to_bytes
@@ -8521 +8522 @@
-0000091c T mp_qstr_const_hashes
+00000922 T mp_qstr_const_hashes
@@ -8523 +8524 @@
-0000048e T mp_qstr_const_lengths
+00000491 T mp_qstr_const_lengths
@@ -8525 +8526 @@
-00001250 T mp_qstr_const_pool
+0000125c T mp_qstr_const_pool
@@ -8891 +8892 @@
-000000d2 T mpz_as_bytes
+0000009e T mpz_as_bytes
@@ -9768 +9769 @@
-00000084 t set_aligned
+00000080 t set_aligned
@@ -10078 +10079 @@
-0000007c t task_attr
+00000078 t task_attr
@@ -10216 +10217 @@
-00000200 t uctypes_struct_attr_op
+00000204 t uctypes_struct_attr_op

(Files generated with nm --print-size firmware.elf | cut -d' ' -f2- | tee firmware.syms)

I don't see much we can do to improve this. Interestingly, the firmware size increase on PYBV10 is about the same if LTO is enabled (although the firmwares are overall smaller.)

@projectgus projectgus force-pushed the feature/int_tobytes_signed branch from 2c3f941 to 6f3c7e0 Compare March 4, 2025 05:11
@dpgeorge
Copy link
Member
dpgeorge commented Mar 4, 2025

Had a quick look into why stm32 got bigger while all the other bare metal ports got smaller. Didn't find anything unusual

What about:

-00000016 T mp_obj_int_to_bytes_impl
+00000114 T mp_obj_int_to_bytes

That's 254 extra bytes. Is that similar on other ports that did decrease?

@projectgus
Copy link
Contributor Author
projectgus commented Mar 4, 2025

That's 254 extra bytes. Is that similar on other ports that did decrease?

I think so, this is the common function that the others now call. They grow by basically the same amount on both samd and rp2 ports. Here's samd:

@@ -6342 +6338 @@
-00000016 T mp_obj_int_to_bytes_impl
+00000114 T mp_obj_int_to_bytes
Full size diffs
--- build-ADAFRUIT_ITSYBITSY_M4_EXPRESS-master/firmware.syms    2025-03-04 17:06:29.044966391 +1100
+++ build-ADAFRUIT_ITSYBITSY_M4_EXPRESS/firmware.syms   2025-03-04 17:07:24.983870265 +1100
@@ -993 +992,0 @@
-t $d
@@ -4381,3 +4379,0 @@
-t $t
-t $t
-t $t
@@ -4404,0 +4401 @@
+00000018 t allowed_args.0
@@ -4620 +4617 @@
-00000260 t compile_scope_inline_asm
+00000264 t compile_scope_inline_asm
@@ -4822 +4819 @@
-00000968 t emit_inline_thumb_op
+00000970 t emit_inline_thumb_op
@@ -5312 +5309 @@
-000000d8 t int_to_bytes
+0000005c t int_to_bytes
@@ -5758,4 +5755,3 @@
-0000002a T mp_binary_set_int
-0000016c T mp_binary_set_val
-0000008c T mp_binary_set_val_array
-000000f2 T mp_binary_set_val_array_from_int
+0000005a T mp_binary_set_int
+00000150 T mp_binary_set_val
+0000006e T mp_binary_set_val_array
@@ -5777 +5773 @@
-0000008c t mp_builtin_compile
+00000088 t mp_builtin_compile
@@ -6300 +6296 @@
-0000003a T mp_obj_exception_attr
+00000038 T mp_obj_exception_attr
@@ -6342 +6338 @@
-00000016 T mp_obj_int_to_bytes_impl
+00000114 T mp_obj_int_to_bytes
@@ -6477 +6473 @@
-00000542 T mp_qstr_const_hashes
+00000536 T mp_qstr_const_hashes
@@ -6479 +6475 @@
-000002a1 T mp_qstr_const_lengths
+0000029b T mp_qstr_const_lengths
@@ -6481 +6477 @@
-00000a9c T mp_qstr_const_pool
+00000a84 T mp_qstr_const_pool
@@ -6802 +6798 @@
-000000d2 T mpz_as_bytes
+0000009e T mpz_as_bytes
@@ -7172 +7168 @@
-00000084 t set_aligned
+00000080 t set_aligned
@@ -7245 +7241 @@
-00000034 t slice_attr
+00000030 t slice_attr
@@ -7402 +7398 @@
-00000070 t task_attr
+0000007c t task_attr
@@ -7553 +7549 @@
-00000200 t uctypes_struct_attr_op
+00000204 t uctypes_struct_attr_op
@@ -7599 +7595 @@
-00000038 t usb_device_attr
+00000034 t usb_device_attr
--- build-RPI_PICO-master/firmware.syms        2025-03-04 17:01:17.898379768 +1100
+++ build-RPI_PICO/firmware.syms 2025-03-04 17:01:28.760234428 +1100
@@ -31,4 +30,0 @@
-t $d
-t $d
-t $d
-t $d
@@ -244 +239,0 @@
-t $d
@@ -630,0 +626,2 @@
+t $d
+r $d
@@ -1202,0 +1200 @@
+t $d
@@ -1536,0 +1535 @@
+t $d
@@ -5462,3 +5460,0 @@
-t $t
-t $t
-t $t
@@ -5503,0 +5500 @@
+00000018 r allowed_args.0
@@ -5826 +5823 @@
-00000278 t compile_scope_inline_asm
+00000274 t compile_scope_inline_asm
@@ -6144 +6141 @@
-00000498 t emit_inline_thumb_op
+00000494 t emit_inline_thumb_op
@@ -6781 +6778 @@
-000000d4 t int_to_bytes
+00000060 t int_to_bytes
@@ -6786 +6783 @@
-00000044 t iobase_ioctl
+00000040 t iobase_ioctl
@@ -7352,4 +7349,3 @@
-0000002a T mp_binary_set_int
-00000154 T mp_binary_set_val
-00000084 T mp_binary_set_val_array
-00000084 T mp_binary_set_val_array_from_int
+00000056 T mp_binary_set_int
+0000012e T mp_binary_set_val
+00000064 T mp_binary_set_val_array
@@ -7370 +7366 @@
-00000064 t mp_builtin_compile
+00000068 t mp_builtin_compile
@@ -7885 +7881 @@
-0000005c T mp_native_type_from_qstr
+00000064 T mp_native_type_from_qstr
@@ -7962 +7958 @@
-00000016 T mp_obj_int_to_bytes_impl
+00000110 T mp_obj_int_to_bytes
@@ -8109 +8105 @@
-000002ee R mp_qstr_const_hashes
+000002f1 R mp_qstr_const_hashes
@@ -8111 +8107 @@
-000002ee R mp_qstr_const_lengths
+000002f1 R mp_qstr_const_lengths
@@ -8113 +8109 @@
-00000bd0 R mp_qstr_const_pool
+00000bdc R mp_qstr_const_pool
@@ -8355 +8351 @@
-00000064 T mp_vfs_blockdev_init
+00000060 T mp_vfs_blockdev_init
@@ -8365 +8361 @@
-00000068 T mp_vfs_getcwd
+0000006c T mp_vfs_getcwd
@@ -8367,2 +8363,2 @@
-00000060 T mp_vfs_ilistdir
-00000070 t mp_vfs_ilistdir_it_iternext
+00000064 T mp_vfs_ilistdir
+00000074 t mp_vfs_ilistdir_it_iternext
@@ -8370 +8366 @@
-0000007c T mp_vfs_import_stat
+0000007a T mp_vfs_import_stat
@@ -8433 +8429 @@
-0000002e T mp_vfs_rename
+00000030 T mp_vfs_rename
@@ -8435 +8431 @@
-00000016 T mp_vfs_rmdir
+00000018 T mp_vfs_rmdir
@@ -8437 +8433 @@
-00000040 T mp_vfs_stat
+0000003c T mp_vfs_stat
@@ -8439 +8435 @@
-00000054 T mp_vfs_statvfs
+00000058 T mp_vfs_statvfs
@@ -8441 +8437 @@
-0000009c T mp_vfs_umount
+000000a0 T mp_vfs_umount
@@ -8450 +8446 @@
-00000140 T mpz_as_bytes
+000000ea T mpz_as_bytes
@@ -9025 +9021 @@
-00000084 t set_aligned
+0000007c t set_aligned
@@ -9102 +9098 @@
-00000038 t slice_attr
+00000034 t slice_attr

Ah! But the very big difference on these ports is that, unlike stm32, the mp_binary_set_val_array_from_int function is no longer linked in this PR! That's about 127 bytes.

That's the function which is still called from stm32/adc.c . So if we can find another way to implement that, the code would go down on stm32 (and we also wouldn't need the changes in #16858).

@projectgus
Copy link
Contributor Author

EDIT: Accidentally buried the useful info in the <details> tag above.

@projectgus
Copy link
Contributor Author

That's the function which is still called from stm32/adc.c . So if we can find another way to implement that, the code would go down on stm32 (and we also wouldn't need the changes in #16858).

Specifically pyb.ADC.read_timed and read_timed_multi are documented as supporting either an array of 8/16-bit integers or a bytearray.

mp_binary_set_val_array_from_int supports any numeric typecode. So I think there's potentially a much simpler version we can embed into adc.c and remove the other function entirely. But probably best to do in a separate PR.

return;
}
#endif
mp_binary_set_val_array_from_int(typecode, p, index, mp_obj_get_int(val_in));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double checking re: discussion in comments above, this is the place that previously called mp_binary_set_val_array_from_int() on all ports.

I think this is a valid change (and covered by tests), non-integer typecodes are handled above so all of the other possibilities are integers of different sizes.

We can probably call mp_binary_set_val_array from adc.c as well, will need to test to be certain.

default: {
size_t size = mp_binary_get_size('@', typecode, NULL);
p = (uint8_t *)p + index * size;
mp_obj_int_to_bytes(val_in, size, p, MP_ENDIANNESS_BIG, is_signed(typecode), OVERFLOW_CHECKS);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This endianness choice seems wrong, though, at least mp_binary_set_val_array_from_int would have used native endianness...? Need to make sure this is being tested.

IhorNehrutsa added a commit to IhorNehrutsa/micropython that referenced this pull request Mar 13, 2025
According to the micropython#16311.

Signed-off-by: Ihor Nehrutsa <Ihor.Nehrutsa@gmail.com>
IhorNehrutsa added a commit to IhorNehrutsa/micropython that referenced this pull request Mar 13, 2025
Add `length`, `byteorder`, `signed`
according to the micropython#16311.

Signed-off-by: Ihor Nehrutsa <Ihor.Nehrutsa@gmail.com>
IhorNehrutsa added a commit to IhorNehrutsa/micropython that referenced this pull request Mar 14, 2025
Support signed param:
result = int.from_bytes(bytearray(),
order='big'|'little', signed=False|True)

Add `length`, `byteorder`, `signed`
according to the micropython#16311.

Signed-off-by: Ihor Nehrutsa <Ihor.Nehrutsa@gmail.com>
IhorNehrutsa added a commit to IhorNehrutsa/micropython that referenced this pull request Mar 14, 2025
Support signed param:
result = int.from_bytes(bytearray(),
order='big'|'little', signed=False|True)

Add `length`, `byteorder`, `signed`
according to the micropython#16311.

Signed-off-by: Ihor Nehrutsa <Ihor.Nehrutsa@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
py-core Relates to py/ directory in source
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0