8000 Add a BufferedReader and allow BufferedWriter to handle partial writes and errors after some data was written. by klondi · Pull Request #13390 · micropython/micropython · GitHub
[go: up one dir, main page]

Skip to content

Add a BufferedReader and allow BufferedWriter to handle partial writes and errors after some data was written. #13390

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

klondi
Copy link
@klondi klondi commented Jan 9, 2024

The commits should already describe much of what is going on but in summary.

Fix a missing include in py/mpconfig.h
Update BufferedWriter to handle partial writes and errors after some data is written.
Add a BufferedReader class handling also partial reads and errors after some data is read.

This is my first time sending code to micropython so I hope I did not make any mistakes.

(Also in case it is necessary, yes the code is mine, it is owned by me and not by any of my employers and I agree to release it under the MIT License).

Copy link
github-actions bot commented Jan 9, 2024

Code size report:

   bare-arm:    +0 +0.000% 
minimal x86:    +0 +0.000% 
   unix x64:    +0 +0.000% standard
      stm32:    +0 +0.000% PYBV10
     mimxrt:    +0 +0.000% TEENSY40
        rp2:    +0 +0.000% RPI_PICO_W
       samd:    +0 +0.000% ADAFRUIT_ITSYBITSY_M4_EXPRESS
  qemu rv32:    +0 +0.000% VIRT_RV32

@klondi
Copy link
Author
klondi commented Jan 9, 2024

I have fixed the spelling error, I am unsure what should I do with the Signed-Of-By headers.

Copy link
codecov bot commented Jan 9, 2024

Codecov Report

Attention: Patch coverage is 44.64286% with 31 lines in your changes missing coverage. Please review.

Project coverage is 98.40%. Comparing base (883dc41) to head (2c4d1db).
Report is 47 commits behind head on master.

Files with missing lines Patch % Lines
py/modio.c 44.64% 31 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #13390      +/-   ##
==========================================
- Coverage   98.54%   98.40%   -0.14%     
==========================================
  Files         169      169              
  Lines       21897    21937      +40     
==========================================
+ Hits        21579    21588       +9     
- Misses        318      349      +31     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

py/misc.h Outdated
@@ -33,6 +33,8 @@
#include <stdbool.h>
#include <stdint.h>
#include <stddef.h>
// Needed for NORETURN
#include "py/mpconfig.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't think this is explicitly documented, but see the rest of the code (grep for all occurrences of misch.h): the idea is that any file needing to include py/misc.h first includes py/mpconfig.h, so this change should not be made.

// Writes out the data stored in the buffer so far
STATIC int bufwriter_do_write(mp_obj_bufwriter_t *self) {
int rv = 0;
// This cannot return 0 without an error
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate on what you mean with this? Like: why would it be an issue for this bit of code if it were 0?

STATIC mp_uint_t bufwriter_write(mp_obj_t self_in, const void *buf, mp_uint_t size, int *errcode) {
mp_obj_bufwriter_t *self = MP_OBJ_TO_PTR(self_in);

mp_uint_t org_size = size;
// Alloc should always remain the same so cache it.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor, but here the sentence ends with a dot (imo best), but not t 8000 he other comments. Also the comment below is after its statement instead of on the line above it.

@stinos
Copy link
Contributor
stinos commented Jan 9, 2024

Would it be possible to add tests? The code is complex enogh to warrant testing all possible edge cases imo.

@dpgeorge dpgeorge added the py-core Relates to py/ directory in source label Jan 16, 2024
@projectgus
Copy link
Contributor

This is an automated heads-up that we've just merged a Pull Request
that removes the STATIC macro from MicroPython's C API.

See #13763

A search suggests this PR might apply the STATIC macro to some C code. If it
does, then next time you rebase the PR (or merge from master) then you should
please replace all the STATIC keywords with static.

Although this is an automated message, feel free to @-reply to me directly if
you have any questions about this.

klondi added 2 commits May 14, 2025 17:46
To simplify the logic create bufwriter_do_write. This function keeps
track of the data left on self->len and, if a partial write happens,
this function uses memmove to move the remaining data back to the
beginning of the buffer.

This allows simplifying bufwriter_write and bufwriter_flush significantly.

bufwriter_flush now only needs to call this function if the buffer has
some data stored and check for any error codes.
Additionally, if the buffer is only partially flushed we notify the user
too (so that they know there might be data left).

bufwriter_write now just needs to call this function whenever the buffer
gets full and copy the input into the buffer.
It will return when either no data is written at all or when all of the
input is consumed. In the case of a partial write it returns exactly the
amount of data which was written.

Additionally allow caching of errors to better handle partial writes.
Until now if an error occurred during the write, the error would be
raised and the caller had no way to know if any data was written at all
(for example in prior calls if more than one block of data was passed as
input). Now when we have written out some data and an error happens, we
reset the buffer to the state it would have if it did not contain the
data that was not written (and which was not buffered previously), and
then, we return the data that was written (if any) or raise an error if
no data from the input was written.

This allows the programmer better control of writes. In particular, the
programmer will know exactly how much of its last input data was written,
consequently allowing it to handle whatever data left to be written in
a better way.

Signed-off-by: Francisco Blas (klondike) Izquierdo Riera <klondike@klondike.es>
Some times there is a need for a BufferedReader in a way similar to how
BufferedWritter works. A clear example is when using an underlying device
requiring aligned reads, but a less clear example is when using
deflate.DeflateIO which will do only 1-byte reads and can become
crippling quickly when the underlying object is a python implemented
stream instead of a native one.

The BufferedReader will only attempt to do full-buffer reads and
ensures word-alignment in a way similar to how the writer does.
Similarly, it will also hide any errors when partial reads happen to
ensure that any data copied so far can be returned first.

Signed-off-by: Francisco Blas (klondike) Izquierdo Riera <klondike@klondike.es>
@DvdGiessen
Copy link
Contributor
DvdGiessen commented May 14, 2025

I've rebased them to match all the changes that happened in the master branch since this PR was made, and applied the review suggestion to import mpconfig.h directly in modio.h (we logically need it for MICROPY_PY_IO_BUFFEREDWRITER || MICROPY_PY_IO_BUFFEREDREADER anyway). Please feel free to copy these over: DvdGiessen/micropython@master...bufferedreaderwriter

Tested it and it works well for significantly speeding up DeflateIO when it's doing many small reads from a not so efficient custom stream. I'd love to see this contribution included.

I'll open a small PR for the Signed-Off-By breaking the rules. Long names should be allowed. (EDIT: #17302)

@projectgus
Copy link
Contributor

I've rebased them to match all the changes that happened in the master branch since this PR was made, and applied the review suggestion to import mpconfig.h directly in modio.h (we logically need it for MICROPY_PY_IO_BUFFEREDWRITER || MICROPY_PY_IO_BUFFEREDREADER anyway). Please feel free to copy these over: DvdGiessen/micropython@master...bufferedreaderwriter

Thanks for picking this up, @DvdGiessen. Unsure how best to handle it, as the PR's been open for a long time. However I've taken the liberty of updating @klondi's branch in this PR with your ch 8000 anges. If we end up going back and forth more than a little then it might be best to open a new PR instead, though.

@projectgus
Copy link
Contributor
projectgus commented May 22, 2025

To keep code coverage happy (and to give us the confidence of test coverage), this probably needs some unit tests. The coverage configuration of the unix port is built at "everything" level so it should have io.BufferedReader already, so the tests can be run in CI (as well as manually on the esp32 port).

It seems to me that the "standard" config of the unix port might benefit from having BufferedReader and BufferedWriter manually enabled, as well.

@dpgeorge
Copy link
Member

IMO this is a worthy feature to have (although at what level to enable it is a separate question). But it will need:

  • tests, full test coverage
  • docs
  • check API compatibility with CPython

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
py-core Relates to py/ directory in source
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants
0