-
-
Notifications
You must be signed in to change notification settings - Fork 8.2k
Add a BufferedReader and allow BufferedWriter to handle partial writes and errors after some data was written. #13390
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Code size report:
|
I have fixed the spelling error, I am unsure what should I do with the Signed-Of-By headers. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #13390 +/- ##
==========================================
- Coverage 98.54% 98.40% -0.14%
==========================================
Files 169 169
Lines 21897 21937 +40
==========================================
+ Hits 21579 21588 +9
- Misses 318 349 +31 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
py/misc.h
Outdated
@@ -33,6 +33,8 @@ | |||
#include <stdbool.h> | |||
#include <stdint.h> | |||
#include <stddef.h> | |||
// Needed for NORETURN | |||
#include "py/mpconfig.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't think this is explicitly documented, but see the rest of the code (grep for all occurrences of misch.h): the idea is that any file needing to include py/misc.h first includes py/mpconfig.h, so this change should not be made.
// Writes out the data stored in the buffer so far | ||
STATIC int bufwriter_do_write(mp_obj_bufwriter_t *self) { | ||
int rv = 0; | ||
// This cannot return 0 without an error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you elaborate on what you mean with this? Like: why would it be an issue for this bit of code if it were 0?
STATIC mp_uint_t bufwriter_write(mp_obj_t self_in, const void *buf, mp_uint_t size, int *errcode) { | ||
mp_obj_bufwriter_t *self = MP_OBJ_TO_PTR(self_in); | ||
|
||
mp_uint_t org_size = size; | ||
// Alloc should always remain the same so cache it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor, but here the sentence ends with a dot (imo best), but not t 8000 he other comments. Also the comment below is after its statement instead of on the line above it.
Would it be possible to add tests? The code is complex enogh to warrant testing all possible edge cases imo. |
This is an automated heads-up that we've just merged a Pull Request See #13763 A search suggests this PR might apply the STATIC macro to some C code. If it Although this is an automated message, feel free to @-reply to me directly if |
To simplify the logic create bufwriter_do_write. This function keeps track of the data left on self->len and, if a partial write happens, this function uses memmove to move the remaining data back to the beginning of the buffer. This allows simplifying bufwriter_write and bufwriter_flush significantly. bufwriter_flush now only needs to call this function if the buffer has some data stored and check for any error codes. Additionally, if the buffer is only partially flushed we notify the user too (so that they know there might be data left). bufwriter_write now just needs to call this function whenever the buffer gets full and copy the input into the buffer. It will return when either no data is written at all or when all of the input is consumed. In the case of a partial write it returns exactly the amount of data which was written. Additionally allow caching of errors to better handle partial writes. Until now if an error occurred during the write, the error would be raised and the caller had no way to know if any data was written at all (for example in prior calls if more than one block of data was passed as input). Now when we have written out some data and an error happens, we reset the buffer to the state it would have if it did not contain the data that was not written (and which was not buffered previously), and then, we return the data that was written (if any) or raise an error if no data from the input was written. This allows the programmer better control of writes. In particular, the programmer will know exactly how much of its last input data was written, consequently allowing it to handle whatever data left to be written in a better way. Signed-off-by: Francisco Blas (klondike) Izquierdo Riera <klondike@klondike.es>
Some times there is a need for a BufferedReader in a way similar to how BufferedWritter works. A clear example is when using an underlying device requiring aligned reads, but a less clear example is when using deflate.DeflateIO which will do only 1-byte reads and can become crippling quickly when the underlying object is a python implemented stream instead of a native one. The BufferedReader will only attempt to do full-buffer reads and ensures word-alignment in a way similar to how the writer does. Similarly, it will also hide any errors when partial reads happen to ensure that any data copied so far can be returned first. Signed-off-by: Francisco Blas (klondike) Izquierdo Riera <klondike@klondike.es>
I've rebased them to match all the changes that happened in the Tested it and it works well for significantly speeding up DeflateIO when it's doing many small reads from a not so efficient custom stream. I'd love to see this contribution included. I'll open a small PR for the |
3ce4d08
to
2c4d1db
Compare
Thanks for picking this up, @DvdGiessen. Unsure how best to handle it, as the PR's been open for a long time. However I've taken the liberty of updating @klondi's branch in this PR with your ch 8000 anges. If we end up going back and forth more than a little then it might be best to open a new PR instead, though. |
To keep code coverage happy (and to give us the confidence of test coverage), this probably needs some unit tests. The It seems to me that the "standard" config of the unix port might benefit from having |
IMO this is a worthy feature to have (although at what level to enable it is a separate question). But it will need:
|
The commits should already describe much of what is going on but in summary.
Fix a missing include in py/mpconfig.h
Update BufferedWriter to handle partial writes and errors after some data is written.
Add a BufferedReader class handling also partial reads and errors after some data is read.
This is my first time sending code to micropython so I hope I did not make any mistakes.
(Also in case it is necessary, yes the code is mine, it is owned by me and not by any of my employers and I agree to release it under the MIT License).