no-copy io.BytesIO
#9456
andrewleech
started this conversation in
Ideas
Replies: 1 comment
-
I would find a non-std arg more discoverable and less confusing, e.g. it would state what happens that is different whereas if there are two classes I'd have to start wondering what all the differences are. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I've run into a number of situations where I'd like a stream interface to an existing buffer, often when assembling / parsing packets of data.
Conceptually it'd be great to create a
io.BytesIO
backed by an existing bytearray or memoryview to stream in/out part of if, then get the overall value at the end without copy (unlikeio.BytesIO.getvalue()
).I've built a thing like this in python code using memoryview and keeping track of start/end slice idx's and it ends up being more involved / hard to read than I'd like.
I see that when an
BytesIO
is created from an existing string/bytearray it's initially built by reference, no copy, and includes a reference back to the original object (inref_obj
): https://github.com/micropython/micropython/blob/bdbc444/py/objstringio.c#L201If the
BytesIO
is used to read from this string, then there's no copy of the entire buffer, great!As soon as it's written to however, this
ref_obj
backedBytesIO
is converted into a copied instance https://github.com/micropython/micropython/blob/bdbc444/py/objstringio.c#L83 and theref_obj
linkage is deleted https://github.com/micropython/micropython/blob/bdbc444/py/objstringio.c#L73This makes sense when it's been applied to a
str
/bytes
immutable object certainly, and it matches cpython.It would take very few changes in C however to provide a means of using this same steaming interface for read/write to a bytearray though, which would be great for efficient use of existing memory buffers!
How would people feel about either:
io.BytesIO(bytearray, byref=True)
to tell it to maintain this reference to update in-place, and similarly not make a copy with.getvalue()
.note: there's alread one non-cpython-compat extension in the
alloc_size
usage ofByteIO
.io.BufferIO
which works in this referenced mode instead?Beta Was this translation helpful? Give feedback.
All reactions