8000 Upcoming stream classes refactor (avoid short reads for .read(), etc.) · Issue #2056 · micropython/micropython · GitHub
[go: up one dir, main page]

Skip to content

Upcoming stream classes refactor (avoid short reads for .read(), etc.) #2056

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
pfalcon opened this issue May 9, 2016 · 6 comments
Closed
Labels
rfc Request for Comment

Comments

@pfalcon
Copy link
Contributor
pfalcon commented May 9, 2016

I advanced in my understanding on Python3's io module class hierarchy - it's actually intended to be as advanced as Java I/O, but avoid combinatoric explosion. Based on that, I intend to do following changes:

  1. Drop .readall() method. It doesn't apply to all streams in Python3 standard hierarchy, and where it applies, the same effect can be achieved by read() (w/o params), so nobody in real world uses .readall().
  2. .read() from a non-blocking stream should either return when it reads enough bytes, encounters EOF, or error. This is how .read() behavior for buffered Python3 streams is defined.

The latter is required to do concise socket programing, at least in non-blocking case. Note that unbuffered I/O in Python3 still allows short reads. Until we implement full types hierarchy, different objects can thus decide themselves whether they want to emulate buffered or unbuffered interface. The point of this change is to provide buffered-like API for sockets. The alternative to this change is implementing e.g. .recvexactly() method on sockets, which was suggested for CPython, patches exist, but it never gathered enough critical mass to be included.

@pfalcon pfalcon added the rfc Request for Comment label May 9, 2016
@pfalcon pfalcon changed the title Upcoming stream classes refactor Upcoming stream classes refactor (avoid short reads for .read(), etc.) May 9, 2016
@pfalcon
Copy link
Contributor Author
pfalcon commented May 9, 2016

Ah, and it's time to finally implement .write(data, offset, len)

@dpgeorge
Copy link
Member
dpgeorge commented May 9, 2016

Drop .readall() method.

This sounds ok, especially since read() does the same thing. But we'd need to be prepared to break existing code, since I guess some people will be using readall(). Also, not having this method means that IO classes (eg UART) no longer inhert RawIOBase (well, they never did, but at least they tried to).

.read() from a non-blocking stream should either return when it reads enough bytes, encounters EOF, or error.

Do you intend to provide the original implementation as well as the new one that does not allow short reads? In this case a given IO object can choose which one to use.

Quoting CPython docs for BufferedIOBase:

If the argument is positive, and the underlying raw stream is not interactive, multiple raw reads may be issued to satisfy the byte count (unless EOF is reached first). But for interactive raw streams, at most one raw read will be issued, and a short result does not imply that EOF is imminent.

Will you cover the case of "interactive" streams? Or only non-interactive?

A BlockingIOError is raised if the underlying raw stream is in non blocking-mode, and has no data available at the moment.

Will this be the new behaviour for non-blocking streams with no data available? And then what happens if the non-blocking stream has only 1 byte, but you ask for 2, and then it blocks waiting for the second? That doesn't sound very "non blocking".

With your change, how will non-blocking differ from blocking?

@pfalcon
Copy link
Contributor Author
pfalcon commented May 9, 2016

Also, not having this method means that IO classes (eg UART) no longer inhert RawIOBase (well, they never did, but at least they tried to).

Well, we just don't implement that method as superfluous. They still should adhere to the rest of the interface (even though we don't physically inherit from that base class).

Do you intend to provide the original implementation as well as the new one that does not allow short reads?

Yes, exactly, so a particular object can choose whether it "inherits" from RawIOBase or BufferedIOBase.

Will you cover the case of "interactive" streams? Or only non-interactive?

Well, that's hard question. The obvious answer is "no" and then see how bad it is. Using sized reads on interactive streams isn't that reliable anyway (for any assumptions of a particular behavior), so making it block until it really reads all isn't much worse than anything else.

A BlockingIOError

Will this be the new behaviour for non-blocking streams with no data available? And then what happens if the non-blocking stream has only 1 byte, but you ask for 2, and then it blocks waiting for the second? That doesn't sound very "non blocking".

Non-blocking behavior won't change - the non-blocking setting prevails, as much data as possible is returned, None returned if nothing available. I.e. even if "buffered" .read() will behave as "raw" in case of non-blocking stream. I don't see a big problem in that, my treatment of that is it's not obvious how buffering and no-short-reads requirement should behave with non-blocking streams. So, it gets into "implementation-dependent" area. CPython chose to break consistency with RawIOBase and create extra entities like BlockingIOError. We choose not to do any of that.

Note that not raising an exception is a critical point why this way is being chosen. I was almost ready to implement .recvexactly(), but avoiding exception case was deciding (availability of BufferedIOBase.read was also one of the responses to adding .recvexactly to CPython: https://bugs.python.org/issue1103213).

So, all in all, it's not a big deal: we'll have RawIOBase methods which additionally (MicroPython impl detail) behave like BufferedIOBase's for blocking streams, that's all.

@dpgeorge
Copy link
Member

So, all in all, it's not a big deal: we'll have RawIOBase methods which additionally (MicroPython impl detail) behave like BufferedIOBase's for blocking streams, that's all.

Ok.

I guess moving forward we should note (in docs/code) what a given stream-like object "inherits" from (Raw or Buffered).

@pfalcon
Copy link
Contributor Author
pfalcon commented Jun 11, 2016

Continued in #2180

@pfalcon pfalcon closed this as completed Jun 11, 2016
@pfalcon
Copy link
Contributor Author
pfalcon commented Nov 13, 2016

Removal of .readall() escaped all this time. Now finally done in 59a1201. (Scavenging bytes to add more code for generators, etc.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rfc Request for Comment
Projects
None yet
Development

No branches or pull requests

2 participants
0