8000 iostream: Fix unreleased memoryview by methane · Pull Request #2008 · tornadoweb/tornado · GitHub
[go: up one dir, main page]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

iostream: Fix unreleased memoryview #2008

Closed
wants to merge 2 commits into from

Conversation

methane
Copy link
Contributor
@methane methane commented Apr 12, 2017

memoryview should be released explicitly. Otherwise, a bytearray
holded by the memoryview is not resizable until GC.

See also: https://bugs.python.org/issue29178

methane added 2 commits April 12, 2017 18:23
memoryview should be released explicitly.  Otherwise, a bytearray
holded by the memoryview is not resizable until GC.

See also: https://bugs.python.org/issue29178
@ploxiln
Copy link
Contributor
ploxiln commented Apr 12, 2017

I think you don't need the second nested with if you don't keep a reference to the sliced memoryview. I think this should be sufficient:

        with memoryview(self._read_buffer) as m:
            b = m[self._read_buffer_pos:self._read_buffer_pos + loc].tobytes()

@methane
Copy link
Contributor Author
methane commented Apr 12, 2017

I think you don't need the second nested with if you don't keep a reference to the sliced memoryview.

Even I don't keep reference, temporary memoryview may be created and closed lazily in Python 3
implementations other than CPython.

@methane
Copy link
Contributor Author
methane commented Apr 12, 2017

FYI:

  • PyPy3.5 allows bytearray resize even when there are unreleased memoryview. So this is theoretical issue rather than real-world problem.
  • In most case, temporal bytearray (below) is faster than double __enter__ and __exit__ calls.
b = bytes(self._read_buffer[self._read_buffer_pos:self._read_buffer_pos + loc])

@bdarnell
Copy link
Member

The with syntax is not supported on python 2 (and we apparently have a bug in our CI scripts that runs some tests in python2 even for the configurations that are supposed to be python 3).

@methane
Copy link
Contributor Author
methane commented Apr 13, 2017

@bdarnell You're right.

How do you think using temporal bytearray like I described above?

When reading MBs of data, avoiding temporal copy is nice.
But two memoryviews + two __enter__ + __exit__ (or try...finally: m.release()) is too expensive
for common cases like reading HTTP header.

@bdarnell
Copy link
Member

CC @pitrou who originally wrote the code here that's using memoryview

Hmm, if this analysis is accurate, it makes memoryview much less attractive to use. I think we should be able to write some tests that reproduce BufferErrors if there is a real issue here. There are two uses of memoryview in IOStream.

_read_buffer

There are several ways we could write this code. Version 1 is the one we have today:

    b = (memoryview(self._read_buffer)
         [self._read_buffer_pos:self._read_buffer_pos + loc]
         ).tobytes()

It does a single copy (in the tobytes() call), but it doesn't release its memoryview objects. If these are not deallocated promptly, they will cause BufferErrors the next time the _read_buffer is resized. But why wouldn't they be deallocated promptly? In CPython, there are no other references to either memoryview so they should be freed at the end of the statement. (and in pypy the BufferErrors don't exist. Is there a possibility that they might be introduced in the future?)

This PR introduces version 2, which uses the with statement to deallocate the buffers:

with memoryview(self._read_buffer) as m:
    with m[self._read_buffer_pos:self._read_buffer_pos + loc] as m2:
        b = m2.tobytes()

We can't use this in python 2, so we would need to use version 3 instead:

m = memoryview(self._read_buffer)
m2 = m[self._read_buffer_pos:self._read_buffer_pos + loc]
b = m2.tobytes()
m2.release()
m.release()

This releases both memoryviews, but is ugly and introduces extra function calls which have a runtime cost.

@ploxiln suggested version 4, which only releases one memory view and relies on the other to be deallocated automatically:

with memoryview(self._read_buffer) as m:
        b = m[self._read_buffer_pos:self._read_buffer_pos + loc].tobytes()

But I'm not sure this makes sense - if one is deallocated promptly, why isn't the other?

Finally, we have version 5:

 b = bytes(self._read_buffer[self._read_buffer_pos:self._read_buffer_pos + loc])

This is the simple and obvious way to write this code, but it makes two copies (once in slicing the bytearray, and once in converting it to bytes)

My preference is for either version 1 or 5: if we can count on immediate deallocation, then we should be able to keep version 1. If we can't, then we should switch to version 5 and not worry about this micro-optimization for now.

_write_buffer

Based on this SO question it looks like this is not just a theoretical concern and ought to be fixed before 4.5 is released.

num_bytes = self.write_to_fd(
    memoryview(self._write_buffer)[start:start + size])

Here we construct a memoryview and pass it to write_to_fd, which may fail. If it raises an exception, a reference to the memoryview will be held in the exception's traceback, and this could be retained long enough to break future use of the _write_buffer. I think this line needs to be rewritten as

m = memoryview(self._write_buffer)[start:start + size]
try:
    self.write_to_fd(m)
finally:
    m.release()

(assuming we stick with option 1 above. If we go with option 5, we would get rid of memoryview here too)

What do y'all think?

@ploxiln
Copy link
Contributor
ploxiln commented Apr 14, 2017

My suggestion was based on this comment in the bug discussion: https://bugs.python.org/issue29178#msg284837 ... on a second reading of that it looks like the bytes() constructor has some magic which takes care of the memoryview passed to it. Which I still don't really understand, and probably isn't applicable to this case anyway. So ignore me :)

@pitrou
Copy link
Contributor
pitrou commented Apr 14, 2017

@methane, is it a theoretical concern or did you actually get an error in the wild (using PyPy perhaps)? CPython is reference-counted so the memoryview is immediately collected after it gets out of scope...

@pitrou
Copy link
Contributor
pitrou commented Apr 14, 2017

If it raises an exception, a reference to the memoryview will be held in the exception's traceback

Tracebacks keep references to local variables. The memoryview there is not bound to a local variable, so it shouldn't be kept alive by the traceback.

@methane
Copy link
Contributor Author
methane commented Apr 14, 2017

@pitrou When I create this pull request, I think it will be real problem at PyPy. But as I said:

PyPy3.5 allows bytearray resize even when there are unreleased memoryview. So this is theoretical issue rather than real-world problem.

I haven't tried micropython yet. Jython, IronPython and Piston are Python 2.7 yet.
So I think current code is OK for a while.
I think _speedup module can have safe and efficient function in the future.

On the other hand, _write_buffer seems real problem.

@bdarnell
Copy link
Member
 num_bytes = self.write_to_fd(
     memoryview(self._write_buffer)[start:start + size])

Tracebacks keep references to local variables. The memoryview there is not bound to a local variable, so it shouldn't be kept alive by the traceback.

There's no local variable in that line, but the memoryview does get bound to a local variable in write_to_fd.

@bdarnell
Copy link
Member

I've tried to reproduce the write error, but I'm unable to. Any error that escapes from write_to_fd will cause the IOStream to be closed, which results in a StreamClosedError on subsequent attempts to write (before it could reach this BufferError). SSLWantWriteError is caught within SSLIOStream.write_to_fd, but the exception object is deallocated promptly on cpython.

On a closer read of the SO question, I think the poster is probably using threads incorrectly in a way that is manifesting in a variety of unexpected errors. So maybe there's nothing to be done here after all. The only python implementations we currently support are cpython and pypy. Pypy doesn't have this error at all, and cpython's refcounting appears to make this naive use of memoryview safe enough.

@methane
Copy link
Contributor Author
methane commented Apr 15, 2017

OK, let's leave it for now.

@methane methane closed this Apr 15, 2017
@methane methane deleted the unclosed-memoryview branch April 15, 2017 13:03
@s0undt3ch
Copy link

I think I may have stumbled on something related to this:

20:11:47 [tests-group-compiled-1] 19:11:46,0990 [tornado.general                                     :554 ][MainProcess][ERROR   ] Uncaught exception, closing connection.
20:11:47 [tests-group-compiled-1] Traceback (most recent call last):
20:11:47 [tests-group-compiled-1]   File "/var/lib/jenkins/.pyenv/versions/Raas-Py3.5.3/lib/python3.5/site-packages/tornado/iostream.py", line 523, in _handle_events
20:11:47 [tests-group-compiled-1]     self._handle_write()
20:11:47 [tests-group-compiled-1]   File "/var/lib/jenkins/.pyenv/versions/Raas-Py3.5.3/lib/python3.5/site-packages/tornado/iostream.py", line 872, in _handle_write
20:11:47 [tests-group-compiled-1]     del self._write_buffer[:self._write_buffer_pos]
20:11:47 [tests-group-compiled-1] BufferError: Existing exports of data: object cannot be re-sized
20:11:47 [tests-group-compiled-1] 19:11:47,0001 [tornado.application                                 :638 ][MainProcess][ERROR   ] Exception in callback None
20:11:47 [tests-group-compiled-1] Traceback (most recent call last):
20:11:47 [tests-group-compiled-1]   File "/var/lib/jenkins/.pyenv/versions/Raas-Py3.5.3/lib/python3.5/site-packages/tornado/ioloop.py", line 888, in start
20:11:47 [tests-group-compiled-1]     handler_func(fd_obj, events)
20:11:47 [tests-group-compiled-1]   File "/var/lib/jenkins/.pyenv/versions/Raas-Py3.5.3/lib/python3.5/site-packages/tornado/stack_context.py", line 277, in null_wrapper
20:11:47 [tests-group-compiled-1]     return fn(*args, **kwargs)
20:11:47 [tests-group-compiled-1]   File "/var/lib/jenkins/.pyenv/versions/Raas-Py3.5.3/lib/python3.5/site-packages/tornado/iostream.py", line 523, in _handle_events
20:11:47 [tests-group-compiled-1]     self._handle_write()
20:11:47 [tests-group-compiled-1]   File "/var/lib/jenkins/.pyenv/versions/Raas-Py3.5.3/lib/python3.5/site-packages/tornado/iostream.py", line 872, in _handle_write
20:11:47 [tests-group-compiled-1]     del self._write_buffer[:self._write_buffer_pos]
20:11:47 [tests-group-compiled-1] BufferError: Existing exports of data: object cannot be re-sized

@s0undt3ch
Copy link

The above is using tornado 4.5.1

@bdarnell
Copy link
Member

@s0undt3ch Can you point to the code that's running here? Are you doing anything with threads? Our (my) working theory for this issue is that it has something to do with improper use of threads, but if you've found a way it can happen without threads then we'd be interested in learning more.

@s0undt3ch
Copy link

@bdarnell if running a loop in its own thread and using add_callback to interact with it, then I'm improperly using threads, however, I haven't seen any other way to run a loop in its own thread. The problematic code can be found here and I start getting all sorts of errors when the loop stopped as you can see on the comments.

You might ask, why am I running the loop in its own thread, or, why ain't I using run_sync if the code is synchronous and the answer to that is that this client code is going to run on code which might or might not be synchronous(I can't just change the code where this client runs) and might have a loop running or not. If the code is asynchronous and it has a loop running I can't use run_sync, if the code is synchronous, I can't just use the async client. So, with that in mind and accepting the possible performance penalty of forcing the client to be synchronous, I made the sync client run it's own loop in its own thread.

If you see any way that I can improve that code and specifically stop running into those exceptions, please let me know since learning is always a pleasant experience to me even when proving me wrong.

@bdarnell
Copy link
Member

@s0undt3ch The errors in the comments of that gist appear to stem from a confusion of IOLoop.stop() and IOLoop.close(). stop() is called on a running IOLoop, i.e. from a callback. close() is called after the IOLoop has stopped and start() has returned. So line 493 should be self._loop.add_callback(self._loop.stop) (instead of close()), and the stop call at line 371 (and similar except blocks) doesn't do anything because by that point the IOLoop is no longer running (This would be the time to call IOLoop.close(), except that you do that too later on).

Other than that, the code looks correct in its handling of the IOLoop across threads. I don't quite see why you can't use run_sync() since you have self._async_call_done apparently enforcing one-at-a-time access to the IOLoop (with some refactoring you could support multiple concurrent calls on the same IOLoop thread, which would justify staying away from run_sync()), but that shouldn't cause any problems here (maybe you just need to pass make_current=False when creating the IOLoop to avoid clobbering the thread's running IOLoop? That's what I do in Tornado's sync HTTPClient).

As for the BufferError, was it related to a call to close() or did it happen while everything is running normally? The call to async_client.close() on line 484 looks unsafe - it needs to use add_callback, same as connect and call. Were there two IOLoops running in different threads, or was the one created by this file the only one at the time?

@bdarnell
Copy link
Member
bdarnell commented Jul 5, 2017

stackimpact/stackimpact-python#1 has identified a non-threading-related scenario in which this error can occur: a signal handler (especially one that calls sys._current_frames(), or maybe one that raises and catches an exception) can cause the local variable to live too long. This particular case has been fixed by changing the signal handler.

8000

@lucasts
Copy link
lucasts commented Oct 10, 2017

seeing the same
tornado 4.5.1 / python 3.6.1

Stacktrace (most recent call last):

  File "tornado/web.py", line 1513, in _execute
    self.finish()
  File "newrelic/hooks/framework_tornado_r3/web.py", line 136, in _nr_wrapper_RequestHandler_finish_
    return request_handler_finish_finalize(wrapped, instance, args, kwargs)
  File "newrelic/hooks/framework_tornado_r3/util.py", line 256, in request_handler_finish_finalize
    return wrapped(*args, **kwargs)
  File "tornado/web.py", line 991, in finish
    self.flush(include_footers=True)
  File "tornado/web.py", line 947, in flush
    start_line, self._headers, chunk, callback=callback)
  File "newrelic/hooks/framework_tornado_r3/http1connection.py", line 32, in _nr_wrapper_HTTP1Connection_write_headers_
    return wrapped(*args, **kwargs)
  File "tornado/http1connection.py", line 400, in write_headers
    self._pending_write = self.stream.write(data)
  File "tornado/iostream.py", line 406, in write
    self._handle_write()
  File "tornado/iostream.py", line 872, in _handle_write
    del self._write_buffer[:self._write_buffer_pos]

screen shot 2017-10-10 at 11 37 41

@bdarnell
Copy link
Member

@lucasts Are you doing anything with threads or signals? Is the newrelic package? (Monitoring packages are likely candidates for using signals in a way that breaks here, as we saw with stackimpact, but grepping the newrelic code I can find doesn't appear to be doing anything like that)

In any case, signal-based monitoring tools are useful enough that we should probably do something to fix this: either introduce the necessary release() calls or revert this memoryview optimization and go back to the simpler copying code.

@pitrou
Copy link
Contributor
pitrou commented Oct 21, 2017

Adding release() calls is fine, except that you must guard them as memoryview.release doesn't exist on Python 2.

@pitrou
Copy link
Contributor
pitrou commented Oct 21, 2017

IOW, I'd really like to see a small reproducer. @lucasts

pitrou added a commit to pitrou/tornado that referenced this pull request Oct 21, 2017
See tornadoweb#2008 for reasons why this
might be necessary.
@pitrou
Copy link
Contributor
pitrou commented Oct 21, 2017

Ok, I've got a reproducer:
https://gist.github.com/pitrou/55b6fb4cbb487ef2748b34001f762e54

@bdarnell
Copy link
Member

Adding release() calls is fine, except that you must guard them as memoryview.release doesn't exist on Python 2.

D'oh, I was relying on my earlier comment #2008 (comment)

Was this exception present in python 2? If so, was relying on destruction order the only way to avoid it? If not, were operations involving memoryviews simply unsafe?

That said, I'm a bit surprised that a signal handler finds it useful to keep all frames alive (that's what happens, right?).

I think it's an accident, rather than something deliberate. If the result of sys._current_frames() is assigned to a local variable in any function, that's a cycle and it means that all local variables are held until the next real GC run instead of being cleaned up via reference counts. The fix is to delete the local variable that's holding on to stack frames (failing to do so is an efficiency issue, but usually a minor one).

Note the current IOStream code doesn't even keep any memoryview in a local variable

It does, indirectly: we construct a temporary memoryview in _handle_write, then pass it in to write_to_fd where it is bound to the local variable data (so explicitly unbinding data in each of the write_to_fd implementations might also be a solution here).

@pitrou
Copy link
Contributor
pitrou commented Oct 21, 2017

Was this exception present in python 2? If so, was relying on destruction order the only way to avoid it? If not, were operations involving memoryviews simply unsafe?

Do you mean the BufferError exception? Yes, it's present. But it doesn't have anything to do with destruction order. A temporary (stack) variable is always cleared as soon as it is done with. As long as there's nobody capturing the whole stack from another thread...

It does, indirectly: we construct a temporary memoryview in _handle_write, then pass it in to write_to_fd where it is bound to the local variable data (so explicitly unbinding data in each of the write_to_fd implementations might also be a solution here).

Hmm, I see. Let me try...

@pitrou
Copy link
Contributor
pitrou commented Oct 21, 2017

(so explicitly unbinding data in each of the write_to_fd implementations might also be a solution here)

Indeed, it works. I'll open an alternative PR for that.

pitrou added a commit to pitrou/tornado that referenced this pull request Oct 21, 2017
See tornadoweb#2008 for reasons
why this might be necessary.
@pitrou
Copy link
Contributor
pitrou commented Oct 21, 2017

So, there are two competing PRs for this: #2172 (call release() explicitly) and #2173 (be careful not to keep the memoryview in any local variable).

bdarnell pushed a commit that referenced this pull request Oct 22, 2017
See #2008 for reasons
why this might be necessary.
@bdarnell
Copy link
Member

Thanks @pitrou ! I've merged #2173.

@lucasts
Copy link
lucasts commented Oct 23, 2017

@bdarnell @pitrou sorry, I was offline during the weekend.

Are you doing anything with threads or signals? Is the newrelic package?

  • Yes, We use threads pool executors here and there.
  • We don't use Signals.
  • Newrelic doesn't seem to use either.

--

catching up, did you take some of the paths?

@bdarnell
Copy link
Member

Yes, we believe #2173 will fix the problem.

bdarnell pushed a commit to bdarnell/tornado that referenced this pull request Jan 5, 2018
See tornadoweb#2008 for reasons why this
might be necessary.
bdarnell pushed a commit to bdarnell/tornado that referenced this pull request Jan 5, 2018
See tornadoweb#2008 for reasons
why this might be necessary.
@jeffreydwalter
Copy link

I'm still seeing a similar error using 4.5.3 in Python 2.7.

$ pip freeze | grep tor
tornado==4.5.3
ERROR:tornado.general:Uncaught exception, closing connection.
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/site-packages/tornado/iostream.py", line 523, in _handle_events
    self._handle_write()
  File "/usr/local/lib/python2.7/site-packages/tornado/iostream.py", line 847, in _handle_write
    assert self._write_buffer_size >= 0
AssertionError
Exception in thread Thread-1:
Traceback (most recent call last):
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 810, in __bootstrap_inner
    self.run()
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 763, in run
    self.__target(*self.__args, **self.__kwargs)
  File "arlowebsocket.py", line 86, in stream_video
    self.write_message(data, binary=True)
  File "/usr/local/lib/python2.7/site-packages/tornado/websocket.py", line 252, in write_message
    return self.ws_connection.write_message(message, binary=binary)
  File "/usr/local/lib/python2.7/site-packages/tornado/websocket.py", line 793, in write_message
    return self._write_frame(True, opcode, message, flags=flags)
  File "/usr/local/lib/python2.7/site-packages/tornado/websocket.py", line 776, in _write_frame
    return self.stream.write(frame)
  File "/usr/local/lib/python2.7/site-packages/tornado/iostream.py", line 406, in write
    self._handle_write()
  File "/usr/local/lib/python2.7/site-packages/tornado/iostream.py", line 872, in _handle_write
    del self._write_buffer[:self._write_buffer_pos]
BufferError: Existing exports of data: object cannot be re-sized

ERROR:tornado.application:Exception in callback None
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/site-packages/tornado/ioloop.py", line 888, in start
    handler_func(fd_obj, events)
  File "/usr/local/lib/python2.7/site-packages/tornado/stack_context.py", line 277, in null_wrapper
    return fn(*args, **kwargs)
  File "/usr/local/lib/python2.7/site-packages/tornado/iostream.py", line 523, in _handle_events
    self._handle_write()
  File "/usr/local/lib/python2.7/site-packages/tornado/iostream.py", line 847, in _handle_write
    assert self._write_buffer_size >= 0
AssertionError

@bdarnell
Copy link
Member

@jeffreydwalter Interesting. I don't think we've seen that AssertionError fire before. (It makes me think about thread-safety issues - are you possibly accessing the IOStream from multiple threads?) Please open a new issue with more details about what you're doing instead of commenting on this closed one.

bmerry added a commit to ska-sa/katsdpdockerbase that referenced this pull request Feb 22, 2018
This contains the fix for tornadoweb/tornado#2008, which I recently
encountered.
@aayla-secura
Copy link

I am seeing the exact same AssertionError as @jeffreydwalter with python 3.6 and tornado 4.5.3. It only happens when I am updating the document (from a callback called from another thread via session_context.with_locked_document) at a very high rate (several times a second). And it happens at random times. It doesn't happen if I add a delay and limit the rate at which the callback is executed, or if I don't change anything in the document from within the callback. I'm wondering if session_context.with_locked_document is not the right way to update a session's document from another thread.

@jeffreydwalter
Copy link

@aayla-secura my issue turned out to be me opening threads inside the on_message() callback.

@jeffreydwalter
Copy link

@aayla-secura I don't know if it'll help you, but here's the issue that I filed in relation to my problem #2271

@aayla-secura
Copy link

@jeffreydwalter I don't think that is my problem here, but it was my incorrect using of threads. I'm still not sure why updating the document via session_context.with_locked_document doesn't work, but I "fixed" it by adding an intermediate step: the function given to with_locked_document is a dummy that schedules the real updater via doc.add_next_tick_callback. The warning that

The ONLY safe operations to perform on a document from a different thread is add_next_tick_callback() and remove_next_tick_callback()

is given in the manual, so that's my bad for not heeding to it.

@rahultaing
Copy link

I see this issue has been closed. But I am seeing the same error on 4.5.3 with python2. I have a consistent repro. This is the line of code that throws the error:
self._write_buffer += data in write method.

@bdarnell
Copy link
Member
bdarnell commented Jul 4, 2019

Every known occurrence of this has been related to incorrect use of threads. If you're seeing this without threads, please file a new issue with details.

@rahultaing
Copy link
rahultaing commented Jul 4, 2019

definitely will do. thanks for a quick response.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants
0