-
-
Notifications
You must be signed in to change notification settings - Fork 32.4k
gh-116810: fix memory leak in ssl module #123249
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 8 commits
119d33a
c98be51
e12cb28
13196b4
8fff948
507266e
c66f537
45df217
537c5bb
0ddc1e2
46126b4
7131837
5ced40b
96ab92b
cb415bc
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
Resolved SSL session memory leak in :mod:`ssl`. | ||
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2853,10 +2853,6 @@ PySSL_get_session(PySSLSocket *self, void *closure) { | |
if ((session = _ssl_session_dup(session)) == NULL) { | ||
return NULL; | ||
} | ||
session = SSL_get1_session(self->ssl); | ||
if (session == NULL) { | ||
Py_RETURN_NONE; | ||
} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Judging by no test failures, this probably does not have any coverage. Any chance of including the tests in this PR? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I can check the tests this week. I doubt I will be able to finish that work today. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since this change still results in returning a valid copy of the ssl session, I doubt tests would fail. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, agreeing there. I think it would be difficult to test this. With that being said, are there any existing tests for checking that the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I had some life issues come up. I will try to get back to this in the next few weeks. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There's a cacophony of mistakes here on the CPython side. Let me try to correct things:
The memory leak seems to have been introduced by 39258d3. Anyway, the problem with that commit is that you all turned That commit not only introduced a memory leak, but it also mostly broke your workaround. I haven't tried it, but I suspect you would see this if you uncleanly destroyed the other connection before calling
That is not what is going on here. There were, as far as I know, no changes in OpenSSL 3+ relating to this.
Although the code was, prior to the regression, gated on a
While I have not tested it (have you all?), based on openssl/openssl#1550 (comment), I highly suspect that CPython "needs" the workaround in OpenSSL 3.x too. I say "needs" because...
No, that's not what I was saying in openssl/openssl#1550 (comment). See the immediately preceding comment:
The bug report was very confused because the minimal reproducer that CPython provided did not actually reflect what CPython does. With all the meandering with the bad reproducer, I suspect it was very confusing. But the CPython folks on the bug never followed up, so my past self assumed you all understood, were planning on fixing it, and I didn't have time to check on you all. 😞 What is actually going on here is that OpenSSL will invalidate sessions on connection destruction if you haven't called
https://www.rfc-editor.org/rfc/rfc2246.html#section-7.2.1 Now, I personally think this behavior is unhelpful. If a connection just got closed, that doesn't make the old key material broken or anything. Also if we were getting any security properties out of this, it wouldn't work anyway. There are loads of cases where you cannot reliably knock out the other session.
Interestingly, OpenSSL doesn't actually look for a full bidirectional shutdown to preserve your session. It only looks for whether you attempted to send TLS 1.1, incidentally, already softened this recommendation:
https://www.rfc-editor.org/rfc/rfc4346.html#section-7.2.1 But OpenSSL retains this requirement, for better or worse. This requirement was not new in 1.1.x, nor was it removed in 3.x. The only reason you all needed to work around it later on is that, due to other quirks of how CPython was using the API (arguably incorrectly), as detailed in my comment on the OpenSSL issue, CPython was sneaking through a gap in this OpenSSL policy. OpenSSL fixed that gap, and now CPython was affected. Now, contrary to the claim in openssl/openssl#1550 (comment), CPython did not call So, what to do given this? I think you have two options:
In either case, you can remove the workaround because this workaround has been self-inflicted from day one. If you pick option 2, please write some clear comments in the code so the next time this happens, I don't have to write another essay for you all. :-P
So, per the above, this question is based on several incorrect premises:
Since this "workaround" is really a workaround for CPython's own mishaps, if you all do the thing I suggest above, it should work just fine in BoringSSL. But also this is all moot in BoringSSL because BoringSSL does not invalidate sessions on There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is where OpenSSL documents their behavior, by the way:
For completeness, there were two different bits of spec text that led to this OpenSSL behavior, and I only cited one of them. The bits I cited were about unclean shutdown and removed very early. Fatal error lasted all the way to TLS 1.2:
https://www.rfc-editor.org/rfc/rfc5246.html#section-7.2.2 It wasn't removed until TLS 1.3, in tlswg/tls13-spec@dc668dc, for the same reasons. In practice, you cannot actually do this consistently, for the reasons I listed above. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks!
Hah, that explains the code smell here. I'd only been glancing at current version of the code in git blame view to try and understand what this was even for without following the individual commits (git blame doing its job and not exposing history of merely deleted lines)... I agree completely on getting rid of that There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, definitely makes sense to just start with something minimal (probably this PR) to fix the immediate leak. I think probably the thing to do is:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Calling Oh and thanks for the detailed explanation @davidben . |
||
pysess = PyObject_GC_New(PySSLSession, self->ctx->state->PySSLSession_Type); | ||
if (pysess == NULL) { | ||
SSL_SESSION_free(session); | ||
|
Uh oh!
There was an error while loading. Please reload this page.