8000 [3.3] bpo-30500: urllib: Simplify splithost by calling into urlparse. (#1849) by vstinner · Pull Request #2292 · python/cpython · GitHub
[go: up one dir, main page]

Skip to content

[3.3] bpo-30500: urllib: Simplify splithost by calling into urlparse. (#1849) #2292

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jul 26, 2017
Merged

[3.3] bpo-30500: urllib: Simplify splithost by calling into urlparse. (#1849) #2292

merged 2 commits into from
Jul 26, 2017

Conversation

vstinner
Copy link
Member
@vstinner vstinner commented Jun 20, 2017

The current regex based splitting produces a wrong result. For example::

http://abc#@def

Web browsers parse that URL as http://abc/#@def, that is, the host
is abc, the path is /, and the fragment is #@def.
(cherry picked from commit 90e01e5)

https://bugs.python.org/issue30500

@mention-bot
Copy link

@Haypo, thanks for your PR! By analyzing the history of the files in this pull request, we identified @gvanrossum, @ezio-melotti and @ncoghlan to be potential reviewers.

@@ -860,8 +860,7 @@ def splithost(url):
"""splithost('//host[:port]/path') --> 'host[:port]', '/path'."""
global _hostprog
if _hostprog is None:
import re
_hostprog = re.compile('^//([^/?]*)(.*)$')
_hostprog = re.compile('//([^/#?]*)(.*)', re.DOTALL)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure sure of the addition of DOTALL, see:
http://bugs.python.org/issue30500#msg296422

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same change was done in the master branch, so it's fine.

@vstinner
Copy link
Member Author

I backported the commit master: I had to remove subTest() from test_urlparse.py, since this feature was added to Python 3.4.

@vstinner
Copy link
Member Author
vstinner commented Jun 20, 2017

@ned-deily: Would you mind to review this change? I tested manually that "./python -m test test_urlparse" pass. Sadly, 3.3 is no pre-commit CI yet :-(

vstinner added 2 commits July 25, 2017 12:36
Keep test_main() and remove subTest() (introduced in Python 3.4).
… (#2291)

The current regex based splitting produces a wrong result. For example::

  http://abc#@def

Web browsers parse that URL as ``http://abc/#@def``, that is, the host
is ``abc``, the path is ``/``, and the fragment is ``#@def``.
(cherry picked from commit 90e01e5)
(cherry picked from commit cc54c1c)
@ned-deily ned-deily merged commit 052f9d6 into python:3.3 Jul 26, 2017
@vstinner vstinner deleted the splithost33 branch July 26, 2017 03:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type-security A security issue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
0