-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Closed
Labels
💰 Bounty $300If you complete this issue we'll pay you $300 on OpenCollective!If you complete this issue we'll pay you $300 on OpenCollective!
Description
Context
Schemeless URLs can cause a lot of parsing confusion because they aren't standardized in the RFC. Examples like the following are especially weird, and could cause problems when interoperability with other parsers matters:
"evil.com://good.com"
Here are some popular URL parsers' interpretations:
urllib3:
Scheme: (nil)
Userinfo: (nil)
Host: evil.com
Port: (nil)
Path: //good.com
Query: (nil)
Fragment: (nil)
cpython urllib:
Scheme: evil.com
Userinfo: (nil)
Host: good.com
Port: (nil)
Path: (nil)
Query: (nil)
Fragment: (nil)
furl:
Scheme: evil.com
Userinfo: (nil)
Host: good.com
Port: (nil)
Path: (nil)
Query: (nil)
Fragment: (nil)
hyperlink:
Scheme: evil.com
Userinfo: (nil)
Host: good.com
Port: (nil)
Path: /
Query: (nil)
Fragment: (nil)
rfc3986:
Scheme: evil.com
Userinfo: (nil)
Host: good.com
Port: (nil)
Path: (nil)
Query: (nil)
Fragment: (nil)
yarl:
Scheme: evil.com
Userinfo: (nil)
Host: good.com
Port: (nil)
Path: /
Query: (nil)
Fragment: (nil)
As you can see, we're the outlier here.
In my opinion, this is something worth fixing, but I imagine that schemeless URLs are in pretty widespread use with urllib3. Thus, we might consider adding a DeprecationWarning that encourages people to explicitly state their schemes.
Dobatymo, sethmlarson and nanonyme
Metadata
Metadata
Assignees
Labels
💰 Bounty $300If you complete this issue we'll pay you $300 on OpenCollective!If you complete this issue we'll pay you $300 on OpenCollective!