-
-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Description
I did this
With anything listening locally on port 3128 (e.g. nc -l 3128
), issue a request that uses a proxy server and specifies port 80 explicitly within the request URL:
$ curl -v -x http://localhost:3128 http://example.org:80/index.html
* Trying ::1:3128...
* Connected to localhost (::1) port 3128 (#0)
> GET http://example.org:80/index.html HTTP/1.1
> Host: example.org
> User-Agent: curl/7.71.1
> Accept: */*
> Proxy-Connection: Keep-Alive
Note that:
- The HTTP request target correctly uses the absolute-form as defined by RFC 7230 section 5.3.2, since we have used the
-x
option to specify the use of a proxy server - The HTTP request target URI authority portion has the value
example.org:80
(i.e. including the port number) - The HTTP
Host
header contains the valueexample.org
(i.e. not including the port number)
This violates RFC 7230 section 5.4, which states in part that
If the target URI includes an authority component, then a
client MUST send a field-value for Host that is identical to that
authority component, excluding any userinfo subcomponent and its "@"
delimiter
This discrepancy between target URI authority portion and Host
header causes a failure when the request happens to pass through HAProxy, which will report a 400 Bad Request
error due to the mismatch.
Within curl, the Host
header is constructed to explicitly omit the port number if it matches the default (80 for HTTP or 443 for HTTPS):
Lines 2108 to 2123 in 03c8cef
if(((conn->given->protocol&CURLPROTO_HTTPS) && | |
(conn->remote_port == PORT_HTTPS)) || | |
((conn->given->protocol&CURLPROTO_HTTP) && | |
(conn->remote_port == PORT_HTTP)) ) | |
/* if(HTTPS on port 443) OR (HTTP on port 80) then don't include | |
the port number in the host string */ | |
data->state.aptr.host = aprintf("Host: %s%s%s\r\n", | |
conn->bits.ipv6_ip?"[":"", | |
host, | |
conn->bits.ipv6_ip?"]":""); | |
else | |
data->state.aptr.host = aprintf("Host: %s%s%s:%d\r\n", | |
conn->bits.ipv6_ip?"[":"", | |
host, | |
conn->bits.ipv6_ip?"]":"", | |
conn->remote_port); |
and the target URI is constructed to exclude the userinfo subcomponent but will leave the port number present even if it would be omitted from the Host
header:
Lines 2176 to 2188 in 03c8cef
if(strcasecompare("http", data->state.up.scheme)) { | |
/* when getting HTTP, we don't want the userinfo the URL */ | |
uc = curl_url_set(h, CURLUPART_USER, NULL, 0); | |
if(uc) { | |
curl_url_cleanup(h); | |
return CURLE_OUT_OF_MEMORY; | |
} | |
uc = curl_url_set(h, CURLUPART_PASSWORD, NULL, 0); | |
if(uc) { | |
curl_url_cleanup(h); | |
return CURLE_OUT_OF_MEMORY; | |
} | |
} |
For reference, the relevant code within HAProxy that rejects the mismatched request target URI and Host
header seems to be: https://github.com/haproxy/haproxy/blob/19d14710e941a366afd5b4ff8720090c011c83c1/src/h1.c#L871-L896
I expected the following
curl should construct a request that conforms to RFC 7230. This could be achieved by any of:
- Retaining the port number within the
Host
header unconditionally, or - Retaining the port number within the
Host
header when issuing a request via a proxy, or - Stripping default port numbers from the request URI target using the same logic as in
Curl_http_host()
I am happy to put together a pull request if a maintainer could indicate which of the above would be the preferred approach.
curl/libcurl version
curl 7.71.1 (x86_64-redhat-linux-gnu) libcurl/7.71.1 OpenSSL/1.1.1i-fips zlib/1.2.11 brotli/1.0.9 libidn2/2.3.0 libpsl/0.21.1 (+libidn2/2.3.0) libssh/0.9.5/openssl/zlib nghttp2/1.43.0
Release-Date: 2020-07-01
Protocols: dict file ftp ftps gopher http https imap imaps ldap ldaps pop3 pop3s rtsp scp sftp smb smbs smtp smtps telnet tftp
Features: AsynchDNS brotli GSS-API HTTP2 HTTPS-proxy IDN IPv6 Kerberos Largefile libz Metalink NTLM NTLM_WB PSL SPNEGO SSL TLS-SRP UnixSockets
operating system
Linux 5.10.16-200.fc33.x86_64 #1 SMP Sun Feb 14 03:02:32 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux