[go: up one dir, main page]

Mixed Content Level 2

W3C First Public Working Draft,

This version:
https://www.w3.org/TR/2020/WD-mixed-content-2-2-20201014/
Latest published version:
https://www.w3.org/TR/mixed-content-2/
Editor's Draft:
https://w3c.github.io/webappsec-mixed-content/level2.html
Version History:
https://github.com/w3c/webappsec-mixed-content/commits/master/level2.src.html
Feedback:
public-webappsec@w3.org with subject line “[mixed-content-2] … message topic …” (archives)
File an issue (open issues)
Editors:
(Google Inc.)
(Google Inc.)
(Google Inc.)
Participate:
File an issue (open issues)

Abstract

This specification describes how a user agent should handle fetching of content over unencrypted or unauthenticated connections in the context of an encrypted and authenticated document.

Status of this document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at https://www.w3.org/TR/.

This document was published by the Web Application Security Working Group as a Working Draft. This document is intended to become a W3C Recommendation.

The (archived) public mailing list public-webappsec@w3.org (see instructions) is preferred for discussion of this specification. When sending e-mail, please put the text “mixed-content-2” in the subject, preferably like this: “[mixed-content-2] …summary of comment…

This document is a First Public Working Draft.

Publication as a First Public Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

This document was produced by the Web Application Security Working Group.

This document was produced by a group operating under the W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.

This document is governed by the 1 March 2019 W3C Process Document.

1. Introduction

This section is not normative.

When a user successfully loads a webpage from example.com over a secure channel (HTTPS, for example), the user is guaranteed that no entity between the user agent and example.com eavesdropped on or tampered with the data transmitted between them. However, this guarantee is weakened if the webpage loads subresources such as script or images over an insecure connection. For example, an insecurely-loaded script can allow an attacker to read or modify data on behalf of the user. An insecurely-loaded image can allow an attacker to communicate incorrect information to the user (e.g., a fabricated stock chart), mutate client-side state (e.g., set a cookie), or induce the user to take an unintended action (e.g., changing the label on a button). These requests are known as mixed content.

[mixed-content] details how a user agent can mitigate these risks by blocking certain types of mixed content, and behaving more strictly in some contexts.

However, as implemented in today’s web browsers, [mixed-content] does not fully protect the confidentiality and integrity of users' data. Insecure content such as images, audio, and video can currently be loaded by default in secure contexts. Secure pages can even initiate insecure downloads which escape the user agent’s sandbox entirely.

Moreover, users do not have a clear security indicator when mixed content is loaded. When a webpage loads mixed content, browsers display an "in-between" security indicator (such as removing the padlock icon), which does not give users a clear indication of whether they should trust the page. This UX has also not proved a sufficient incentive for developers to avoid mixed content, since it is still common for webpages to load mixed content. Blocking all mixed content would provide the user with a simpler mental model -- the webpage is either loaded over a secure transport or it isn’t -- and encourage developers to securely load any mixed content that is necessary for their webpage to function properly.

Mixed Content Level 2 therefore updates and extends [mixed-content] to provide users with better security and privacy guarantees and a better security UX, while minimizing breakage. Instead of advising browsers to simply strictly block all mixed content, the core idea of Level 2 is mixed content autoupgrading. That is, mixed content that user agents are not already blocking should be autoupgraded to a secure transport. If the request cannot be autoupgraded, it will be blocked. Autoupgrading avoids loading insecure resources on secure webpages, while minimizing the amount of developer effort needed to avoid breakage.

This specification (Level 2) only recommends autoupgrading types of mixed content subresources that are not currently blocked by default, and does not recommend autougprading types of content that are already blocked. This is to minimize the amount of web-visible change; we only want to autoupgrade content if it advances us towards the goal of blocking all mixed content by default.

This specification also explicitly introduces the concept of mixed downloads. A mixed download is a resource that a user agent handles as a download, which was initiated by a secure context but is downloaded over an insecure connection. User agents should block mixed downloads because they can escape the user agent’s sandbox (in the case of an executable) or contain sensitive information (e.g., a downloaded bank statement). This is especially misleading because user agents typically indicates to the user that they are on a secure page while initiating and completing a mixed download.

2. Key Concepts and Terminology

mixed content
A request is mixed content if its url is not a priori authenticated, and the context responsible for loading it prohibits mixed security contexts (see § 4.3 Does settings prohibit mixed security contexts? for a normative definition of the latter).

A response is mixed content if it is an unauthenticated response, and the context responsible for loading it requires prohibits mixed security contexts.

Inside a context that restricts mixed content (https://secure.example.com/, for example):
  1. A request for the script http://example.com/script.js is mixed content. As script requests are blockable, the user agent will return a network error rather than loading the resource.

  2. A request for the image http://example.com/image.png is mixed content. As image requests are upgradeable, the user agent might rewrite the URL as http://example.com/image.png, otherwise it will block the load.

Note: "Mixed content" was originally defined in section 5.3 of [WSC-UI]. This document updates that initial definition.

Note: [XML] also defines an unrelated "mixed content". concept. This is potentially confusing, but given the term’s near ubiquitious usage in a security context across user agents for more than a decade, the practical risk of confusion seems low.

a priori authenticated URL
We know a priori that a request to a particular URL (url) will be delivered in a way that mitigates the risks of interception and modifications if either of the following statements is true:
  1. url is a potentially trustworthy URL [SECURE-CONTEXTS].

  2. url’s scheme is "data".

    Note: We special case data URLs here, as we don’t consider them particularly trustworthy, but we also don’t wish to block them as mixed content, as they never hit the network.

unauthenticated response
We know a posteriori that a response (response) is unauthenticated if response’s url is not a priori authenticated.
embedding document
Given a Document A, the embedding document of A is A’s browsing context's container document [HTML].
mixed download
A mixed download is a resource that a user agent handles as a download, which was initiated by a secure context but is downloaded over an insecure connection.

3. Content Categories

In a perfect world, each user agent would be required to block all mixed content without exception. Unfortunately, that is impractical on today’s Internet; a user agent needs to be more nuanced in its restrictions to avoid degrading the experience on a substantial number of websites.

With that in mind, we here split mixed content into two categories: § 3.1 Upgradeable Content and § 3.2 Blockable Content.

3.1. Upgradeable Content

Upgradeable content was previously referred as optionally-blockable in [mixed-content]

Mixed content is upgradeable when the risk of allowing its usage as mixed content is outweighed by the risk of breaking significant portions of the web. This could be because mixed usage of the resource type is sufficiently high, and because the resource is low-risk in and of itself. The fact that these resource types are upgradeable does not mean that they are safe, simply that they’re less catastrophically dangerous than other resource types. For example, images and icons are often the central UI elements in an application’s interface. If an attacker reversed the "Delete email" and "Reply" icons, there would be real impact to users.

This category includes:

We further limit this category in § 4.4 Should fetching request be blocked as mixed content? by force-failing any CORS-enabled request. This means, for example, that mixed content images loaded via <img crossorigin ...> will be blocked. This is a good example of the general principle that content falls into this category only when it is too widely used to be blocked outright. The Working Group intends to carve out more blockable subsets as time goes on.

3.2. Blockable Content

Any mixed content that is not upgradeable as defined above is considered to be blockable. Typical examples of this kind of content include scripts, plugin data, data requested via XMLHttpRequest, and so on.

Note: Navigation requests might target top-level browsing contexts; these are not considered mixed content. See § 4.4 Should fetching request be blocked as mixed content? for details.

Note: Note that requests made on behalf of a plugin are blockable. We recognize, however, that user agents aren’t always in a position to mediate these requests. NPAPI plugins, for instance, often have direct network access, and can generally bypass the user agent entirely. We recommend that user agents block these requests when possible, and that plugin vendors implement mixed content checking themselves to mitigate the risks outlined in this document.

4. Algorithms

4.1. Upgrade request to an a priori authenticated URL as mixed content, if appropriate

Note: The Fetch specification will hook into this algorithm to upgrade upgradeable mixed content to HTTPS.

Given a Request request, this algorithm will rewrite its url if the request is deemed to be upgradeable mixed content, via the following algorithm:

  1. If one or more of the following conditions is met, return without modifying request:
    1. request’s url is an a priori authenticated URL
    2. § 4.3 Does settings prohibit mixed security contexts? returns "Does Not Restrict Mixed Security Contents" when applied to request’s client.
    3. request’s mode is CORS.
    4. request’s destination is not "image", "audio", or "video".
    5. request’s destination is "image" and request’s initiator is "imageset".
  2. If request’s url’s scheme is http, set request’s url’s scheme to https, and return.

    Note: Per [url], we do not modify the port because it will be set to null when the scheme is http, and interpreted as 443 once the scheme is changed to https

4.2. Existing Mixed Content Algorithms and Modifications

Note: This section specifies modifications to existing mixed content algorithms to ignore the distinction between optionally-blockable and blockable mixed content, since all optionally-blockable mixed content will now be autoupgraded. Mixed Content §should-block-fetch and Mixed Content §should-block-response are replaced by the versions below, while Mixed Content §categorize-settings-object is left unmodified (but included below since this will replace the [mixed-content] spec).

section>

4.3. Does settings prohibit mixed security contexts?

Both documents and workers have environment settings objects which may be examined according to the following algorithm in order to determine whether they restrict mixed content. This algorithm returns "Prohibits Mixed Security Contexts" or "Does Not Prohibit Mixed Security Contexts", as appropriate.

Given an environment settings object (settings):

  1. If settingsorigin is a priori authenticated., then return "Prohibits Mixed Security Contexts".

  2. If settings has a responsible document document, then:

    1. While document has an embedding document:

      1. Set document to document’s embedding document.

      2. Let embedder settings be document’s global object's relevant settings object.

      3. If embedder settingsorigin is a priori authenticated, then return "Prohibits Mixed Security Contexts".

  3. Return "Does Not Restrict Mixed Security Contexts".

If a document has an embedding document, a user agent needs to check not only the document itself, but also the top-level browsing context in which the document is nested, as that is the context which controls the user’s expectations regarding the security status of the resource she’s loaded. For example:
http://a.com loads http://evil.com. The insecure request will be allowed, as a.com was not loaded over a secure connection.
https://a.com loads http://evil.com. The insecure request will be blocked, as a.com was loaded over a secure connection.
http://a.com frames https://b.com, which loads http://evil.com. In this case, the insecure request to evil.com will be blocked, as b.com was loaded over a secure connection, even though a.com was not.
https://a.com frames a data: URL, which loads http://evil.com. In this case, the insecure request to evil.com will be blocked, as a.com was loaded over a secure connection, even though the framed data: URL would not block mixed content if loaded in a top-level context.

4.4. Should fetching request be blocked as mixed content?

Note: The Fetch specification hooks into this algorithm to determine whether a request should be entirely blocked (e.g. because the request is for blockable content, and we can assume that it won’t be loaded over a secure connection).

Given a Request request, a user agent determines whether the Request request should proceed or not via the following algorithm:

  1. Return allowed if one or more of the following conditions are met:
    1. § 4.3 Does settings prohibit mixed security contexts? returns "Does Not Restrict Mixed Security Contexts" when applied to request’s client.
    2. request’s url is a priori authenticated.
    3. The user agent has been instructed to allow mixed content, as described in § 7.2 User Controls).
    4. request’s destination is "document", and request’s target browsing context has no parent browsing context.

      Note: We exclude top-level navigations from mixed content checks, but user agents MAY choose to enforce mixed content checks on insecure form submissions (see § 7.1 Form Submission).

  2. Return blocked.

4.5. Should response to request be blocked as mixed content?

Note: If a request proceeds, we still might want to block the response based on the state of the connection that generated the response (e.g. because the request is blockable, but the connection is unauthenticated), and we also need to ensure that a Service Worker doesn’t accidentally return an unauthenticated response for a blockable request. This algorithm is used to make that determination.

Given a request request and response response, the user agent determines what response should be returned via the following algorithm:

  1. Return allowed if one or more of the following conditions are met:
    1. § 4.3 Does settings prohibit mixed security contexts? returns Does Not Restrict Mixed Content when applied to request’s client.
    2. response’s url is a priori authenticated
    3. The user agent has been instructed to allow mixed content, as described in § 7.2 User Controls).
    4. request’s destination is "document", and request’s target browsing context has no parent browsing context.

      Note: We exclude top-level navigations from mixed content checks, but user agents MAY choose to enforce mixed content checks on insecure form submissions (see § 7.1 Form Submission).

  2. Return blocked.

5. Integrations

5.1. Modifications to Fetch

Fetch Standard §4.1 Main fetch should be modified to call § 4.1 Upgrade request to an a priori authenticated URL as mixed content, if appropriate on request between steps 3 and 4. That is, upgradeable mixed content should be autoupgraded to HTTPS before applying mixed content blocking.

5.2. Modifications to HTML

Process a navigate response should be modified as follows. Step 3 should abort the download and return if source’s active document’s url is an a priori authenticated URLs and any URL in response’s URL list is not an a priori authenticated URLs.

A similar change should be made to downloads a hyperlink. In this algorithm, step 6.2 should be modified to return (aborting the download) if subject’s node document’s url is an a priori authenticated URLs and any URL in response’s URL list is not an a priori authenticated URLs (where response is the result of fetching request).

Note: Downloads are not autoupgraded like other types of mixed content, because the user agent does not always know before requesting a resource that it will be downloaded.

Note: Resources are downloaded before the user agent decides whether to abort them based on an insecure connection. Sensitive information may therefore traverse the network even though the user agent eventually blocks the download. This is generally unavoidable because the user agent may not know that a resource is to be downloaded until it receives the final response, but by blocking the resource, user agents will encourage website operators to serve downloads over secure connections.

6. Obsolescences

6.1. Strict Mixed Content Checking

This specification renders the Mixed Content §strict-checking mode and the block-all-mixed-content CSP directive obsolete, because all mixed content is now blocked if it can’t be autoupgraded.

Note: The upgrade-insecure-requests ([upgrade-insecure-requests]) directive is not obsolete because it allows developers to upgrade blockable content. This specification only upgrades upgradeable content by default.

6.2. Mixed Content

This specification renders the [mixed-content] specification obsolete, and replaces it.

7. Security and Privacy Considerations

Overall, autoupgrading upgradeable mixed content is expected to be security- and privacy-positive, by protecting more user traffic from network eavesdropping and tampering.

There is a risk of introducing a security or privacy issue in a webpage by loading a resource that the developer did not intend. For example, suppose that a website includes an innocuous image from http://www.example.com/image.jpg, and for some reason https://www.example.com/image.jpg redirects to a tracking site. The browser will now have introduced a privacy issue without the developer’s or user’s explicit consent. However, these cases are expected to be exceedingly rare. The risk is mitigated by autoupgrading only upgradeable content, not blockable content as well. Blockable content could present more risk, for example the risk of loading an out-of-date and vulnerable JavaScript library.

7.1. Form Submission

If § 4.3 Does settings prohibit mixed security contexts? returns Restricts Mixed Content when applied to a Document's relevant settings object, then a user agent MAY choose to warn users of the presence of one or more form elements with action attributes whose values are not a priori authenticated URLs

A user agent MAY choose to warn users on submission of a form element with action attributes whose values are not a priori authenticated URLs and allow users to abort the submission.

Further, a user agent MAY treat form submissions from such a Document as a blockable request, even if the submission occurs in the top-level browsing context.

7.2. User Controls

A user agent MAY offer users the ability to override its decision to block blockable mixed content on a particular page.

Note: Practically, a user agent probably can’t get away with not offering such a back door. That said, allowing mixed script is in particular a very dangerous option, and each user agent REALLY SHOULD NOT [RFC6919] present such a choice to users without careful consideration and communication of the risk involved.

A user agent MAY offer users the ability to override its decision to automatically upgrade upgradeable mixed content on a particular page.

Any such controls offered by a user agent MUST also be offered through accessibility APIs for users of assistive technologies.

8. Acknowledgements

In addition to the wonderful feedback gathered from the WebAppSec WG, the Chrome security team was invaluable in preparing this specification. In particular, Chris Palmer, Chris Evans, Ryan Sleevi, Michal Zalewski, Ken Buchanan, and Tom Sepez gave lots of early feedback. Anne van Kesteren explained Fetch, helped define the interface to this specification, and provided valuable feedback on the Level 2 update. Brian Smith helped keep the spec focused, trim, and sane.

Conformance

Document conventions

Conformance requirements are expressed with a combination of descriptive assertions and RFC 2119 terminology. The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in the normative parts of this document are to be interpreted as described in RFC 2119. However, for readability, these words do not appear in all uppercase letters in this specification.

All of the text of this specification is normative except sections explicitly marked as non-normative, examples, and notes. [RFC2119]

Examples in this specification are introduced with the words “for example” or are set apart from the normative text with class="example", like this:

This is an example of an informative example.

Informative notes begin with the word “Note” and are set apart from the normative text with class="note", like this:

Note, this is an informative note.

Conformant Algorithms

Requirements phrased in the imperative as part of algorithms (such as "strip any leading space characters" or "return false and abort these steps") are to be interpreted with the meaning of the key word ("must", "should", "may", etc) used in introducing the algorithm.

Conformance requirements phrased as algorithms or specific steps can be implemented in any manner, so long as the end result is equivalent. In particular, the algorithms defined in this specification are intended to be easy to understand and are not intended to be performant. Implementers are encouraged to optimize.

Index

Terms defined by this specification

Terms defined by reference

References

Normative References

[DOM]
Anne van Kesteren. DOM Standard. Living Standard. URL: https://dom.spec.whatwg.org/
[FETCH]
Anne van Kesteren. Fetch Standard. Living Standard. URL: https://fetch.spec.whatwg.org/
[HTML]
Anne van Kesteren; et al. HTML Standard. Living Standard. URL: https://html.spec.whatwg.org/multipage/
[MIXED-CONTENT]
Mike West. Mixed Content. 2 August 2016. CR. URL: https://www.w3.org/TR/mixed-content/
[RFC2119]
S. Bradner. Key words for use in RFCs to Indicate Requirement Levels. March 1997. Best Current Practice. URL: https://tools.ietf.org/html/rfc2119
[SECURE-CONTEXTS]
Mike West. Secure Contexts. 15 September 2016. CR. URL: https://www.w3.org/TR/secure-contexts/
[URL]
Anne van Kesteren. URL Standard. Living Standard. URL: https://url.spec.whatwg.org/
[XHR]
Anne van Kesteren. XMLHttpRequest Standard. Living Standard. URL: https://xhr.spec.whatwg.org/

Informative References

[CSS-BACKGROUNDS-3]
Bert Bos; Elika Etemad; Brad Kemper. CSS Backgrounds and Borders Module Level 3. 17 October 2017. CR. URL: https://www.w3.org/TR/css-backgrounds-3/
[RFC6919]
R. Barnes; S. Kent; E. Rescorla. Further Key Words for Use in RFCs to Indicate Requirement Levels. 1 April 2013. Experimental. URL: https://tools.ietf.org/html/rfc6919
[UPGRADE-INSECURE-REQUESTS]
Mike West. Upgrade Insecure Requests. 8 October 2015. CR. URL: https://www.w3.org/TR/upgrade-insecure-requests/
[WSC-UI]
Thomas Roessler; Anil Saldhana. Web Security Context: User Interface Guidelines. 12 August 2010. REC. URL: https://www.w3.org/TR/wsc-ui/
[XML]
Tim Bray; et al. Extensible Markup Language (XML) 1.0 (Fifth Edition). 26 November 2008. REC. URL: https://www.w3.org/TR/xml/