docs(trustless-gateway): Add p2p usage section #520

aschmahmann · 2025-11-11T19:55:12Z

Clarify caveats around utilizing trustless gateways within p2p networks.

Closes #519

cc @lidel

github-actions · 2025-11-11T19:56:37Z

🚀 Build Preview on IPFS ready

🔎 Commit: b155d57
🔏 CID bafybeibcqenrqj6pptvq2kxecqwcitjsxpjlghgkzsgcwzkbyan33gyj44
📦 Preview:

aschmahmann · 2025-11-11T20:02:29Z

src/http-gateways/trustless-gateway.md

+Gateways serving data to non-LAN peers SHOULD support HTTPS and HTTP/2 or greater.
+Similarly, it is RECOMMENDED that clients restrict to HTTPS and HTTP/2 or greater.


Note: Technically #519 asks for "why is http/2 recommended" rather than just documenting that it's the case. IIUC generally we don't specify rationale within the spec document, but if it makes sense we can.

As I was quoted in the above issue my understanding for how we ended up with http/2 as required within boxo (and recommended in general) is:

HTTP/1.1 does not have multiplexing which means that if we clients want to send many downloading requests (e.g. for various resources, to handle parallel block requests, optimistic queries to avoid having to round-trip through the content routing system, etc.) this will be quite painful

Browsers in particular have a limited number of concurrent requests they can make at a time with http 1.1 to a given origin (6-8 iirc). This means that supporting http/1.1 would mean users would sometimes see reasonable performance but frequently get pretty bad performance

The result is it seemed better to not support http/1.1 at all then put users in the patchwork "but it works on my small test, oh well I guess ipfs is just slow" regime

Note: "Recommended" is present in the spec because sometimes the developer understands the caveats present and decides to support HTTP/1.1 anyway. For example, within browsers there's no control over which HTTP versions are used so the client will support them all anyway even if a server using HTTP/1.1 ultimately performs poorly. Similarly, in LAN environments getting a TLS certificate setup with which to use HTTPS may be painful, h2c may not be easily accessible across platforms / languages, and performance criteria are more controllable which makes the downsides more manageable.

Thanks for adding this. Rationale sgtm, but should be in the document imo: I've pushed b155d57 which expands the P2P usage section with detailed rationale for HTTP/2 and TLS requirements. Since this is in "Appendix: Notes for implementers", I think it is fine to provide extra context (as :::note) explaining WHY these are SHOULD requirements:

HTTP/2: explain head-of-line blocking, connection overhead, and impact on DAG fetching with many block requests

TLS: clarify privacy vs integrity (multihash provides integrity, TLS provides confidentiality and is required for HTTP/2 in browsers)

Add security considerations for both gateway operators and clients

Include practical defaults (30s timeout, 2MiB blocks)

Reference RFC 9113 for HTTP/2 specifications

If not for humans, this "WHY" will be even more important for LLMs that will generate client/server code for this spec.

@aschmahmann @BigLep thoughts?

Clarify caveats around utilizing trustless gateways within p2p networks

…nale Expand the P2P usage section with detailed rationale for HTTP/2 and TLS requirements. Since this is in "Appendix: Notes for implementers", we provide extra context explaining why these are SHOULD requirements: - HTTP/2: explain head-of-line blocking, connection overhead, and impact on DAG fetching with many block requests - TLS: clarify privacy vs integrity (multihash provides integrity, TLS provides confidentiality and is required for HTTP/2 in browsers) - Add security considerations for both gateway operators and clients - Include practical defaults (30s timeout, 2MiB blocks) - Reference RFC 9113 for HTTP/2 specifications Addresses #519 request to document why HTTP/2 is recommended rather than just stating the requirement.

aschmahmann · 2025-11-12T21:31:00Z

src/http-gateways/trustless-gateway.md

+
+Clients SHOULD NOT download unbounded amounts of data before being able to validate that data.
+
+Clients SHOULD limit the maximum block size to 2MiB. This value aligns with the maximum block size used in UnixFS chunking and provides a reasonable balance between transfer efficiency and resource constraints.


How about something like this?

Suggested change

Clients SHOULD limit the maximum block size to 2MiB. This value aligns with the maximum block size used in UnixFS chunking and provides a reasonable balance between transfer efficiency and resource constraints.

Clients SHOULD limit the maximum block size to 2MiB. This value aligns with the maximum block size used in Bitswap, and throughout much of the ecosystem.

UnixFS really has nothing to do with this, the 2MiB isn't in the spec (implementations like kubo won't even let you get that high when creating a UnixFS DAG). This is just where much of the ecosystem has set their limits so that individual storage providers, pinning services, etc. don't ingest blocks they can't serve back to clients because either the clients will reject the blocks as too big or the protocols won't support it. See the similar comment in the Bitswap spec

specs/src/bitswap-protocol.md

Lines 67 to 71 in 110bf46

## Block Sizes

Bitswap implementations MUST support sending and receiving individual blocks of

sizes less than or equal to 2MiB. Handling blocks larger than 2MiB is not recommended

so as to keep compatibility with implementations which only support up to 2MiB.

If you want we can use language closer to this

Additionally, should the "This value aligns with the..." section move into the note below?

aschmahmann · 2025-11-12T21:37:11Z

src/http-gateways/trustless-gateway.md

+
+:::note
+
+Blocks larger than 2MiB can cause memory pressure on resource-constrained clients and increase the window for incomplete transfers. Since blocks must be validated as a unit, smaller blocks allow for more granular verification and easier retries on failure.


As alluded to above there are a whole host of issues beyond just these, some of which are bigger deals. Notably, working with larger blocks is in general, unfortunately, not yet ecosystem-safe. If you don't know what you're doing (i.e. you fall under the category of overriding the RECOMMEND / SHOULD) you're likely to cause yourself problems by virtue of building tooling that doesn't work with much else in the ecosystem.

Also, the DoS risks, and latency / excessive bandwidth consumption tradeoffs generally become more difficult.

aschmahmann · 2025-11-12T21:39:56Z

src/http-gateways/trustless-gateway.md

+Trustless Gateways serve two primary deployment models:
+
+1. **Verifiable bridges**: Gateways that provide trustless access from HTTP clients into IPFS networks, where the gateway operator is distinct from content providers
+2. **P2P retrieval endpoints**: Gateways embedded within P2P networks where they serve as HTTP interfaces to peer-operated block stores


nit: I get we're stuck with this phrasing to some extent, but IMO if we can shy away from referring to implementations of the non-recursive trustless IPFS HTTP gateway API used in a p2p network as gateways that'd be great. They're not really gateways into the network as much as part of it 😅.

aschmahmann · 2025-11-12T21:53:17Z

src/http-gateways/trustless-gateway.md

+
+To work around this limitation, clients must open multiple parallel TCP connections to achieve concurrent requests. However, each additional connection incurs significant overhead: TCP handshake latency, memory buffers, bandwidth competition, and increased implementation complexity. Browsers limit concurrent connections per origin (typically 6-8) to manage these costs, but this limitation affects all HTTP/1.1 clients, not just browsers, as the overhead of maintaining many connections becomes prohibitive.
+
+When fetching a DAG that requires many block requests, HTTP/1.1's lack of multiplexing creates a critical bottleneck. Clients face a difficult trade-off: either serialize requests (severely limiting throughput) or maintain many parallel connections (incurring substantial overhead). Users may experience acceptable performance with small test cases, but real-world IPFS content with deep DAG structures will encounter significant slowdowns. HTTP/2's stream multiplexing (:cite[rfc9113]) eliminates this bottleneck by allowing many concurrent requests over a single connection without head-of-line blocking at the application layer.


This is true, but even if not relying on block requests (e.g. CARs with queries sufficient to describe what the client needs) there are other constraints within p2p networks that can cause issues here too (e.g. optimistic queries which also eat up requests).

aschmahmann · 2025-11-12T22:10:22Z

src/http-gateways/trustless-gateway.md

+
+Trustless Gateways operating in P2P contexts SHOULD NOT recursively search for content.
+
+In P2P networks, gateways typically serve as block stores for specific peers or content, rather than attempting to locate content across the entire network. Recursive content discovery is handled by the P2P layer (e.g., Amino DHT, IPFS routing), not by individual HTTP gateways.


Recursive content discovery

This isn't quite right. Recursion implies it happens again (e.g. a gateway that returns data itself doing data lookups, or a routing endpoint itself doing routing lookups), the Amino DHT, etc. don't do that. Additionally, as this is sort of longer justification it should probably move into the note section as well,

Does this sentence need to exist given the note below seems to cover the same content anyway?

aschmahmann requested a review from lidel as a code owner November 11, 2025 19:55

aschmahmann commented Nov 11, 2025

View reviewed changes

docs(trustless-gateway): Add p2p usage section

f7daba8

Clarify caveats around utilizing trustless gateways within p2p networks

aschmahmann force-pushed the feat/trustless-gateway-p2p-notes branch from ab86146 to f7daba8 Compare November 11, 2025 20:10

lidel mentioned this pull request Dec 1, 2025

Protocol stewardship and improvements — IPFS/2025 ipshipyard/roadmaps#16

Open

aschmahmann commented Dec 11, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

docs(trustless-gateway): Add p2p usage section #520

docs(trustless-gateway): Add p2p usage section #520

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		Gateways serving data to non-LAN peers SHOULD support HTTPS and HTTP/2 or greater.
		Similarly, it is RECOMMENDED that clients restrict to HTTPS and HTTP/2 or greater.


		Clients SHOULD NOT download unbounded amounts of data before being able to validate that data.

		Clients SHOULD limit the maximum block size to 2MiB. This value aligns with the maximum block size used in UnixFS chunking and provides a reasonable balance between transfer efficiency and resource constraints.

	## Block Sizes

	Bitswap implementations MUST support sending and receiving individual blocks of
	sizes less than or equal to 2MiB. Handling blocks larger than 2MiB is not recommended
	so as to keep compatibility with implementations which only support up to 2MiB.


		:::note

		Blocks larger than 2MiB can cause memory pressure on resource-constrained clients and increase the window for incomplete transfers. Since blocks must be validated as a unit, smaller blocks allow for more granular verification and easier retries on failure.


		To work around this limitation, clients must open multiple parallel TCP connections to achieve concurrent requests. However, each additional connection incurs significant overhead: TCP handshake latency, memory buffers, bandwidth competition, and increased implementation complexity. Browsers limit concurrent connections per origin (typically 6-8) to manage these costs, but this limitation affects all HTTP/1.1 clients, not just browsers, as the overhead of maintaining many connections becomes prohibitive.

		When fetching a DAG that requires many block requests, HTTP/1.1's lack of multiplexing creates a critical bottleneck. Clients face a difficult trade-off: either serialize requests (severely limiting throughput) or maintain many parallel connections (incurring substantial overhead). Users may experience acceptable performance with small test cases, but real-world IPFS content with deep DAG structures will encounter significant slowdowns. HTTP/2's stream multiplexing (:cite[rfc9113]) eliminates this bottleneck by allowing many concurrent requests over a single connection without head-of-line blocking at the application layer.


		Trustless Gateways operating in P2P contexts SHOULD NOT recursively search for content.

		In P2P networks, gateways typically serve as block stores for specific peers or content, rather than attempting to locate content across the entire network. Recursive content discovery is handled by the P2P layer (e.g., Amino DHT, IPFS routing), not by individual HTTP gateways.

docs(trustless-gateway): Add p2p usage section #520

Are you sure you want to change the base?

docs(trustless-gateway): Add p2p usage section #520

Uh oh!

Conversation

Uh oh!

Uh oh!

🚀 Build Preview on IPFS ready

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants