-
Notifications
You must be signed in to change notification settings - Fork 36
Description
wasi-http should probably specify exactly when scheme and authority won't be present.
Conceptually, a request always has a scheme and authority, but annoyingly it's expressed in multiple ways:
- Requests that were originally HTTP/2 or HTTP/3 or HTTP/1.1 in absolute-form (rare but possible) will always have a scheme and authority, expressed as either native HTTP/2 or HTTP/3 pseudo-headers or as an HTTP/1.1 absolute-form request-line. The Host header is not authoritative, and there are normative requirements to ignore it and replace it with the authority (and failing to do so can lead to request smuggling vulnerabilities).
- Requests that were originally HTTP/1.1 with a request-line in the origin-form, absolute-form or asterisk-form have authoritative Host headers (and no scheme or authority in the request-line). If they're transformed to HTTP/2 or HTTP/3, the
:authoritypseudo-header will not be filled in with the Host by default.:scheme, however, is not optional - but figuring out the scheme of an origin-form HTTP/1.1 request is annoying and guessworky and relies on knowledge of how the system is configured. CONNECTandOPTIONSrequests are special and weird but not in any way that fundamentally complicates things.
The reason for this weirdness is to allow exact reconstruction of an original HTTP/1.1 request-line even after it's gone through transformations (specifically, if an :authority pseudo-header is present, it was absolute-form, otherwise it was origin-form unless it was CONNECT or OPTIONS).
Is this something wasi-http wants to support? Spiritually, wasi-http implementations are proxies and the abstract interface is quite similar to HTTP/2 and HTTP/3 in being another way of representing abstract HTTP in ~full generality. There are some applications that do want to allow exact reconstruction, but I strongly suspect most applications not only don't care but are actively harmed by having to deal with special cases here (or, more likely, failing to deal with them and being subtly wrong).
Whatever the answer here is, it should be documented, and guidance to implementations provided about how much they should tinker with request control data because I think some are getting this wrong independent of this decision (e.g., wasmtime right now is filling missing authority data in from Host headers but not scrubbing Host headers if authority data was present, and assumes that it's served over an insecure connection and so the scheme is http if missing (reasonable) but also over-rides schemes in absolute-form (incorrect; bytecodealliance/wasmtime#11571)).
Spec references: