Skip to content

Conversation

@karenetheridge
Copy link
Member

Not intended for merging in this form.

Notes on my initial thoughts on how each uri-reference in the OAD should resolve.

See #4925 (reply in thread).

@karenetheridge karenetheridge changed the base branch from main to v3.2-dev September 7, 2025 03:40
@karenetheridge karenetheridge force-pushed the ether/v3.2-uri-resolution-notes branch from 599193e to 93c4121 Compare September 7, 2025 03:43
@karenetheridge karenetheridge force-pushed the ether/v3.2-uri-resolution-notes branch from 93c4121 to 43316da Compare September 7, 2025 03:43
@karenetheridge karenetheridge changed the title Ether/v3.2 uri resolution notes uri resolution notes Sep 18, 2025
@baywet
Copy link
Member

baywet commented Nov 6, 2025

Sharing some notes from the meeting: This is a working document from @karenetheridge with her understanding of the spec today. If some of that is "wrong" it probably means that we need to make the specification clearer. @handrews to provide some feedback to help get to a common understanding and identify the potential for improvements in the specification.

pattern: '^3\.2\.\d+(-.+)?$'
$self:
type: string
$comment: resolved against the retrieval uri
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is one of several spots where I would not use "retrieval URI." I tried to make this more clear in 3.2, so it would be good to understand what is not coming across. I would probably use a phrase like "against the document's base URI," although I don't think that that phrase is used prominently in 3.2. Maybe that is part of the lack of clarity?

Here's the wording for $self:

This string MUST be in the form of a URI reference as defined by [RFC3986] Section 4.1. The $self field provides the self-assigned URI of this document, which also serves as its base URI in accordance with [RFC3986] Section 5.1.1. Implementations MUST support identifying the targets of API description URIs using the URI defined by this field when it is present. See Establishing the Base URI for the base URI behavior when $self is absent or relative, and see Appendix F for examples of using $self to resolve references.

Perhaps we could change "See Establishing the Base URI for the base URI behavior when $self is absent or relative" to "When $self is relative, it is resolved against the document's base URI"?

In section 4.1.2.2.1 Establishing the Base URI, the third paragraph starts:

The most common base URI source that is used in the event of a missing or relative $self (in the OpenAPI Object) and (for Schema Object) $id is the retrieval URI.

My intention with that phrasing was both to note the common behavior and to emphasize that the base URI is not always the retrieval URI. Does it seem clear enough if the $self wording is changed, or is it still unclear?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the base URI is not always the retrieval URI

I'm not clear on the difference. Isn't the retrieval URI already defined as "the initial URI we assign to the document when we start reading from it"? I recall we used the retrieval URI phrasing even if the document is never actually reachable as an URL on the network, but just a thing in application state.

It sounds like we need to come up with a clearer name for whatever this thing is, and use it throughout.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@karenetheridge the clear name is "base URI." The less-clear part is that there are several possible base URIs that are searched in order, and if any of them are relative, then they are resolved against the next possible base URI. Since $self is the first place to search, it both is a base URI, and potentially needs to be resolved against the next base URI in the list.

But all of this is about base URIs. It is not about retrieval URLs/URIs, except by coincidence. They happen to be one of four possible base URI sources (and one of three after $self). But it's always about base URIs, and we should always talk in terms of base URIs.

This is for OAD URIs, though, not API URLs, where the situation is different.

format: uri-reference
default: 'https://spec.openapis.org/oas/3.2/dialect/WORK-IN-PROGRESS'
servers:
$comment: server urls are resolved against the HTTP request uri itself (template matching ideally happens first, because of url encoding, but matching the template parts separately may produce ambiguous results, so in my implementation I concatenate the server url template with the path template and then match that against the entire uri)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this "HTTP request uri" the request URI of the OAD document? This is a place where our URI/URL distinction can get a bit muddled.

In this case, I would call this a URL matters because the document's identity (URI) might be set (by $self) to something other than the document's location (which is generally it's retrieval URL).

The tricky part here is that if we are simulating the document's retrieval URL (e.g. because you are testing things and don't want to actually deploy to production, but do want to test as if it were deployed), then it's arguably being a bit more URI-ish than URL-ish? But I still lean towards calling that a URL because the whole point is that you are simulating the location, and want it to behave as a location. You are not simulating it to separate identity from location (which is what $self does). You are simulating it to get location-based (not identity-based) behavior.

This simulation use case is discussed in the third paragraph of 4.1.2.2.1 Establishing the Base URI, although that's not linked from the Server Object so maybe that's a concern.

Does all of this make sense, and if so do you have any ideas on how we can clarify it? I moved the "Relative References in API URLs" section under the Server Object, but might have accidentally dropped explanatory text elsehwere in the process.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this "HTTP request uri" the request URI of the OAD document?

No, whenever I refer to an HTTP request it is in the server context -- it is a request for one of the APIs described by the document (or perhaps not matching any API, but there is still an attempt to match one of the path-items to it).

When I try to match server urls to the HTTP request, I resolve the url against the retrieval uri (not $self), and then again against the HTTP request's URI, if the retrieval URI was relative. (We have to have a URI with host and scheme in order to match it against the HTTP request URI, and having a relative retrieval URI would prevent that.

I believe this is correct?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@karenetheridge I think I am confused. How can you resolve a relative server URL against an API request, when you need the server URL to make an API request? That sounds like server-side matching which is, to me, a different concern from URL construction, although they are definitely related.

I'm glad you are calling attention to server matching. It had never occurred to me to think through that process, so please bear with me as a try to wrap my head around it.

By URI/URL construction, I mean "I'm parsing this OAD and I found some part of an OAD URI or API URL and need to construct the full URI/URL before I can use it."

By server-side matching, I mean "I am handling an API request and I am trying to figure out which possible URL construction (as in the previous definition) matches this URL, after which I need to split it back into parts to extract server variables, path template parameters, and handle the query string."

Most of your comments here seem to be about construction, which is why I am confused. I think it is important to get the construction parts clear and separate from the matching guidance. Because if we aren't constructing correctly, we'll end up attempting to match to an incorrect set of potential URLs.

description:
type: string
termsOfService:
$comment: resolved against the resolved version of $self
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I had to think about whether Info/License/Contact should be OAD URI-based or API URL-based. Taking a look at these fields, we do say "URI", and that's what I was expecting to find. I wonder if we should make "URI" a link to 4.1.2.2 Relative References in API Description URIs for added clarity?

flows:
$ref: '#/$defs/oauth-flows'
oauth2MetadataUrl:
$comment: resolved against the matching and de-templated server url
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's an interesting wrinkle here that this is a URL (including possibly a relative URL-reference), but the resolved server URL is expected to be used as a prefix (with resolved Path Templates) rather than a normal base URL. I'm not sure how much this matters as both behaviors are well-defined, but it means that, given a server URL of https://example.com/api, a Path Template of /foo, and a security URL of /foo, the resulting API endpoint is https://example.com/api/foo but the resulting security URL is https://example.com/foo (no /api).

Which leads to another observation: The Paths Object requires Path Templates to start with /, but the Server Object does not forbid URLs from ending with /. I'm not quite sure what to do with that 😵‍💫

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... but the resulting security URL is https://example.com/foo (no /api)

Yes, that was my observation as well. I assume that this is the best we can do, and it doesn't really make sense for the api portion to appear in the resolved form of the server url we use for matching.

It's even more clear when we change the retrieval URL to ".../api.json". That's clearly a filename, not a directory, and we don't expect to see that in any request URLs that we're trying to match.

The Paths Object requires Path Templates to start with /, but the Server Object does not forbid URLs from ending with /. I'm not quite sure what to do with that.

Yes. I played around with trying to resolve the path template against the server url, instead of appending, but the results were less favourable.

We could say something like "the templated URI resulting from the concatenation of the server url and path template portions should then be normalized", which would also allow for constructs like ../ to be removed, but I feel like there would be a lot of edge cases and surprises here in practice.

The alternative is: you end up with a uri template that contains //, which is likely not ever going to match. So don't do that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it doesn't really make sense for the api portion to appear in the resolved form of the server url we use for matching.

In my example, the resolved API endpoint does include /api so I am confused as to this statement. Only the auth endpoint does not include it. This might be more of me not quite wrapping my head around the server matching use cases.

We could say something like "the templated URI resulting from the concatenation of the server url and path template portions should then be normalized", which would also allow for constructs like ../ to be removed, but I feel like there would be a lot of edge cases and surprises here in practice.

My sense is that URL normalization is something that does or doesn't happen independently. I would tend to normalize URLs as much as possible before attempting to compare, athough as you note that could get surprising. (For those unfamiliar, normalization and comparison is a tricky topic addressed in-depth by RFC3986 §6.)

The alternative is: you end up with a uri template that contains //, which is likely not ever going to match. So don't do that.

I want to say that // anywhere other than as the authority prefixe (as in https://example.com) gets normalized to /, but I can't find a reference for that so it might just be a thing that web servers tend to do? I'm really not sure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants