Skip to content

Commit d0792a9

Browse files
committed
docs: clarify URN required format with regex
The required regex is: ^extension:[^:]+:[^:]+$
1 parent 1f7aa85 commit d0792a9

File tree

3 files changed

+4
-4
lines changed

3 files changed

+4
-4
lines changed

proto/substrait/extensions/extensions.proto

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ message SimpleExtensionURN {
2929

3030
// The extension URN that uniquely identifies this extension. This must follow the
3131
// format extension:<OWNER>:<ID> and serves as the "namespace" of this extension.
32-
// The URN must be valid RFC 8141 format without the urn: prefix.
32+
// This must conform to the following regex: ^extension:[^:]+:[^:]+$
3333
string urn = 2;
3434
}
3535

site/docs/extensions/index.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -13,14 +13,14 @@ Some kinds of primitives are so frequently extended that Substrait defines a sta
1313
* Window Functions
1414
* Table Functions
1515

16-
To extend these items, developers can create one or more YAML files that describe the properties of each of these extensions. Each YAML file must include a required `urn` field that uniquely identifies the extension. While these identifiers are URN-like but not technically URNs (they lack the `urn:` prefix), they will be referred to as `extension URNs` for clarity. These URNs must be valid [RFC 8141](https://www.rfc-editor.org/rfc/rfc8141.html) format without the `urn:` prefix.
16+
To extend these items, developers can create one or more YAML files that describe the properties of each of these extensions. Each YAML file must include a required `urn` field that uniquely identifies the extension. While these identifiers are URN-like but not technically URNs (they lack the `urn:` prefix), they will be referred to as `extension URNs` for clarity.
1717

1818
This extension URN uses the format `extension:<OWNER>:<ID>`, where:
1919

2020
- `OWNER` represents the organization or entity providing the extension and should follow [reverse domain name convention](https://en.wikipedia.org/wiki/Reverse_domain_name_notation) (e.g., `io.substrait`, `com.example`, `org.apache.arrow`) to prevent name collisions
2121
- `ID` is the specific identifier for the extension (e.g., `functions_arithmetic`, `custom_types`)
2222

23-
Thus, if `extension:<OWNER>:<ID>` is our URN, then `urn:extension:<OWNER>:<ID>` must be a valid [RFC 8141 URN](https://www.rfc-editor.org/rfc/rfc8141.html).
23+
These URNs must match the regex `^extension:[^:]+:[^:]+$`.
2424

2525
The YAML file is constructed according to the [YAML Schema](https://github.com/substrait-io/substrait/blob/main/text/simple_extensions_schema.yaml). Each definition in the file corresponds to the YAML-based serialization of the relevant data structure. If a user only wants to extend one of these types of objects (e.g. types), a developer does not have to provide definitions for the other extension points.
2626

site/docs/serialization/binary_serialization.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ For simple extensions, a plan references the extension URNs associated with the
2222

2323
Simple extensions within a plan are split into three components: an extension URN, an extension declaration and a number of references.
2424

25-
* **Extension URN**: A unique identifier for the extension following the format `extension:<OWNER>:<ID>` that identifies a YAML document specifying one or more specific extensions. Declares an anchor that can be used in extension declarations. The URN with the `urn:` prefix added must conform to [RFC 8141](https://www.rfc-editor.org/rfc/rfc8141.html).
25+
* **Extension URN**: A unique identifier for the extension following the format `extension:<OWNER>:<ID>` that identifies a YAML document specifying one or more specific extensions. Declares an anchor that can be used in extension declarations. The extension URN must conform to the regex `^extension:[^:]+:[^:]+$`.
2626
* **Extension Declaration**: A specific extension within a single YAML document. The declaration combines a reference to the associated extension URN along with a unique key identifying the specific item within that YAML document (see [Function Signature](../extensions/index.md#function-signature)). It also defines a declaration anchor. The anchor is a plan-specific unique value that the producer creates as a key to be referenced elsewhere.
2727
* **Extension Reference**: A specific instance or use of an extension declaration within the plan body.
2828

0 commit comments

Comments
 (0)