diff --git a/website/docs/spec/index.md b/website/docs/spec/index.md index ef09665375..05c169a1b5 100644 --- a/website/docs/spec/index.md +++ b/website/docs/spec/index.md @@ -111,12 +111,14 @@ All MCAP records are serialized as follows: Record type is a single byte opcode, and record content length is a uint64 value. -Records may be extended by adding new fields at the end of existing fields. Readers should ignore any unknown fields. +Future changes to this specification may extend records by adding new fields at the end of existing fields. Readers should ignore any unknown fields. > The Footer and Message records will not be extended, since their formats do not allow for backward-compatible size changes. Each record definition below contains a `Type` column. See the [Serialization](#serialization) section on how to serialize each type. +Record content may end before all known fields have been serialized. Readers should treat a missing field as having that field type's [zero value](#zero-values). + ### Header (op=0x01) | Bytes | Name | Type | Description | @@ -155,13 +157,17 @@ A Channel record defines an encoded stream of messages on a topic. Channel records are uniquely identified within a file by their channel ID. A Channel record must occur at least once in the file prior to any message referring to its channel ID. Any two channel records sharing a common ID must be identical. -| Bytes | Name | Type | Description | -| ----- | ---------------- | --------------------- | ----------------------------------------------------------------------------------------------------------- | -| 2 | id | uint16 | A unique identifier for this channel within the file. | -| 2 | schema_id | uint16 | The schema for messages on this channel. A schema_id of 0 indicates there is no schema for this channel. | -| 4 + N | topic | String | The channel topic. | -| 4 + N | message_encoding | String | Encoding for messages on this channel. The [well-known message encodings][message_encodings] are preferred. | -| 4 + N | metadata | `Map` | Metadata about this channel | +The `uuid` field may be used to associate channels between files. For example, an application that merges MCAP files +may merge channels with the same non-nil `uuid` into the same output channel. + +| Bytes | Name | Type | Description | +| ----- | ---------------- | --------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | +| 2 | id | uint16 | A unique identifier for this channel within the file. | +| 2 | schema_id | uint16 | The schema for messages on this channel. A schema_id of 0 indicates there is no schema for this channel. | +| 4 + N | topic | String | The channel topic. | +| 4 + N | message_encoding | String | Encoding for messages on this channel. The [well-known message encodings][message_encodings] are preferred. | +| 4 + N | metadata | `Map` | Metadata about this channel | +| 16 | uuid | uuid | A globally unique identifier for this channel. A [nil UUID](https://datatracker.ietf.org/doc/html/rfc4122#section-4.1.7) indicates no globally unique identifier is available. | Channel records may be duplicated in the summary section. @@ -319,22 +325,33 @@ A Summary Offset record contains the location of records within the summary sect ## Serialization +### Zero Values + +Record field types have a defined _zero value_. This is the value that readers should use if that field +is missing from the serialized MCAP record. + ### Fixed-width types Multi-byte integers (`uint16`, `uint32`, `uint64`) are serialized using [little-endian byte order](https://en.wikipedia.org/wiki/Endianness). +The [zero value](#zero-values) of all integer types is 0. + ### String Strings are serialized using a `uint32` byte length followed by the string data, which should be valid [UTF-8](https://en.wikipedia.org/wiki/UTF-8). +The [zero value](#zero-values) of a string is the empty string. + ### Bytes Bytes is sequence of bytes with no additional requirements. +The [zero value](#zero-values) of a byte sequence is the empty sequence. + ### Tuple Tuple represents a pair of values. The first value has type first_type and the second has type second_type. @@ -353,6 +370,8 @@ Example `Tuple`: +The [zero value](#zero-values) of a tuple is the first value type's zero value followed by the second value type's zero value. + ### Array Arrays are serialized using a `uint32` byte length followed by the serialized array elements. @@ -365,6 +384,8 @@ An array of uint64 is specified as `Array` and serialized as: > Since arrays use a `uint32` byte length prefix, the maximum size of the serialized array elements cannot exceed 4,294,967,295 bytes. +The [zero value](#zero-values) of an array is the empty array. + ### Timestamp `uint64` nanoseconds since a user-understood epoch (i.e unix epoch, robot boot time, etc.) @@ -383,6 +404,14 @@ A `Map` would be serialized as: A serialization which has duplicate keys may cause indeterminate decoding. +The [zero value](#zero-values) of a map is an empty map. + +### UUID + +An UUID is a 128-bit identifier described in [RFC-4122](https://datatracker.ietf.org/doc/html/rfc4122). + +The [zero value](#zero-values) of an UUID is the [nil UUID](https://datatracker.ietf.org/doc/html/rfc4122#section-4.1.7). + ## Diagrams The following diagrams demonstrate various valid MCAP files.