Recap Type Spec

  • Version 0.3.0
  1. Introduction
  2. Types
    1. null
    2. bool
    3. int
    4. float
    5. string
    6. bytes
    7. list
    8. map
    9. struct
    10. enum
    11. union
  3. Documentation
  4. Defaults
  5. Optional Types
  6. Logical Types
    1. build.recap.Date
    2. build.recap.Decimal
    3. build.recap.Duration
    4. build.recap.Interval
    5. build.recap.Time
    6. build.recap.Timestamp
    7. build.recap.UUID
  7. Aliases
    1. Examples
    2. Built-in Aliases
  8. Changelog


This document defines Recap’s types. It is intended to be the authoritative specification. Recap type implementations must adhere to this document.

This spec uses YAML to provide examples, but Recap’s types are agnostic to the serialized format. Recap types may be defined in YAML, TOML, JSON, XML, or any other compatible language.

Recap’s type spec is also available as a JSON schema, which you can use to validate your type definitions.

What is Recap’s Type Spec?

Recap’s type spec describes a data model that can model relation database schemas and RPC IDLs with minimal type coercion. It’s similar to Apache Arrow’s Schema.fbs or Apache Kafka’s

Why Does Recap Type Spec Exist?

Data passes through web services, databases, message brokers, and object stores. Each system describes its data differently. Developers have historically written conversion logic and tooling for each system. Repeatedly building custom logic and tooling is inefficient and error-prone. Recap’s type system describes schemas in a standard data model so a single set of converters and tools can be built to deal with data as it moves between systems.


Recap supports the following types:

  • null
  • bool
  • int
  • float
  • string
  • bytes
  • list
  • map
  • struct
  • enum
  • union

This section defines each type.

NOTE: Attribute types in the spec define what type a Recap implementation should use when storing the attribute. Attribute types are not the same as Recap types unless noted.


A null value.


A boolean value.


An integer value with a fixed bit length.


Name Description Type Required Default
bits Number of bits. 32-bit signed integer YES  
signed False means the integer is unsigned. Boolean NO true


# A 32-bit signed integer
type: int
bits: 32
signed: true


An IEEE 754 encoded floating point value with a fixed bit length.


Name Description Type Required Default
bits Number of bits. 32-bit signed integer YES  


# A 32-bit IEEE 754 encoded float
type: float
bits: 32


A UTF-8 encoded Unicode string with a maximum byte length.


Name Description Type Required Default
bytes The maximum number of bytes to store the string. Must be >= 1 if set. If unset, the string length is infinite. 64-bit signed integer | null NO null
variable If true, the string is variable-length (<= bytes). If false, bytes must be set. Boolean NO true


# A VARCHAR(255)
type: string
bytes: 255
variable: true


A byte array value.


Name Description Type Required Default
bytes The maximum number of bytes that can be stored in the byte array. Must be >= 1 if set. If unset, the byte length is infinite. 64-bit signed integer | null NO null
variable If true, the byte array is variable-length (<= bytes). If false, bytes must be set. Boolean NO true


type: bytes
bytes: 255
variable: true


A list of values all sharing the same type.


Name Description Type Required Default
values Type for all items in the list. Recap type object YES  
length The maximum length of the list. Must be >= 1 if set. If unset, the list size is infinite. 64-bit signed integer | null NO null
variable If true, the list is variable-length (<= length if length is set). If false, length must be set. Boolean NO true


# A list of unsigned 64-bit integers
type: list
  type: int
  bits: 64
  signed: false


A map of key/value pairs where each key is the same type and each value is the same type.


Name Description Type Required Default
keys Type for all items in the key set. Recap type object YES  
values Type for all value items. Recap type object YES  


# A map from 32-bit strings to boolean values
type: map
  type: string
  bytes: 2_147_483_647
  type: bool


An ordered collection of Recap types. Table schemas are typically represented as Recap structs, though the two are not the same. Recap structs support additional features like the nested structs (like a Protobuf Message) and unnamed fields (like a CSV file with no header).

struct Attributes

Name Description Type Required Default
name The struct’s name. String NO  
fields An ordered list of Recap types. List of Recap type objects NO []

field Attributes

Recap types in the fields attribute can have an extra name attribute to set a field’s name.

Name Description Type Required Default
name The field’s name. String NO  


# A struct with a required signed 32-bit integer field called "id"
type: struct
  - name: id
    type: int
    bits: 32
  - name: email
    type: string
    bytes: 255


An enumeration of string symbols.


Name Description Type Required Default
symbols An ordered list of string symbols. List of strings YES  


# An enum with RGB symbols
type: enum
symbols: ["RED", "GREEN", "BLUE"]


A value that can be one of several types. It is acceptable for a value to be more than one of the types in the union.


Name Description Type Required Default
types A list of types the value can be. List of Recap type objects YES  


# A union type of null or a 32-bit signed int
type: union
  - type: "null"
  - type: int
    bits: 32

Unions can also be defined as a list of types:

# A union type of null or a boolean
type: ["null", "bool"]


All types support a doc attribute, which allows developers to document types.

Name Description Type Required Default
doc A documentation string for the type. A string | null NO null
type: union
doc: A union type of null or a 32-bit signed int
    - type: "null"
    - type: int
      bits: 32


Recap types can have a default attribute with a literal value. The value is used when the field is not set in a struct.

Name Description Type Required Default
default The default value for a reader if the field is not set in the struct. Literal of any type NO  
# A struct with a field called "id" that defaults to 0
type: struct
  - name: id
    type: int
    bits: 32
    default: 0

A missing default is differentiated from a default with a null value. An unset default is treated as “no default”, while a default that’s been set to null is treated as a null default.

NOTE: Database defaults often appear as strings like nextval('\"public\".some_id_seq'::regclass) or 'USD'::character varying. Such defaults are left to the developer to interpret based on the database they’re using.

Optional Types

Optional types are expressed as a union with a null type and a null default (similar to Avro’s fields).

# A struct with an optional string field called "secondary_phone"
type: struct
  - name: secondary_phone
    type: union
    types: ["null", "string32"]
    default: "null"

Unions can be cumbersome, so Recap supports a shorthand syntax for types using optional:

# A struct with an optional string field called "secondary_phone"
type: struct
  - name: secondary_phone
    type: string32?
    optional: true

The above shorthand is equivalent to the union example.

Optional types also work for unions:

# A union type of null or a 32-bit signed int
type: union
  - type: int32
  - type: float32
optional: true

Optionals also work with aliases:

# A struct with an optional string field called "secondary_phone"
type: struct
  - name: phone
    alias: phone_type
    type: string32
  - name: secondary_phone
    type: phone_type
    optional: true

Optionality is not inherited in aliases:

type: struct
  - name: phone
    alias: phone_type
    type: string32
    optional: true
  - name: secondary_phone
    type: phone_type

Logical Types

Logical types annotate one of the 11 Recap types listed in the Types section at the top of the spec. Logical types add additional context to Recap types that help converters. For example, a decimal logical type can be defined as:

type: bytes
logical: decimal
precision: 6
scale: 3
bytes: 16
variable: false

Developers may define their own logical types.

Logical types may require additional attributes such as the precision and scale attributes shown above. Recap converters may use logical annotations to convert schemas to more accurate types such as SQL’s DECIMAL type (rather than BYTES).

Logical type names are globally unique, so they must include a unique dotted namespace prefix.

Built-in Logical Types

Recap comes with a collection of built-in logical types:

  • build.recap.Date: An integer representing a length of time since the UNIX epoch without timezones and leap seconds.
  • build.recap.Decimal: A byte array representing an arbitrary-precision decimal number.
  • build.recap.Duration: An integer representing a length of time and time unit without timezones and leap seconds.
  • build.recap.Interval: A byte array representing an interval of time on a calendar measured in months, days, and a duration of intra-day time with a time unit.
  • build.recap.Time: An integer representing a length of time since midnight without timezones and leap seconds.
  • build.recap.Timestamp: An integer representing a length of time (in a time unit) since a specific epoch.
  • build.recap.UUID: A string representing a UUID in 8-4-4-4-12 format as defined in RFC 4122.


Elapsed time since the UNIX epoch without timezones and leap seconds.


build.recap.Date logical types must annotate int types.


Name Description Type Required Default
unit A string time unit. String literal of year, month, day, hour, minute, second, millisecond, microsecond, nanosecond, or picosecond. YES  


type: int
logical: build.recap.Date
unit: day


An arbitrary-precision decimal number. This type is the same as Avro’s Decimal.


build.recap.Decimal logical types must annotate bytes types.


Name Description Type Required Default
precision Total number of digits. 123.456 has a precision of 6. 32-bit signed integer YES  
scale Digits to the right of the decimal point. 123.456 has a scale of 3. 32-bit signed integer YES  


type: bytes
logical: build.recap.Decimal
precision: 6
scale: 3
bytes: 16
variable: false


A length of time without timezones and leap seconds. This type is the same as Arrow’s Duration but with a superset of time units.


build.recap.Duration logical types must annotate int types.


Name Description Type Required Default
unit A string time unit. String literal of year, month, day, hour, minute, second, millisecond, microsecond, nanosecond, or picosecond. YES  


type: int
logical: build.recap.Duration
bits: 64
unit: millisecond


An interval of time on a calendar. This measurement allows you to measure time without worrying about leap seconds, leap years, and time changes. Years, quarters, hours, and minutes can be expressed using this type.

Intervals are measured in months, days, and an intra-day time measurement. Months and days are each 32-bit signed integers. The remainder is a 64-bit signed integer measured in a certain time unit. Leap seconds are ignored.


build.recap.Interval logical types must annotate bytes types with the variable attribute set to false and the bytes attribute set to 16.

build.recap.Interval is the same as Avro’s Duration but with a superset of time units and 16 bytes instead of 12 bytes.


Name Description Type Required Default
unit A string time unit. String literal of year, month, day, hour, minute, second, millisecond, microsecond, nanosecond, or picosecond. YES  


type: bytes
logical: build.recap.Interval
bytes: 16
variable: false
unit: millisecond


Elapsed time since midnight without timezones and leap seconds.


build.recap.Time logical types must annotate int types.


Name Description Type Required Default
unit A string time unit. String literal of year, month, day, hour, minute, second, millisecond, microsecond, nanosecond, or picosecond. YES  


type: int
logical: build.recap.Time
bits: 32
unit: millisecond


Time elapsed since a specific epoch.

A timestamp with no timezone is a DATETIME in database parlance–a date and time as you would see it on wrist-watch.

A timestamp with a timezone represents the amount of time elapsed since the 1970-01-01 00:00:00 epoch in UTC time zone (regardless of the timezone that’s specified). Readers must translate the UTC timestamp to a timestamp value for the specified timezone. See Apache Arrow’s Schema.fbs documentation for more details.

This type is the same as Arrow’s timestamp but with a superset of time units and an arbitrary bit length.


build.recap.Timestamp logical types must annotate int types.


Name Description Type Required Default
unit A string time unit. String literal of year, month, day, hour, minute, second, millisecond, microsecond, nanosecond, or picosecond. YES  
timezone An optional Olson timezone database string. A string | null NO null


type: int
logical: build.recap.Timestamp
bits: 64
unit: millisecond


A string representing a UUID in 8-4-4-4-12 format as defined in RFC 4122.


build.recap.UUID logical types must annotate string types with a bytes attribute greater than or equal to 36.


build.recap.UUID logical types have no additional attributes.


type: string
logical: build.recap.UUID
bytes: 36
variable: false


All types support an alias attribute. An alias must reference one of the 11 Recap types defined above. Aliases are useful in larger data structures where complex types are repeatedly used.

Aliases are globally unique, so they must include a unique dotted namespace prefix. Naked aliases (aliases with no dotted namespace) are reserved for Recap’s built-in aliases (see the next section).


Name Description Type Required Default
alias An alias for the type. A string | null NO null


Aliases are referenced using the type field:

type: struct
doc: A book with pages
  - name: previous
    alias: com.mycorp.models.Page
    type: int
    bits: 32
    signed: false
  - name: next
    type: com.mycorp.models.Page

Recap will treat this struct the same as:

type: struct
doc: A book with pages
  - name: previous
    alias: com.mycorp.models.Page
    type: int
    bits: 32
    signed: false
  # The `next` field has the same types and attributes
  # as the `previous` one; they're both pages.
  - name: next
    type: int
    bits: 32
    signed: false

Recap also allows cyclic references:

alias: com.mycorp.models.LinkedListUint32
type: struct
doc: A linked list of unsigned 32-bit integers
  - name: value
    type: int
    bits: 32
    signed: false
  - name: next
    type: com.mycorp.models.LinkedListUint32

And attribute overrides:

type: struct
  - name: id
    alias: com.mycorp.models.Uint24
    type: int
    bits: 24
    signed: false
  - name: signed_id
    type: com.mycorp.models.Uint24
    # Let's make this a signed int24
    signed: true

But alias inheritance is not allowed:

type: struct
doc: All fields have the same type
  - name: field1
    alias: "com.mycorp.models.Field"
    type: int
    bits: 32
    signed: false
  - name: field2
    type: com.mycorp.models.Field
    # An alias of an alias isn't allowed.
    alias: com.mycorp.models.FieldAlias
  - name: field3
    type: com.mycorp.models.FieldAlias

Built-in Aliases

Recap comes with a collection of built-in aliases:

  • int8: An 8-bit signed integer.
  • uint8: An 8-bit unsigned integer.
  • int16: A 16-bit signed integer.
  • uint16: A 16-bit unsigned integer.
  • int32: A 32-bit signed integer.
  • uint32: A 32-bit unsigned integer.
  • int64: A 64-bit signed integer.
  • uint64: A 64-bit unsigned integer.
  • float16: A 16-bit IEEE 754 encoded floating point number.
  • float32: A 32-bit IEEE 754 encoded floating point number.
  • float64: A 64-bit IEEE 754 encoded floating point number.
  • string32: A variable-length UTF-8 encoded unicode string with a maximum length of 2_147_483_648.
  • string64: A variable-length UTF-8 encoded unicode string with a maximum length of 9_223_372_036_854_775_807.
  • bytes32: A variable-length byte array with a maximum length of 2_147_483_648.
  • bytes64: A variable-length byte array with a maximum length of 9_223_372_036_854_775_807.
  • uuid: A fixed-length 36 byte string in UUID 8-4-4-4-12 string format as defined in RFC 4122.
  • decimal128: An arbitrary-precision decimal number stored in a fixed-length 128-bit byte array.
  • decimal256: An arbitrary-precision decimal number stored in a fixed-length 256-bit byte array.
  • duration64: A length of time and time unit without timezones and leap seconds.
  • interval128: An interval of time on a calendar measured in months, days, and a duration of intra-day time with a time unit.
  • time32: Time since midnight without timezones and leap seconds in a 32-bit signed integer.
  • time64: Time since midnight without timezones and leap seconds in a 64-bit signed integer.
  • timestamp64: Time elapsed (in a time unit) since a specific epoch.
  • date32: Date since the UNIX epoch without timezones and leap seconds in a 32-bit integer.
  • date64: Date since the UNIX epoch without timezones and leap seconds in a 64-bit integer.


  • 0.3.0
    • Allow strings and bytes with undefined byte lengths. (#413)
  • 0.2.0
    • Add optional attribute. (#335)
  • 0.1.1
    • Make string and bytes byte attribute optional. (#284 #292 #293)
  • 0.1.0
    • Initial release.