Module WireSource

Binary wire format descriptions.

A wire format is a sequence of typed Fields -- integers, bitfields, enumerations, byte arrays -- laid out at fixed bit offsets in a buffer. A Codec binds those fields to an OCaml record, giving you:

type header = { version : int; length : int }

let f_version = Field.v "Version" (bits ~width:4 U8)
let f_length = Field.v "Length" uint16be
let bf_version = Codec.(f_version $ fun h -> h.version)
let bf_length = Codec.(f_length $ fun h -> h.length)

let codec =
  Codec.v "Header"
    (fun version length -> { version; length })
    [ bf_version; bf_length ]

(* Staged zero-copy access *)
let get_version = Staged.unstage (Codec.get codec bf_version)
let v = get_version buf 0

(* Full-record round-trip *)
let () = Codec.encode codec { version = 1; length = 42 } buf 0
let h = Codec.decode codec buf 0

The same description can be projected to an EverParse 3D schema via Everparse, for verified C parser generation.

Sourcemodule Staged : sig ... end

Expressions

Expressions describe sizes, constraints, and dependencies between fields. They are part of the wire description itself: no evaluation happens at interface construction time.

They are used whenever the layout depends on previously decoded data: array lengths, byte-slice sizes, field constraints, and similar dependent structure.

Sourcetype 'a expr
Sourcetype bitfield =
  1. | U8
  2. | U16
  3. | U16be
  4. | U32
  5. | U32be
Sourcetype bit_order =
  1. | Msb_first
  2. | Lsb_first
    (*

    Which end of a packed base word the first declared bitfield occupies.

    • Msb_first (default): the first declared field lands at the most significant bit of the base word, matching how RFC, CCSDS, and IETF specs draw their bit diagrams. Copy-pasting a spec into field declarations just works.
    • Lsb_first: the first declared field lands at bit 0 of the base word, matching MSVC's C bit-field packing. Useful when mirroring a C struct.

    Bit order is independent of byte order: any combination of base word and bit order is a valid wire description, and the EverParse 3D projection reverses declaration order within a bit group when necessary so that every pairing emits a valid 3D schema with identical byte layout.

    *)
Sourcetype endian =
  1. | Little
  2. | Big
    (*

    Byte order for multi-byte integers.

    *)
Sourcetype 'a typ
Sourcetype param

Untyped formal parameter declaration. Create via Param.v.

Sourcemodule Param : sig ... end

Formal parameters for codecs.

Sourcemodule Action : sig ... end

Small imperative language for side-effects during validation.

Sourceval int : int -> int expr

Constant integer expression.

Sourceval int64 : int64 -> int64 expr

Constant 64-bit integer expression.

Sourceval sizeof : 'a typ -> int expr

Size of a fixed-size wire description.

Sourceval sizeof_this : int expr

Number of bytes already consumed in the enclosing sequential description.

This is meaningful only while interpreting a larger description, typically a struct or record-shaped layout. It is used in dependent sizes and constraints for later fields.

Sourceval field_pos : int expr

Zero-based index of the current field in the enclosing sequential description.

Like sizeof_this, this is context-dependent and is mainly used inside dependent field constraints and projections.

Sourcemodule Expr : sig ... end

Arithmetic, bitwise, and comparison operators on expressions.

Fields

A field is a named, typed piece of a wire layout -- the building block for everything else.

Define each field once with Field.v, then reuse it everywhere: bind it into a Codec for zero-copy access and full-record round-trips, reference it from dependent expressions with Field.ref, or project it to EverParse 3D via Everparse.schema.

let f_version = Field.v "Version" (bits ~width:4 U8)
let f_length = Field.v "Length" uint16be
let f_data = Field.v "Data" (byte_slice ~size:(Field.ref f_length))

Fields carry no buffer position -- that comes from the Codec they are bound into. The same field can appear in multiple codecs.

Sourcemodule Field : sig ... end

Type Descriptions

The primitive constructors describe immediate wire values. The combinators build larger descriptions out of them.

In ordinary use, one starts from primitive integer or byte descriptions and combines them with bits, array, byte_slice, where, enum, and related combinators.

Sourceval uint8 : int typ

Unsigned 8-bit integer.

Sourceval uint16 : int typ

Unsigned 16-bit little-endian integer.

Sourceval uint16be : int typ

Unsigned 16-bit big-endian integer.

Sourceval uint32 : int typ

Unsigned 32-bit little-endian integer.

Sourceval uint32be : int typ

Unsigned 32-bit big-endian integer.

Sourceval uint63 : int typ

Unsigned 63-bit little-endian integer carried on 8 bytes.

Sourceval uint63be : int typ

Unsigned 63-bit big-endian integer carried on 8 bytes.

Sourceval uint64 : int64 typ

uint64 is an unsigned 64-bit little-endian integer represented as int64.

Sourceval uint64be : int64 typ

uint64be is an unsigned 64-bit big-endian integer represented as int64.

Sourceval uint : ?endian:endian -> int expr -> int typ

uint size is an unsigned integer of size bytes (1-7) with the given byte order (default Big). The size may be a dynamic expression for parameter-driven widths.

Sourceval bits : ?bit_order:bit_order -> width:int -> bitfield -> int typ

bits ~width base declares a bitfield of width bits inside base.

~bit_order selects which end of the base word the first declared bitfield occupies. It defaults to Msb_first, which makes the DSL match how protocol specifications draw their bit diagrams. Pass ~bit_order:Lsb_first when mirroring MSVC-style C bit-fields.

Sourceval map : decode:('w -> 'a) -> encode:('a -> 'w) -> 'w typ -> 'a typ

map ~decode ~encode t views a wire value through conversion functions.

Sourceval bool : bool -> bool expr

Constant boolean expression.

Sourceval bit : int typ -> bool typ

bit t views an integer wire value as a boolean. Zero is false, non-zero is true.

Sourceval lookup : 'a list -> int typ -> 'a typ

lookup table t decodes an integer as a zero-based index into a finite table.

The decoded integer selects the corresponding element from the list. An out-of-range index produces an Invalid_tag parse error (reported via result in decode / decode_string / decode_bytes). Encoding raises Invalid_argument if the value is not in the table.

Sourceval empty : unit typ

empty is a description carrying no bytes and producing ().

Sourceval all_bytes : string typ

All remaining bytes of the enclosing sequential description as a string.

This is mainly useful as the final field of a struct or record-shaped layout.

Sourceval all_zeros : string typ

All remaining bytes of the enclosing sequential description, requiring each of them to be zero.

This is mainly useful as the final field of a struct or record-shaped layout.

Sourceval where : bool expr -> 'a typ -> 'a typ

Refine a description with a boolean constraint.

Sourcetype ('elt, 'seq) seq_map =
  1. | Seq_map : {
    1. empty : 'b;
    2. add : 'b -> 'elt -> 'b;
    3. finish : 'b -> 'seq;
    4. iter : ('elt -> unit) -> 'seq -> unit;
    } -> ('elt, 'seq) seq_map
    (*

    Builder for sequence accumulation (Jsont-style).

    *)
Sourceval seq_list : ('a, 'a list) seq_map

Default builder: accumulate into a list.

Sourceval array : len:int expr -> 'a typ -> 'a list typ

Repetition of a description, with length computed from an expression.

Sourceval array_seq : ('a, 'seq) seq_map -> len:int expr -> 'a typ -> 'seq typ

Repetition with custom builder.

Sourceval byte_array : size:int expr -> string typ

Fixed-size byte sequence copied as a string.

Sourceval byte_slice : size:int expr -> Bytesrw.Bytes.Slice.t typ

Fixed-size byte sequence exposed as a zero-copy slice.

Sourceval nested : size:int expr -> 'a typ -> 'a typ

nested ~size t parses one value of type t inside a length-prefixed region of size bytes.

This is for layouts where a length expression denotes the size of a region, but that region is known to contain exactly one value, such as a single nested message.

Sourceval nested_at_most : size:int expr -> 'a typ -> 'a typ

nested_at_most ~size t is like nested, but treats size as an upper bound rather than an exact size.

This is for length-prefixed regions where the one logical element may consume fewer bytes than the available space.

Sourceval enum : string -> (string * int) list -> int typ -> int typ

enum name cases base validates that the decoded integer is one of the named values. The result is still int -- use variants instead if you want to decode to proper OCaml values. enum is mainly useful for 3D projection where the name and cases appear in the generated .3d file.

Sourceval variants : string -> (string * 'a) list -> int typ -> 'a typ

variants name cases base maps integer values to OCaml values via a named enumeration. Unlike enum, this converts to proper OCaml values.

Sourcetype 'a case_def
Sourceval case : ?index:int -> 'w typ -> inject:('w -> 'a) -> project:('a -> 'w option) -> 'a case_def

Tagged branch of a casetype.

Sourceval default : 'w typ -> inject:('w -> 'a) -> project:('a -> 'w option) -> 'a case_def

Default branch of a casetype.

Sourceval casetype : ?first:int -> ?step:int -> string -> int typ -> 'a case_def list -> 'a typ

Tag-dispatched choice between several descriptions.

Sourceval size : 'a typ -> int option

size t is the fixed wire size of a description, if known statically.

Parsing Errors

Direct decoding reports failures as values of type parse_error. The cases distinguish structural failure on input, such as unexpected end of input or a constraint violation, from semantic failure such as an invalid enum or tag.

Sourcetype parse_error =
  1. | Unexpected_eof of {
    1. expected : int;
    2. got : int;
    }
  2. | Constraint_failed of string
  3. | Invalid_enum of {
    1. value : int;
    2. valid : int list;
    }
  4. | Invalid_tag of int
  5. | All_zeros_failed of {
    1. offset : int;
    }
Sourceexception Validation_error of parse_error

Raised by Codec.validate on constraint or where-clause failure.

Sourceval pp_parse_error : Format.formatter -> parse_error -> unit

Pretty-printer for decode errors.

Direct Decoding and Encoding

These functions interpret a 'a typ directly and exchange ordinary OCaml values with bytes.

Use them when you want a value now: one-shot decoding, streaming code built around Bytesrw, tests, small tools, and formats that are naturally consumed as values.

Use Codec instead when the format is record-shaped and the main goal is repeated access to individual fields in an existing buffer, without allocating an OCaml record for each read.

Decodes one value from the current reader position.

If the description references parameters, bind them with Param.bind before calling decode. Output parameters are updated during decoding; read them back with Param.get.

For the zero-copy codec path, prefer Codec.decode_with which takes an explicit Param.env.

Decoding is prefix-based: success does not imply that the reader is exhausted afterwards.

Sourceval decode_string : 'a typ -> string -> ('a, parse_error) result

Decodes one value from the start of the string.

Trailing bytes, if any, are left uninterpreted.

Sourceval decode_bytes : 'a typ -> bytes -> ('a, parse_error) result

Decodes one value from the start of the byte sequence.

Trailing bytes, if any, are left uninterpreted.

Direct Encoding

Encoding follows the same description language as decoding. The functions in this section are the direct counterparts of decode, decode_string, and decode_bytes: they work with whole OCaml values rather than field-level accessors.

Unlike decoding, encoding is exception-based rather than result-based. Decoding fails on untrusted input (truncated data, constraint violations), so callers need structured errors. Encoding fails only on programmer errors (wrong description for the value, unsupported form), which are not data-dependent and should not be silently ignored.

Sourceval encode : 'a typ -> 'a -> Bytesrw.Bytes.Writer.t -> unit

Encodes one value to a Bytesrw.Bytes.Writer.t.

This function is exception-based. Unsupported description forms, such as unresolved type references, raise an exception rather than returning an error value.

Sourceval encode_to_bytes : 'a typ -> 'a -> bytes

Encodes one value to freshly allocated bytes.

Sourceval encode_to_string : 'a typ -> 'a -> string

Encodes one value to a freshly allocated string.

Codecs

A codec is the primary way to work with a wire format. It binds Fields to an OCaml record type and provides:

All three modes derive from the same definition, so the layout is always consistent. Everparse.schema projects the same codec to a verified C parser -- no separate description to maintain.

Sourcemodule Codec : sig ... end

Nested Codec Combinators

These combinators extend the type language with structured sub-codecs, optional fields, and repeated elements -- for protocols like CCSDS TM frames where the layout depends on mission configuration.

Sourceval codec : 'r Codec.t -> 'r typ

codec c embeds a sub-codec as a field type. The sub-codec's decode and encode functions are called at the appropriate offset.

Sourceval optional : bool expr -> 'a typ -> 'a option typ

optional present t is a field that is present when present evaluates to true, absent otherwise. Absent fields decode as None and consume zero bytes.

Sourceval optional_or : bool expr -> default:'a -> 'a typ -> 'a typ

optional_or present ~default t is a field that decodes to the inner value when present is true, or returns default when absent. No option wrapper -- zero allocation for the absent case.

Sourceval repeat : size:int expr -> 'a typ -> 'a list typ

repeat ~size t parses elements of type t repeatedly until size bytes have been consumed.

Sourceval repeat_seq : ('a, 'seq) seq_map -> size:int expr -> 'a typ -> 'seq typ

Repeat with custom builder.

Export

Everparse is the pure export layer. The normal workflow is:

For unusual EverParse constructs that have no codec equivalent yet, see the explicit escape hatch Everparse.Raw.

Sourcemodule Everparse : sig ... end

ASCII Bit Diagrams

RFC-style 32-bit-wide ASCII bit layout diagrams, following the conventions of RFC 791 and similar documents.

Sourcemodule Ascii : sig ... end