Module Kernel.CoverageSource

This module implements coverability analysis of the LR automaton and a corresponding matching machine to detect uncovered failing configurations.

Design overview:

Implementation details:

To compute coverage, one has to track many information at each step: possible stack prefixes (current LR states), on-going reductions, unaccepted lookahead symbols. Doing so all at once is expensive (the state space is huge), and unnecessary, as relevant information can be recovered later.

The current implementation is the result of dozens of experimentations to find a balance between efficiency, precision and ease of use.

The Andor module is non-deterministic in reductions: a node tracks a single reduction, together with the precise LR state and set of lookahead symbols.

The Deter module determinizes Andor but ignores the set of lookahead symbols. (That is, the lookahead symbols of the different Andor nodes in the kernel of a Deter node are unrelated; one might accept a symbol, another reject it, and a third one just asks for more reduction to decide what to do with lookahead). Determinizing with respect to lookahead would cause a combinatorial explosion without providing more actionable information.

The Enum module refines Deter with the unaccepted lookahead symbols of each node. Since lookaheads are not part of the Deter kernel, this is path-dependent: the shortest paths witnessing that a given lookahead is unaccepted are remembered. What matters is that there is at least one way to reach a given node for a given lookahead. Since all reductions applicable to a given configuration are tracked simultaneously, this approach also works with GLR automata: we know that the lookahead has not been accepted by any of the possible reductions

Sourceval string_of_items_for_filter : 'a Kernel__Info.grammar -> 'a Info.Lr0.n Fix.Indexing.index -> string list

Convert LR(0) items to a filtered string representation for display. Two optimizations are applied:

  • Skip trivial items of the form symbol: symbol . ... where the LHS equals the first RHS symbol.
  • Group items sharing the same LHS and prefix but differing in the post-dot suffix, collapsing them with a wildcard (_*).
Sourceval print_pattern : 'a Kernel__Info.grammar -> 'a Info.Lr0.n Fix.Indexing.index -> string list

Format LR(0) items into a multi-line visual pattern with indentation, used for coverage filter display. The incoming symbol of the LR(0) state determines the prefix style.

Sourcemodule Andor : sig ... end
Sourcemodule Deter : sig ... end
Sourceval dyn_array : unit -> 'a list array Stdlib.ref * (int -> 'a -> unit)

Create a dynamically growing array backed by a reference. Returns the reference and a setter that automatically resizes the array (doubling as needed) when an out-of-bounds index is set. Values are accumulated as lists at each index.

Sourcemodule Enum : sig ... end
Sourcemodule Cover : sig ... end
Sourcemodule Extract : sig ... end

Extracts witness paths from coverage analysis graphs. Propagates rejectable lookaheads backward from sink nodes through predecessor links, identifying maximal nodes where user-reached goals intersect with unaccepted lookaheads.

Sourcemodule Report : sig ... end

Formats and emits coverage results for user-facing output. Provides local (per-state) and global (cross-state) reporting modes, with sentence deduplication and cost-based ordering.