123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707708709710711712713714715716717718719720721722723724725726727728729730731732733734735736737738739740741742743744745746747748749750751752753754755756757758759760761762763764765766767768769770771772773774775776777778779780781782783784785786787788789790791792793794795796797798799800801802803804805806807808809810811812813814815816817818819820821822823824825826827828829830831832833834835836837838839840841842843844845846847848849850851852853854855856857858859860861862863864865866867868869870871872873874875876877878879880881882883884885886887888889890891892893894895896897898899900901902903904905906907908909910911912913914915916917918919920921922923924925926927928929930931932933934935936937938939940941942943944945946947948949950951952953954955956957958959960961962963964965966967968969970971972973974975976977978979980981982983(* MIT License
*
* Copyright (c) 2026 Frédéric Bour
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in all
* copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
* AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
* SOFTWARE.
*)(** This module implements coverability analysis of the LR automaton and a
corresponding matching machine to detect uncovered failing configurations.
Design overview:
- Andor module: builds an AND-OR graph where OR-nodes represent
non-deterministic configurations (multiple possible reductions) and
AND-nodes represent deterministic consumption of LR state (popping and
branching from the top of the LR stack).
- Deter module: constructs a deterministic automaton from the AND-OR graph
by merging all OR-nodes and grouping AND-nodes branching on the same LR
state. In other words, an edge reaching an OR-node is an ϵ-transition and
an edge reaching an AND-node is labelled by an LR(1) state.
- Enum module: Augment the deterministic graph by computing the lookahead
symbols not yet accepted when reaching a node. The unaccepted symbols of
the sinks (nodes without successor) describe all the possible failures,
which we can illustrate with counter-examples by following the
predecessors back to the initial node.
- Cover module: Determine the coverage of an error matching machine by
computing a synchronized product of the machine with the enum graph.
- Extract module: extracts counter-examples for fallible reductions from the
enumeration and coverage graphs.
- Report module: formats and presents enumeration and coverage results to
users.
Implementation details:
To compute coverage, one has to track many information at each step:
possible stack prefixes (current LR states), on-going reductions, unaccepted
lookahead symbols. Doing so all at once is expensive (the state space is
huge), and unnecessary, as relevant information can be recovered later.
The current implementation is the result of dozens of experimentations to
find a balance between efficiency, precision and ease of use.
The Andor module is non-deterministic in reductions: a node tracks a single
reduction, together with the precise LR state and set of lookahead symbols.
The Deter module determinizes Andor but ignores the set of lookahead symbols.
(That is, the lookahead symbols of the different Andor nodes in the kernel
of a Deter node are unrelated; one might accept a symbol, another reject it,
and a third one just asks for more reduction to decide what to do with
lookahead).
Determinizing with respect to lookahead would cause a combinatorial
explosion without providing more actionable information.
The Enum module refines Deter with the unaccepted lookahead symbols of
each node. Since lookaheads are not part of the Deter kernel, this is
path-dependent: the shortest paths witnessing that a given lookahead is
unaccepted are remembered.
What matters is that there is at least one way to reach a given node for a
given lookahead.
Since all reductions applicable to a given configuration are tracked
simultaneously, this approach also works with GLR automata: we know that
the lookahead has not been accepted by any of the possible reductions
*)openUtilsopenMiscopenFix.IndexingopenInfo(** Convert LR(0) items to a filtered string representation for display.
Two optimizations are applied:
- Skip trivial items of the form [symbol: symbol . ...] where the LHS
equals the first RHS symbol.
- Group items sharing the same LHS and prefix but differing in the
post-dot suffix, collapsing them with a wildcard ([_*]). *)letstring_of_items_for_filterglr0=letdecomposeitem=letprod,pos=Item.descgiteminletrhs=Production.rhsgprodin(Production.lhsgprod,Array.subrhs0pos,Array.subrhspos(Array.lengthrhs-pos))inletlines=ref[]inletappend(lhs,pre,post)=matchprewith(* Optimization 1: skip items of the form symbol: symbol . ... *)|[|first|]whenIndex.equal(Symbol.inj_nglhs)first->()|_->(* Optimization 2: group items of the form
sym: α . x . β₁, sym: α . x.β₂, ...
as sym: α . x _* *)match!lineswith|(lhs',pre',post')::restwhenIndex.equallhslhs'&&array_equalIndex.equalprepre'->beginmatchpost',postwith|`Suffix[||],_|_,[||]->pushlines(lhs,pre,`Suffixpost)|`Suffixpost',postwhenIndex.equalpost'.(0)post.(0)->lines:=(lhs',pre',`Wildpost.(0))::rest|`Wildpost0,postwhenIndex.equalpost0post.(0)->()|_->pushlines(lhs,pre,`Suffixpost)end|_->pushlines(lhs,pre,`Suffixpost)inIndexSet.rev_map_elements(Lr0.itemsglr0)decompose|>List.sortcompare|>List.iterappend;letprint_item(lhs,pre,post)=letsymssyms=Array.to_list(Array.map(Symbol.to_stringg)syms)inString.concat" "@@(Nonterminal.to_stringglhs^":")::symspre@"."::matchpostwith|`Suffixpost->symspost|`Wildsym->[Symbol.to_stringgsym;"_*"]inList.rev_mapprint_item!lines(** Format LR(0) items into a multi-line visual pattern with indentation,
used for coverage filter display. The incoming symbol of the LR(0) state
determines the prefix style. *)letprint_patternglr0=letfirst,other,suffix=matchLr0.incomingglr0with|SomesymwhenSymbol.is_nonterminalgsym->"| [_* /"," /","]"|Some_|None->"| /"," /",""inletrecpreparepad=function|[]->assertfalse|[x]->[pad^x^suffix]|x::xs->(pad^x)::prepareotherxsinpreparefirst(string_of_items_for_filterglr0)moduleAndor=struct(** Non-deterministic AND-OR graph for coverability analysis.
OR nodes represent choice points where multiple reductions may fire.
AND nodes represent deterministic stack consumption during a reduction. *)type('g,'lrc,'n)node={lrc:'lrcindex;rpos:'gRedpos.tOpt.nindex;active:'gterminalindexset;mutablesuccessors:('gterminalindexset*'nindex)array;}type('g,'lrc,'n)_graph={initials:('lrc,'gterminalindexset*'nindex)indexmap;nodes:('n,('g,'lrc,'n)node)vector;}type('g,'lrc)graph=Graph:('g,'lrc,'n)_graph->('g,'lrc)graph(** Returns true when the node is an OR node, i.e., at a reduction choice
point: either between reductions ([rpos = None]) or at the start of a
reduction ([rpos] at position zero). *)letis_or_nodertablenode=matchOpt.prjnode.rposwith|None->true|Somerpos->Redpos.is_zerortablerpos(** Returns the top LR(1) state on the stack for a given node.
For OR nodes ([rpos = None]), this is the label of the LRC state.
For AND nodes mid-reduction on a nonterminal, this is the goto target
from the LRC's label state. *)lettop_stateg(stacks:_Automata.stacks)rtablenode=matchOpt.prjnode.rposwith|None->stacks.labelnode.lrc|Somerpos->matchRedpos.previousrtablerposwith|Either.Right_->assertfalse|Either.Leftnt->Transition.find_goto_targetg(stacks.labelnode.lrc)nt(** Build the AND-OR graph by fixpoint iteration.
Starting from each LRC state with all regular terminals as active
lookahead, the graph explores reductions and stack pops. OR nodes
branch into all applicable reductions; AND nodes chain through
right-to-left consumption of production RHS symbols. *)letmake(typeglrc)(g:ggrammar)(rcs:(glr1,gRedgraph.reduction_closure)vector)(stacks:(g,lrc)Automata.stacks)(rtable:gRedpos.table)=letopenIndexBufferinletmoduleMap=Map.Make(structtypet=gRedpos.tOpt.nindex*gterminalindexsetletcompare(i1,s1)(i2,s2)=letc=Index.comparei1i2inifc<>0thencelseIndexSet.compares1s2end)inletmoduleNodes=Gen.Make()inletnodes=Nodes.get_generator()inlettable=Vector.makestacks.domainMap.emptyinlettodo=ref[]inletget_statelrcrposactive=letmap=table.:(lrc)inletaccept=matchOpt.prjrposwith|None->rcs.:(stacks.labellrc).accepting|Somerpos->matchRedpos.previousrtablerposwith|Either.Right_->IndexSet.empty|Either.Leftnt->rcs.:(Transition.find_goto_targetg(stacks.labellrc)nt).acceptinginletaccept=IndexSet.interactiveacceptinletactive=IndexSet.diffactiveacceptinletkey=(rpos,active)inmatchMap.find_optkeymapwith|Someix->(accept,ix)|None->letix=Gen.addnodes{lrc;rpos;active;successors=[||]}intable.:(lrc)<-Map.addkeyixmap;pushtodoix;(accept,ix)inletinitials=letregular=Terminal.regularginIndexMap.inflate(funlrc->get_statelrcOpt.noneregular)stacks.topsinbeginletpropagations=ref0inletpropagatesource=incrpropagations;letsnode=Gen.getnodessourceinassert(Array.lengthsnode.successors=0);letlrc=snode.lrcinmatchOpt.prjsnode.rposwith|None->letedges=ref[]inList.iteribeginfunints->edges:=IndexMap.foldbeginfunntlabelacc->letrpos=Opt.some(Redpos.injrtablent(i+1))inget_statelrcrposlabel::accendnts!edgesendrcs.:(stacks.labellrc).all_reductions;snode.successors<-Array.of_list!edges|Somerpos->beginmatchRedpos.previousrtablerposwith|Either.Rightrpos'->snode.successors<-IndexSet.map_to_array(stacks.prevlrc)(funlrc->get_statelrc(Opt.somerpos')snode.active)|Either.Leftnt->letlr1=stacks.labellrcinletgoto=Transition.find_goto_targetglr1ntinletedges=ref[]inList.iteribeginfunints->edges:=IndexMap.foldbeginfunntlabelacc->letlabel=IndexSet.intersnode.activelabelinifIndexSet.is_not_emptylabelthenget_statelrc(Opt.some(Redpos.injrtablenti))label::accelseaccendnts!edgesendrcs.:(goto).all_reductions;snode.successors<-Array.of_list!edgesendinletcounter=ref0infixpoint~counter~propagatetodo;stopwatch1"Andor construction: %d iterations, %d propagations, %d nodes"!counter!propagations(cardinalNodes.n);end;letnodes=Gen.freezenodesinGraph{initials;nodes}endmoduleDeter=struct(** Deterministic automaton obtained by merging Andor OR nodes into single
states and grouping AND nodes that branch on the same LR state.
Lookahead sets are intentionally excluded from the DFA kernel to avoid
combinatorial explosion. *)type('g,'n,'m)node={index:'mindex;ker:'nindexset;mutabletop:'glr1indexset;mutableaccept:'gterminalindexset;mutablesuccessors:('gterminalindexset*('g,'n,'m)node)array;}type('g,'lrc,'n,'m)_graph={initials:('lrc,'gterminalindexset*'mindex)indexmap;nodes:('m,('g,'n,'m)node)vector;}type('g,'lrc,'n)graph=Graph:('g,'lrc,'n,'m)_graph->('g,'lrc,'n)graph(** Returns the LRC state shared by all Andor nodes in the kernel of a
Deter node. All Andor nodes in a Deter kernel are guaranteed to have
the same LRC state. *)letget_lrcagrnode=agr.Andor.nodes.:(IndexSet.choosenode.ker).lrc(** Build the deterministic graph from an Andor graph.
OR nodes are collapsed via epsilon transitions; AND nodes are grouped
by their branching LR state. The resulting DFA tracks which Andor
nodes form each kernel, along with accumulated accept sets and top
LR(1) states. *)letmake(typeglrcn)(g:ggrammar)(_rcs:(glr1,gRedgraph.reduction_closure)vector)(stacks:(g,lrc)Automata.stacks)(rtable:gRedpos.table)(gr:(g,lrc,n)Andor._graph):(g,lrc,n)graph=letopenIndexBufferinletmoduleNodes=Gen.Make()inletnodes=Nodes.get_generator()inlettable=Vector.makestacks.domainIndexSet.Map.emptyinlettodo=ref[]inletget_stateker=letn=IndexSet.choosekerinletlrc=gr.nodes.:(n).lrcinassert(IndexSet.for_all(funn'->Index.equallrcgr.nodes.:(n').lrc)ker);letmap=table.:(lrc)inmatchIndexSet.Map.find_optkermapwith|Someix->ix|None->letr=Gen.reservenodesinletindex=Gen.indexrinletnode={index;ker;successors=[||];top=IndexSet.empty;accept=IndexSet.empty}inGen.commitnodesrnode;table.:(lrc)<-IndexSet.Map.addkernodemap;pushtodonode;nodeinletinitials=IndexMap.map(fun(accept,ix)->(accept,(get_state(IndexSet.singletonix)).index))gr.initialsinletpropagatenode=lettop=refIndexSet.emptyinletaccept0=refIndexSet.emptyinletrecaccumulate(accept,ker)acc=letnode=gr.nodes.:(ker)inifAndor.is_or_nodertablenodethen(top:=IndexSet.add(Andor.top_stategstacksrtablenode)!top;accept0:=IndexSet.unionaccept!accept0;Array.fold_rightaccumulatenode.successorsacc)elsematchaccwith|None->letaccepts=Array.map(fun(accept',_)->IndexSet.unionacceptaccept')node.successorsinletsuccs=Array.map(fun(_,succ)->IndexSet.singletonsucc)node.successorsinSome(accepts,succs)|Some(accepts,succs)->Array.iteri(funi(accept',succ)->accepts.(i)<-IndexSet.union(IndexSet.unionacceptaccept')accepts.(i);succs.(i)<-IndexSet.addsuccsuccs.(i))node.successors;accinletaccumulation=IndexSet.fold_right(funaccx->accumulate(IndexSet.empty,x)acc)Nonenode.kerinnode.top<-!top;node.accept<-!accept0;matchaccumulationwith|None->()|Some(accepts,succs)->node.successors<-Array.map2(funab->(a,get_stateb))acceptssuccsinletcounter=ref0infixpoint~counter~propagatetodo;stopwatch1"Deter construction: %d iterations, %d initials, %d nodes"!counter(IndexMap.cardinalinitials)(cardinalNodes.n);letnodes=Gen.freezenodesinGraph{initials;nodes}(** Compute the set of LR(0) states reachable from a Deter node.
Traverses the Andor graph from the node's kernel, following OR nodes
down through reduction closure trees, and collecting the LR(0) base
state at each leaf. Used for mapping coverage results back to
human-readable grammar states. *)letget_lr0(typeglrcnm)(g:ggrammar)(rcs:(glr1,gRedgraph.reduction_closure)vector)(stacks:(g,lrc)Automata.stacks)(rtable:gRedpos.table)(agr:(g,lrc,n)Andor._graph)(dgr:(g,lrc,n,m)_graph)ix=letstates=refIndexSet.emptyinletrecvisit_stacksstacktree=matchtree.Redgraph.nextwith|[]->states:=IndexSet.add(Lr1.to_lr0g(List.hdstack))!states|xs->List.iter(fun(stack,_,tree)->visit_stacksstacktree)xsinletrecvisit_leaves(_,ker)=letnode=agr.nodes.:(ker)inletsucc=node.successorsinifArray.exists(fun(_,ker)->Andor.is_or_nodertableagr.nodes.:(ker))succthenArray.itervisit_leavessuccelseifAndor.is_or_nodertablenodethenletlr1=Andor.top_stategstacksrtablenodeinvisit_stacks[lr1]rcs.:(lr1).stacksinIndexSet.iterbeginfunker->ifAndor.is_or_nodertableagr.nodes.:(ker)thenvisit_leaves(IndexSet.empty,ker)enddgr.nodes.:(ix).ker;!statesend(** Create a dynamically growing array backed by a reference.
Returns the reference and a setter that automatically resizes the
array (doubling as needed) when an out-of-bounds index is set.
Values are accumulated as lists at each index. *)letdyn_array()=letr=ref(Array.make16[])inletsetindexcell=ifArray.length!r<=indexthen(letlength=ref(Array.length!r)inwhile!length<=indexdolength:=!length*2;done;letmaximals'=Array.make!length[]inArray.blit!r0maximals'0(Array.length!r);r:=maximals');(!r).(index)<-cell::(!r).(index)in(r,set)moduleEnum=struct(** Augments the deterministic graph with unaccepted lookahead tracking.
For each node, computes which terminal symbols have not been accepted
by any applicable reduction. Sink nodes with non-empty unaccepted sets
represent coverage gaps. Predecessor links enable path reconstruction. *)type('n,'term)_graph={domain:'ncardinal;predecessors:'nindex->('nindex*int*'termindexset)list;unaccepted:'nindex->'termindexset;}(** Build the enumeration graph and compute maximal sink groups.
Propagates unaccepted lookaheads from initial states through the
deterministic graph, subtracting each node's accept set along edges.
Returns the graph with predecessor/unaccepted accessors, plus an
array of sink groups indexed by stack prefix length. *)letprepare(typeglrcnm)(g:ggrammar)(agr:(g,lrc,n)Andor._graph)(dgr:(g,lrc,n,m)Deter._graph)(some_prefix:lrcindex->int*lrcindexlist):(m,gterminal)_graph*(mindexlist*gterminalindexset)listarray=letunaccepted=Vector.make_associatedgr.nodesIndexSet.emptyinletpredecessors=Vector.make_associatedgr.nodes[]inletcounter=ref0inlettodo=ref[]inletdelta=Vector.make_associatedgr.nodesIndexSet.emptyinletupdatesourcetargetset=letset=IndexSet.diffsetdgr.nodes.:(target).acceptinletset=IndexSet.diffsetunaccepted.:(target)inifIndexSet.is_not_emptysetthen(unaccepted.@(target)<-IndexSet.unionset;ifIndexSet.is_emptydelta.:(target)thenpushtodotarget;delta.@(target)<-IndexSet.unionset;matchsourcewith|None->()|Somesource->predecessors.@(target)<-List.cons(source,!counter,set))inletpropagateix=lettodo=delta.:(ix)indelta.:(ix)<-IndexSet.empty;letsource=SomeixinArray.iterbeginfun(accept,target)->lettodo=IndexSet.difftodoacceptinifIndexSet.is_not_emptytodothenupdatesourcetarget.Deter.indextodoenddgr.nodes.:(ix).successorsinletregular=Terminal.regularginIndexMap.iterbeginfun_(accept,target)->updateNonetarget(IndexSet.diffregularaccept)enddgr.initials;fixpoint~counter~propagatetodo;letpfx_min=refmax_intandpfx_max=ref0inVector.iterbeginfunnode->ifArray.lengthnode.Deter.successors=0then(letlength,_=some_prefix(Deter.get_lrcagrnode)inpfx_min:=Int.min!pfx_minlength;pfx_max:=Int.max!pfx_maxlength;)enddgr.nodes;letmaximals=Array.make(if!pfx_min=max_intthen0else!pfx_max-!pfx_min+1)[]inVector.iteribeginfunixnode->ifArray.lengthnode.Deter.successors=0then(letlength,_=some_prefix(Deter.get_lrcagrnode)inletindex=length-!pfx_mininletunaccepted=unaccepted.:(ix)inmaximals.(index)<-([ix],unaccepted)::maximals.(index))enddgr.nodes;letpredecessors=Vector.getpredecessorsinletunaccepted=Vector.getunacceptedin({domain=Vector.lengthdgr.nodes;predecessors;unaccepted},maximals)endmoduleCover=struct(** Computes coverage by constructing a synchronized product of the user's
error matching machine with the enumeration graph. States where the
machine cannot handle the unaccepted lookaheads of the enumeration
graph are identified as uncovered. *)type('g,'lrc,'enu)graph=Graph:{enum:('n,'gterminal)Enum._graph;position:'nindex->('lrc,'enu)Sum.nindex;unaccepted:'nindex->'gterminalindexset;sinks:'nindexlist;}->('g,'lrc,'enu)graph(** Compute the coverage product graph.
Each product state pairs a machine state with a position in the
enumeration graph (either an LRC state or an Enum node). Unaccepted
lookaheads are propagated through the product, filtered by the
machine's accepting transitions and branch lookahead sets. Sink
states with remaining unaccepted lookaheads are coverage gaps. *)letcoverage(typegrsttrlrcaoen)(g:ggrammar)(branches:(g,r)Spec.branches)(machine:(g,r,st,tr)Automata.Machine.t)(stacks:(g,lrc)Automata.stacks)(agr:(g,lrc,ao)Andor._graph)(dgr:(g,lrc,ao,en)Deter._graph)=letopenIndexBufferinletmoduleKer=Gen.Make()inletker=Ker.get_generator()inletunaccepted=Dyn.makeIndexSet.emptyinletdelta=Dyn.makeIndexSet.emptyinletpredecessors=Dyn.make[]inletsuccessors=Dyn.make[||]inlettable=Vector.make(Sum.cardinalstacks.domain(Vector.lengthdgr.nodes))IndexMap.emptyinletmaco=Opt.cardinal(Vector.lengthmachine.outgoing)inletgetmacpos=letmap=table.:(pos)inmatchIndexMap.find_optmacmapwith|Someindex->index|None->letindex=Gen.addker(Prod.injmacomacpos)intable.:(pos)<-IndexMap.addmacindexmap;indexinletfind_targettrslrc=letlr1=stacks.labellrcinletcheck_trtr=ifIndexSet.memlr1machine.label.:(tr).filterthenSometrelseNoneinmatchIndexSet.find_mapcheck_trtrswith|None->Opt.none|Sometr->Opt.somemachine.target.:(tr)inletuncovered=ref[]inletinitialize_lrcnodetrslrc=letlrcs=stacks.prevlrcinifIndexSet.is_emptytrs||IndexSet.is_emptylrcsthenpushuncoverednodeelseDyn.setsuccessorsnode@@IndexSet.map_to_arraylrcs(funlrc->IndexSet.empty,get(find_targettrslrc)(Sum.inj_llrc))inletinitializenode=letmac,pos=Prod.prjmaco(Gen.getkernode)inlettrs=matchOpt.prjmacwith|None->IndexSet.empty|Somemac->machine.outgoing.:(mac)inmatchSum.prjstacks.domainposwith|Llrc->initialize_lrcnodetrslrc|Renu->letenu=dgr.nodes.:(enu)inmatchenu.successorswith|[||]->initialize_lrcnodetrs(Deter.get_lrcagrenu)|succ->Dyn.setsuccessorsnode@@Array.mapbeginfun(accept,enu')->letmac=find_targettrs(Deter.get_lrcagrenu')inletpos=Sum.inj_rstacks.domainenu'.indexin(accept,getmacpos)endsuccinlettodo=ref[]inletupdatetargetset=letmac,pos=Prod.prjmaco(Gen.getkertarget)inletset=matchSum.prjstacks.domainposwith|L_->set|Ren->IndexSet.diffsetdgr.nodes.:(en).acceptinletset=matchOpt.prjmacwith|None->set|Somemac->List.fold_leftbeginfunla(br,_,_)->ifBoolvector.testbranches.is_partialbrthenlaelse(* FIXME: check for unreachable clauses *)matchbranches.lookaheads.:(br)with|None->IndexSet.empty|Somela'->IndexSet.difflala'endsetmachine.accepting.:(mac)inletset=IndexSet.diffset(Dyn.getunacceptedtarget)inletdelta0=Dyn.getdeltatargetinletset=IndexSet.diffsetdelta0inifIndexSet.is_not_emptysetthen(ifIndexSet.is_emptydelta0thenpushtodotarget;Dyn.setdeltatarget(IndexSet.unionsetdelta0));setinletcounter=ref0inletpropagations=ref0inletpropagatenode=incrpropagations;ifIndexSet.is_empty(Dyn.getunacceptednode)theninitializenode;lettodo=Dyn.getdeltanodeinDyn.setdeltanodeIndexSet.empty;Dyn.setunacceptednode(IndexSet.uniontodo(Dyn.getunacceptednode));Array.iterbeginfun(accept,target)->lettodo=IndexSet.difftodoacceptinifIndexSet.is_not_emptytodothenletdelta=updatetargettodoinifIndexSet.is_not_emptydeltathenDyn.setpredecessorstarget((node,!counter,delta)::Dyn.getpredecessorstarget)end(Dyn.getsuccessorsnode)inletregular=Terminal.regularginlettrs=Option.fold~none:IndexSet.empty~some:(Vector.getmachine.outgoing)machine.initialinIndexMap.iterbeginfunlrc(accept,enu)->letst=get(find_targettrslrc)(Sum.inj_rstacks.domainenu)inignore(updatest(IndexSet.diffregularaccept))enddgr.initials;fixpoint~counter~propagatetodo;stopwatch1"Cover (%d iterations, %d propagations, %d uncovered states)"!counter!propagations(List.length!uncovered);letenum={Enum.domain=Ker.n;predecessors=Dyn.getpredecessors;unaccepted=Dyn.getunaccepted;}inletpositionix=let_,pos=Prod.prjmaco(Gen.getkerix)inposinGraph{enum;position;sinks=!uncovered;unaccepted=Dyn.getunaccepted}endmoduleExtract=struct(** Extracts witness paths from coverage analysis graphs.
Propagates rejectable lookaheads backward from sink nodes through
predecessor links, identifying maximal nodes where user-reached
goals intersect with unaccepted lookaheads. *)(** Compute maximal prefixes: paths from sink nodes back to nodes where
goals have been reached. Rejectable lookaheads are propagated backward
through predecessors, intersected with each predecessor's unaccepted
set. Results are grouped by propagation depth. *)letcompute_maximal_prefixes(typengoalterm)~(graph:(n,term)Enum._graph)~(iter_sinks:(nindex->int->unit)->unit)~(reached:nindex->goalindexset)=letsink_min=refmax_intandsink_max=ref0initer_sinks(fun_cost->sink_min:=Int.mincost!sink_min;sink_max:=Int.maxcost!sink_max;);letsink_min=!sink_mininletsinks=Array.make(ifsink_min=max_intthen0else!sink_max-sink_min+1)[]initer_sinksbeginfunsinkcost->letindex=cost-sink_mininsinks.(index)<-([sink],graph.unacceptedsink)::sinks.(index)end;(* We have the sinks, propagate to get the maximal nodes *)letrejectable=Vector.makegraph.domainIndexSet.emptyinletcounter=ref0inlettodo=ref[]inletmaximals,add_maximal=dyn_array()inletfirst_counter=ref(-1)inletpropagate(path,reachable)=letnode=List.hdpathinletrejectable0=rejectable.:(node)inletrejectable'=IndexSet.unionreachablerejectable0inifrejectable'!=rejectable0then((* There are some lookaheads not yet known to be rejectable *)rejectable.:(node)<-rejectable';ifIndexSet.is_empty(reachednode)then((* It is an intermediate node *)List.iterbeginfun(node',_,_)->letreachable=IndexSet.interreachable(graph.unacceptednode')inifIndexSet.is_not_emptyreachablethenpushtodo(node'::path,reachable)end(graph.predecessorsnode))else((* It is a maximal node *)if!first_counter=-1thenfirst_counter:=!counter;add_maximal(!counter-!first_counter)(path,reachable)))inArray.iterbeginfunsinks->lettodo'=!todointodo:=[];List.iterpropagatesinks;List.iterpropagatetodo';incrcounter;endsinks;fixpoint~counter~propagatetodo;!maximals(** Compute global prefixes: propagate rejectable lookaheads from maximal
nodes further backward, collecting all nodes where goals were reached.
Unlike [compute_maximal_prefixes], which stops at the first goal node,
this continues propagation to find all goal intersections along each
path. Results are grouped by propagation depth. *)letcompute_global_prefixes(typengoalterm)~(graph:(n,term)Enum._graph)~(maximals:(nindexlist*termindexset)listarray)~(reached:nindex->goalindexset):(nindexlist*termindexset)listarray=letcounter=ref0inlettodo=ref[]inletglobals,add_global=dyn_array()inletfirst_counter=ref(-1)inletrejectable=Vector.makegraph.domainIndexSet.emptyinletpropagate(path,reachable)=letnode=List.hdpathinletrejectable0=rejectable.:(node)inletrejectable'=IndexSet.unionreachablerejectable0inifrejectable'!=rejectable0then((* There are some lookaheads not yet known to be rejectable *)rejectable.:(node)<-rejectable';ifIndexSet.is_not_empty(reachednode)then(if!first_counter=-1thenfirst_counter:=!counter;add_global(!counter-!first_counter)(path,reachable););(* It is an intermediate node *)List.iterbeginfun(node',_,_)->letreachable=IndexSet.interreachable(graph.unacceptednode')inifIndexSet.is_not_emptyreachablethenpushtodo(node'::path,reachable)end(graph.predecessorsnode))inArray.iterbeginfunmaximals->lettodo'=!todointodo:=[];List.iterpropagatemaximals;List.iterpropagatetodo';incrcounter;endmaximals;fixpoint~counter~propagatetodo;!globalsendmoduleReport=struct(** Formats and emits coverage results for user-facing output.
Provides local (per-state) and global (cross-state) reporting modes,
with sentence deduplication and cost-based ordering. *)(** Deduplicate and filter sentences for a single LR(0) state.
Sentences are sorted by cost, and lookaheads already covered by
earlier sentences are removed. Only sentences contributing new
uncovered lookaheads are retained. *)letcleanup_sentences(goal:'termindexset)lr0(sentences:('tactic*int*'termindexset)list)=letcovered=refIndexSet.emptyinsentences|>List.sort(fun(_t1,c1,_l1)(_t2,c2,_l2)->Int.comparec1c2)|>list_rev_filter_map(fun(t,c,l)->letl=IndexSet.intergoal(IndexSet.diffl!covered)incovered:=IndexSet.unionl!covered;ifIndexSet.is_emptylthenNoneelseSome(t,c,l,lr0))(** Emit local (per-state) coverage results.
For each LR(0) state with uncovered lookaheads, returns a sequence
of sentences sorted by cost. States with no remaining goals are
omitted. *)letemit_local(typelr0termtactic)(goals:(lr0,termindexset)vector)(sentences:lr0index->(tactic*int*termindexset)list):(lr0index*(tactic*termindexset)Seq.t)Seq.t=Vector.to_seqigoals|>Seq.filter_mapbeginfun(lr0,goal)->ifIndexSet.is_emptygoalthenNoneelsematchcleanup_sentencesgoallr0(sentenceslr0)with|[]->None|sentences->letsentences=List.to_seqsentences|>Seq.filter_mapbeginfun(t,_c,l,_)->letl=IndexSet.intergoals.:(lr0)linifIndexSet.is_emptylthenNoneelseSome(t,l)endinSome(lr0,sentences)end(** Emit global coverage results using a priority queue.
Sentences from all LR(0) states are merged into a single cost-ordered
sequence, so the most relevant counterexamples appear first regardless
of which state they belong to. *)letemit_global(typelr0termtactic)(goals:(lr0,termindexset)vector)(sentences:lr0index->(tactic*int*termindexset)list):(lr0index*tactic*termindexset)Seq.t=letmoduleH=Heap.Intinletadd_sentencesheap=function|[]->heap|(tactic,c,la,lr0)::sentences->H.insertc(tactic,la,lr0,sentences)heapinletall_sentences=Vector.fold_leftibeginfunheaplr0goal->ifIndexSet.is_emptygoalthenheapelseadd_sentencesheap(cleanup_sentencesgoallr0(sentenceslr0))endH.emptygoalsinletrecloopheap()=matchH.popheapwith|None->Seq.Nil|Some(_cost,(tactic,la,lr0,rest),heap)->letheap=add_sentencesheaprestinletla=IndexSet.intergoals.:(lr0)lainifIndexSet.is_emptylathenloopheap()elseSeq.Cons((lr0,tactic,la),loopheap)inloopall_sentences(** Emit both local and global coverage results.
Populates goal-to-sentence mappings from maximal and global prefix
arrays, then expands paths by following predecessor links and
committing coverage decisions (removing lookaheads already accounted
for by earlier sentences). Returns a pair of (local, global) result
sequences. *)letemit_all(typengoalterm)~goals:(goal_domain:goalcardinal)~(graph:(n,term)Enum._graph)~(maximals:(nindexlist*termindexset)listarray)~(globals:(nindexlist*termindexset)listarray)~(reached:nindex->goalindexset):(goalindex*(nindexlist*termindexset)Seq.t)Seq.t*(goalindex*(nindexlist*termindexset)Seq.t)Seq.t=letgoals=Vector.makegoal_domainIndexSet.emptyinletpopulateoffsetarr=letsentences=Vector.makegoal_domain[]inArray.iteribeginfunicandidates->List.iterbeginfun(path,la)->IndexSet.iterbeginfungoal->goals.@(goal)<-IndexSet.unionla;sentences.@(goal)<-List.cons(path,offset+i,la)end(reached(List.hdpath))endcandidatesendarr;sentencesinletmax_sentences=populate0maximalsinletglob_sentences=populate(Array.lengthmaximals)globalsinletreccommit_untilprefixla=function|[]->()|ixswhenixs==prefix->()|ix::ixs->IndexSet.iter(fung->goals.:(g)<-IndexSet.diffgoals.:(g)la)(reachedix);commit_untilprefixlaixsinletexpand_and_commit(prefix0,la)=commit_until[]laprefix0;letrecloop_successorprefixlaacc(node,goal)=letla=IndexSet.inter(graph.unacceptednode)lainloop_successors(node::prefix)laaccgoalandloop_successorsprefixlaaccgoal=matchgraph.predecessors(List.hdprefix)with|[]->commit_untilprefix0laprefix;(prefix,la)::acc|predecessors->letmatch_goal(next,_cost,goal')=letgoal=IndexSet.intergoal'goalinifIndexSet.is_emptygoalthenNoneelseSome(next,goal)inmatchList.filter_mapmatch_goalpredecessorswith|[]->(* FIXME: no predecessor has overlapping unaccepted lookaheads with goal — can this happen? *)assertfalse|[next]->loop_successorprefixlaaccnext|nexts->List.fold_left(loop_successorprefixla)accnextsinList.to_seq(loop_successorsprefix0la[]la)inletis_nonempty_seqseq=matchseq()with|Seq.Nil->None|Seq.Cons(x,xs)->Some(fun()->Seq.Cons(x,xs))inletlocals=emit_localgoals(Vector.getmax_sentences)|>Seq.filter_mapbeginfun(lr0,sentences)->matchis_nonempty_seq(Seq.concat_mapexpand_and_commitsentences)with|None->None|Somesentences->Some(lr0,sentences)endinletglobals=emit_globalgoals(Vector.getglob_sentences)|>Seq.filter_mapbeginfun(lr0,path,la)->matchis_nonempty_seq(expand_and_commit(path,la))with|None->None|Somesentences->Some(lr0,sentences)endin(locals,globals)end