Kernel.TranslSourceTranslation from syntax to regular expressions
This module handles the translation of pattern syntax (from the grammar) into regular expression structures that can be processed by the compilation pipeline.
Main components:
by_incoming_symbol: Maps each symbol to the set of LR states that have a transition on that symbolprod_by_lhs: Maps each nonterminal to the set of productions that have it as LHSby_items: Maps each LR(0) item to the set of LR1 states that recognizes itfoo _* bar` to match any sequence that starts with `foo`, ends with `bar`, with any symbols in between.struct_filter: Parses a glob pattern into componentsnormalize_filter: Normalizes the parsed componentsextract: Given a right-hand side, finds the positions where the pattern matchestransl_filter: Translates filter patterns into sets of LR states that satisfy the filter.transl: The main translation function that converts a regular expression from the syntax tree into an Expr.t regular expression structure.Tricky implementation details:
match_skip and extract_skip functions implement sophisticated backtracking to find all matching positions.compile_reduce_expr function uses the Redgraph.target_trie to find all states where a reduction can occur. It tracks both immediate reductions (can happen now) and deferred reductions (need to follow transitions first).Usage system tracks which parser constructs are actually used, enabling dead-code analysis.val string_of_goto :
'a Kernel__Info.grammar ->
'a Kernel__Info.goto_transition Fix.Indexing.index ->
stringval transl_filter :
'g Info.grammar ->
'g Indices.t ->
Stdlib.Lexing.position ->
lhs:Syntax.symbol option ->
rhs:(Syntax.filter_symbol * Stdlib.Lexing.position) list ->
'g Info.lr1 Utils.IndexSet.tval compile_reduce_expr :
'g Info.grammar ->
'g Redgraph.graph ->
'g Redgraph.target_trie ->
'g Kernel__Regexp.Expr.t ->
'g Redgraph.target Utils.IndexSet.t * 'g Info.lr1 Utils.IndexSet.tval transl :
'g Info.grammar ->
'g Redgraph.graph ->
'g Indices.t ->
'g Redgraph.target_trie ->
capture:
(Syntax.capture_kind ->
string ->
Kernel__Regexp.Capture.n Utils.IndexSet.element) ->
Syntax.regular_expr ->
Kernel__Regexp.Capture.n Utils.IndexSet.t * 'g Regexp.Expr.t