A rust macro for generating PEG parsers.
- Literals
- strs
"Hello" - chars
'c' - char ranges
'0'..='9'
- strs
- Span
$e - Atomic
@e - Named
name: e - Lookahead
- Positive lookahead
&e - Negative lookahead
!e
- Positive lookahead
- Sequence
e e - Alternation
e | e - Number
- Many 0 (zero or more)
e* - Many 1 (one or more)
e+ - Optional (zero or one)
e?
- Many 0 (zero or more)
any = // Matches any single character
digit = '0'..='9'
alpha = insensitive['a'..='z']
alnum = alpha | digit
comment = !any
space = (whitespace | comment)*
rule = name '=' alt
alt = cat ('|' cat)* action?
cat = (named | unary)+
named = name ':' unary
unary = prefix | postfix | atom
prefix_op
= '$' // Span
| '@' // Atomic
| '!' // Negative lookahead
| '&' // Positive lookahead
postfix_op
= '*' // Many0
| '+' // Many1
| '?' // Optional
prefix = prefix_op atom
postfix = postfix_op atom
atom = name | literal | parenthesized
parenthesized = '(' alt ')'
Try the rule over and over, with recursion depth from 1.., until the parse fails. Now the latest cached result is the answer.
- Parse the syntax
- Replace all rule names with indices
- Detect overwriting of built-in rules
- Mark left-recursive rules
- Report refutable bindings
For delimiters, we may want to select for
- 0 or more occurrences
[] - 1 or more occurrences
[1] - 1 or more occurrences, but only when there is a trailing comma
[1,] - When
xhas return typeXandyhas return typeY,x ^ yshould have return type(Vec<X>, Vec<Y>) - Tuple can be made as follows:
tuple = '(' item: item ',' items: item ^ ',' ')' {
let mut items = items.0;
items.insert(0, item);
items
}
Alternatively (this isn't possible yet)
tuple = '(' items: item ^ ',' ')' {?
// At least one comma
match items.1.len() {
0 => None
_ => Some(items.0)
}
}