Basic Lean parser infrastructure #
The Lean parser was developed with the following primary goals in mind:
- flexibility: Lean's grammar is complex and includes indentation and other whitespace sensitivity. It should be possible to introduce such custom "tweaks" locally without having to adjust the fundamental parsing approach.
- extensibility: Lean's grammar can be extended on the fly within a Lean file, and with Lean 4 we want to extend this to cover embedding domain-specific languages that may look nothing like Lean, down to using a separate set of tokens.
- losslessness: The parser should produce a concrete syntax tree that preserves all whitespace and other "sub-token" information for the use in tooling.
- performance: The overhead of the parser building blocks, and the overall parser performance on average-complexity input, should be comparable with that of the previous parser hand-written in C++. No fancy optimizations should be necessary for this.
Given these constraints, we decided to implement a combinatoric, non-monadic, lexer-less, memoizing recursive-descent
parser. Using combinators instead of some more formal and introspectible grammar representation ensures ultimate
flexibility as well as efficient extensibility: there is (almost) no pre-processing necessary when extending the grammar
with a new parser. However, because all the results the combinators produce are of the homogeneous Syntax
type, the
basic parser type is not actually a monad but a monomorphic linear function ParserState → ParserState
, avoiding
constructing and deconstructing countless monadic return values. Instead of explicitly returning syntax objects, parsers
push (zero or more of) them onto a syntax stack inside the linear state. Chaining parsers via >>
accumulates their
output on the stack. Combinators such as node
then pop off all syntax objects produced during their invocation and
wrap them in a single Syntax.node
object that is again pushed on this stack. Instead of calling node
directly, we
usually use the macro leading_parser p
, which unfolds to node k p
where the new syntax node kind k
is the name of the
declaration being defined.
The lack of a dedicated lexer ensures we can modify and replace the lexical grammar at any point, and simplifies
detecting and propagating whitespace. The parser still has a concept of "tokens", however, and caches the most recent
one for performance: when tokenFn
is called twice at the same position in the input, it will reuse the result of the
first call. tokenFn
recognizes some built-in variable-length tokens such as identifiers as well as any fixed token in
the ParserContext
's TokenTable
(a trie); however, the same cache field and strategy could be reused by custom token
parsers. Tokens also play a central role in the prattParser
combinator, which selects a leading parser followed by
zero or more trailing parsers based on the current token (via peekToken
); see the documentation of prattParser
for more details. Tokens are specified via the symbol
parser, or with symbolNoWs
for tokens that should not be preceded by whitespace.
The Parser
type is extended with additional metadata over the mere parsing function to propagate token information:
collectTokens
collects all tokens within a parser for registering. firstTokens
holds information about the "FIRST"
token set used to speed up parser selection in prattParser
. This approach of combining static and dynamic information
in the parser type is inspired by the paper "Deterministic, Error-Correcting Combinator Parsers" by Swierstra and Duponcheel.
If multiple parsers accept the same current token, prattParser
tries all of them using the backtracking longestMatchFn
combinator.
This is the only case where standard parsers might execute arbitrary backtracking. Repeated invocations of the same category or concrete
parser at the same position are cached where possible; see withCache
.
Finally, error reporting follows the standard combinatoric approach of collecting a single unexpected token/... and zero
or more expected tokens (see Error
below). Expected tokens are e.g. set by symbol
and merged by <|>
. Combinators
running multiple parsers should check if an error message is set in the parser state (hasError
) and act accordingly.
Error recovery is left to the designer of the specific language; for example, Lean's top-level parseCommand
loop skips
tokens until the next command keyword on error.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
Instances For
Equations
- Lean.Parser.epsilonInfo = { collectTokens := id, collectKinds := id, firstTokens := Lean.Parser.FirstTokens.epsilon }
Instances For
Equations
- Lean.Parser.checkStackTopFn p msg x s = if p s.stxStack.back = true then s else s.mkUnexpectedError msg
Instances For
Equations
- Lean.Parser.checkStackTop p msg = { info := Lean.Parser.epsilonInfo, fn := Lean.Parser.checkStackTopFn p msg }
Instances For
Equations
- Lean.Parser.andthenFn p q c s = if (p c s).hasError = true then p c s else q c (p c s)
Instances For
Equations
- Lean.Parser.andthenInfo p q = { collectTokens := p.collectTokens ∘ q.collectTokens, collectKinds := p.collectKinds ∘ q.collectKinds, firstTokens := p.firstTokens.seq q.firstTokens }
Instances For
The andthen(p, q)
combinator, usually written as adjacency in syntax declarations (p q
),
parses p
followed by q
.
The arity of this parser is the sum of the arities of p
and q
:
that is, it accumulates all the nodes produced by p
followed by the nodes from q
into the list
of arguments to the surrounding parse node.
Equations
- Lean.Parser.andthen p q = { info := Lean.Parser.andthenInfo p.info q.info, fn := Lean.Parser.andthenFn p.fn q.fn }
Instances For
Equations
- Lean.Parser.instAndThenParser = { andThen := fun (a : Lean.Parser.Parser) (b : Unit → Lean.Parser.Parser) => Lean.Parser.andthen a (b ()) }
Equations
- Lean.Parser.nodeFn n p c s = (p c s).mkNode n s.stackSize
Instances For
Equations
- Lean.Parser.trailingNodeFn n p c s = (p c s).mkTrailingNode n s.stackSize
Instances For
Equations
- Lean.Parser.nodeInfo n p = { collectTokens := p.collectTokens, collectKinds := fun (s : Lean.Parser.SyntaxNodeKindSet) => (p.collectKinds s).insert n, firstTokens := p.firstTokens }
Instances For
Equations
- Lean.Parser.node n p = { info := Lean.Parser.nodeInfo n p.info, fn := Lean.Parser.nodeFn n p.fn }
Instances For
Equations
- Lean.Parser.errorFn msg x s = s.mkUnexpectedError msg
Instances For
Equations
- Lean.Parser.error msg = { info := Lean.Parser.epsilonInfo, fn := Lean.Parser.errorFn msg }
Instances For
Equations
- Lean.Parser.errorAtSavedPosFn msg delta c s = match c.savedPos? with | none => s | some pos => let pos := if delta = true then c.input.next pos else pos; s.mkUnexpectedErrorAt msg pos
Instances For
Generate an error at the position saved with the withPosition
combinator.
If delta == true
, then it reports at saved position+1.
This useful to make sure a parser consumed at least one character.
Equations
- Lean.Parser.errorAtSavedPos msg delta = { info := { collectTokens := id, collectKinds := id, firstTokens := Lean.Parser.FirstTokens.unknown }, fn := Lean.Parser.errorAtSavedPosFn msg delta }
Instances For
Succeeds if c.prec <= prec
Equations
- Lean.Parser.checkPrecFn prec c s = if c.prec ≤ prec then s else s.mkUnexpectedError "unexpected token at this precedence level; consider parenthesizing the term"
Instances For
Equations
- Lean.Parser.checkPrec prec = { info := Lean.Parser.epsilonInfo, fn := Lean.Parser.checkPrecFn prec }
Instances For
Succeeds if c.lhsPrec >= prec
Equations
- Lean.Parser.checkLhsPrecFn prec x s = if s.lhsPrec ≥ prec then s else s.mkUnexpectedError "unexpected token at this precedence level; consider parenthesizing the term"
Instances For
Equations
- Lean.Parser.checkLhsPrec prec = { info := Lean.Parser.epsilonInfo, fn := Lean.Parser.checkLhsPrecFn prec }
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- Lean.Parser.setLhsPrec prec = { info := Lean.Parser.epsilonInfo, fn := Lean.Parser.setLhsPrecFn prec }
Instances For
Equations
Instances For
Equations
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- Lean.Parser.leadingNode n prec p = HAndThen.hAndThen (Lean.Parser.checkPrec prec) fun (x : Unit) => HAndThen.hAndThen (Lean.Parser.node n p) fun (x : Unit) => Lean.Parser.setLhsPrec prec
Instances For
Equations
- Lean.Parser.trailingNodeAux n p = { info := Lean.Parser.nodeInfo n p.info, fn := Lean.Parser.trailingNodeFn n p.fn }
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
- Lean.Parser.mergeOrElseErrors s error1 iniPos mergeErrors = s
Instances For
- acceptLhs: Lean.Parser.OrElseOnAntiquotBehavior
- takeLongest: Lean.Parser.OrElseOnAntiquotBehavior
- merge: Lean.Parser.OrElseOnAntiquotBehavior
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
Instances For
Equations
- Lean.Parser.orelseInfo p q = { collectTokens := p.collectTokens ∘ q.collectTokens, collectKinds := p.collectKinds ∘ q.collectKinds, firstTokens := p.firstTokens.merge q.firstTokens }
Instances For
Run p
, falling back to q
if p
failed without consuming any input.
NOTE: In order for the pretty printer to retrace an orelse
, p
must be a call to node
or some other parser
producing a single node kind. Nested orelse
calls are flattened for this, i.e. (node k1 p1 <|> node k2 p2) <|> ...
is fine as well.
Equations
- Lean.Parser.orelse p q = { info := Lean.Parser.orelseInfo p.info q.info, fn := Lean.Parser.orelseFn p.fn q.fn }
Instances For
Equations
- Lean.Parser.instOrElseParser = { orElse := fun (a : Lean.Parser.Parser) (b : Unit → Lean.Parser.Parser) => Lean.Parser.orelse a (b ()) }
Equations
- Lean.Parser.noFirstTokenInfo info = { collectTokens := info.collectTokens, collectKinds := info.collectKinds, firstTokens := Lean.Parser.FirstTokens.unknown }
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
The atomic(p)
parser parses p
, returns the same result as p
and fails iff p
fails,
but if p
fails after consuming some tokens atomic(p)
will fail without consuming tokens.
This is important for the p <|> q
combinator, because it is not backtracking, and will fail if
p
fails after consuming some tokens. To get backtracking behavior, use atomic(p) <|> q
instead.
This parser has the same arity as p
- it produces the same result as p
.
Instances For
Information about the state of the parse prior to the failing parser's execution
- initialPos : String.Pos
The position prior to the failing parser
- initialSize : Nat
The syntax stack height prior to the failing parser's execution
Instances For
Equations
Equations
- Lean.Parser.instReprRecoveryContext = { reprPrec := Lean.Parser.reprRecoveryContext✝ }
Recover from errors in p
using recover
to consume input until a known-good state has appeared.
If recover
fails itself, then no recovery is performed.
recover
is provided with information about the failing parser's effects , and it is run in the
state immediately after the failure.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Recover from errors in parser
using handler
to consume input until a known-good state has appeared.
If handler
fails itself, then no recovery is performed.
handler
is provided with information about the failing parser's effects , and it is run in the
state immediately after the failure.
The interactions between <|> and recover'
are subtle, especially for syntactic
categories that admit user extension. Consider avoiding it in these cases.
Equations
- Lean.Parser.recover' parser handler = { info := parser.info, fn := Lean.Parser.recoverFn parser.fn fun (s : Lean.Parser.RecoveryContext) => (handler s).fn }
Instances For
Recover from errors in parser
using handler
to consume input until a known-good state has appeared.
If handler
fails itself, then no recovery is performed.
handler
is run in the state immediately after the failure.
The interactions between <|> and recover
are subtle, especially for syntactic
categories that admit user extension. Consider avoiding it in these cases.
Equations
- Lean.Parser.recover parser handler = Lean.Parser.recover' parser fun (x : Lean.Parser.RecoveryContext) => handler
Instances For
Equations
- Lean.Parser.optionalFn p c s = (if ((p c s).hasError && (p c s).pos == s.pos) = true then (p c s).restore s.stackSize s.pos else p c s).mkNode Lean.nullKind s.stackSize
Instances For
Equations
- Lean.Parser.optionaInfo p = { collectTokens := p.collectTokens, collectKinds := p.collectKinds, firstTokens := p.firstTokens.toOptional }
Instances For
Equations
- Lean.Parser.optionalNoAntiquot p = { info := Lean.Parser.optionaInfo p.info, fn := Lean.Parser.optionalFn p.fn }
Instances For
Equations
- Lean.Parser.lookaheadFn p c s = if (p c s).hasError = true then p c s else (p c s).restore s.stackSize s.pos
Instances For
lookahead(p)
runs p
and fails if p
does, but it produces no parse nodes and rewinds the
position to the original state on success. So for example lookahead("=>")
will ensure that the
next token is "=>"
, without actually consuming this token.
This parser has arity 0 - it does not capture anything.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
notFollowedBy(p, "foo")
succeeds iff p
fails;
if p
succeeds then it fails with the message "unexpected foo"
.
This parser has arity 0 - it does not capture anything.
Equations
- Lean.Parser.notFollowedBy p msg = { info := { collectTokens := id, collectKinds := id, firstTokens := Lean.Parser.FirstTokens.unknown }, fn := Lean.Parser.notFollowedByFn p.fn msg }
Instances For
Equations
- Lean.Parser.manyFn p c s = (Lean.Parser.manyAux p c s).mkNode Lean.nullKind s.stackSize
Instances For
Equations
- Lean.Parser.manyNoAntiquot p = { info := Lean.Parser.noFirstTokenInfo p.info, fn := Lean.Parser.manyFn p.fn }
Instances For
Equations
- Lean.Parser.many1Fn p c s = (Lean.Parser.andthenFn p (Lean.Parser.manyAux p) c s).mkNode Lean.nullKind s.stackSize
Instances For
Instances For
Equations
- Lean.Parser.sepByFn allowTrailingSep p sep c s = Lean.Parser.sepByFnAux p sep allowTrailingSep s.stackSize true c s
Instances For
Equations
- Lean.Parser.sepBy1Fn allowTrailingSep p sep c s = Lean.Parser.sepByFnAux p sep allowTrailingSep s.stackSize false c s
Instances For
Equations
- Lean.Parser.sepByInfo p sep = { collectTokens := p.collectTokens ∘ sep.collectTokens, collectKinds := p.collectKinds ∘ sep.collectKinds, firstTokens := Lean.Parser.FirstTokens.unknown }
Instances For
Equations
- Lean.Parser.sepBy1Info p sep = { collectTokens := p.collectTokens ∘ sep.collectTokens, collectKinds := p.collectKinds ∘ sep.collectKinds, firstTokens := p.firstTokens }
Instances For
Equations
- Lean.Parser.sepByNoAntiquot p sep allowTrailingSep = { info := Lean.Parser.sepByInfo p.info sep.info, fn := Lean.Parser.sepByFn allowTrailingSep p.fn sep.fn }
Instances For
Equations
- Lean.Parser.sepBy1NoAntiquot p sep allowTrailingSep = { info := Lean.Parser.sepBy1Info p.info sep.info, fn := Lean.Parser.sepBy1Fn allowTrailingSep p.fn sep.fn }
Instances For
Apply f
to the syntax object produced by p
Equations
- Lean.Parser.withResultOfFn p f c s = if (p c s).hasError = true then p c s else let stx := (p c s).stxStack.back; (p c s).popSyntax.pushSyntax (f stx)
Instances For
Equations
- Lean.Parser.withResultOfInfo p = { collectTokens := p.collectTokens, collectKinds := p.collectKinds, firstTokens := Lean.Parser.FirstTokens.unknown }
Instances For
Equations
- Lean.Parser.withResultOf p f = { info := Lean.Parser.withResultOfInfo p.info, fn := Lean.Parser.withResultOfFn p.fn f }
Instances For
Equations
- Lean.Parser.many1Unbox p = Lean.Parser.withResultOf (Lean.Parser.many1NoAntiquot p) fun (stx : Lean.Syntax) => if (stx.getNumArgs == 1) = true then stx.getArg 0 else stx
Instances For
Equations
Instances For
Equations
- Lean.Parser.takeWhileFn p = Lean.Parser.takeUntilFn fun (c : Char) => !p c
Instances For
Equations
- Lean.Parser.takeWhile1Fn p errorMsg = Lean.Parser.andthenFn (Lean.Parser.satisfyFn p errorMsg) (Lean.Parser.takeWhileFn p)
Instances For
Equations
- Lean.Parser.finishCommentBlock.eoi pushMissingOnError s = s.mkUnexpectedError "unterminated comment" [] pushMissingOnError
Instances For
Equations
- Lean.Parser.mkEmptySubstringAt s p = { str := s, startPos := p, stopPos := p }
Instances For
Match an arbitrary Parser and return the consumed String in a Syntax.atom
.
Equations
- Lean.Parser.rawFn p trailingWs c s = if (p c s).hasError = true then p c s else Lean.Parser.rawAux s.pos trailingWs c (p c s)
Instances For
Equations
- Lean.Parser.chFn c trailingWs = Lean.Parser.rawFn (Lean.Parser.satisfyFn (fun (d : Char) => c == d) ("'" ++ toString c ++ "'")) trailingWs
Instances For
Equations
- Lean.Parser.rawCh c trailingWs = { info := { collectTokens := id, collectKinds := id, firstTokens := Lean.Parser.FirstTokens.unknown }, fn := Lean.Parser.chFn c trailingWs }
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Parses the whitespace after the \
when there is a string gap.
Raises an error if the whitespace does not contain exactly one newline character.
Parses a string quotation after a \
.
isQuotable
determines which characters are valid escapesinString
enables features that are only valid within strings, in particular"\" newline whitespace*
gaps.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Like quotedCharFn
but enables escapes that are only valid inside strings.
In particular, string gaps ("\" newline whitespace*
).
Equations
Instances For
Push (Syntax.node tk <new-atom>)
onto syntax stack if parse was successful.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Raw strings have the syntax r##...#"..."#...##
with zero or more #
's.
If we are looking at a valid start to a raw string (r##...#"
),
returns true.
We assume i
begins at the position immediately after r
.
Parses a raw string literal assuming isRawStrLitStart
has returned true.
The startPos
is the start of the raw string (at the r
).
The parser state is assumed to be immediately after the r
.
Equations
- Lean.Parser.rawStrLitFnAux startPos = Lean.Parser.rawStrLitFnAux.initState startPos 0
Instances For
Gives the "unterminated raw string literal" error.
Equations
- Lean.Parser.rawStrLitFnAux.errorUnterminated startPos s = s.mkUnexpectedErrorAt "unterminated raw string literal" startPos
Instances For
Parses the #
's and "
at the beginning of the raw string.
The num
variable counts the number of #
s after the r
.
Parses characters after the first "
. If we need to start counting #
's to decide if we are closing
the raw string literal, we switch to closingState
.
Parses #
characters immediately after a "
.
The closingNum
variable counts the number of #
s seen after the "
.
Note: num > 0
since the num = 0
case is entirely handled by normalState
.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- Lean.Parser.binNumberFn startPos c s = Lean.Parser.mkNodeToken Lean.numLitKind startPos c (Lean.Parser.takeWhile1Fn (fun (c : Char) => c == '0' || c == '1') "binary number" c s)
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
- Lean.Parser.mkTokenAndFixPos startPos none c s = s.mkErrorAt "token" startPos
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- Lean.Parser.identFnAux startPos tk r = Lean.Parser.identFnAux.parse startPos tk r
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- Lean.Parser.peekToken c s = if (s.cache.tokenCache.startPos == s.pos) = true then (s, Except.ok s.cache.tokenCache.token) else Lean.Parser.peekTokenAux c s
Instances For
Treat keywords as identifiers.
Equations
- Lean.Parser.rawIdentFn c s = if c.input.atEnd s.pos = true then s.mkEOIError else Lean.Parser.identFnAux s.pos none Lean.Name.anonymous c s
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- Lean.Parser.symbolFnAux sym errorMsg = Lean.Parser.satisfySymbolFn (fun (s : String) => s == sym) [errorMsg]
Instances For
Equations
- Lean.Parser.symbolInfo sym = { collectTokens := fun (tks : List Lean.Parser.Token) => sym :: tks, collectKinds := id, firstTokens := Lean.Parser.FirstTokens.tokens [sym] }
Instances For
Equations
- Lean.Parser.symbolFn sym = Lean.Parser.symbolFnAux sym ("'" ++ sym ++ "'")
Instances For
Equations
- Lean.Parser.symbolNoAntiquot sym = { info := Lean.Parser.symbolInfo sym.trim, fn := Lean.Parser.symbolFn sym.trim }
Instances For
Equations
- Lean.Parser.checkTailNoWs prev = match prev.getTailInfo with | Lean.SourceInfo.original leading pos trailing endPos => trailing.stopPos == trailing.startPos | x => false
Instances For
Check if the following token is the symbol or identifier sym
. Useful for
parsing local tokens that have not been added to the token table (but may have
been so by some unrelated code).
For example, the universe max
Function is parsed using this combinator so that
it can still be used as an identifier outside of universe (but registering it
as a token in a Term Syntax would not break the universe Parser).
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- Lean.Parser.nonReservedSymbolFn sym = Lean.Parser.nonReservedSymbolFnAux sym ("'" ++ sym ++ "'")
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- Lean.Parser.nonReservedSymbolNoAntiquot sym includeIdent = { info := Lean.Parser.nonReservedSymbolInfo sym.trim includeIdent, fn := Lean.Parser.nonReservedSymbolFn sym.trim }
Instances For
Equations
- Lean.Parser.strAux sym errorMsg j = Lean.Parser.strAux.parse sym errorMsg j
Instances For
Equations
- Lean.Parser.checkTailWs prev = match prev.getTailInfo with | Lean.SourceInfo.original leading pos trailing endPos => decide (trailing.stopPos > trailing.startPos) | x => false
Instances For
Equations
- Lean.Parser.checkWsBeforeFn errorMsg x s = if Lean.Parser.checkTailWs s.stxStack.back = true then s else s.mkError errorMsg
Instances For
The ws
parser requires that there is some whitespace at this location.
For example, the parser "foo" ws "+"
parses foo +
or foo/- -/+
but not foo+
.
This parser has arity 0 - it does not capture anything.
Equations
- Lean.Parser.checkWsBefore errorMsg = { info := Lean.Parser.epsilonInfo, fn := Lean.Parser.checkWsBeforeFn errorMsg }
Instances For
Equations
- Lean.Parser.checkTailLinebreak prev = match prev.getTailInfo with | Lean.SourceInfo.original leading pos trailing endPos => trailing.contains '\n' | x => false
Instances For
Equations
- Lean.Parser.checkLinebreakBeforeFn errorMsg x s = if Lean.Parser.checkTailLinebreak s.stxStack.back = true then s else s.mkError errorMsg
Instances For
The linebreak
parser requires that there is at least one line break at this location.
(The line break may be inside a comment.)
This parser has arity 0 - it does not capture anything.
Equations
- Lean.Parser.checkLinebreakBefore errorMsg = { info := Lean.Parser.epsilonInfo, fn := Lean.Parser.checkLinebreakBeforeFn errorMsg }
Instances For
Equations
- Lean.Parser.checkNoWsBeforeFn errorMsg x s = if Lean.Parser.checkTailNoWs (Lean.Parser.pickNonNone s.stxStack) = true then s else s.mkError errorMsg
Instances For
The noWs
parser requires that there is no whitespace between the preceding and following
parsers. For example, the parser "foo" noWs "+"
parses foo+
but not foo +
.
This is almost the same as "foo+"
, but using this parser will make foo+
a token, which may cause
problems for the use of "foo"
and "+"
as separate tokens in other parsers.
This parser has arity 0 - it does not capture anything.
Equations
- Lean.Parser.checkNoWsBefore errorMsg = { info := Lean.Parser.epsilonInfo, fn := Lean.Parser.checkNoWsBeforeFn errorMsg }
Instances For
Equations
- Lean.Parser.unicodeSymbolFnAux sym asciiSym expected = Lean.Parser.satisfySymbolFn (fun (s : String) => s == sym || s == asciiSym) expected
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- Lean.Parser.unicodeSymbolFn sym asciiSym = Lean.Parser.unicodeSymbolFnAux sym asciiSym ["'" ++ sym ++ "', '" ++ asciiSym ++ "'"]
Instances For
Equations
- Lean.Parser.unicodeSymbolNoAntiquot sym asciiSym = { info := Lean.Parser.unicodeSymbolInfo sym.trim asciiSym.trim, fn := Lean.Parser.unicodeSymbolFn sym.trim asciiSym.trim }
Instances For
Equations
- Lean.Parser.mkAtomicInfo k = { collectTokens := id, collectKinds := id, firstTokens := Lean.Parser.FirstTokens.tokens [k] }
Instances For
Parses a token and asserts the result is of the given kind.
desc
is used in error messages as the token kind.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
Instances For
Equations
- Lean.Parser.numLitNoAntiquot = { info := Lean.Parser.mkAtomicInfo "num", fn := Lean.Parser.numLitFn }
Instances For
Equations
- Lean.Parser.scientificLitFn = Lean.Parser.expectTokenFn Lean.scientificLitKind "scientific number"
Instances For
Equations
- Lean.Parser.scientificLitNoAntiquot = { info := Lean.Parser.mkAtomicInfo "scientific", fn := Lean.Parser.scientificLitFn }
Instances For
Equations
- Lean.Parser.strLitFn = Lean.Parser.expectTokenFn Lean.strLitKind "string literal"
Instances For
Equations
- Lean.Parser.strLitNoAntiquot = { info := Lean.Parser.mkAtomicInfo "str", fn := Lean.Parser.strLitFn }
Instances For
Equations
- Lean.Parser.charLitFn = Lean.Parser.expectTokenFn Lean.charLitKind "character literal"
Instances For
Equations
- Lean.Parser.charLitNoAntiquot = { info := Lean.Parser.mkAtomicInfo "char", fn := Lean.Parser.charLitFn }
Instances For
Equations
- Lean.Parser.nameLitFn = Lean.Parser.expectTokenFn Lean.nameLitKind "Name literal"
Instances For
Equations
- Lean.Parser.nameLitNoAntiquot = { info := Lean.Parser.mkAtomicInfo "name", fn := Lean.Parser.nameLitFn }
Instances For
Equations
- Lean.Parser.identFn = Lean.Parser.expectTokenFn Lean.identKind "identifier"
Instances For
Equations
- Lean.Parser.identNoAntiquot = { info := Lean.Parser.mkAtomicInfo "ident", fn := Lean.Parser.identFn }
Instances For
Equations
- Lean.Parser.rawIdentNoAntiquot = { info := { collectTokens := id, collectKinds := id, firstTokens := Lean.Parser.FirstTokens.unknown }, fn := Lean.Parser.rawIdentFn }
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- Lean.Parser.identEq id = { info := Lean.Parser.mkAtomicInfo "ident", fn := Lean.Parser.identEqFn id }
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
Instances For
Equations
- Lean.Parser.ParserState.keepTop s startStackSize = (s.shrink startStackSize).push s.back
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
- s.mergeErrors oldStackSize oldError = s
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- s.replaceLongest startStackSize = s.keepLatest startStackSize
Instances For
Equations
- Lean.Parser.invalidLongestMatchParser s = s.mkError "longestMatch parsers must generate exactly one Syntax node"
Instances For
Auxiliary function used to execute parsers provided to longestMatchFn
.
Push left?
into the stack if it is not none
, and execute p
.
Remark: p
must produce exactly one syntax node.
Remark: the left?
is not none when we are processing trailing parsers.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- Lean.Parser.longestMatchMkResult startSize s = if s.stackSize > startSize + 1 then s.mkNode Lean.choiceKind startSize else s
Instances For
Equations
- Lean.Parser.longestMatchFnAux left? startSize startLhsPrec startPos prevPrio ps = Lean.Parser.longestMatchFnAux.parse left? startSize startLhsPrec startPos prevPrio ps
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
- Lean.Parser.longestMatchFn left? [] = fun (x : Lean.Parser.ParserContext) (s : Lean.Parser.ParserState) => s.mkError "longestMatch: empty list"
- Lean.Parser.longestMatchFn left? [p] = fun (c : Lean.Parser.ParserContext) (s : Lean.Parser.ParserState) => Lean.Parser.runLongestMatchParser left? s.lhsPrec p.fst.fn c s
Instances For
Equations
- Lean.Parser.anyOfFn [] x✝ x = x.mkError "anyOf: empty list"
- Lean.Parser.anyOfFn [p] x✝ x = p.fn x✝ x
- Lean.Parser.anyOfFn (p :: ps) x✝ x = Lean.Parser.orelseFn p.fn (Lean.Parser.anyOfFn ps) x✝ x
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
The colEq
parser ensures that the next token starts at exactly the column of the saved
position (see withPosition
). This can be used to do whitespace sensitive syntax like
a by
block or do
block, where all the lines have to line up.
This parser has arity 0 - it does not capture anything.
Equations
- Lean.Parser.checkColEq errorMsg = { info := { collectTokens := id, collectKinds := id, firstTokens := Lean.Parser.FirstTokens.unknown }, fn := Lean.Parser.checkColEqFn errorMsg }
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
The colGe
parser requires that the next token starts from at least the column of the saved
position (see withPosition
), but allows it to be more indented.
This can be used for whitespace sensitive syntax to ensure that a block does not go outside a
certain indentation scope. For example it is used in the lean grammar for else if
, to ensure
that the else
is not less indented than the if
it matches with.
This parser has arity 0 - it does not capture anything.
Equations
- Lean.Parser.checkColGe errorMsg = { info := { collectTokens := id, collectKinds := id, firstTokens := Lean.Parser.FirstTokens.unknown }, fn := Lean.Parser.checkColGeFn errorMsg }
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
The colGt
parser requires that the next token starts a strictly greater column than the saved
position (see withPosition
). This can be used for whitespace sensitive syntax for the arguments
to a tactic, to ensure that the following tactic is not interpreted as an argument.
example (x : False) : False := by
revert x
exact id
Here, the revert
tactic is followed by a list of colGt ident
, because otherwise it would
interpret exact
as an identifier and try to revert a variable named exact
.
This parser has arity 0 - it does not capture anything.
Equations
- Lean.Parser.checkColGt errorMsg = { info := { collectTokens := id, collectKinds := id, firstTokens := Lean.Parser.FirstTokens.unknown }, fn := Lean.Parser.checkColGtFn errorMsg }
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
The lineEq
parser requires that the current token is on the same line as the saved position
(see withPosition
). This can be used to ensure that composite tokens are not "broken up" across
different lines. For example, else if
is parsed using lineEq
to ensure that the two tokens
are on the same line.
This parser has arity 0 - it does not capture anything.
Equations
- Lean.Parser.checkLineEq errorMsg = { info := { collectTokens := id, collectKinds := id, firstTokens := Lean.Parser.FirstTokens.unknown }, fn := Lean.Parser.checkLineEqFn errorMsg }
Instances For
withPosition(p)
runs p
while setting the "saved position" to the current position.
This has no effect on its own, but various other parsers access this position to achieve some
composite effect:
colGt
,colGe
,colEq
compare the column of the saved position to the current position, used to implement Python-style indentation sensitive blockslineEq
ensures that the current position is still on the same line as the saved position, used to implement composite tokens
The saved position is only available in the read-only state, which is why this is a scoping parser:
after the withPosition(..)
block the saved position will be restored to its original value.
This parser has the same arity as p
- it just forwards the results of p
.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
withoutPosition(p)
runs p
without the saved position, meaning that position-checking
parsers like colGt
will have no effect. This is usually used by bracketing constructs like
(...)
so that the user can locally override whitespace sensitivity.
This parser has the same arity as p
- it just forwards the results of p
.
Equations
- One or more equations did not get rendered due to their size.
Instances For
withForbidden tk p
runs p
with tk
as a "forbidden token". This means that if the token
appears anywhere in p
(unless it is nested in withoutForbidden
), parsing will immediately
stop there, making tk
effectively a lowest-precedence operator. This is used for parsers like
for x in arr do ...
: arr
is parsed as withForbidden "do" term
because otherwise arr do ...
would be treated as an application.
This parser has the same arity as p
- it just forwards the results of p
.
Equations
- One or more equations did not get rendered due to their size.
Instances For
withoutForbidden(p)
runs p
disabling the "forbidden token" (see withForbidden
), if any.
This is usually used by bracketing constructs like (...)
because there is no parsing ambiguity
inside these nested constructs.
This parser has the same arity as p
- it just forwards the results of p
.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- Lean.Parser.eoiFn c s = if c.input.atEnd s.pos = true then s else s.mkError "expected end of file"
Instances For
Equations
- Lean.Parser.eoi = { info := { collectTokens := id, collectKinds := id, firstTokens := Lean.Parser.FirstTokens.unknown }, fn := Lean.Parser.eoiFn }
Instances For
A multimap indexed by tokens. Used for indexing parsers by their leading token.
Equations
Instances For
Equations
- map.insert k v = match Lean.RBMap.find? map k with | none => Lean.RBMap.insert map k [v] | some vs => Lean.RBMap.insert map k (v :: vs)
Instances For
Equations
- Lean.Parser.TokenMap.instInhabited = { default := Lean.RBMap.empty }
Equations
- Lean.Parser.TokenMap.instEmptyCollection = { emptyCollection := Lean.RBMap.empty }
Equations
- Lean.Parser.TokenMap.instForInProdNameList = inferInstanceAs (ForIn m (Lean.RBMap Lake.Name (List α) Lean.Name.quickCmp) (Lake.Name × List α))
- leadingTable : Lean.Parser.TokenMap (Lean.Parser.Parser × Nat)
- leadingParsers : List (Lean.Parser.Parser × Nat)
- trailingTable : Lean.Parser.TokenMap (Lean.Parser.Parser × Nat)
- trailingParsers : List (Lean.Parser.Parser × Nat)
Instances For
Equations
- Lean.Parser.instInhabitedPrattParsingTables = { default := { leadingTable := ∅, leadingParsers := [], trailingTable := ∅, trailingParsers := [] } }
The type LeadingIdentBehavior
specifies how the parsing table
lookup function behaves for identifiers. The function prattParser
uses two tables leadingTable
and trailingTable
. They map tokens
to parsers.
We use LeadingIdentBehavior.symbol
and LeadingIdentBehavior.both
and nonReservedSymbol
parser to implement the tactic
parsers.
The idea is to avoid creating a reserved symbol for each
builtin tactic (e.g., apply
, assumption
, etc.). That is, users
may still use these symbols as identifiers (e.g., naming a
function).
- default: Lean.Parser.LeadingIdentBehavior
LeadingIdentBehavior.default
: if the leading token is an identifier, thenprattParser
just executes the parsers associated with the auxiliary token "ident". - symbol: Lean.Parser.LeadingIdentBehavior
LeadingIdentBehavior.symbol
: if the leading token is an identifier<foo>
, and there are parsersP
associated with the token<foo>
, then it executesP
. Otherwise, it executes only the parsers associated with the auxiliary token "ident". - both: Lean.Parser.LeadingIdentBehavior
LeadingIdentBehavior.both
: if the leading token an identifier<foo>
, the it executes the parsers associated with token<foo>
and parsers associated with the auxiliary token "ident".
Instances For
Each parser category is implemented using a Pratt's parser.
The system comes equipped with the following categories: level
, term
, tactic
, and command
.
Users and plugins may define extra categories.
The method
categoryParser `term prec
executes the Pratt's parser for category term
with precedence prec
.
That is, only parsers with precedence at least prec
are considered.
The method termParser prec
is equivalent to the method above.
- declName : Lake.Name
The name of a declaration which will be used as the target of go-to-definition queries and from which doc strings will be extracted. This is a dummy declaration of type
Lean.Parser.Category
created bydeclare_syntax_cat
, but for builtin categories the declaration is made manually and passed toregisterBuiltinParserAttribute
. - kinds : Lean.Parser.SyntaxNodeKindSet
The list of syntax nodes that can parse into this category. This can be used to list all syntaxes in the category.
- tables : Lean.Parser.PrattParsingTables
The parsing tables, which consist of a dynamic set of parser functions based on the syntaxes that have been declared so far.
- behavior : Lean.Parser.LeadingIdentBehavior
The
LeadingIdentBehavior
, which specifies how the parsing table lookup function behaves for the first identifier to be parsed. This is used by thetactic
parser to avoid creating a reserved symbol for each builtin tactic (e.g.,apply
,assumption
, etc.).
Instances For
Equations
- Lean.Parser.instInhabitedParserCategory = { default := { declName := default, kinds := default, tables := default, behavior := default } }
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
Instances For
Equations
- Lean.Parser.categoryParserFn catName ctx s = Lean.Parser.categoryParserFnExtension.getState ctx.env catName ctx s
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- Lean.Parser.termParser prec = Lean.Parser.categoryParser `term prec
Instances For
Antiquotations #
Fail if previous token is immediately followed by ':'.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- Lean.Parser.setExpected expected = Lean.Parser.withFn (Lean.Parser.setExpectedFn expected)
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- Lean.Parser.antiquotExpr = HOrElse.hOrElse Lean.Parser.identNoAntiquot fun (x : Unit) => HOrElse.hOrElse (Lean.Parser.symbolNoAntiquot "_") fun (x : Unit) => Lean.Parser.antiquotNestedExpr
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
Instances For
Equations
- Lean.Parser.instCoeStringParser = { coe := Lean.Parser.symbol }
Equations
- Lean.Parser.nonReservedSymbol sym includeIdent = Lean.Parser.tokenWithAntiquot (Lean.Parser.nonReservedSymbolNoAntiquot sym includeIdent)
Instances For
Equations
- Lean.Parser.unicodeSymbol sym asciiSym = Lean.Parser.tokenWithAntiquot (Lean.Parser.unicodeSymbolNoAntiquot sym asciiSym)
Instances For
Define parser for $e
(if anonymous == true
) and $e:name
.
kind
is embedded in the antiquotation's kind, and checked at syntax match
unless isPseudoKind
is true.
Antiquotations can be escaped as in $$e
, which produces the syntax tree for $e
.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Optimized version of mkAntiquot ... <|> p
.
Equations
- Lean.Parser.withAntiquot antiquotP p = { info := Lean.Parser.orelseInfo antiquotP.info p.info, fn := Lean.Parser.withAntiquotFn antiquotP.fn p.fn }
Instances For
Equations
- Lean.Parser.withoutInfo p = { info := { collectTokens := id, collectKinds := id, firstTokens := Lean.Parser.FirstTokens.unknown }, fn := p.fn }
Instances For
Parse $[p]suffix
, e.g. $[p],*
.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Parse suffix
after an antiquotation, e.g. $x,*
, and put both into a new node.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- Lean.Parser.nodeWithAntiquot name kind p anonymous = Lean.Parser.withAntiquot (Lean.Parser.mkAntiquot name kind anonymous) (Lean.Parser.node kind p)
Instances For
End of Antiquotations #
Equations
- Lean.Parser.sepByElemParser p sep = Lean.Parser.withAntiquotSpliceAndSuffix `sepBy p (Lean.Parser.symbol (sep.trim ++ "*"))
Instances For
Equations
- Lean.Parser.sepBy p sep psep allowTrailingSep = Lean.Parser.sepByNoAntiquot (Lean.Parser.sepByElemParser p sep) psep allowTrailingSep
Instances For
Equations
- Lean.Parser.sepBy1 p sep psep allowTrailingSep = Lean.Parser.sepBy1NoAntiquot (Lean.Parser.sepByElemParser p sep) psep allowTrailingSep
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- Lean.Parser.leadingParser kind tables behavior antiquotParser = Lean.Parser.withAntiquotFn antiquotParser (Lean.Parser.leadingParserAux kind tables behavior) true
Instances For
Equations
- Lean.Parser.trailingLoopStep tables left ps c s = Lean.Parser.longestMatchFn (some left) (ps ++ tables.trailingParsers) c s
Instances For
Implements a variant of Pratt's algorithm. In Pratt's algorithms tokens have a right and left binding power.
In our implementation, parsers have precedence instead. This method selects a parser (or more, via
longestMatchFn
) from leadingTable
based on the current token. Note that the unindexed leadingParsers
parsers
are also tried. We have the unidexed leadingParsers
because some parsers do not have a "first token". Example:
syntax term:51 "≤" ident "<" term "|" term : index
Example, in principle, the set of first tokens for this parser is any token that can start a term, but this set
is always changing. Thus, this parsing rule is stored as an unindexed leading parser at leadingParsers
.
After processing the leading parser, we chain with parsers from trailingTable
/trailingParsers
that have precedence
at least c.prec
where c
is the ParsingContext
. Recall that c.prec
is set by categoryParser
.
Note that in the original Pratt's algorithm, precedences are only checked before calling trailing parsers. In our implementation, leading and trailing parsers check the precedence. We claim our algorithm is more flexible, modular and easier to understand.
antiquotParser
should be a mkAntiquot
parser (or always fail) and is tried before all other parsers.
It should not be added to the regular leading parsers because it would heavily
overlap with antiquotation parsers nested inside them.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- Lean.Parser.fieldIdx = Lean.Parser.withAntiquot (Lean.Parser.mkAntiquot "fieldIdx" `fieldIdx) { info := Lean.Parser.mkAtomicInfo "fieldIdx", fn := Lean.Parser.fieldIdxFn }
Instances For
Equations
- Lean.Parser.skip = { info := Lean.Parser.epsilonInfo, fn := fun (x : Lean.Parser.ParserContext) (s : Lean.Parser.ParserState) => s }
Instances For
Equations
- s.foldArgsM f b = Array.foldlM (flip f) b s.getArgs
Instances For
Equations
- s.foldArgs f b = (s.foldArgsM f b).run
Instances For
Equations
- s.forArgsM f = s.foldArgsM (fun (s : Lean.Syntax) (x : Unit) => f s) ()