ANTLR predicatesEdit
Redirected from Syntactic predicate
This page provides an overview of the different types of predicate available in ANTLR:
Syntactic predicates
- Specifies the syntactic validity of applying an alternative
 
- Indicate syntactic context that must be satisfied in order for an alternative to match
 
- Automatically look ahead at future input to help make decisions (arbitrary lookahead)
 
- Used to order a rule’s alternatives (specify precedence among rules that would otherwise be ambiguous); this would suggest that it makes no sense to apply syntactic predicates in a rule with only one alternative and experience seems to confirm this (see PEG-style predicates in ANTLR for example)
 
- First alternative that matches wins
 
- Put another way, syntactic predicates indicate where backtracking may occur
 
- Take the form: 
(...)=> where ... is a grammar fragment 
- May only appear on the extreme left edge of an alternative
 
- In a rule with many alternatives, the last rule does not need an explicit predicate (it is assumed to be the default if nothing else matches)
 
- Similarly, in a rule with many alternatives, if some of the rules are not mutually ambiguous then they don’t need explicit predicates either
 
- The 
backtrack = true option may be used to have ANTLR automatically insert syntactic predicates on alternatives that do not already have a user-specified predicate 
- The overhead of backtracking may be amortized by turning on memoizing, preferably on a per-rule basis (the 
memoize = true option) 
- Actions are not executed during backtracking
 
- Under the covers use a "pushdown machine" rather than a DFA (as in normal LL(*) parsing), so are more powerful recognizers (they can recognize context-free structures, unlike DFAs which cannot recognize recursive structures and are equivalent in power to regular expressions)
 
- Explicitly specified syntactic predicates are implemented in ANTLR using gated semantic predicates behind the scenes
 
- Implicitly added syntactic predicates (added for auto-backtracking) are implemented using normal disambiguating semantic predicates and are therefore only evaluated when LL(*) fails to predict an alternative
 
- Documented in Chapter 14 of the ANTLR book (starting on page 331)
 
Example
(...)=> where ... is a grammar fragmentbacktrack = true option may be used to have ANTLR automatically insert syntactic predicates on alternatives that do not already have a user-specified predicatememoize = true option)Here is an example lexer rule that uses a syntactic predicate to order two ambiguous alternatives:
CRLF
  : ('\r'? '\n')=> '\r'? '\n'
  | '\r'
  ;
This will match \r\n, \r or \n, even in the same file (even if a file has mixed line endings). Without the syntactic predicate ANTLR doesn’t know what to do when faced with input such as \r\n (is it two CRLF tokens or one?); with the predicate ANTLR doesn’t issue an ambiguity warning and always chooses the longest match (\r\n) if possible.
Semantic predicates
- Specifies the semantic validity of applying an alternative
 
- Are boolean expressions in the target language that are evaluated at runtime
 
- Used to guide recognition
 
- Enforce non-syntactic rules (rules for which mere syntax is not enough to describe a valid sentence)
 
- Help to recognize context-sensitive language constructs
 
- Are "hoisted" up into other rules when necessary to inform the decision making process
 
- Must be free of side effects
- Must be possible to repeatedly evaluate them and get the same result
 
- Evaluation order must not matter
 
 
- Should not reference local variables or parameters (because this would break hoisting)
 
- Documented in Chapter 13 of the ANTLR book (starting on page 317)
 
Validating semantic predicates
- Must be possible to repeatedly evaluate them and get the same result
 - Evaluation order must not matter
 
- Look like normal actions but are followed immediately by a question mark 
{..}? ...is a boolean expression written in the target language; if it evaluates to false then aFailedPredicateExceptionis thrown
Gated semantic predicates
- Precede alternatives and are written as 
{...}?=> where ... is a boolean expression written in the target language 
- When evaluating to false, effectively disable the alternative, making it invisible to the recognizer (no exception is thrown)
 
- Put another way, they dynamically "turn on" or "turn off" portions of a grammar
 
- Always hoisted, even when decisions are deterministic
 
- Useful when you want to distinguish between alternatives that are not syntactically ambiguous
 
- May only appear in rules that actually have multiple alternatives
 
- May be used in lexer and parser rules
 
Disambiguating semantic predicates
- Precede alternatives and are written as 
{...}? 
- Used in making prediction decisions
 
- Are only used (evaluated) when syntax alone is insufficient to distinguish between alternatives
 
- Put another way, they disambiguate syntactically identical alternatives
 
- Unlike gated semantic predicates may be used in rules which have only one alternative
 
- Are hoisted into rules higher up in the decision chain when LL(*) lookahead alone is not sufficient to distinguish between alternatives
 
- Unlike gated semantic predicates, are not hoisted for deterministic decisions
 
- Only predicates reachable from the left edge without consuming an input symbol are hoisted
 
- To fully resolve a non-determinism, all alternatives must be covered by a disambiguating predicate
 
- As a special case, if the last (and only the last) conflicting alternative is not covered then ANLTR implicitly covers it as a "default"
 
- Predicated alternatives specified earlier have precedence over those specified later; that is, the semantic predicates are evaluated in the order specified in the alternatives in the grammar
 
- Multiple predicates may be specified for a single alternative:
 
rule
  : {pred1}? {pred2}? TOK // both pred1 and pred2 must evaluate to true
  | {pred3}? TOK
  ;
See also
{...}?=> where ... is a boolean expression written in the target language- Precede alternatives and are written as 
{...}? - Used in making prediction decisions
 - Are only used (evaluated) when syntax alone is insufficient to distinguish between alternatives
 - Put another way, they disambiguate syntactically identical alternatives
 - Unlike gated semantic predicates may be used in rules which have only one alternative
 - Are hoisted into rules higher up in the decision chain when LL(*) lookahead alone is not sufficient to distinguish between alternatives
 - Unlike gated semantic predicates, are not hoisted for deterministic decisions
 - Only predicates reachable from the left edge without consuming an input symbol are hoisted
 - To fully resolve a non-determinism, all alternatives must be covered by a disambiguating predicate
 - As a special case, if the last (and only the last) conflicting alternative is not covered then ANLTR implicitly covers it as a "default"
 - Predicated alternatives specified earlier have precedence over those specified later; that is, the semantic predicates are evaluated in the order specified in the alternatives in the grammar
 - Multiple predicates may be specified for a single alternative:
 
rule
  : {pred1}? {pred2}? TOK // both pred1 and pred2 must evaluate to true
  | {pred3}? TOK
  ;