Skip to main content

Follow Declaration

rascal-0.34.0

Synopsis

A conditional Symbol, constraining the characters that can immediately follow a symbol in the input source text.

Syntax

  • Symbol >> constraint
  • Symbol !>> constraint

where a constraint is any character class, a literal or a keyword non-terminal Symbol.

Description

Using !>>, the parser will not accept the Symbol if it is immediately followed by the terminal in the input string. If the end of the symbol coincides with end-of-file, the constraint will always succeed and the symbol is accepted.

Examples

First match is often important in the context of nullables (such as empty lists). Consider the following:

syntax Program = "{" "statement"* "}";
layout Spaces = [\ ]*;

Even an input with a single space is ambiguous, because it can go either before or after the empty lists of statement:

rascal>import ParseTree;
ok
rascal>parse(#Program, "{ }");
|TODO:///|: Ambiguity(
|unknown:///|(0,3,<1,0>,<1,3>),
"Program",
"{ }")
ok

To fix this we add a follow restriction that will not accept an empty Spaces if we have a space to the right of it, and so it must "eat" the space:

syntax Program = "{" "statement"* "}";
layout Spaces = [\ ]* !>> [\ ];
rascal>import ParseTree;
ok
rascal>parse(#Program, "{ }");
Program: (Program) `{ }`

Another example is typical for optional whitespace:

rascal>lexical Id = [a-z]+;
ok
rascal>syntax Program = Id+;
ok
rascal>layout Spaces = [\ ]*;
ok

Here we see that even two a's could be one or two elements. One longer Id ab or two shorter a and b

rascal>import ParseTree;
ok
rascal>parse(#Program, "ab");
|TODO:///|: Ambiguity(
|unknown:///|,
"Id+",
"ab")
ok

We solve this by declaring longest match for Id:

rascal>lexical Id = [a-z]+ !>> [a-z];
ok
rascal>syntax Program = Id+;
ok
rascal>layout Spaces = [\ ]*;
ok

And the ambiguity is gone:

rascal>import ParseTree;
ok
rascal>parse(#Program, "ab");
Program: (Program) `ab`

Benefits

  • Longest and first match are not implicit heuristics but declarative disambiguation filters

Pitfalls

  • Follow restrictions can filter all of the trees and leave you with a parse error
  • Follow restictions bring us outside of the realm of context-free grammars. So when constructing new parse tree values by visiting and substitution (rewriting) it is possible to construct trees that are not strictly in the language. The type system of Rascal for substitution in parse trees only covers non-terminal substitutability and not the additional constraints of the disambiguation filters.