The Sireum Parser Generator (PG) for Slang is a LL(k), PEG/Packrat (without syntactic predicates), or “mixed” (LL k-lookahead with backtracking) parser generator. PG generates a readable and easy-to-debug parser/lexer1 over Unicode codepoint stream in Slang, which can be compiled further to native (via Graal) or Javascript (via Scala.js). PG’s input grammar is a small subset of ANTLR3’s grammar (e.g., without syntactic predicates and semantic actions), thus, AntlrWorks can be used as its grammar IDE to ease development.

As an example, PG uses its own generated parser/lexer to parse its own input grammar. PG’s generated parsers build general parse trees that come with: (1) a generic tree rewriting algorithm for binary operators with runtime-configurable precedence/associativity rules after parsing2, and (2) GraphViz’s dot input generation to help visualize parse trees (note that AntlrWorks’s grammar interpreter/debugger also provides an excellent parse tree visualization tool).

Running PG

Below is the command line for generating Slang parser/lexer using PG (use --help to see more PG options):

$SIREUM_HOME/bin/sireum parser gen -l <license.txt> -p <package-name> -m slang -n <generated-class-name> -o <output-dir> <grammar.g> 
%SIREUM_HOME%\bin\sireum parser gen -l <license.txt> -p <package-name> -m slang -n <generated-class-name> -o <output-dir> <grammar.g>

For example, below is the command line to regenerate PG’s parser/lexer using PG itself in the Sireum source distribution:

$SIREUM_HOME/bin/sireum parser gen -l $SIREUM_HOME/license.txt -p org.sireum.parser -m slang -n SireumGrammar --no-backtracking -o $SIREUM_HOME/parser/shared/src/main/scala/org/sireum/parser $SIREUM_HOME/parser/jvm/src/main/resources/SireumAntlr3.g 
%SIREUM_HOME%\bin\sireum parser gen -l %SIREUM_HOME%\license.txt -p org.sireum.parser -m slang -n SireumGrammar --no-backtracking -o %SIREUM_HOME%\parser\shared\src\main\scala\org\sireum\parser %SIREUM_HOME%\parser\jvm\src\main\resources\SireumAntlr3.g


AntlrWorks is ANTLR3’s recommended IDE. The Sireum AntlrWorks fork can be installed and launched as follows:

${SIREUM_HOME}/bin/mac/java/bin/java -jar ${SIREUM_HOME}/bin/antlrworks.jar 
${SIREUM_HOME}/bin/linux/java/bin/java -jar ${SIREUM_HOME}/bin/antlrworks.jar 
%SIREUM_HOME%\bin\win\java\bin\java -jar %SIREUM_HOME%\bin\antlrworks.jar 

  1. The generated parser/lexer code is essentially finite state machine (FSM) encodings of the input grammar rules with some “cookie crumbs” for readability sake (e.g., token string comments on FSM transitions). ↩︎

  2. Useful, for example, to avoid grammar refactoring to enforce precedence/associativity rules, thus resulting in a more readable grammar, or to dynamically adapt precedence/associativity rules depending on a certain context. ↩︎