EBNF & Syntax Graphs

2020-07-23 |

EBNF

EBNF is used to make a formal description of a formal language such as a computer programming language. They are extensions of the basic Backus–Naur form (BNF) metasyntax notation. [wiki]

Optional Example

<func_decl> -> tDEF tID tLPAR [<id_list>] tRPAR
                                  |___optional
<if_stmnt>  ->
  • [] denotes in EBNF for optional parts in our productions.

List Example

Multiple identifier in id_list

<id_list> -> tID
           | tID, <id_list>

```xmlExample: -> tID { , tID} |_ zero or more occurence

* `{}` denoted in EBNF which used for list tyoes


### OR Expression
```xml
<for_stat> -> for tID := <expr> to <epxr>
            |     tID := <expr> down <epxr>
<for_stat> -> for tID := <expr> (to|down) <epxr>

Complete Example

Example: Turn this BNF notation to EBNF notation.

<expr> -> <expr> + <term>
        | <expr> - <term>
        | <term>

<term> -> <term> * <factor>
        | <term> / <factor>
        | <factor>

To achieve this first we need to derive expr notation. So we need to make observation from derivation.

<expr> =>* <term> + or - <term> ... 
       |__ derivation

<term> =>* <factor> * or / <factor> ...

Result is:

<expr> -> <term> {(+|-) <term>}  
<term> -> <factor> {(*|/) <factor>}  

Syntax Graphs

  • We need a graph for each non-terminals
  • Rectangle ==> terminal/token
  • Ellipse ==> non-terminals
<if_stat> -> if <bool> then <stats> {<elseif>} [else<stats>]
<elseif> -> elseif <bool> then <stats>

In this example we will examine <if_stat> and <elseif>

IF

Elseif

Parsing

  • Scanner = generate tokens
  • Parser = check if the program is correct with token sequence
  • Parse trees can be constructed with: bottom-up or top-dwon
int x; ==> tINT tID tSEMI; // We don't know if its correct 

Bottom-up Parsing

Now we will start to implement our parser.

Parser Generators

(yacc, bison,etc):

  • yacc = yet another compiler compiler
  • in this course we will use bison.
sudo apt-get install bison

Parser Generator

Postfix Language Example

Let's write a parser for postfix.

1 
1 2 +
1 2 + 3 *
  • 1 Non-Terminal
  • 3 Token

Bison File

postfix.y

  • Now we need to write our non-terminal in this notation.
  • We need to say yo bison tNUM, tADD is a token.
%token tNUM tADD tMULT
%%
postfix : tNUM
        | postfix postfix tADD
        | postfix postfix tMULT
;
%%

Also we need to create a flex file to scan our program.

%%
[0-9]+  return tNUM; // ==> these are numbers 
\+      return tADD;
\*      return tMULT;
%%
// We don't need to write anything in here.

When we call %token in bison it will generate a unique number for each tokens. So we can return these numbers from tokens.

%token tNUM tADD tMULT

We need to add this file to get assignment for each token.

%{
#include "postfix.tab.h"
}%
%left '+'   ==> + is left associative operator
%left '*'   ==> * is left associative operator

'+' has higher priority than '*' because it's on the top. if we want to say they're in the same priority level we can indicate like that:

%left '+','*'

Relative priority

Attribute Grammers

Attribute grammers ==> based on context free grammer

  • State additional grammers for context free grammers
  • Attribute Grammer = CFG + attributes (piece of informations) + semantics actions (actions of functions, how the values are produced) + predicates (boolean value functions)
  • Detect variable not declared yet
  • Example
func A

endfunc A

Attributes

  • Synthesized: Synthesized attributes of a symbol depend on the attributes of its children.
  • Inherited: Depends on the attributes of its parent and its siblings.

We will try to describe this rule but context free grammer is not enough.

Static Semantics

  • Related to syntax of program
  • Described by CFG

Dynamic Statics

  • Run-time
  • Described by natural language

CFG example:

<exp> -> <num> + <num>
<num> -> tNUM

This was the end of the blog post. You can reach me via email umusasadik at gmail com