It determines whether the string of input tokens form valid sentences.
At this stage the structure of the source program is analysed to see if it conforms to the context-free grammar for the particular language is being compiled.
This stage includes
finding out if the number of brackets is correct. (stack may be used, why?)
determining the arithmetical operators used within an expression.
Complex forms may be broken down into simpler equivalents and more manageable form.
The primary formal methods of describing the syntax of programming languages are context-free grammars a formalism that is also known as Backus Naur form and syntax diagram.
The syntax of a program language the form of its expressions, statements and program units.
The semantic of a program language the meaning of those expression, statements and program units.
Backus-Naur Form
It was presented by Backus in 1959 and Naur in 1960.
The BNF is a metalanguage for program languages. A metalanguage is a language that is used to describe another language.
It was abstractions for syntactic structures. A Pascal assignment statement, for example, might be represented by the abstraction . The actual definition of may be given by
::= :=
The text to the right of ‘::=’ is the definition of the symbol on the left side. The definition is called a rule, or production.
BNF is a generative tool for defining language. The sentences of the language are generated through repeated application of the rules, and such generation is called a derivation.
The above small language has only one statement form, assignment, of which the right hand side allows either a single variable, or two variables and either a + or - operator. The only allowable variable names are A, B and C. Here is a sample program:
begin
A := B + C;
B := C
end
A derivation of this program in this language follows:
begin end
begin ; end
begin := + ; end
begin A := + ; end
begin A := B + ; end
begin A := B + C; end
begin A := B + C; end
begin A := B + C; := end
begin A := B + C; B := end
begin A := B + C; B := C end
Example 2. A grammar for simple assignment statements.
::= :=
::= A | B | C
::= +
| *
| ( )
|
<id>
:=
A
*
B
(
)
+
A
C
Figure 5. A parse tree for a simple assignment.
Remember one of the most attractive features of grammars is that they naturally describe the hierarchical syntactic structure of the sentences of the languages they defined. Such hierarchical structures are called parse tree.
Thus the statement:
A:= B * (A + C)
can be generated by the derivation and form the corresponding parse tree.
| Figure 6 Two distinct parse trees for the same grammar.
viii. Example 3 : An unambiguous grammar for expression ::= :=
::= A | B | C
::= +
|
::= *
|
::= ( )
| Figure 7 The unique parse tree using an
unambiguous grammar
:=
A
+
*
A
B
C
This grammar generates the same language as the BNF example 2, but it indicates the proper procedure order of multiply and add operators. A derivation of the sentence A := B + C * A will form a unique parse tree.
Assoicativity of Operators
The assignment A := B + C + A should form a parse tree as follows:
Thus B + C is calculated first rather than C + A. Such is called left associativity.
When a BNF rule has its LHS also appears the beginning of its RHS, the rule is said to be left recursive, which specifies left associativity.
<id>
:=
A
+
+
A
C
B
c) When a BNF rule has its LHS also appears the end of its RHS, the rule is said to be right recursive, which specifies right associativity.
d) Rules such as
**
|
( )
|
could be used to describe exponentiation as a right associative operator.