There is no universal method of describing semantics.
Three methods: Operational, Axiomatic and Denotational.
Operational semantics
To use operational semantics to describe the semantics of a programming language requires the construction of two components.
transfer to convert statement to a close low-level language for a virtual machine.
the virtual machine itself.
e.g. Describing Pascal for...do loop
Pascal statement Operational semantics
for I := first to last do I := first
begin loop: if I > last goto out
. .
. .
end I := I + 1
goto loop
out: ...
Evaluation. It provides an effective means of describing semantics for language users and language implementers, as long as the descriptions are kept as simple and informal as possible.
Precondition A predicate, or an assertion, immediately before a statement describes the constraints on the program variable.
Postcondition An assertion immediately following a statement describe the new constraints on those variables.
e.g. if postcondition { sum > 11 } follows the statement sum := 2 * x + 1, then one possible precondition is { x > 10}. i.e. { x > 10 } sum := 2 * x + 1 { sum > 11 }
The weakest precondition is the least restriction that will guarantee the validity of the associated postcondition. For the above example, the preconditional {x > 10}, {x > 2000} and {x>15.5} are all valid, but weakest one should be {x > 5}.
e.g.
The postcondition of the statement a := b/2 -1 is {a < 10}. The weakest precondition is {b<22}. Thus {b < 22} a := b/2 -1 {a < 10}.
In general, {PxE} x := E {P} where x E means substituting E for every occurrence of x in the postcondition.
There is a wp transformer function used as follows
wp( x := E, P) = PxE
Sequence
If {P1} S1 {P2} and
{P2} S2 {P3}
we get {P1} S1, S2 {P3}. If
S1 is x1 := E1 and
S2 is x2 := E2
then we get
{P3x2E2} x2 := E2 {P3}
{(P3x2x2)x1E1} x1 := E1 {P3x2E2}
For while loop
while y <> x do y := Y + 1 { y = x}
For 0 iteration, the weakest precondition is { y = x }
For 1 iteration, wp(y := y+1, {y = x}) = {y=x-1}
For 2 iteration, wp(y := y+2, {y = x}) = {y=x-2}
For 3 iteration, wp(y := y+3, {y = x}) = {y=x-3}
If the postcondition of the loop is loop termination. The weakest precondition is
{y x }
Evaluation
A powerful tool for research into program correctness proofs.
No general methods of creating the predicate transformers function, thus the usefulness is limited.
Denotational Semantics
It defines both a mathematical object for each language entity and a function that maps instances of that entity onto instance of the mathematics object.
e.g. BNF of binary number
0
| 1
| 0
| 1
The semantics function N maps the abstract syntax to the objects in N is as follows:
N[[ 0 ]] = 0
N[[ 1 ]] = 1
N[[ 0 ]] = 2 * N[[ ]]
N[[ 1 ]] = 2 * N[[ ]] + 1
Evaluation.
In a similar but complex way objects and functions can be defined for the other syntactic entities of programming languages. This provides a framework for thinking in a highly rigorous way about programming, as well as a method of proving the correctness of programs.
It can be used as an aid to language design.
Attribute Grammars
An attribute grammar is a grammar with the following additions:
Associated with each grammar symbol X is a set of attributes A(X). The set consists of two disjoint sets, synthesized attributes and inherited attributes.
Associated with each grammar rule is a set of semantic functions and a possibly empty set of predicate functions over the attributes of the symbols in the grammar rule.
For a rule X0 X1...Xn,
The synthesized attributes of X0 are computed with a semantic function of the form
S(X0) = f(A(X1), ... , A(Xn))
meaning that their values depend only on the attribute values of their parent nodes.
The inherited attributes of Xj, 1 j n, are computed with a semantic function of the form
I(Xj) = f(A(X0))
meaning that their values depend only on the attribute values of their parent nodes.
Intrinsic Attributes. They are synthesized attributes of leaf nodes, where values are determined outside the parse tree.
e.g. The data type of a variable in program could come from a table a symbol table.
Example: An attribute grammar for simple assignment statement.
1. Syntax rule: :=
Semantic rule: .env .env
.env .env
.lhs_type .actual_type
.expected_type .lhs_type
2. Syntax rule: [2] + [3]
Semantic rule: [2].env .env
[3].env .env
.actual_type
if ([2].actual_type = int_type)
and ([3].actual_type = int_type)
then int_type
else real_type
end if
Predicate: .actual_type = .expected_type
3. Syntax rule:
Semantic rule: .actual_type .actual_type
.env .env
Predicate: .actual_type = .expected_type
4. Syntax rule: A | B | C
Semantic rule: .actual_type look-up (RHS, .env)
actual_type. It is associated with the terminals and . It is used to stores either int_type or real_type. In case of a variable, the actual type is intrinsic.
expected_type. i. An inherited attribute associated with the non-terminal . It is used to stores either int_type or real_type.
ii. It is determined by the type of the variable on the left side of the assignment statement.
lhs_type. A synthesized attribute associated with . It is used to move the value of the synthesized actual)type of the LHS of an assignment statement to the inherited attribute expected for the .
env. An inherited attribute associated with the non-terminals , and . It carries the reference to the correct symbol table entries to the instances of variables.
env
<assign>
lhs_type
expected_type env
actual_type
env actual_type env actual_type env
actual_type
A := A + B
Figure 11 The flow of attributes in the tree
Figure 12. A fully attributed env=table_1
parse tree
env=table_1
lhs_type = real_type
env=table_1
expected_type=real_type
actual_type = real_type
env=table_1 env=table_1
actual_type = env=table_1 actual_type =
real_type actual_type = int_type
real_type
A := A + B
Evaluation.
It provides a complete description of the syntax and static semantics of program language; they have been used as the formal definition of language that can be input to a compiler generation system.
Difficulties. Its size and complexity; a large parse tree which is costly to be evaluated.
Code Generation
The code specific to the target machine is generated.
As the code is machine code then it is usual for several machine code instructions to be generated for each high level language instruction.
ii. search for the symbol table to see the locations A, B and C.
iii. generate the necessary machine code.
It should be reminded that parse trees may often be built before this phase, they can be used in the generation.
Routines from the system library may often have to be called up, e.g. write procedure of Pascal.
Optimisation. Often the code produced by such methods is not the best that could be obtained. It is possible to make more efficient machine code by carrying out a process which is called optimisation.
Reverse Polish notation (Postfix)
** The reverse Polish notation is used to parse and represent arithmetic expressions in compiler.
Polish notation is also known as prefix notation because each operator precedes its operand.
A ‘Normal’ arithmetic expression is as follows
(3+5)x(9-7)
This is called infix notation because all the operators are inside the expression. The Polish notation of it will be as follows
x+35-97
The Polish notation has the advantage that there can be no ambiguity in the way that an arithmetic expression can be worked out. It also needs no parentheses to separate the different parts.
Another notation is the reverse Polish (or postfix) notation which is very similar in principle and also forms a parentheses-free notation. However, this time reverse Polish notation is particularly suited to computerised methods because of the ability to deal with such expression easily by using a stack.
The reverse Polish notation of the above expression is as follows
35+97-x
This leads to the following very simple rules for evaluating such expressions::
The next symbol encountered must be loaded on to the stack if it is an operand, i.e. a number or variable which is to be operated upon.
If the next symbol to be encountered is an operator, i.e. +, /, -, etc. then carry out the required operation on the top two items in the stack. The result of this operation must be left on the top of the stack.
To convert an infix string of arithmetic expression to postfix one, a stack and a table of order of precedence should be used.
Assume the following rules of precedence are used:
Operator
Precedence
() & ^
3
* & /
2
+ & -
1
=
0
The algorithm of the conversion is shown in the following flowchart: