There is no universal method of describing semantics.

Three methods: Operational, Axiomatic and Denotational.

Operational semantics

To use operational semantics to describe the semantics of a programming language requires the construction of two components.

transfer to convert statement to a close low-level language for a virtual machine.

the virtual machine itself.

e.g. Describing Pascal for...do loop

Pascal statement Operational semantics

for I := first to last do I := first

begin loop: if I > last goto out

. .

. .

end I := I + 1

goto loop

out: ...

Evaluation. It provides an effective means of describing semantics for language users and language implementers, as long as the descriptions are kept as simple and informal as possible.

Precondition A predicate, or an assertion, immediately before a statement describes the constraints on the program variable.

Postcondition An assertion immediately following a statement describe the new constraints on those variables.

e.g. if postcondition { sum > 11 } follows the statement sum := 2 * x + 1, then one possible precondition is { x > 10}. i.e. { x > 10 } sum := 2 * x + 1 { sum > 11 }

The weakest precondition is the least restriction that will guarantee the validity of the associated postcondition. For the above example, the preconditional {x > 10}, {x > 2000} and {x>15.5} are all valid, but weakest one should be {x > 5}.

e.g.

The postcondition of the statement a := b/2 -1 is {a < 10}. The weakest precondition is {b<22}. Thus {b < 22} a := b/2 -1 {a < 10}.

In general, {P_{x}_{}_{E}} x := E {P} where x E means substituting E for every occurrence of x in the postcondition.

There is a wp transformer function used as follows

For 0 iteration, the weakest precondition is { y = x }

For 1 iteration, wp(y := y+1, {y = x}) = {y=x-1}

For 2 iteration, wp(y := y+2, {y = x}) = {y=x-2}

For 3 iteration, wp(y := y+3, {y = x}) = {y=x-3}

If the postcondition of the loop is loop termination. The weakest precondition is

{y x }

Evaluation

A powerful tool for research into program correctness proofs.

No general methods of creating the predicate transformers function, thus the usefulness is limited.

Denotational Semantics

It defines both a mathematical object for each language entity and a function that maps instances of that entity onto instance of the mathematics object.

e.g. BNF of binary number

0

| 1

| 0

| 1

The semantics function N maps the abstract syntax to the objects in N is as follows:

N[[ 0 ]] = 0

N[[ 1 ]] = 1

N[[ 0 ]] = 2 * N[[ ]]

N[[ 1 ]] = 2 * N[[ ]] + 1

Evaluation.

In a similar but complex way objects and functions can be defined for the other syntactic entities of programming languages. This provides a framework for thinking in a highly rigorous way about programming, as well as a method of proving the correctness of programs.

It can be used as an aid to language design.

Attribute Grammars

An attribute grammar is a grammar with the following additions:

Associated with each grammar symbol X is a set of attributes A(X). The set consists of two disjoint sets, synthesized attributes and inherited attributes.

Associated with each grammar rule is a set of semantic functions and a possibly empty set of predicate functions over the attributes of the symbols in the grammar rule.

meaning that their values depend only on the attribute values of their parent nodes.

The inherited attributes of X_{j}, 1 j n, are computed with a semantic function of the form

I(X_{j}) = f(A(X_{0}))

meaning that their values depend only on the attribute values of their parent nodes.

Intrinsic Attributes. They are synthesized attributes of leaf nodes, where values are determined outside the parse tree.

e.g. The data type of a variable in program could come from a table a symbol table.

Example: An attribute grammar for simple assignment statement.

1. Syntax rule: :=

Semantic rule: .env .env

.env .env

.lhs_type .actual_type

.expected_type .lhs_type

2. Syntax rule: [2] + [3]

Semantic rule: [2].env .env

[3].env .env

.actual_type

if ([2].actual_type = int_type)

and ([3].actual_type = int_type)

then int_type

else real_type

end if

Predicate: .actual_type = .expected_type

3. Syntax rule:

Semantic rule: .actual_type .actual_type

.env .env

Predicate: .actual_type = .expected_type

4. Syntax rule: A | B | C

Semantic rule: .actual_type look-up (RHS, .env)

actual_type. It is associated with the terminals and . It is used to stores either int_type or real_type. In case of a variable, the actual type is intrinsic.

expected_type. i. An inherited attribute associated with the non-terminal . It is used to stores either int_type or real_type.

ii. It is determined by the type of the variable on the left side of the assignment statement.

lhs_type. A synthesized attribute associated with . It is used to move the value of the synthesized actual)type of the LHS of an assignment statement to the inherited attribute expected for the .

env. An inherited attribute associated with the non-terminals , and . It carries the reference to the correct symbol table entries to the instances of variables.
env

It provides a complete description of the syntax and static semantics of program language; they have been used as the formal definition of language that can be input to a compiler generation system.

Difficulties. Its size and complexity; a large parse tree which is costly to be evaluated.

Code Generation

The code specific to the target machine is generated.

As the code is machine code then it is usual for several machine code instructions to be generated for each high level language instruction.

ii. search for the symbol table to see the locations A, B and C.

iii. generate the necessary machine code.

It should be reminded that parse trees may often be built before this phase, they can be used in the generation.

Routines from the system library may often have to be called up, e.g. write procedure of Pascal.

Optimisation. Often the code produced by such methods is not the best that could be obtained. It is possible to make more efficient machine code by carrying out a process which is called optimisation.

Reverse Polish notation (Postfix)

** The reverse Polish notation is used to parse and represent arithmetic expressions in compiler.

Polish notation is also known as prefix notation because each operator precedes its operand.

A ‘Normal’ arithmetic expression is as follows

(3+5)x(9-7)

This is called infix notation because all the operators are inside the expression. The Polish notation of it will be as follows

x+35-97

The Polish notation has the advantage that there can be no ambiguity in the way that an arithmetic expression can be worked out. It also needs no parentheses to separate the different parts.

Another notation is the reverse Polish (or postfix) notation which is very similar in principle and also forms a parentheses-free notation. However, this time reverse Polish notation is particularly suited to computerised methods because of the ability to deal with such expression easily by using a stack.

The reverse Polish notation of the above expression is as follows

35+97-x

This leads to the following very simple rules for evaluating such expressions::

The next symbol encountered must be loaded on to the stack if it is an operand, i.e. a number or variable which is to be operated upon.

If the next symbol to be encountered is an operator, i.e. +, /, -, etc. then carry out the required operation on the top two items in the stack. The result of this operation must be left on the top of the stack.

To convert an infix string of arithmetic expression to postfix one, a stack and a table of order of precedence should be used.

Assume the following rules of precedence are used:

Operator

Precedence

() & ^

3

* & /

2

+ & -

1

=

0

The algorithm of the conversion is shown in the following flowchart: