Semantic Analysis (reference only)

 Page 5/5 Date 28.01.2017 Size 181.29 Kb. #10755
1   2   3   4   5

Semantic Analysis (reference only)

1. There is no universal method of describing semantics.

2. Three methods: Operational, Axiomatic and Denotational.

3. Operational semantics

1. To use operational semantics to describe the semantics of a programming language requires the construction of two components.

1. transfer to convert statement to a close low-level language for a virtual machine.

2. the virtual machine itself.

2. e.g. Describing Pascal for...do loop

Pascal statement Operational semantics

for I := first to last do I := first

begin loop: if I > last goto out

. .

. .

end I := I + 1

goto loop

out: ...

1. Evaluation. It provides an effective means of describing semantics for language users and language implementers, as long as the descriptions are kept as simple and informal as possible.

1. Axiomatic Semantics

1. It is based on the mathematical logic.

2. Precondition  A predicate, or an assertion, immediately before a statement describes the constraints on the program variable.

3. Postcondition  An assertion immediately following a statement describe the new constraints on those variables.

4. e.g. if postcondition { sum > 11 } follows the statement sum := 2 * x + 1, then one possible precondition is { x > 10}. i.e. { x > 10 } sum := 2 * x + 1 { sum > 11 }

5. The weakest precondition is the least restriction that will guarantee the validity of the associated postcondition. For the above example, the preconditional {x > 10}, {x > 2000} and {x>15.5} are all valid, but weakest one should be {x > 5}.

6. e.g.

1. The postcondition of the statement a := b/2 -1 is {a < 10}. The weakest precondition is {b<22}. Thus {b < 22} a := b/2 -1 {a < 10}.

2. In general, {PxE} x := E {P} where x  E means substituting E for every occurrence of x in the postcondition.

3. There is a wp transformer function used as follows

wp( x := E, P) = PxE

1. Sequence

If {P1} S1 {P2} and

{P2} S2 {P3}

we get {P1} S1, S2 {P3}. If

S1 is x1 := E1 and

S2 is x2 := E2

then we get

{P3x2E2} x2 := E2 {P3}

{(P3x2x2)x1E1} x1 := E1 {P3x2E2}

1. For while loop

while y <> x do y := Y + 1 { y = x}

For 0 iteration, the weakest precondition is { y = x }

For 1 iteration, wp(y := y+1, {y = x}) = {y=x-1}

For 2 iteration, wp(y := y+2, {y = x}) = {y=x-2}

For 3 iteration, wp(y := y+3, {y = x}) = {y=x-3}

 If the postcondition of the loop is loop termination. The weakest precondition is

{y  x }

1. Evaluation

1. A powerful tool for research into program correctness proofs.

2. No general methods of creating the predicate transformers function, thus the usefulness is limited.

1. Denotational Semantics

1. It defines both a mathematical object for each language entity and a function that maps instances of that entity onto instance of the mathematics object.

2. e.g. BNF of binary number

 0

| 1

| 0

| 1

1. The semantics function N maps the abstract syntax to the objects in N is as follows:

N[[ 0 ]] = 0

N[[ 1 ]] = 1

N[[ 0 ]] = 2 * N[[ ]]

N[[ 1 ]] = 2 * N[[ ]] + 1

1. Evaluation.

1. In a similar but complex way objects and functions can be defined for the other syntactic entities of programming languages. This provides a framework for thinking in a highly rigorous way about programming, as well as a method of proving the correctness of programs.

2. It can be used as an aid to language design.

Attribute Grammars

1. An attribute grammar is a grammar with the following additions:

1. Associated with each grammar symbol X is a set of attributes A(X). The set consists of two disjoint sets, synthesized attributes and inherited attributes.

2. Associated with each grammar rule is a set of semantic functions and a possibly empty set of predicate functions over the attributes of the symbols in the grammar rule.

3. For a rule X0  X1...Xn,

1. The synthesized attributes of X0 are computed with a semantic function of the form

S(X0) = f(A(X1), ... , A(Xn))

meaning that their values depend only on the attribute values of their parent nodes.

1. The inherited attributes of Xj, 1  j  n, are computed with a semantic function of the form

I(Xj) = f(A(X0))

meaning that their values depend only on the attribute values of their parent nodes.

1. Intrinsic Attributes. They are synthesized attributes of leaf nodes, where values are determined outside the parse tree.

e.g. The data type of a variable in program could come from a table  a symbol table.

1. Example: An attribute grammar for simple assignment statement.

1. Syntax rule: :=

Semantic rule: .env  .env

.env  .env

.lhs_type  .actual_type

.expected_type  .lhs_type

2. Syntax rule: [2] + [3]

Semantic rule: [2].env  .env

[3].env  .env

.actual_type 

if ([2].actual_type = int_type)

and ([3].actual_type = int_type)

then int_type

else real_type

end if

Predicate: .actual_type = .expected_type

3. Syntax rule:

Semantic rule: .actual_type  .actual_type

.env  .env

Predicate: .actual_type = .expected_type

4. Syntax rule:  A | B | C

Semantic rule: .actual_type  look-up (RHS, .env)

actual_type. It is associated with the terminals and . It is used to stores either int_type or real_type. In case of a variable, the actual type is intrinsic.

expected_type. i. An inherited attribute associated with the non-terminal . It is used to stores either int_type or real_type.

ii. It is determined by the type of the variable on the left side of the assignment statement.

lhs_type. A synthesized attribute associated with . It is used to move the value of the synthesized actual)type of the LHS of an assignment statement to the inherited attribute expected for the .

env. An inherited attribute associated with the non-terminals , and . It carries the reference to the correct symbol table entries to the instances of variables.
env

<assign>

lhs_type

expected_type env

actual_type

env actual_type env actual_type env

actual_type

A := A + B

Figure 11 The flow of attributes in the tree

Figure 12. A fully attributed env=table_1

parse tree

env=table_1

lhs_type = real_type

env=table_1

expected_type=real_type

actual_type = real_type
env=table_1 env=table_1

actual_type = env=table_1 actual_type =

real_type actual_type = int_type

real_type

A := A + B

1. Evaluation.

1. It provides a complete description of the syntax and static semantics of program language; they have been used as the formal definition of language that can be input to a compiler generation system.

2. Difficulties. Its size and complexity; a large parse tree which is costly to be evaluated.

Code Generation

1. The code specific to the target machine is generated.

2. As the code is machine code then it is usual for several machine code instructions to be generated for each high level language instruction.

3. e.g. LET A = B + C in Basic.

In Code Generation,

i. remove the redundant word LET.

ii. search for the symbol table to see the locations A, B and C.

iii. generate the necessary machine code.

1. It should be reminded that parse trees may often be built before this phase, they can be used in the generation.

2. Routines from the system library may often have to be called up, e.g. write procedure of Pascal.

3. Optimisation. Often the code produced by such methods is not the best that could be obtained. It is possible to make more efficient machine code by carrying out a process which is called optimisation.

Reverse Polish notation (Postfix)

** The reverse Polish notation is used to parse and represent arithmetic expressions in compiler.

1. Polish notation is also known as prefix notation because each operator precedes its operand.

2. A ‘Normal’ arithmetic expression is as follows

(3+5)x(9-7)

This is called infix notation because all the operators are inside the expression. The Polish notation of it will be as follows

x+35-97

1. The Polish notation has the advantage that there can be no ambiguity in the way that an arithmetic expression can be worked out. It also needs no parentheses to separate the different parts.

2. Another notation is the reverse Polish (or postfix) notation which is very similar in principle and also forms a parentheses-free notation. However, this time reverse Polish notation is particularly suited to computerised methods because of the ability to deal with such expression easily by using a stack.

3. The reverse Polish notation of the above expression is as follows

35+97-x

1. This leads to the following very simple rules for evaluating such expressions::

1. The next symbol encountered must be loaded on to the stack if it is an operand, i.e. a number or variable which is to be operated upon.

2. If the next symbol to be encountered is an operator, i.e. +, /, -, etc. then carry out the required operation on the top two items in the stack. The result of this operation must be left on the top of the stack.

2. To convert an infix string of arithmetic expression to postfix one, a stack and a table of order of precedence should be used.

3. Assume the following rules of precedence are used:
 Operator Precedence () & ^ 3 * & / 2 + & - 1 = 0

1. The algorithm of the conversion is shown in the following flowchart:

Start

Stop

Put on ( Other

stack Test Error report

Operand

Output to postfix string

Remove top of stack

Look Empty

) at top of

Test stack

Operand

Output top of stack to postfix string

Yes

Space Stack No Is (

Test empty on top of

? stack?

Operation Yes No

Is Stop Output top of stack

operation of to postfix string

Yes higher precedence No

than that on

stack or stack

empty? Output top of stack

` to postfix string

Figure 13

1. To convert the expression V+W^X*Y/(Z-1):
 Symbol being considered Output Postfix String Stack ( bottom) V V + V + W VW + ^ VW ^+ X VWX ^+ * VWX^ *+ Y VWX^Y *+ / VWX^Y* /+ ( VWX^Y* (/+ Z VWX^Y*Z (/+ - VWX^Y*Z -(/+ 1 VWX^Y*Z1 -(/+ ) VWX^Y*Z1- /+ end of string VWX^Y*Z1-/+ stack empty

Programming Language page