Compiler Principles Lecture 8 Intermediate Representation Roman Manevich BenGurion University Tentative syllabus 2 midterm exam Previously Why compilers need Intermediate Representations IR ID: 420728
Download Presentation The PPT/PDF document "Fall 2014-2015" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Fall 2014-2015 Compiler PrinciplesLecture 8: Intermediate Representation
Roman ManevichBen-Gurion UniversitySlide2
Tentative syllabus2
mid-term
examSlide3
PreviouslyWhy compilers need Intermediate Representations (IR)Translating from abstract syntax (AST) to IRThree-Address CodeLocal optimizationsSethi-Ullman algorithm for efficient code generation3Slide4
cgen for basic expressions4cgen(k) = { // k is a constant Choose a new temporary t Emit( t
:= k ) Return t{cgen(id) = { // id is an identifier Choose a new temporary
t Emit( t := id ) Return
t
{Slide5
Naive cgen for binary expressionsMaintain a counter for temporaries in cInitially: c = 0cgen(e1 op e2) = { Let A = cgen(e1
) c = c + 1 Let B = cgen(e2) c = c + 1 Emit( _tc := A op
B; ) Return _tc
}
5
The translation emits code to evaluate
e
1
before
e
2
. Why is that?Slide6
Example: cgen for binary expressions6cgen( (a*b)-d)Slide7
Example: cgen for binary expressions7c = 0cgen( (a*b)-d)Slide8
Example: cgen for binary expressions8c = 0cgen( (a*b)-d) = {
Let A = cgen(a*b) c = c + 1 Let B = cgen(d) c = c + 1
Emit( _tc := A - B; )
Return _
t
c
}Slide9
Example: cgen for binary expressions9c = 0cgen( (a*b)-d) = {
Let A = { Let A = cgen(a) c = c + 1 Let B = cgen(b) c = c + 1
Emit( _tc := A
*
B; )
Return
t
c
}
c = c + 1
Let B =
cgen
(d)
c = c + 1
Emit( _
t
c
:= A
-
B; )
Return _
t
c
}Slide10
Example: cgen for binary expressions10c = 0cgen( (a*b)-d) = {
Let A = { Let A = { Emit(_tc := a;), return _tc } c = c + 1 Let B = { Emit(_
tc := b;), return _tc
}
c = c + 1
Emit( _
t
c
:= A
*
B; )
Return _
t
c
}
c = c + 1
Let B = { Emit(_
t
c
:= d;), return _
t
c
}
c = c + 1
Emit( _
t
c
:= A
-
B; )
Return _
t
c
}
Code
here A=_t0Slide11
Example: cgen for binary expressions11c = 0cgen( (a*b)-d) = {
Let A = { Let A = { Emit(_tc := a;), return _tc } c = c + 1 Let B = { Emit(_
tc := b;), return _tc
}
c = c + 1
Emit( _
t
c
:= A
*
B; )
Return _
t
c
}
c = c + 1
Let B = { Emit(_
t
c
:= d;), return _
t
c
}
c = c + 1
Emit( _
t
c
:= A
-
B; )
Return _
t
c
}
Code
_t0:=a;
here A=_t0Slide12
Example: cgen for binary expressions12c = 0cgen( (a*b)-d) = {
Let A = { Let A = { Emit(_tc := a;), return _tc } c = c + 1 Let B = { Emit(_
tc := b;), return _tc
}
c = c + 1
Emit( _
t
c
:= A
*
B; )
Return _
t
c
}
c = c + 1
Let B = { Emit(_
t
c
:= d;), return _
t
c
}
c = c + 1
Emit( _
t
c
:= A
-
B; )
Return _
t
c
}
Code
_t0:=a;
_t1:=b;
here A=_t0Slide13
Example: cgen for binary expressions13c = 0cgen( (a*b)-d) = {
Let A = { Let A = { Emit(_tc := a;), return _tc } c = c + 1 Let B = { Emit(_
tc := b;), return _tc
}
c = c + 1
Emit( _
t
c
:= A
*
B; )
Return _
t
c
}
c = c + 1
Let B = { Emit(_
t
c
:= d;), return _
t
c
}
c = c + 1
Emit( _
t
c
:= A
-
B; )
Return _
t
c
}
Code
_t0:=a;
_t1:=b;
_t2:=_t0*_t1
here A=_t0Slide14
Example: cgen for binary expressions14c = 0cgen( (a*b)-d) = {
Let A = { Let A = { Emit(_tc := a;), return _tc } c = c + 1 Let B = { Emit(_
tc := b;), return _tc
}
c = c + 1
Emit( _
t
c
:= A
*
B; )
Return _
t
c
}
c = c + 1
Let B = { Emit(_
t
c
:= d;), return _
t
c
}
c = c + 1
Emit( _
t
c
:= A
-
B; )
Return _
t
c
}
Code
_t0:=a;
_t1:=b;
_t2:=_t0*_t1
here A=_t0
here A=_t2Slide15
Example: cgen for binary expressions15c = 0cgen( (a*b)-d) = {
Let A = { Let A = { Emit(_tc := a;), return _tc } c = c + 1 Let B = { Emit(_
tc := b;), return _tc
}
c = c + 1
Emit( _
t
c
:= A
*
B; )
Return _
t
c
}
c = c + 1
Let B = { Emit(_
t
c
:= d;), return _
t
c
}
c = c + 1
Emit( _
t
c
:= A
-
B; )
Return _
t
c
}
Code
_t0:=a;
_t1:=b;
_t2:=_t0*_t1
_t3:=d;
here A=_t0
here A=_t2Slide16
Example: cgen for binary expressions16c = 0cgen( (a*b)-d) = {
Let A = { Let A = { Emit(_tc := a;), return _tc } c = c + 1 Let B = { Emit(_
tc := b;), return _tc
}
c = c + 1
Emit( _
t
c
:= A
*
B; )
Return _
t
c
}
c = c + 1
Let B = { Emit(_
t
c
:= d;), return _
t
c
}
c = c + 1
Emit( _
t
c
:= A
-
B; )
Return _
t
c
}
Code
_t0:=a;
_t1:=b;
_t2:=_t0*_t1
_t3:=d;
_t4:=_t2-_t3
here A=_t0
here A=_t2Slide17
Naïve translationcgen translation shown so far very inefficientGenerates (too) many temporaries – one per sub-expressionGenerates many instructions – at least one per sub-expressionExpensive in terms of running time and spaceCode bloatWe can do much better17Slide18
agendaLowering optimizationsSethi-Ullman algorithm for efficient code generation18Slide19
19Lowering optimizationSlide20
Improving cgen for expressionsImprove translation for leavesReuse temporariesWeighted register allocation for expression treesReorder computations20Slide21
21Optimization 1: leavesSlide22
Improving cgen for leaves casesObservation – naïve translation needlessly generates temporaries for leaf expressionsSolution: treat leaves as special cases by returning them immediately without storing them in temporaries22Slide23
23Optimization 2: reusing temporariesSlide24
cgen with temporaries recyclingObservation – temporaries used exactly onceOnce a temporary has been read it can be reused for another sub-expressionWant to implement cgen(x := e1 op e2) with small number of temporariescgen(x := e1 op
e2) = { c = 0 Emit(x := cgen(e1 op e2))}
24
How can we make this more efficient?Slide25
Improved cgenc = 0cgen(e1 op e2) = { cgen(e1) c = c + 1
cgen(e2) c = c - 1 Emit( _tc := _tc op _t
c+1; )}
25Slide26
Example26c = 0cgen( (a*b)-d) = {
{ { Emit(_tc := a;), return _tc } c = c + 1 { Emit(_
tc := b;), return _tc
}
c = c - 1
Emit( _
t
c
:= _
t
c
*
_t
c+1
; )
}
c = c + 1
{ Emit(_
t
c
:= d;), return _
t
c
}
c = c - 1
Emit( _
t
c
:= _
t
c
-
_t
c+1
; )}Code
c=0Slide27
Example27c = 0cgen( (a*b)-d) = { Let _t
c = { Let _tc = { Emit(_tc := a;), return _tc }
c = c + 1 Let _t
c
= { Emit(_
t
c
:= b;), return _
t
c
}
c = c - 1
Emit( _
t
c
:= _
t
c
*
_t
c+1
; )
Return _
t
c
}
c = c + 1
Let _
t
c
= { Emit(_
t
c
:= d;), return _
t
c } c = c - 1 Emit( _
tc := _tc - _t
c+1; ) Return _tc
}
Code
_t0:=a;
c=1Slide28
Example28c = 0cgen( (a*b)-d) = { Let _t
c = { Let _tc = { Emit(_tc := a;), return _tc }
c = c + 1 Let _t
c
= { Emit(_
t
c
:= b;), return _
t
c
}
c = c - 1
Emit( _
t
c
:= _
t
c
*
_t
c+1
; )
Return _
t
c
}
c = c + 1
Let _
t
c
= { Emit(_
t
c
:= d;), return _
t
c } c = c - 1 Emit( _
tc := _tc - _t
c+1; ) Return _tc
}
Code
_t0:=a;
_t1:=b;
c=1Slide29
Example29c = 0cgen( (a*b)-d) = { Let _t
c = { Let _tc = { Emit(_tc := a;), return _tc }
c = c + 1 Let _t
c
= { Emit(_
t
c
:= b;), return _
t
c
}
c = c - 1
Emit( _
t
c
:= _
t
c
*
_t
c+1
; )
Return _
t
c
}
c = c + 1
Let _
t
c
= { Emit(_
t
c
:= d;), return _
t
c } c = c - 1 Emit( _
tc := _tc - _t
c+1; ) Return _tc
}
Code
_t0:=a;
_t1:=b;
c=0Slide30
Example30c = 0cgen( (a*b)-d) = { Let _t
c = { Let _tc = { Emit(_tc := a;), return _tc }
c = c + 1 Let _t
c
= { Emit(_
t
c
:= b;), return _
t
c
}
c = c - 1
Emit( _
t
c
:= _
t
c
*
_t
c+1
; )
Return _
t
c
}
c = c + 1
Let _
t
c
= { Emit(_
t
c
:= d;), return _
t
c } c = c - 1 Emit( _
tc := _tc - _t
c+1; ) Return _tc
}
Code
_t0:=a;
_t1:=b;
_t0:=_t0*_t1
c=0Slide31
Example31c = 0cgen( (a*b)-d) = { Let _t
c = { Let _tc = { Emit(_tc := a;), return _tc }
c = c + 1 Let _t
c
= { Emit(_
t
c
:= b;), return _
t
c
}
c = c - 1
Emit( _
t
c
:= _
t
c
*
_t
c+1
; )
Return _
t
c
}
c = c + 1
Let _
t
c
= { Emit(_
t
c
:= d;), return _
t
c } c = c - 1 Emit( _
tc := _tc - _t
c+1; ) Return _tc
}
Code
_t0:=a;
_t1:=b;
_t0:=_t0*_t1;
c=1Slide32
Example32c = 0cgen( (a*b)-d) = { Let _t
c = { Let _tc = { Emit(_tc := a;), return _tc }
c = c + 1 Let _t
c
= { Emit(_
t
c
:= b;), return _
t
c
}
c = c - 1
Emit( _
t
c
:= _
t
c
*
_t
c+1
; )
Return _
t
c
}
c = c + 1
Let _
t
c
= { Emit(_
t
c
:= d;), return _
t
c } c = c - 1 Emit( _
tc := _tc - _t
c+1; ) Return _tc
}
Code
_t0:=a;
_t1:=b;
_t0:=_t0*_t1;
_t1:=d;
c=1Slide33
Example33c = 0cgen( (a*b)-d) = { Let _t
c = { Let _tc = { Emit(_tc := a;), return _tc }
c = c + 1 Let _t
c
= { Emit(_
t
c
:= b;), return _
t
c
}
c = c - 1
Emit( _
t
c
:= _
t
c
*
_t
c+1
; )
Return _
t
c
}
c = c + 1
Let _
t
c
= { Emit(_
t
c
:= d;), return _
t
c } c = c - 1 Emit( _
tc := _tc - _t
c+1; ) Return _tc
}
Code
_t0:=a;
_t1:=b;
_t0:=_t0*_t1;
_t1:=d;
c=0Slide34
Example34c = 0cgen( (a*b)-d) = { Let _t
c = { Let _tc = { Emit(_tc := a;), return _tc }
c = c + 1 Let _t
c
= { Emit(_
t
c
:= b;), return _
t
c
}
c = c - 1
Emit( _
t
c
:= _
t
c
*
_t
c+1
; )
Return _
t
c
}
c = c + 1
Let _
t
c
= { Emit(_
t
c
:= d;), return _
t
c } c = c - 1 Emit( _
tc := _tc - _t
c+1; ) Return _tc
}
Code
_t0:=a;
_t1:=b;
_t0:=_t0*_t1;
_t1:=d;
_t0:=_t0-_t1;
c=0Slide35
Example 2_t0 := cgen( ((c*d)-(e*f))+(a*b) )
_t0 := c*d_t1 := e*f
_t0 := _t0 -_t1
c = c + 1
c = c - 1
c = 0
_t0 :=
cgen
(c*d
)-(e*f
))
_t1 :=
a*b
c = c + 1
_t0 := _t0
+
_t1
c = c - 1
35Slide36
36Optimization 3: weighted register allocationSlide37
Sethi-Ullman translationAlgorithm by Ravi Sethi and Jeffrey D. Ullman to emit optimal TACMinimizes number of temporaries(Minimizes number of instructions by reducing spills)A kind of local register allocation (within single expressions)Uses two ideasReusing temporariesChoosing order of translation
37Slide38
Weighted register allocationSuppose we have expression e1 op e2e1, e2 without side-effectsThat is, no function calls, memory accesses, ++xDoes order of translation matter? … YesSethi & Ullman’s
algorithm translates heavier sub-tree firstOptimal local (per-statement) allocation for side-effect-free statements38Slide39
Example 1 (assuming leaves require temporaries)_t0 := cgen( a+(b+(c*d)) )
bc
d
*
+
+
a
_t0
_t1
_t2
4 temporaries
_t2
_t1
left child first
b
c
d
*
+
+
a
_t0
2 temporaries
_t0
_t0
right child first
_t0
_t0
39
_t1
_t1
_t1
_t3Slide40
Sethi-Ullman: weighted register allocationCan save registers by re-ordering subtree computationsPhase 1: check that no side-effects exist and otherwise revert to translation following pre-defined order of computationPhase 2: assign weight to each nodeWeight = number of registers neededLeaf weight knownInternal node weight
w(left) > w(right) then w = leftw(right) > w(left) then w = rightw(right) = w(left) then w = left + 1Phase 3: translate by following heavier child first40Slide41
Sethi-Ullman algorithm examplecgen( g := (a+b)+((c+d)+(e+f)))
e
f
+
+
+
41
a
b
+
w=1
w=1
w=2
w=1
w=1
w=2
w=3
w=3
Code
_t0 := c;
_t1 := d;
_t0 := _t0 * _t1;
_t1 := e;
_t2 := f;
_t1 := _t1 * _t2;
_t0 := _t0 - _t1
_t1 := a;
_t2 := b;
_t1 := _t1 -_t2
_t0 := _t1 - _t0
g := _t0
c
d
+
w=1
w=1
w=2Slide42
Example with side effectsb5
c*
array access
+
a
base
index
w=1
w=1
w=1
w=2
w=1
w=2
w=2
Phase 1:
check
absence of side-effects in expression tree
Phase 2:
assign weight to each AST node
42
_t0 :
=
cgen
(
a+b
[5*c]
)Slide43
Example with side effectsb5
c*
array access
+
a
_t0
_t0
_t0
base
index
W=2
Phase 2: use weights to decide on order of translation
Heavier sub-tree
Heavier sub-tree
_t0 := _t1 *
_t0
_t0 := _t1[_t0]
_t0 := _t1 + _t0
43
_t0 :
=
cgen
(
a+b
[5*c]
)
w=1
w=1
w=1
w=2
w=1
w=2
w=2
_t0
_t1
_t1
_t1
_t0 := c
_t1 := 5
_t1 := b
_t1 := aSlide44
Note on weighted register allocationThe algorithm is local – works for a single expressionMust reset temporaries counter after every statement:x := y;
y := z; should not be translated to_t0 := y;x := _t0;
_t1 := z;
y := _t1;
But rather to
_t0 := y;
x := _t0;
# Finished translating statement. Set c=0
_t0 := z;
y := _t0;
44Slide45
Going beyond the basic SU algorithmThere exist advanced extensions of SUTake into account semantics of operatorsTake into account latency45Slide46
Can we do better than basic SU here?cgen( g := (a+b)+((c+d)+(e+f)))
ef
+
+
+
46
a
b
+
w=1
w=1
w=2
w=1
w=1
w=2
w=3
w=3
Code
_t0 := c;
_t1 := d;
_t0 := _t0 + _t1;
_t1 := e;
_t2 := f;
_t1 := _t1 + _t2;
_t0 := _t0 + _t1
_t1 := a;
_t2 := b;
_t1 := _t1 +_t2
_t0 := _t1 + _t0
g := _t0
c
d
+
w=1
w=1
w=2
What if we are allowed to rewrite expressions?Slide47
Can we do better than basic SU here?cgen( g := (a+b)+((c+d)+(e+f))) =
cgen( g := a + (b + (c + (d + (e + f)))))+
+
47
a
Code
_t0 := e;
_t1 := f;
_t0 := _t0 + _t1;
_t1 := d;
_t0 := _t0 + _t1;
_t1 := c;
_t0 := _t0 + _t1;
_t1 := b;
_t0 := _t0 + _t1;
_t1 := a;
_t0 := _t0 + _t1;
g := _t0
+
b
+
c
+
d
f
e
w=1
w=1
w=2
w=1
w=2
w=1
w=2
w=1
w=2
w=1
w=2
Can rewrite since + is associative and commutativeSlide48
Next lecture:Local Optimizations