Unformatted text preview: CS421 COMPILERS AND INTERPRETERS CS421 Code Optimizations
source
program compiler
frontend code
optimizer AND INTERPRETERS Code Optimizations (cont’d) better
intermediate
code intermediate
code COMPILERS target
machine
code compiler
backend • A code optimizer is often organized as follows:
inter. code with
controlflow info &
dataflow info. inter. code with
controlflow info intermediate
code controlflow
analysis dataflow
analysis improved
inter. code many code
transformations • The intermediate code (e.g., IR tree) generated by the frontend is often
not efficient. • ControlFlow Analysis  divide the IR into basic blocks, build the • The code optimizer reads IR, emits better IR; almost all optimizations
done here are machineindependent. Machinedependent optimizations are done in the backend. Code Optimizations: Page 1 of 15 CS421 COMPILERS AND • DataFlow Analysis  gather dataflow information (e.g., the set of live
variables). • Main techniques used: graph algorithms, control and data flow analysis
Copyright 1994  2010 Zhong Shao, Yale University controlflow graph (CFG) • Code Transformations  the actual optimizations
Copyright 1994  2010 Zhong Shao, Yale University INTERPRETERS Code Optimizations: Page 2 of 15 CS421 Code Optimizations (cont’d)
• Optimizations that are restricted to one basic block are called local COMPILERS AND INTERPRETERS Examples: Source Code
• C code for quicksort (also in ASU page 588) : optimizations; otherwise, they are called global optimizations • Here are a partial list of wellknown compiler optimizations:
algebraic optimizations (strength reduction, constant folding)
commonsubexpression eliminations
copy propagations and constant propagations
deadcode eliminations
codemotions (i.e., lifting loopinvariants) 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17 void quicksort(m, n);
int m, n;
{
int i, j, v, x;
if (n <= m) return;
i = m1; j = n; v = a[n];
while (1) {
do i = i+1; while ( a[i] < v);
do j = j1; while ( a[j] > v);
if (i >= j) break;
x = a[i]; a[i] = a[j]; a[j] = x;
}
x = a[i]; a[i] = a[n]; a[n] = x;
quicksort(m,j); quicksort(i+1,n);
} induction variable eliminations; strength reductions for loops Copyright 1994  2010 Zhong Shao, Yale University Code Optimizations: Page 3 of 15 Copyright 1994  2010 Zhong Shao, Yale University Code Optimizations: Page 4 of 15 CS421 COMPILERS AND INTERPRETERS CS421 Example: Intermediate Code
i
j
t1
v
i
t2
t3
if
j
t4
t5
if
if
t6
x (16)
(17)
(18)
(19)
(20)
(21)
(22)
(23)
(24)
(25)
(26)
(27)
(28)
(29)
(30) := m  1
:= n
:= 4 * n
:= a[t1]
:= i + 1
:= 4 * i
:= a[t2]
t3 < v goto (5)
:= j  1
:= 4 * j
:= a[t4]
t5 > v goto (9)
i >= j goto (23)
:= 4 * i
:= a[t6] AND INTERPRETERS ControlFlow Analysis • Intermediate code for the shaded fragments of previous example:
(01)
(02)
(03)
(04)
(05)
(06)
(07)
(08)
(09)
(10)
(11)
(12)
(13)
(14)
(15) COMPILERS • How to build the ControlFlow Graph (CFG) ?
each basic block as node, each jump statement as edge.
there is always a root  the “initial” node or the entry point t7 := 4 * i
t8 := 4 * j
t9 := a[t8]
a[t7] := t9
t10 := 4 * j
a[t10] := x
goto (5)
t11 := 4 * i
x := a[t11]
t12 := 4 * i
t13 := 4 * n
t14 := a[t13]
a[t12] := t14
t15 := 4 * n
a[t15] := x • How to identify loops ? and how to identify nested loops ?
1. build the dominator tree from the CFG
2. find all the back edges; each back edge defines a natural loop
3. keep finding the innermost loop and reduce it to a single node.
• Given a CFG G with the initial node (root) r, we say node d dominates
node n, if every path from root r to n goes through d.
• Dominator tree is used to characterize the “dominate” relation: r as the
root, the parent of a node is its immediate dominator. (see ASU page
602608 for more details) Copyright 1994  2010 Zhong Shao, Yale University Code Optimizations: Page 5 of 15 CS421 COMPILERS AND Copyright 1994  2010 Zhong Shao, Yale University INTERPRETERS Code Optimizations: Page 6 of 15 CS421 DataFlow Analysis COMPILERS AND INTERPRETERS DataFlow Analysis (cont’d) • DataFlow Analysis refers to a process in which the optimizer collects
dataflow information at all the program points.
• Examples of interesting dataflow information:
reaching definitions: the set of definitions reaching a program point in[S] : the set of dataflow info. associated with the point before S
out[S] : the set of dataflow info. associated with the point after S
gen[S] : the set of dataflow info. generated by S
kill[S] : the set of dataflow info. destroyed by S
Naturally, if S1 and S2 are two “adjacent” statements within a basic
block, say, S2 immediately follows S1, then in[S2] = out[S1] available expressions: the set of expressions available at a point.
live variables: the set of variables that are live at a point
.......................................................... • We can define these four sets for each basic block B in the same way.
The gen and kill sets of a basic block can be calculated from the • Program points: with each basic block, the point between two adjacent
statements, or the point before the first statement and after the last. • For each statement S, we associate it with four sets: A corresponding values for each statement of that basic block.
• ForwardDataFlowProblem: the dataflow info. is calculated along the path from point p1 to pn is a sequence of points p1, ..., pn such that pi direction of control flow; BackwardDataFlowProblem: the dataflow and pi+1 are “adjacent” for all i =1,...,n1. info. is calculated opposite to the direction of control flow. Copyright 1994  2010 Zhong Shao, Yale University Code Optimizations: Page 7 of 15 Copyright 1994  2010 Zhong Shao, Yale University Code Optimizations: Page 8 of 15 CS421 COMPILERS AND INTERPRETERS CS421 Example: Reaching Definitions
• A definition d reaches a point p if there is a path from the point
immediately following d to p, such that d is not “killed” along that path.
• A definition of a variable v is “killed” between two points if there is a read
of v or an assignment to v in between.
reach point p. This is a forward dataflow problem:
/* initialize out[B] assuming in[B] =
change := true; Copyright 1994  2010 Zhong Shao, Yale University • UseDefinition Chains: for each use of a variable v, find out all the
definitions that reach that use. (directly from reaching definitions info.)
• Available Expressions: an expression x + y is available at a point p if
such evaluation prior to reaching p, there are no subsequent
(this is a forward dataflow problem) • LiveVariable Analysis: a variable x is live at point p if the value of x at Code Optimizations: Page 9 of 15 AND INTERPRETERS Other DataFlow Problems assignments to x or y. for all B */ while change do begin
change := false;
for each block B do begin
in[B] := union of out[P] for all predecessor P of B;
oldout := out[B];
out[B] := gen[B]
(in[B]  kill[B]);
if out[B] <> oldout then change := true
end
end COMPILERS AND every path from the initial node to p evaluates x + y, and after the last • Goal: given a program point p, find out the set of definitions that might CS421 COMPILERS p may be used along some path starting at p. (this is a backward dataflow problem)
• DefinitionUse Chains: for each program point p, compute the set of
uses s of a variable x such that there is a path from p to s that does not
redefine x. (backward dataflow problem)
Copyright 1994  2010 Zhong Shao, Yale University INTERPRETERS Code Optimizations: Page 10 of 15 CS421 Using DataFlow Info. COMPILERS AND INTERPRETERS Using DataFlow Info. (cont’d) • Common Subexpression Eliminations: a flow graph with available
• Copy Propagations: a flow graph plus the udchains and duchains expression information. (ASU page 634) information, and also some copystatement info. (see ASU page 638) For every statement s of the form x := y + z such that y+z is available
at the beginning of s’s block, neither y nor z is defined prior to s in that
block.
1. discover all the last evaluations of y+z that reach s’s block
2. create a new variable u. for each copy s : x := y , determine all the uses of x that reached by
this definition of x, then for each use of x, determine s is the only
definitions that reachs this use, if so, replace the use of x with y.
• Loop Invariants: a flow graph plus the udchains information
a statement is a loop invariant if its operands are all constants, or its
reaching definitions are loop invariants or from outside the loop. 3. replace each statement w := y+z found in (1) by
u := y + z
w := u • For more examples, see the ASU section 10.7. 4. replace statement s by x := u • Challenges: what if there are procedure calls, pointer dereferencing ...?
also, how to make these algorithms more efficient ? Copyright 1994  2010 Zhong Shao, Yale University Code Optimizations: Page 11 of 15 Copyright 1994  2010 Zhong Shao, Yale University Code Optimizations: Page 12 of 15 CS421 COMPILERS AND INTERPRETERS CS421 StaticSingle Assignment COMPILERS AND INTERPRETERS StaticSingle Assignment (cont’d) • Motivation: how to make dataflow analysis more efficient & powerful ?
• Main idea #2: after each branchjoin node, a special form of assignment
• StaticSingle Assignment (SSA) form  an extension of CFG :
v
z
v
y :=
:=
:=
:= 4
v+5
6
v+7 SSA transformation v1 := 4
z := v1 + 5
v2 := 6
y := v2 + 7 if P
then v := 4
else v := 6 if P
then v3 := 4
else v4 := 6 u=v+y v5 = (v3,v4)
u = v5 + y • Main idea #1: each assignment to a variable is given a unique name,
and all of the uses reached by that assignment are renamed to match Copyright 1994  2010 Zhong Shao, Yale University Code Optimizations: Page 13 of 15 COMPILERS AND INTERPRETERS SSA Construction [Cytron91]
• Turn every “preserving” def into a “killing” def, by copying potentially
unmodified values (at subscripted defs, call sites, aliased defs, etc.)
• Every ordinary definition of v defines a new name.
• At each node in the flow graph where multiple definitions of v meets, a
function is introduced to represent yet another new name of v.
• Uses are renamed by their dominating definitions (where uses at a
function are regarded as belonging to the appropriate
predecessor node of the function). • Code Size: the ffunction inserted in SSA can increase the code size, but
only linearly; in practice, the ratio of SSA over OLD is 0.6  2.4. Copyright 1994  2010 Zhong Shao, Yale University (v1,v2,...,vn) means that if the runtime returns the value of vi . • Why SSA is good ? SSA significantly simplifies the representation of
many kinds of dataflow information; data flow algorithms built on defuse
chains, etc. gain asymptotic efficiency.
In SSA, each use is reached by a unique def, so the size of defuse
chains is linear to the number of edges in the CFG.
In nonSSA, the defuse chains are much bigger. the assignment’s new name. CS421 called a function is inserted. execution comes from the ith predecessor, then the above functio Code Optimizations: Page 15 of 15 Copyright 1994  2010 Zhong Shao, Yale University Code Optimizations: Page 14 of 15 ...
View
Full Document
 Fall '09
 Zhong Shao, Code Optimizations

Click to edit the document details