Unformatted text preview: Synthesizing parametric specifications of dynamic memory utilization in objectoriented programs
Víctor Braberman: DC, FCEN, UBA, Argentina Diego Garbervetsky: DC, FCEN, UBA, Argentina Sergio Yovine: Verimag. France
Dependable Software Research Group DEPENDEX Synthesis of parametric specifications of dynamic memory utilization. BGY. FTFJP'05 DEPENDEX 1 Motivation
void m1(int k) {
for(i=1;i<=k;i++)
{
a = new A();
m2(i);
}
}
void m2(int n) {
for(j=1;j<=n;j++)
{
b = new B();
}
} How much dynamic memory is allocated when method m1 is invoked? k times new A() & (1+2+…+k) times new B() ⇒
memAlloc(m1)=size(A) * k + size(B) * ( ½k2+½k) Not a trivial task! Synthesis of parametric specifications of dynamic memoryEPENDEX 2
D utilization. BG Context Problem undecidable in general Several techniques for functional languages Impossible to find an exact expression of dynamic memory allocation even knowing program inputs Usually linear upper bounds Less explored for Object Oriented programs Synthesis of parametric specifications of dynamic memoryEPENDEX 3
D utilization. BG Our work
A general technique to find nonlinear parametric upper bounds of dynamic memory utilization Given a method m(p1,..,pn) memAlloc(m): symbolic expression (a polynomial) in terms of p1,…,pn over
approximating the amount of dynamic memory allocated by any run starting at m Synthesis of parametric specifications of dynamic memoryEPENDEX 4
D utilization. BG Key idea: Counting visits to statements that allocates memory
for(i=0;i<n;i++)
φ ≡ {0≤ i < n, 0≤j<i}: a set of for(j=0;j<i;j++)
constraints describing a iteration • new C()
space Dynamic Memory allocations ≅ number of visits to new statements ≅ number of possible variable assignments at statement’s control location ≅ number of integer solutions of a predicate constraining variable assignments at its control location (i.e. an invariant)
For linear invariants, # of integer solutions = # of integer points = Ehrhart polynomial (size(C) * ( ½k2+½k)) Synthesis of parametric specifications of dynamic memoryEPENDEX 5 i
D utilization. BG Our approach
1.
2. 3.
4.
5. Identify every allocation site (new statement) reachable from the method under analysis (MUA)
Generate invariants describing possible variables assignments at each allocation site (the “iteration space”)
Count the number solutions for the invariant in terms of MUA parameters (# of visits to the allocation site)
Adapt those expressions to take into account the size of object allocated (their types)
Sum up the resulting expression for each allocation site Synthesis of parametric specifications of dynamic memoryEPENDEX 6
D utilization. BG Running Example
void m0(int mc) {
1: m1(mc);
2: B m2Arr=m2(2 * mc);
} m0 1: m 1 (mc): void m1(int k) {
3: for (int i = 1; i <= k; i++) {
4:
A a = new A();
new
5:
B dummyArr= m2(i);
}
}
B m2(int n) {
6: B arrB = new B[n];
B
new
7: for (int j = 1; j <= n; j++) {
8:
B b = new B();
new
}
9: return arrB;
} m1 2: m2 (2 *mc) : 4 : new A
5: m2 (i) ( i times): m2
6: new B 8: new B Synthesis of parametric specifications of dynamic memoryEPENDEX 7
D utilization. BG Step 1:Identifying allocation sites m0 1: m 1( mc): m1 2: m 2 (2 *mc) : 4 : new A
5: m 2 (i ) (i times ): m2
6 : new B 8: new B m2 is called at least twice ⇒ 2 static traces for 6:newB and 8:new B Distinguish program locations not only by a “methodlocal” control location but also by a call chain
Creation Site (cs=π.l) = a path π from the MUA to a new statement at l.
Denotes a statement and a call stack. Example: m0.2.m2.6, cs for statement new B withstack (m0.2). Creation sites reachable from m0:
CSm0 = {m0.1.m1.4, m0.1.m1.5.m2.6, m0.1.m1.5.m2.8, m0.2.m2.6, m0.2.m2.8} Synthesis of parametric specifications of dynamic memoryEPENDEX 8
D utilization. BG Step 2:Finding invariants for creation sites
void m0(int mc) {
1: m1(mc);
2: B m2Arr=m2(2 * mc);
}
void m1(int k) {
3: for(int i = 1; i <= k; i++){
4:
A a = new A();
new
5:
B dummyArr= m2(i);
}
} We need invariants involving variables in a path through several methods (appearing in the creation site) Im0(m0.1.m1.4)≡ {k=mc ∧ ≤i≤k} 1 Creation Site invariants can Im0 (m0.1.m1.5.m2.6)
be generated using local ≡ {k=mc ∧ 1≤i≤k ∧ n=i}
invariants and binding the B m2(int n) {
Im0(m0.1.m1.5.m2.8)≡ {k=mc ∧ ≤i≤k ∧ 1 n=i ∧ 6: B arrB = new B[n];
calls
B
new 1≤j≤n}
7: for(int j = 1; j <= n; j++){
8:
B b = new B();
new
}
9: return arrB;
} Im0(m0.2.m2.6)≡ {n=2*mc}
Im0(m0.2.m2.8)≡ {n=2*mc ∧ ≤j≤n} 1 Synthesis of parametric specifications of dynamic memoryEPENDEX 9
D utilization. BG Step 3: Counting the number of solutions (in terms of MUA parameters) Example: # of visits (in terms of m0 parameters) to m2.8 for the stack configuration [m0.1.m1.5]? Recall: I (m0.1.m1.5.m2.8)≡ {k=mc ∧ ≤i≤k ∧ 1 n=i m0
∧ ≤j≤n}
1 Then # of visits in terms of mc (method m0 parameter) = #{(k,i,j,n) (k=mc ∧ ≤i≤k ∧n=i ∧ ≤j≤n) } = 1 1 = ½ mc2 + ½ mc DEPENDEX
Synthesis of parametric specifications of dynamic memory utilization. BG
10 Step 4:Transforming number of visits into memory consumption We know how to approximate number of visits of a creation site, but not dynamic memory allocations Example: How much memory (in terms of m0 parameters) is allocalated by to m2.8 for the stack configuration [m0.1.m1.5]?
Recall: # of visits in terms of mc (method m0 parameter) = ½ mc2 + ½ mc Then memory allocated is size(B)*½ mc2 + ½ mc S(m,cs): computes an upper bound of the amount of memory allocated by one creation site, in terms of the parameters of m Transforms #of visits into estimations of memory consumptions
Special treatment for arrays allocations (new T[e1]..[en]) Treated as n nested loops: for(t1=0;t1<e1;t1++)…for(tn=0;tn<en;tn++) new RefT DEPENDEX
Synthesis of parametric specifications of dynamic memory utilization. BG
11 Step 5: Summing up expressions To predict the amount of memory allocated by a method m. memAlloc(m) = computeAlloc(m,CSm)
For every creation site: Get an invariant, compute the S function and sum them up
where memAlloc(m0) =S(m0,m0.1.m1.4)+S(m0,m0.1.m1.5.m2.6) +S(m0,m0.1.m1.5.m2.8)+S(m0,m0.2.m2.6)+S(m0,m0.2.m2.8 ) = = size(B) * (1/2 * mc2 + 5/2 * mc) + size(B) * (1/2 * mc2 + 5/2 * mc) + size(A) * mc
DEPENDEX
Synthesis of parametric specifications of dynamic memory utilization. BG
12 Experiments We tested our prototype with some JOlden and JavaGrande benchmarks. Example:Class.Method em3d:Bigraph.create(nN, nD) (* )health: (rec): Village.createVillage(l, lab, b, s) fft:FFT.test(n) 32 8 memAlloc Param. (10, 5) (20, 6) (100, 7) (1000, 8) Obtained by hand 10 2
4
8
8
32
1024 # Objs Estimation 348
348
808
808
4608
4608
52008
52008
55
935
240295
38
134
4102 Err% 0
0
0
0 40
136
4104 5
1,47
0,05 In general, when the amount of memory allocated is polynomial , we obtained accurate upper bounds
The main issue is finding good invariants… DEPENDEX
Synthesis of parametric specifications of dynamic memory utilization. BG
13 Scopedmemory Management Leveraging escape analysis, we can compute upper bounds of memory escaping and captured by a method (assuming a region per method)
memEscapes(m) = computeAlloc(m,escapes(m)) memCaptured(m)=computeAlloc(m,capture(m)) Useful for RTSJ Predicting regions sizes Predicting how much allocated memory by the MUA will remain uncollected after its execution
DEPENDEX Synthesis of parametric specifications of dynamic memory utilization. BG
14 Prototype Tool
Java Application
(bytecode or src) Creation Sites
Finder
Local Invariant
Generator Daikon
(dymanic) Local Invariants JInvariant
( static) Polynomials
Evaluator Path Invariant
Generator Escape Analysis
Path Invariants C rea tion S it es ’
Po ly nom ials Symbolic Pohyhedral
C alculator Inductive Variables
A nalysis DEPENDEX
Synthesis of parametric specifications of dynamic memory utilization. BG
15 Conclusions A technique that computes nonlinear parametric upper bounds of dynamic memory allocation
An application to scoped memory management Use for estimating region size in RTSJ Useful for embedded systems
Benchmarks results are promising…
But many challenges remain… DEPENDEX
Synthesis of parametric specifications of dynamic memory utilization. BG
16 Current and future Work Find a symbolic upperbound of memory required to run a method (assuming scopedmemory management) Improving precision of upperbounds under weaker invariants We need to solve an optimization problem (symbolically) [if (cond) then B1 else B2] statements, not capturing cond The same for polymorphism Dealing with recursion
Automated code generation for RTSJ Using memCaptured estimator to determine region’s size
DEPENDEX
Synthesis of parametric specifications of dynamic memory utilization. BG
17 Extra Material How we compute the path invariants
Memory required to run a method
Improving method precision
Counting (more formally)
Definition of function S() DEPENDEX
Synthesis of parametric specifications of dynamic memory utilization. BG
18 On computing Invariants We need linear invariants involving variables in a path through several methods Our technique could deal with some patterns of iteration beyond integercounter based ones. Strategy: we compute or annotate local invariants and bind them for iterations over collections we introduce a virtual counter bounded by the collection size (i.e. {0≤ i ≤ c.size()})
{0 We (try) to obtain invariants that only predicates about inductive set of variables (roughly speaking, a subset of variables which is enough to count the number of visits of a given statement) Currently we approximate inductive variables sets by combining a field sensitive live variables analysis and manual adjustments DEPENDEX
Synthesis of parametric specifications of dynamic memory utilization. BG
19 Step 2:Finding invariants for creation sites We need linear invariants involving variables in a path through several methods We compute or annotate local invariants and bind them m0 ?
? 1: m 1( mc): m1 2: m 2 (2 *mc) : 5: m 2 (i ) (i times ): ?
? I(m0.1)≡ {}
I(m1.5)≡ {1≤i≤k } 4 : new A ? Example for: cs m0.1.m1.5.m2.8 m2
6 : new B 8: new B I(m2.8)≡ {1j≤n }
I(m0.1.m1)≡ {k=mc } I(m1.5.m2)≡ {n=i }
(bindings)
Im0(m0.1.m1.5.m2.8)≡ {k=mc ∧ ≤i≤k ∧ 1 n=i ∧ ≤n} 1j DEPENDEX
Synthesis of parametric specifications of dynamic memory utilization. BG
20 Computing invariants using Daikon DEPENDEX
Synthesis of parametric specifications of dynamic memory utilization. BG
21 Memory required to run a method Knowing the amount memory captured by a method is not enough We must consider the regions of the method it calls m0 1: m 1( mc): 1.
2. They are not in terms of MUA parameters
A method could be called several times with different arguments m1 2: m 2 (2 *mc) : 4 : new A
5: m2 (i ) (i times ): m2
6 : new B 8: new B DEPENDEX
Synthesis of parametric specifications of dynamic memory utilization. BG
22 Two maximization problems In any run only one stack (path) configuration will be active (singlethreading) required(m0)(mc) = max (rsize(m0.1.m1.5,mc)+ rsize(m0.1.m1.5.m2,mc), rsize(m0.2.m2,mc)) In one path a region can be created several times and have different sizes m0 memCapture(m2) depends may vary depending on i in the path m0.1.m1.5.m2 For every path, we need an expression in terms of MUA parameters that maximizes the size of every region in the path 1: m 1( mc): m1 2: m 2 (2 *mc) : 4 : new A
5: m 2 (i ) (i times ): m2
6 : new B 8: new B DEPENDEX
Synthesis of parametric specifications of dynamic memory utilization. BG
23 Maximizing a path rsize(π.m,pmr)=Maximize memCaptured(m) subject to Imr(π)[P/pmr]
This is, find an expression in terms of method mr parameters that represents the maximum region for method m knowing that m will be called with stack π and the variables in call stack are constrained by the invariant Imr(π ) DEPENDEX
Synthesis of parametric specifications of dynamic memory utilization. BG
24 Improving technique precision The statements 3: and 4: will have the same computeAlloc relies on having good invariants invariant… And the technique will sum capturing “controlflow” decisions their upperbounds ignoring the impossibility of visit both statements 3: and Consider this example:
4: i=1;i<=n;i++)
1: for(int in the same iteration!
2:
3:
4:
5: if(t(i))
a[i] = new Integer[2*i]; {1≤i≤n ∧ t(i)}
else
a[i] = new Integer[10]; {1≤i≤n ∧ ¬t(i)} What happens if t(i) cannot be capture by the invariants? DEPENDEX
Synthesis of parametric specifications of dynamic memory utilization. BG
25 Improving precision (cont…) How do we cope with this problem?
Find a condition that maximizes the amount of memory allocated by the statements knowing that they cannot by executed together
In the example we can add a new restriction over i 3:{1≤i≤n ∧ i>5}
5:{1≤i≤n ∧ i≤5} DEPENDEX
Synthesis of parametric specifications of dynamic memory utilization. BG
26 Counting the number of solutions (more formally) Given an invariant and a set of selected variables (parameters) we can get an expression in terms of their parameters It represents the number solutions to the invariant, fixing the values of that parameters
Counting the number of Example: Im0(m0.1.m1.5.m2.8)≡ {k=mc ∧ ≤i≤k ∧ 1 n=i ∧ ≤j≤n} 1 solutions for an invariant for a C(I (m0.1.m1.5.m2.8),{k,i,j,n},{mc})(mc) =
m0
creation site cs= i ∧.l over = #{(k,i,j,n) (k=mc ∧ ≤i≤k ∧n=π 1≤j≤n) } = 1 = ½ mc2 + ½ mc
approximates the number of i
Theoretical Framework:
visits of the new statement when Given a set of constraints φ such that var(φ )=P∪ W, the number of solutions for program stack is π φ fixing the values of P: C(φ,W, P)(p) = #{w φ[W/w,P/p] }is a function in terms of P. For polytypes, # of integer solutions = # of integer points = Ehrhart polynomial j DEPENDEX
Synthesis of parametric specifications of dynamic memory utilization. BG
27 Function S (more formally) C(Ics,W,P) approximates number of visits of a creation site
S(I,P,cs): computes an upper bound of the amount of memory allocated by a creation site, in terms of P using C(Ics,W,P)
Example for creation site m0.1.m1.5.m2.8(new B): Im0(m0.1.m1.5.m2.8)≡ {k=mc ∧ ≤i≤k ∧ 1 n=i ∧ ≤j≤n}, 1 C(Im0(m0.1.m1.5..m2.8),{n,i,k,j},{mc})= ½ mc2 + ½ mc
S(Im0(m0.1.m1.5.m2.8),{mc}, m0.2.m2.8) = = size(B)*(C(Im0(m0.1.m1.5.m2.8),{n,i,k,j},{mc})=size(B)*½ mc2 + ½ mc Adaptations performed by S(I,P,cs)
new T(): Size(T)*C(I,W,P)
new T[e1]..[en]: Size(T) * C(I ∪{0≤t1<e1} … ∪ {0≤tn<en} ,W,P) Simulating n nested loops: for(t1=0;t1<e1;t1++)…for(tn=0;tn<en;tn++) new T DEPENDEX
Synthesis of parametric specifications of dynamic memory utilization. BG
28 ...
View
Full
Document
This note was uploaded on 02/24/2012 for the course CSE 503 taught by Professor Davidnotikin during the Spring '11 term at University of Washington.
 Spring '11
 DavidNotikin

Click to edit the document details