AICslides_ppt2010

AICslides_ppt2010 - Abstraction, Decomposition, Relevance...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Abstraction, Decomposition, Relevance Coming to Grips with Complexity in Verification Ken McMillan Microsoft Research Need for Formal Methods that Scale We design complex computing systems by debugging Design something approximately correct Fix it where it breaks (repeat) Verification consumes majority of resources in chip design Cost of small errors is huge ($500M for one error in 1990's) Security vulnerabilities have enormous economic cost Correct design is one of the grand challenges of computing Constructing proofs of systems of realistic scale is an overwhelming task Automation is clearly needed As a result, the primary task of design is actually verification The ugly truth: we don't know how to design correct systems Verification by logical proof seems a natural candidate, but... Model Checking Logical Specification G(p ) F q) yes! Model Checker p System Model p q q Counterexample A great advantage of model checking is the ability to produce behavioral counterexamples to explain what is going wrong. no! Temporal logic (LTL) ... Types of temporal properties We will focus on safety properties. Safety and reachability Transitions = state(s) steps Counterexample! Initial execution Bad state(s) States =Breadth-first searchvariables valuations of state I F Reachable state set Fixed point = reachable state set Breadth-first search Remove the "bug" I Safety property verified! Model checking is a little more complex than this, but reachability captures the essence for our purposes. Model checking can find very subtle bugs in circuits and protocols, but suffers from state explosion. F Symbolic Model Checking Avoid building state graph by using succinct representation for large sets Binary Decision Diagrams (Bryant) a 0 c 1 d 0 d c 0 0 0 d c 1 d b 0 1 0 d 0 1 0 1 d b 1 0 d c 1 1 d 0 0 0 1 0 0 0 1 0 0 0 1 1 1 1 1 Symbolic Model Checking Avoid building state graph by using succinct representation for large sets Multiprocessor Cache Coherence Protocol Abstract model host protocol S/F network other hosts Symbolic Model Checking detected very subtle bugs Allowed scalable verification, avoiding state explosion The Real World How do we cope with the complexity of real systems? Must deal with order 100K state holding elements (registers) State space is exponential in the number of registers Software complexity is greater To make model checking a useful tool for engineers, we had to find ways to cut this problem down to size. To do this, we apply three key concepts: decomposition, abstraction and refinement. Deep v. Shallow Properties A property is shallow if, in some sense, you don't have to know very much information about the system to prove it. Shallow property: Bus bridge never drops transactions Deep property: System implements x86 Our first job is to reduce a deep property to a multitude of shallow properties that we can handle by abstraction. Functional Decomposition Abstract model host protocol S/F network other hosts Shallow properties track individual transactions though RTL... ~30K lines of verilog CAM TABLES Abstraction Problem: verify a shallow property of a very large system Solution: Abstraction Extract just the facts about the system state that are relevant to the proving the shallow property. An abstraction is a restricted deduction system that focuses our reasoning on relevant facts, and thus makes proof easier. Relevance and refinement Problem: how do we decide what deductions are relevant? Is relevance even a well defined notion? A relevant deduction is one that is used in a simple proof of the desired property. Deductions used in the proof of special cases tend to be relevant to the overall proof. Relevance: Generalization principle: Proofs A proof is a series of deductions, from premises to conclusions Each deduction is an instance of an inference rule Usually, we represent a proof as a tree... Premises P5 P1 P2 C Conclusion C P1 P2 P3 P4 If the conclusion is "false", the proof is a refutation Inference rules The inference rules depend on the theory we are reasoning in Boolean logic Resolution rule: Linear arithmetic Sum rule: x1 y1 x2 y2 _ x1+x2 y1+y2 Inductive invariants Forms a barrier the state space intostates and statestates A Boolean-valued formula over the system bad Partitions between the initial two regions I : No transitions cross this way F Reachable states: complex Inductive invariant: simple! Invariants and relevance A predicate is relevant if it is used in a simple inductive invariant l1: l2: l3: l4: l5: l6: x = y = 0; while(*) x++, y++; while(x != 0) x--, y--; assert (y == 0); state variables: pc, x, y property: pc = l6 ) y = 0 inductive invariant = property + pc = l1 x = y Relevant predicates: pc = l1 and x = y Irrelevant (but provable) predicate: x 0 Three ideas to take away An abstraction is a restricted deduction system. A proof decomposition divides a proof into shallow lemmas, where shallow means "can be proved in a simple abstraction" Relevant abstractions are discovered by generalizing from particular cases. These lectures are divided into three parts, covering these three ideas. Abstraction What is Abstraction By abstraction, we mean something like "reasoning with limited information". The purpose of abstraction is to let us ignore irrelevant details, and thus simplify our reasoning. In abstract interpretation, we think of an abstraction as a restricted domain of information about the state of a system. Here, we will take a slightly broader view: An abstraction is a restricted deduction system We can think of an abstraction as a language for expressing facts, and a set of deduction rules for inferring conclusions in that language. The function of abstraction The function of abstraction is to reduce the cost of proof search by reducing the space of proofs. Abstraction Rich Deduction System Automated tool can search this space for a proof. An abstraction is a way to express our knowledge of what deductions may be relevant to proving a particular fact. Symbolic transition systems Proof by Inductive Invariant Many different choices have been made in practice. We will discuss a few... Abstraction languages Difference bounds Affine equalities Abstraction languages Example Let's try some abstraction languages on an example... l1: l2: l3: l4: l5: l6: x = y = 0; while(*) x++, y++; while(x != 0) x--, y--; assert (y == 0); Difference bounds Affine equalities Another example Let's try an even simpler example... l1: l2: l3: l4: l5: l6: x = 0; if(*) x++; else x--; assert (x != 0); Difference bounds Affine equalities Deduction systems Up to now, we have implicitly assumed we have an oracle that can prove any valid formulas of the forms: Localization abstraction Boolean Programs A Boolean program is defined by a set of such facts. Example l1: int x = *; In practice, we may add some disjunctions to our set l2: if(x > 0){ of allowed deductions, to avoid adding more predicates. l3: x--; l4: assert(x >= 0); l5: } Proof search In general, making the space of proofs smaller will make the proof search easier. Relevance and abstraction The key to proving a property with abstraction is to choose a small space of deductions that are relevant to the property. How do we choose... Predicates for predicate abstraction? System components for localization? Disjunctions for Boolean programs? In the section on relevance, we will observe that deductions that are relevant to particular cases tend to be relevant in general. This gives us a methodology of abstraction refinement. Next section: how to decompose big verification problems into small problems that can be proved with simple abstractions. Decomposition Proof decomposition Our goal in proof decomposition is to reduce proof of a deep property of a complex system to proofs of shallow lemmas that can be proved with simple abstractions. We will consider some basic strategies for decomposing a proof, and consider how they might affect the abstractions we need. We consider two basic categories of decomposition: Non-temporal: reasoning about system states Temporal: reasoning about sequences of states As we go along, we'll look at a system called Cadence SMV that implements these proof decompositions, and corresponding abstractions. Cadence SMV basics Type declarations typedef MyType 0..2; typedef MyArray array MyType of {0,1}; Variables and assignments v : MyType; init(v) := 0; next(v) := 1 - v; v = 0,1,0,1,0,... Temporal assertions p : assert G (v < 2); SMV can automatically verify this assertion by model checking. Case splitting The simplest way to breakdown a proof is by cases: Here is a temporal version of case splitting: p q :p q p q :p q :p q p q p q Temporal case splitting Here is a more general version of temporal case splitting: p1 p2 p3 p4 p5 ... v1 : I'm O.K. at time t. Temporal case splitting in SMV v : T; s : assert G p ; forall (i in T) subcase c[i] of s for v = i; ) p) ; c[1] : assert G (v=1 ) p) ; ... c[0] : assert G (v=0 Invariant decomposition In a proof using an inductive invariant, we often decompose the invariant into a conjunction of many smaller invariants that are mutually inductive: 1 2 T ) '1 1 2 T ) '2 1 2 T ) '1 '2 {1 2} s {1} {1 2} s {2} {1 2} s {1 2} To prove each conjunct inductive, we might use a different abstraction. Often we need to strengthen an invariance property with many additional invariants to make it inductive. Temporal Invariant Decomposition To prove a property holds at time t, we can assume that other properties hold at times less than t. The properties then hold by mutual induction. We can express this idea using the releases operator: "p fails before q fails" If no property is the first to fail, then all properties are always true. These premises can be checked with a model checker. Invariant decomposition in SMV This argument: can be expressed in SMV like this: p : assert G ...; q : assert G ...; using (p) prove q; using (q) prove p; Combine with case splitting p1 p2 p3 p4 p5 ... v1 : I'm O.K. at time t. Combining in SMV This argument: Can be expressed like this in SMV: w : T; p : assert G ...; forall(i in T) subcase c[i] of p for w = i; forall(in in T) using (p) prove c[i]; Abstractions Having decomposed a property into a collection of simpler properties, we need an abstraction to prove each property. Recall, an abstraction is just a restricted proof system. SMV uses a very simple form of predicate abstraction called a data type reduction. Data type abstraction Deduction rules Data type reductions in SMV typedef T 0..999; forall(i in T) p[i] : assert G ...; forall(i in T) using T -> {i} prove p[i]; A simple example An array of processes with one state variable each and a one shared variable. At each time, the scheduled process swaps its own variable with the shared variable. typedef T 0..999; typedef Q 0..2; v : Q a : array T of Q; sched : T; init(v) := {0,1}; forall(i in T) a[i] := {0,1}; next(a[sched]) := v; next(v) := a[sched]; A simple example We want to prove the shared variable always less than 2: p : assert G (v < 2); Split cases on most recent writer of shared variable: w : T; next(w) := sched; forall(i in T) subcase c[i] of p for w = i; Use mutual induction to prove the cases, with a data type reduction: forall(i) using p, T->{i} prove c[i]; Functional decompositions This combination of temporal case splitting and invariant decomposition can support a general approach to decomposing proofs of complex systems. Use case splitting to divide the proof into "units of work" or "transactions". For a CPU, this might be instructions, loads, stores, etc... For a router, units of work might be packets. Each transaction can assume all earlier transactions are correct. Since each unit of work uses only a small collection of system resources, a simple abstraction will prove each. Example : packet router input buffers output buffers Switch fabric Unit of work is a packet Packets don't interact Each packet uses finite resources allows abstraction to finite state Illustration: Tomasulo's algorithm Execute instructions in data flow order REG FILE VAL/TAG VAL/TAG VAL/TAG VAL/TAG TAGGED RESULTS OP,DST opra oprb OPS opra oprb OP,DST OP,DST EU EU EU opra oprb Data types in Tomasulo The following data types are used in Tomasulo REG TAG EU WORD (register file indices) (reservation station indices) (execution unit indices) (data words) Specification via reference model Reference model Reference model describes simple in-order instruction execution. Specifications Invariant properties specify values in the out-oforder system relative to the reference model. System Invariant decomposition Decompose into two lemmas Lemma 2: Correct results REG FILE VAL/TAG VAL/TAG VAL/TAG VAL/TAG TAGGED RESULTS OP,DST opra oprb OPS opra oprb OP,DST EU EU Lemma 1: Correct operands OP,DST EU opra oprb "Correct" means same value as reference model computes. Lemmas in SMV Lemma 1: The A operand in reservation station k is correct: forall (k in TAG) lemma1[k] : assert G rs[k].valid & rs[k].opra.valid -> rs[k].opra.val = aux[k].opra; Lemma 2: Values on result bus with tag i are correct: forall (i in TAG) lemma2[i] : assert G rb.tag = i & rb.valid -> rb.val = aux[i].res; Note: only two system signals specified in proof decomposition Case splitting in Tomasulo For each operand, split cases on the tag of the operand. VAL/TAG VAL/TAG VAL/TAG VAL/TAG REG FILE TAGGED RESULTS OP,DST opra oprb OPS opra oprb OP,DST OP,DST EU EU EU opra oprb Proving Lemma 1 To prove correctness of operands, split cases on tag and reg: forall (i in TAG; j in REG; k in TAG; d in WORD) subcase lemma1c[i][j][k][d] of lemma1[i] for rs[i].opra.tag = j & rs[i].tag = j & aux[i].opra = d; Then assume all results of earlier instructions are correct and reduce data types to just relevant values: forall (i in TAG; j in REG; k in TAG; d in WORD) using (lemma2), TAG->{i,k}, REG->{j}, WORD->{d}, EU->{} prove lemma1c[i][j][k][d]; Uninterpreted functions Verify Tomasulo for arbitrary EU function f(a,b). SPEC RESULTS f(a,b) REG FILE VAL/TAG VAL/TAG VAL/TAG VAL/TAG TAGGED RESULTS OP,DST opra oprb OPS opra oprb OP,DST OP,DST f(a,b) f(a,b) opra oprb (related: Burch, Dill, Jones, etc...) Case splitting for lemma 2 REG FILE VAL/TAG VAL/TAG VAL/TAG VAL/TAG k OP,DST i opra j OPS oprb f(a,b) f(a,b) f(a,b) OP,DST OP,DST opra oprb Result SMV can reduce the verification of the lemmas to finite-state model checking Max 25 state bits to represent abstract values Total verification time less than 4 seconds Arbitrary number of registers, reservation stations Arbitrary data word size and EU function (unbounded EU's requires one more lemma) Tomasulo implementation proved for Note the strategy we applied: 1) Case split into "units of work" (operand fetch, result comp) 2) Specify units of work relative to reference model 3) Choose abstraction for each unit of work. A more complex example REG VAL/TAG VAL/TAG FILE VAL/TAG VAL/TAG OP,DST RETIRED RESULTS PM d e c branch results opraoprb OPS opraoprb OP,DST OP,DST EU EURES EU BUF BUF BUF PC branch predictor opraoprb LSQ DM Unit of work = instruction Scaling problem Must consider up to three instructions: instruction we want to verify up to two previous instructions Resulting abstractions too complex Soln: break instruction execution into smaller units of work write more intermediate specifications manual invariant proof approx. 2MB (!) temporal decomposition and abstraction proof approx. 20 KB Compared to similar proof using manual inductive invariants... Cache coherence (Eiriksson 98) P P M INTF to net IO Nondeterministic abstract model Atomic actions Single address abstraction Verified coherence, etc... host Distributed cache coherence protocol host protocol host protocol S/F network 64 Mapping Protocol to RTL Abstract model host protocol S/F network other hosts Shallow properties track individual transactions though RTL... ~30K lines of verilog CAM TABLES Conclusions Proof decomposition means breaking down a proof into lemmas that can be proved in simpler deduction systems (abstractions). A functional decomposition approach divides the proof based on "units of work" or "transactions". This can be accomplished by two basic decomposition steps: Temporal case splitting Temporal invariant decomposition Since each unit of work uses few resources, this style of decomposition lends itself to proof with fairly primitive abstractions, such as data type reductions. Next section: more sophisticated abstractions and how we discover them. Relevance Relevance and Refinement Having decomposed a verification problem into shallow temporal lemmas, we need to choose an abstraction to prove each lemma. That is, we are looking for a small space of relevant deductions in which to search for a proof of a property. In this section, we will focus on the question of how we determine what is relevant and on how we apply this notion to the problem of abstraction refinement. Refinement is the process of choosing the deduction system that defines our abstraction. This is usually, but not always does as a process of gradual refinement of the abstraction, adding information until the property is proved. Basic framework Abstraction and refinement are proof systems spaces of possible proofs that we search prog. pf. of special case Abstractor pf. General proof system Incomplete special case Refiner cex. Specialized proof system Complete Refinement = augmenting abstractor's proof system to replicate proof of special abstractor's proof space to relevant facts. Narrow the case generated by refiner. Background Simple program statements (and their Hoare axioms) { ) } {[e/x]} {8 x } x := e havoc x {} {} {} A compound stmt is a sequence simple statements 1;...; k A CFG (program) is an NFA whose alphabet is compound statements. The accepting states represent safety failures. x = 0; while(*) x++; assert x >= 0; x := x +1 x := 0 [x<0] Hoare logic proofs Write H(L) for the Hoare logic over logical language L. A proof of program C in H(L) maps vertices of C to L such that: the initial vertex is labeled True the accepting vertices are labeled False every edge is a valid Hoare triple. x := x +1 x := 0 {True} {x 0} [x<0] {False} This proves the failure vertex not reachable, or equivalently, no accepting path can be executed. Path reductiveness An abstraction is path-reductive if, whenever it fails to prove program C, it also fails to prove some (finite) path of program C. Example, H(L) is path-reductive if L is finite L closed under disjunction/conjunction Path reductiveness allows refinement by proof of paths. In place of "path", we could use other program fragments, including restricted paths (with extra guards), paths with loops, procedure calls... We will focus on paths for simplicity. Example x = y = 0; while(*) x++; y++; while(x != 0) x--; y--; assert (y == 0); x:=0;y:=0 x:=x+1; y:=y+1 [x 0 ]; := -1 ; := -1 [x=0]; [y 0 ] Try to prove with predicate abstraction, with predicates {x=0,y=0} Predicate abstraction with P is Hoare logic over the Boolean combinations of P Unprovable path {True} {True} x = y = 0; x++; y++; x++; y++; [x!=0]; x--; y--; [x!=0]; x--; y--; [x == 0] [y != 0] {x=0 = y= 0} {x x= {x = y y} 0} {x={x y} 0 } {x 0y = y 0 } x {x = y} {True} {x = y} Augment P with new predicate x=y. PA can replicate proof. Ask refiner to prove it! Cannot prove with PA({x=0,y=0}) {x = y} {True} {x = y} {True} {x = y} {x = y} {False} {False} {True} Abstraction refinement: Path unprovable to abstraction Refiner proves Abstraction replicates proof Path reductiveness Path reductive abstractions can be characterized by the path proofs they can replicate Predicate abstraction over P replicates all the path proofs over Boolean combinations of P. The Boolean program abstraction replicates all the path proofs over the cubes of P. For these cases, it is easy to find an augmentation that replicates a proof (if the proof is QF). In general, finding the least augmentation might be hard... But where do the path proofs come from? Refinement methods Strongest postcondition (SLAM1) Weakest precondition (Magic,FSoft,Yogi) Interpolant methods Feasible interpolation (BLAST, IMPACT) Bounded provers (SATABS) Constraint-based (ARMC) Local proof Interpolation Lemma [Craig,57] If A B = false, there exists an interpolant A' for (A,B) such that: A A' A' ^B = false A' 2 L(A) \ L(B) Example: A = p q, B = q r, A' = q In many logics, an interpolant can be derived in linear time from a refutaion proofs of A ^B. Interpolants as Floyd-Hoare proofs True {True} xx=y; x1= y = y0 1. Each formula implies the next 2. Each is over common symbols of prefix and suffix 3. Begins with true, ends with false x1=y0 {x=y} y1=y0+1 y++; y++ x1= 1 [x[x=y] == y] y1>x1 {y>x} False {False} Proving in-line programs SSA sequence Prover proof Hoare Proof Interpolation ) ) ) Local proofs and interpolants TRUE x1=y0 x=y; x1 y0 y++; y1=y0+1 y0 x1 x1 y y0+1 y1 x1+1 y1 y1x1 [y x] 10 FALSE FALSE y1 y0+ 1 y1 x1+ 1 x1+1 y1 This is an example of a local proof... Definition of local proof {x1,y0} y0 x1 {x1,y0,y1} y1 {x1,y1} y1x1 x1=y0 x1 y0 y1=y0+1 y0+1 y1 x1+1 y1 deduction "in scope" here Local proof: Every deduction written in vocabulary of some frame. vocabulary of framerange of variables occurs in scope of variable = = set frames it "in scope" Forward local proof TRUE {x1,x0} x1=y0 x1 y0 x1 y y1=y0+1 {x1,y0,y1} y0+1 y1 x1+1 y1 x1+1 y1 {x1,y1} y1x1 10 FALSE FALSE For a forward local proof, the (conjunction of) assertionsframe Forward local proof: each deduction can be assigned a crossing frame boundary is arrows go forward. such that all the deduction an interpolant. Reverse local proof FALSE TRUE {x1,x0} x1=y0 10 x1 y0 : y0+1 x1 y1=y0+1 {x1,y0,y1} y0+1 y1 : y1 x1 {x1,y1} y1x1 FALSE For a reverse local proof, the negation of assertions a frame Reverse local proof: each deduction can be assigned crossing frame boundary is arrows go backward. such that all the deduction an interpolant. y0+1 x1 General local proof TRUE {x1,y0} x1=3y0 x1 0 x12 ) x10 x1 2 {x1} x1 0 {x1} 1 x1 10 FALSE FALSE General local proof:proof, the interpolants contain implications. For a general local each deduction can be assigned a frame, but deduction arrows can go either way. Refinement methods Strongest postcondition (SLAM1) Weakest precondition (Magic,FSoft,Yogi) Interpolant methods Feasible interpolation (BLAST, IMPACT) Bounded provers (SATABS) Constraint-based (ARMC) Local proof Refinement with SP The strongest post-condition of w.r.t. progam , written SP(,), is the strongest such that {} {}. The SP exactly characterizes the states reachable via . Refinement with SP: True {True} xx=y; x1= y = y0 x1=y0 {x=y} y1=y0+1 y++; y++ y1>x1 {y=x+1} x1= 1 [x=y] [yx] False {False} This is viewed as symbolic execution, but there is a simpler view. {} havoc x Syntactic SP computation: {} {} x := e { } {9 v [v/x] x = e[v/x]} {9 x } SP as local proof Order the variables by their creation in SSA form: Refinement with SP corresponds to local deduction with these rules: = [/] FALSE x0 y0 x1 y1 x max. in . We encode havoc specially in the SSA: where havoc x x = Think of the i's as implicitly existentially quantified SP example TRUE {x1,y0} y0 = 1 x1=y0 y1=y0+1 {x1,y0,y1} 91 (x1=1 y1 = 1+1) y1 = x0 + 1 y1x1 {x1,y1} y1 1 1 + 1 1 FALSE FALSE Orderinguserewrites ensures forward local proof. We can of quantifier elimination if our logic supports it. The (conjunction of) assertions crossing frame boundary is an interpolant with i's existentially quantifed. x1 = 1 91= y0 x1 (x1=1 y0 = 1) y1 = 1 + 1 Witnessing quantifiers What happens if we can't eliminate the quantifiers? We can witness them by adding auxiliary variables to the program. Refinement with SP: havoc yy havoc xx=y; x1= y 1 = y = y0 x=y y1=y0+1 y++; y++ {x=1(x=1 y = 1+1)} {91 y = 1+1} x1= x] [x=y] [y 1 False {False} Will the auxiliary variables get out of control? True {True} {x=1(x=1 y = 1)} {91 y = 1} Predicate abstraction can't reproduce this proof! Proof reduction By dropping unneeded inferences, we can weaken the interpolant and eliminate irrelevant predicates. TRUE {x1,y0} y0 = 1 x1=y0+1 z1=x1+1 {x1,y0,z1} x1y0 y0 z1 x1 1 1 + 1 1 FALSE FALSE Newton does this to eliminate irrelevant predicates. 91 (x1=1 y1 = 1+1 1+1) z1=1+1) 1 1+2 x1 = 1 + 1 91 (x1=1+1 y0 = 1) z1 = 1 + 2 {x1,y0,z1} Refinement methods Strongest postcondition (SLAM1) Weakest precondition (Magic,FSoft,Yogi) Interpolant methods Feasible interpolation (BLAST, IMPACT) Bounded provers (SATABS) Constraint-based (ARMC) Local proof Refinement with WP The weakest (liberal) pre-condition of w.r.t. progam , written WP(,), is the weakest such that {} {}. The WP characterizes the states may not reach :. Refinement with WP: True {True} xx=y; x1= y = y0 x1=y0 {x < y+1} y1=y0+1 y++; y++ y1>x1 {x<y} x1= 1 [x=y] [yx] False {False} This can also be viewed as local proof. Syntactic WP computation: { ) } {[e/x]} {8 x } x := e havoc x {} {} {} WP as local proof Order the variables by their creation in SSA form: Refinement with WP corresponds to local deduction with these rules: = [/] FALSE x0 y0 x1 y1 x min. in . We encode havoc specially in the SSA: where havoc x x = Think of the i's as implicitly existentially quantified WP example FALSE TRUE {x1,x0} y0 = 1 x1=y0 y1=y0+1 {x1,y0,y1} y1x1 {x1,y1} FALSE No need of quantifier elimination in frame proof. Ordering for rewrites ensures reversethis example. The negation of assertions crossing local boundary (with i's existentially quantified) is an interpolant. 1 + 1 1 y0+1 y0 : y0+1 x1 y0+1 x1 : y1 x1 Observations WP allows proof reductions, just like SP We are allowed to mix forward and backward rewriting (SP and WP) Result is a general local proof, which we can interpolate. However, forward rewriting may have advantages for Boolean programs, since it always produces conjunctions. Abstracting paths Removing irrelevant assignments and constraints can prevent SP and WP from introducing irrelevant predicates. Proof using SP... havoc b; c := b; {True} After quantifier elimination... 1 = b; {b 1 {b == c} c = 1} havoc b; a := 3c + a; 2 = a; [a < b]; [c < a] {b{b = c} c = 1 a = 41} {b= 1 c = 1 a = 2} = 1 {41< 1 = 1 a = 2} 1 {a {2 <<c} c c = 1 a = 41} irrelevant! {False} Abstracting paths very important to keep SP and WP simple Quantifier divergence SP and WP introduce quantifiers Quantifiers can diverge as we consider longer paths through loops Example program: a = 1; b = 0; while (*) { a : = 3a^3 b; if (a > 0) b = b + a; } assert b >= 0; (Complicated, but irrelevant) Quantifier divergence Proof using SP... a:= 1; b := 0; a :=havoc a; 3a3 - b; [a > 0]; [a > 0]; b := b + a; b := b + a; a :=havoc a; 3a3 - b; [a > 0]; [a > 0]; b := b + a; b := b + a; [b < 0] {True} After quantifier elimination... {b = 0} {a = 1 b = 0} irrelevant! This predicate is sufficient for PA. {1 1} b = 1} {b > 0 irrelevant! {1 > 0 2 > 0 b = 1 + 2} {b 2} Skolem constants diverging! {False} QE is difficult, but necessary for loops with SP and WP. Refinement quality Refinement with SP and WP is incomplete May exists a refinement that proves program but we never find one These are weak proof systems that tend to yield low-quality proofs Example program: x = y = 0; while(*) x++; y++; while(x != 0) x--; y--; assert (y == 0); invariant: {x == y} Execute the loops twice {True} x = y = 0; x++; y++; x++; y++; [x!=0]; x--; y--; [x!=0]; x--; y--; [x == 0] [y != 0] {y = 0} {x = y} {x = y} {y 1} {x = y} {y 2} This simple proof contains invariants for both loops Same result with WP! Refine with SP (and proof reduction) {x = y} {y 1} {y = 0} {x = y} {False} Predicates diverge as we unwind A practical method must somehow prevent this kind of divergence! We need refinement methods that can generate simple proofs! Refinement methods Strongest postcondition (SLAM1) Weakest precondition (Magic,FSoft,Yogi) Interpolant methods Feasible interpolation (BLAST, IMPACT) Bounded provers (SATABS) Constraint-based (ARMC) Local proof Bounded Provers [SATABS] Define a (local) proof system Can contain whatever proof rules you want For example, number of distinct predicates after dropping subscripts May restrict to forward or reverse proofs Define a cost metric for proofs Exhaustive search for lowest cost proof = [/] x max. in FALSE . Allow simple arithmetic rewriting. Loop example cost: 2N TRUE x0 = 0 y0 = 0 x1=x0+1 y1=y0+1 x2=x1+1 y2=y1+1 ... x0 = y0 x0= 0 y0 = 0 x1 = = y0+1 x1 1 y1 = 1 x1 = y1 x2 = = y1+1 x2 2 y2 = 2 x2 = y2 ... ... x1=1 y1 = 1 x1= y1 x0 = y0 cost: 2 TRUE x2=2 y2 = 2 ... x2= y2 ... Lowest cost proof is simpler, avoids divergence. Lowest-cost proofs Lowest-cost proof strongly depends on choice of proof rules This is a heuristic choice Rules might include bit vector arithmetic, arrays, etc... May contain SP or WP (so complete for refuting program paths) Hope is that lowest-cost proof is short Require fixed truth value for all atoms (refines restricted case) However, heuristically, will diverge less often than SP or WP. Search for lowest cost proof may be expensive! Divergence is still possible when a terminating refinement exists Refinement completeness Refinement completeness: if, within the abstraction framework, an abstraction exists that proves a given program safe, then refinement eventually produces such an abstraction. Example: predicate abstraction over LRA. If there exists an inductive invariant proving safety in QFLRA, then the predicate set eventually contains the atomic predicates of such an invariant. For a stratified language {Li}, when the Li-bounded local proof system is complete for consequence generation in Li. Under certain conditions, for bounded local saturation provers, including first-order superposition calculus provers. So we know that local provers can avoid divergence. The key question is whether the cost of finding the best proofs is justified in practice. Some kinds of bounded provers can achieve refinement completeness: Refinement methods Strongest postcondition (SLAM1) Weakest precondition (Magic,FSoft,Yogi) Interpolant methods Feasible interpolation (BLAST, IMPACT) Bounded provers (SATABS) Constraint-based (ARMC) Local proof Constraint-based interpolants Farkas' lemma: If a system of linear inequalities is UNSAT, there is a refutation proof by summing the inequalities with non-neg. coefficients. Farkas' lemma proofs are local proofs! 00 x0 0 0 y0 x1x0+ 1 z1x1-1 y0+1y1 y1+ 1x1 1 (x0 0) 1 (0 y0) x0 y0 1 (x1x0+ 1) 0 (z1x1-1) x1 y0 1 (y0+1y1) 1 (y1+1x1) 10 x0 y0 Intermediate sums are the interpolants! Coefficients can be found by solving an LP. Interpolants can be controlled with additional constraints. . x1 y0 10 Refinement methods Strongest postcondition (SLAM1) Weakest precondition (Magic,FSoft,Yogi) Interpolant methods Feasible interpolation (BLAST, IMPACT) Bounded provers (SATABS) Constraint-based (ARMC) Local proof Interpolation of non-local proofs In some logics, we can translate a non-local proof into interpolants. propositional logic linear arithmetic (integer or real) equality, function symbols, arrays In most case, QF formulas yield QF interpolants, solvingthe quantifier divergence problem. use of the array theory is limited can be accomplished with standard decision procedures This is an advantage, since searching for a non-local proof is easier Non-local to local We can think of interpolation as translating a non-local proof into a local proof. 00 x0 y0 x2 y0-2 x1x0-1 x1 y0-1 x2 x0-2 x2 y0-2 0-2 0 -2 0 -2 x1 y0-1 Non-local! x0 y0 Interpolation re-orders the sum to make the proof local. x2 x1-1 y0x2 Interpolation makes proof search easier, but can substantially reduce the cost of the proof, possibly leading to divergence, Refinement methods Strongest postcondition (SLAM1) Weakest precondition (Magic,FSoft,Yogi) Interpolant methods Feasible interpolation (BLAST, IMPACT) Bounded provers (SATABS) Constraint-based (ARMC) Local proof These methods can be viewed as different strategies to search for a local proof, trading off the cost of the search and the quality of the interpolants. Basic Framework Abstraction and refinement are proof systems spaces of possible proofs that we search prog. pf. of special case Abstractor pf. General proof system Incomplete special case Refiner cex. Specialized proof system Complete Degree of specialization can strongly affect refinement quality Predicate abstraction In predicate abstraction, we typically build a graph in which the vertices are labeled with minterms over P (abstract states). The proof is complete when it folds into a Hoare logic proof of C. An unprovable path looks like this: 1 1 2 2 3 3 4 4 5 5 no individual transition refutable To refine, translate to restricted program path: [1 ];1 [2 ];2 [3 ];3 [4 ];4 [5 ];5 Any proof of this restricted path rules out the original, but... Overspecialization Restricting paths can affect the quality of the refinement. [x=0] x=0 [x=0] x++ [x=1] x++ [x=2] x++ [x 0 ,1 ,2 ] [x < 0] {0 x} Restricting paths can make the refiner's job easier. However, it also skews the proof cost metric. This can cause the refiner to miss globally optimal proofs, leading to divergence. {True} {0 x} Restricted path, from PA({x=0,x=1,x=2}) Lowest-cost proof leads to divergence! Lowest-cost proof without restriction. {x=3} {0 x} {False} {False} Synergy algorithm The Synergy algorithm produces a very local refinement by strongly restricting the refinement path. Shortest infeasible prefix 1 1 2 2 3 3 4 4 5 Restrict to concrete states. 5 {} 4 3: Refinement only here! ...splits just one state! 3 Synergy produces small incremental refinements at low cost. However, extreme specialization can reduce quality of refinements leading to divergence for loops. Summary Abstraction and refinement can be thought of as two proof systems: Abstractor is general, but incomplete Refiner is specialized, but complete. Refiner generates path proof Abstractor replicates proof Quality of proof depends on proof system, search strategy Low refinement quality leads to divergence Different refines represent different cost/quality trade-offs Specialization reduces cost, but also refinement quality. In general, the more the refiner sees, the better the refinement Abstraction is path-reductive is, when it fails, it fails for one path. Existing refiners can be viewed as local proof systems Abstractors vary in the refinement proof goals generated Three ideas to take away An abstraction is a restricted deduction system. A proof decomposition divides a proof into shallow lemmas, where shallow means "can be proved in a simple abstraction" Relevant abstractions are discovered by generalizing from particular cases. By applying these three ideas, we can increase the degree of automation in proofs of complex systems. ...
View Full Document

Ask a homework question - tutors are online