341 Pages

umi-umd-5736

Course: TOMOS 1903, Fall 1920
School: Maryland
Rating:
 
 
 
 
 

Word Count: 95513

Document Preview

of ABSTRACT Title dissertation: Language-based Enforcement of User-dened Security Policies As Applied to Multi-tier Web Programs Nikhil Swamy, Doctor of Philosophy, 2008 Directed by: Professor Michael Hicks Department of Computer Science Over the last 35 years, researchers have proposed many different forms of security policies to control how information is managed by software, e.g., multi-level information ow...

Register Now

Unformatted Document Excerpt

Coursehero >> Maryland >> Maryland >> TOMOS 1903

Course Hero has millions of student submitted documents similar to the one
below including study guides, practice problems, reference materials, practice exams, textbook help and tutor support.

Course Hero has millions of student submitted documents similar to the one below including study guides, practice problems, reference materials, practice exams, textbook help and tutor support.
of ABSTRACT Title dissertation: Language-based Enforcement of User-dened Security Policies As Applied to Multi-tier Web Programs Nikhil Swamy, Doctor of Philosophy, 2008 Directed by: Professor Michael Hicks Department of Computer Science Over the last 35 years, researchers have proposed many different forms of security policies to control how information is managed by software, e.g., multi-level information ow policies, role-based or history-based access control, data provenance management etc. A large body of work in programming language design and analysis has aimed to ensure that particular kinds of security policies are properly enforced by an application. However, these approaches typically x the style of security policy and overall security goal, e.g., information ow policies with a goal of noninterference. This limits the programmers ability to combine policy styles and to apply customized enforcement techniques while still being assured the system is secure. This dissertation presents a series of programming-language calculi each intended to verify the enforcement of a range of user-dened security policies. Rather than bake in the semantics of a particular model of security policy, our languages are parameterized by a programmer-provided specication of the policy and enforcement mechanism (in the form of code). Our approach relies on a novel combination of dependent types to correctly associate security policies with the objects they govern, and afne types to account for policy or program operations that include side effects. We have shown that our type systems are expressive enough to verify the enforcement of various forms of access control, provenance, information ow, and automata-based policies. Additionally, our approach facilitates straightforward proofs that programs implementing a particular policy achieve their high-level security goals. We have proved our languages sound and we have proved relevant security properties for each of the policies we have explored. To our knowledge, no prior framework enables the enforcement of such a wide variety of security policies with an equally high level of assurance. To evaluate the practicality of our solution, we have implemented one of our type systems as part of the LINKS web-programming language; we call the resulting language SELINKS. We report on our experience using SELINKS to build two substantial applications, a wiki and an on-line store, equipped with a combination of access control and provenance policies. In general, we have found the mechanisms SELINKS provides to be both sufcient and relatively easy to use for many common policies, and that the modular separation of user-dened policy code permitted some reuse between the two applications. Language-based Enforcement of User-dened Security Policies As Applied to Multi-tier Web Programs NIKHIL SWAMY Dissertation submitted to the Faculty of the Graduate School of the University of Maryland, College Park in partial fulllment of the requirements for the degree of Doctor of Philosophy 2008 Advisory Committee: Professor Michael W. Hicks, Chair/Advisor Professor Samrat Bhattacharjee Professor Jeffrey S. Foster Professor Jeffrey W. Herrmann, Deans representative Professor Jonathan Katz c Copyright Nikhil Swamy 2008 Acknowledgments This dissertation would have been impossible, and my experience of graduate school would have been much diminished, were it not for the help, guidance, support, and friendship of several people. I am most indebted to my advisor, Mike Hicks. Despite several false starts, and through the bleak middle years of graduate school that saw several research projects zzle out for one reason or another, Mikes constant encouragement was a source of muchneeded condence. Watching Mike devote himself to all he does, with what appears to always be the most natural ease, has been nothing short of inspirational. Mike has been a teacher, a role model, a condant, an indulgent sparring partner in numerous arguments, and, above all else, a very good friend. I am also grateful to all the members of PLUM, the programming languages research group at the University of Maryland. To Jeff Foster (and Mike again), for making PLUM a creative, productive, supportive, and, basically, just a really fun research group. To Brian Corcoran, whose help with SELINKS has been invaluable. And, for way too many things to mention, to Iulian Neamtiu, Polyvios Pratikakis, Saurabh Srivastava, Mike Furr, Pavlos Papageorge, Nick Petroni, David Greeneldboyce, Martin Ma, Chris Hayden, Khoo Yit Phang, Elnatan Reisner, David An, and Eric Hardisty. Finally, I am deeply obliged to my family. They have patiently tolerated the garbled explanations of my work that I have grudgingly offered from time to time, and have always responded with quiet encouragement. Their unquestioning support has been, for me, an ever-present exhortation to try to do whats right, never fearing for the outcome, or being mindful of the fruits, should there be any, of my work. ii Table of Contents List of Figures 1 Introduction 1.1 Overview of our Approach . . . . . . . . . . . . 1.1.1 A Brief Primer on Security Typing . . . . 1.1.2 Enforcing User-dened Security Policies 1.1.3 Building Secure Web Applications . . . . 1.2 Summary of Contributions . . . . . . . . . . . . vii 1 6 6 8 12 14 16 17 17 21 23 29 31 32 32 37 43 46 48 52 53 54 57 60 63 65 73 74 76 81 86 86 91 94 96 97 99 102 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Enforcing Purely Functional Policies 2.1 FABLE: System F with Labels . . . . . . . . . . . 2.1.1 Syntax . . . . . . . . . . . . . . . . . . . 2.1.2 Example: A Simple Access Control Policy 2.1.3 Typing . . . . . . . . . . . . . . . . . . . 2.1.4 Operational Semantics . . . . . . . . . . . 2.1.5 Soundness . . . . . . . . . . . . . . . . . 2.2 Example Policies in FABLE . . . . . . . . . . . . . 2.2.1 Access Control Policies . . . . . . . . . . 2.2.2 Dynamic Provenance Tracking . . . . . . . 2.2.3 Static Information Flow . . . . . . . . . . 2.2.4 Dynamic Information Flow . . . . . . . . . 2.3 Composition of Security Policies . . . . . . . . . . 2.4 Concluding Remarks . . . . . . . . . . . . . . . . 3 Enforcing Stateful Policies for Functional Programs 3.1 Overview . . . . . . . . . . . . . . . . . . . 3.2 AIR: Automata for Information Release . . . 3.2.1 Syntax of AIR, by Example . . . . . 3.2.2 A Simple Stateful Policy in AIR . . . 3.3 A Programming Model for AIR . . . . . . . 3.4 Syntax and Semantics of AIR . . . . . . . . 3.4.1 Syntax . . . . . . . . . . . . . . . . 3.4.2 Static Semantics . . . . . . . . . . . 3.4.3 Dynamic Semantics . . . . . . . . . 3.5 Translating AIR to AIR . . . . . . . . . . . 3.5.1 Representing AIR Primitives . . . . . 3.5.2 Translating Rules in an AIR Class . . 3.5.3 Programming with the AIR API . . . 3.5.4 Correctness of Policy Enforcement . 3.6 Encoding FABLE in AIR . . . . . . . . . . . 3.6.1 SFABLE : A AIR Signature for FABLE . 3.7 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii 4 Enforcing Policies for Stateful Programs 4.1 FLAIR: Extending AIR with References . . . . . . . . . . . . . . 4.2 A Reference Specication of Information Flow . . . . . . . . . . 4.2.1 Information Flow for Core-ML . . . . . . . . . . . . . . 4.3 Tracking Indirect Flows in FLAIR using Program Counter Tokens 0 4.3.1 SFlow : A Sketch of a Solution . . . . . . . . . . . . . . . . 0 4.3.2 Example Programs that use SFlow . . . . . . . . . . . . . . 4.4 Enforcing Static Information Flow in FLAIR . . . . . . . . . . . . 4.4.1 SFlow : A Signature for Static Information Flow . . . . . . 4.4.2 Simple Examples using SFlow . . . . . . . . . . . . . . . . 4.4.3 Examples with Higher-order Programs . . . . . . . . . . 4.4.4 Security Theorem . . . . . . . . . . . . . . . . . . . . . . 4.5 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . 5 Enhancing LINKS with Security Typing 5.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 An Introduction to LINKS . . . . . . . . . . . . . . . . . . . . . . 5.2.1 Programming in LINKS . . . . . . . . . . . . . . . . . . 5.2.2 Fine-grained Security with Links . . . . . . . . . . . . . . 5.3 SELINKS Basics: Enforcing Policies with Static Security Labels . 5.3.1 Dening a Language of Security Labels . . . . . . . . . . 5.3.2 Protecting Resources with Labels . . . . . . . . . . . . . 5.3.3 Interpreting Labels via the Enforcement Policy . . . . . . 5.4 Enforcing Policies with Dynamic Labels . . . . . . . . . . . . . . 5.4.1 Dependently Typed Functions . . . . . . . . . . . . . . . 5.4.2 Dependently Typed Records . . . . . . . . . . . . . . . . 5.5 Rening Polymorphism in SELINKS . . . . . . . . . . . . . . . . 5.5.1 Phantom Variables: Polymorphism over Type-level Terms 5.5.2 Restricting Polymorphism by Stratifying Types into Kinds 5.6 Expressiveness of Policy Enforcement in SELINKS . . . . . . . . 5.6.1 Type-level Computation . . . . . . . . . . . . . . . . . . 5.6.2 Rening Types with Runtime Checks . . . . . . . . . . . 5.7 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . 6 Building Secure Multi-tier Applications in SELINKS 6.1 Application Experience with SELINKS . . . . . . . . 6.1.1 SEWiki . . . . . . . . . . . . . . . . . . . . . 6.1.2 SEWineStore . . . . . . . . . . . . . . . . . . 6.2 Efcient Cross-tier Enforcement of Policies . . . . . . 6.3 Implementation of Cross-tier Enforcement in SELINKS 6.3.1 User-dened Type Extensions in PostgreSQL . 6.3.2 Compilation of SELinks to PL/pgSQL . . . . . 6.3.3 Invoking UDFs in Queries . . . . . . . . . . . 6.4 Experimental Results . . . . . . . . . . . . . . . . . . 6.4.1 Conguration . . . . . . . . . . . . . . . . . . iv . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 105 109 109 115 116 118 120 122 128 131 135 138 140 142 144 146 151 154 156 159 160 162 162 165 174 175 180 182 183 186 187 188 189 190 195 196 199 200 202 204 206 206 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5 6.4.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210 211 211 212 213 214 214 215 216 218 218 220 221 223 223 226 227 228 230 235 236 237 239 240 244 246 247 247 249 251 252 257 258 262 262 270 274 280 283 7 Related Work 7.1 Security-typed Languages . . . . . . . . . . . . . . . . . . 7.1.1 FlowCaml . . . . . . . . . . . . . . . . . . . . . . . 7.1.2 Jif . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 Extensible Programming Languages . . . . . . . . . . . . . 7.2.1 Classic Work on Extensible Programming Languages 7.2.2 Extensible Type Systems . . . . . . . . . . . . . . . 7.2.3 Extensions Based on Haskells Type System . . . . . 7.3 Dependent Typing . . . . . . . . . . . . . . . . . . . . . . . 7.3.1 Dependently Typed Proof Systems . . . . . . . . . . 7.3.2 Dependently Typed Programming Languages . . . . 7.3.3 Dependent Types for Security . . . . . . . . . . . . 7.4 Security Policies . . . . . . . . . . . . . . . . . . . . . . . 7.4.1 Security Automata . . . . . . . . . . . . . . . . . . 7.4.2 Declassication Policies . . . . . . . . . . . . . . . 7.4.3 Data Provenance Tracking . . . . . . . . . . . . . . 7.5 Web Programming . . . . . . . . . . . . . . . . . . . . . . 7.5.1 Label-based Database Security . . . . . . . . . . . . 7.6 Other Technical Machinery . . . . . . . . . . . . . . . . . . 8 Looking Ahead 8.1 Assessment of Limitations . . . . . . . . . . . . . . . . 8.2 Automated Enforcement of Policies . . . . . . . . . . . 8.2.1 Transforming Programs to Insert Policy Checks . 8.2.2 Inferring and Propagating Label Annotations . . 8.2.3 Semi-Automated Proofs of Policy Correctness . 8.3 Enhancements to Support Large-scale Policies . . . . . . 8.3.1 Interfacing with Trust Management Frameworks 8.3.2 Administrative Models for Policy Updates . . . . 8.3.3 Administrative Models for Policy Composition . 8.4 Defending Against Emerging Threats to Web Security . . 8.5 Concluding Remarks . . . . . . . . . . . . . . . . . . . 9 Conclusions A Proofs of Theorems Related to FABLE A.1 Soundness of FABLE . . . . . . . . . . . . . . . . A.2 Correctness of the Access Control Policy . . . . . A.3 Dynamic Provenance Tracking . . . . . . . . . . . A.4 Correctness of the Static Information-ow Policy . A.5 Completeness of the Static Information-ow Policy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v B Proofs of Theorems Related to AIR 287 B.1 Soundness of AIR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287 B.2 Proof of Correct API Usage . . . . . . . . . . . . . . . . . . . . . . . . . 303 C Proofs of Theorems Related to FLAIR C.1 Soundness of FLAIR . . . . . . . . . . . . . . . . . . C.2 Correctness of Static Information Flow . . . . . . . . . C.2.1 Afnity of Program Counters and Capabilities C.2.2 Proving Noninterference using FLAIR2 . . . . Bibliography 306 306 308 310 313 320 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi List of Figures 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 Syntax of FABLE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Syntactic shorthands . . . . . . . . . . . . . . . . . . . . . . . . . . . . Enforcing a simple access control policy . . . . . . . . . . . . . . . . . . Static semantics of FABLE . . . . . . . . . . . . . . . . . . . . . . . . . Dynamic semantics of FABLE . . . . . . . . . . . . . . . . . . . . . . . Similarity of expressions under the access control policy . . . . . . . . . Enforcing a dynamic provenance-tracking policy . . . . . . . . . . . . . A logical relation that relates terms of similar provenance (selected rules) Enforcing an information ow policy . . . . . . . . . . . . . . . . . . . . 17 20 22 24 29 34 39 40 44 47 49 61 63 67 73 77 82 91 95 99 2.10 A dynamic information ow policy and a client that uses it . . . . . . . . 2.11 A type-based composability criterion . . . . . . . . . . . . . . . . . . . . 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4.1 4.2 Syntax of AIR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A stateful information release policy in AIR . . . . . . . . . . . . . . . . Programming with an AIR policy . . . . . . . . . . . . . . . . . . . . . Syntax of AIR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Static semantics of AIR (Selected rules) . . . . . . . . . . . . . . . . . . Dynamic semantics of AIR . . . . . . . . . . . . . . . . . . . . . . . . Translating an AIR rule to a base-term function in a AIR signature . . . A AIR program that performs a secure information release . . . . . . . . SFABLE : An embedding of FABLE in AIR . . . . . . . . . . . . . . . . . . Syntax and semantics of FLAIR (Extends AIR with references) . . . . . 106 Core-ML syntax and typing . . . . . . . . . . . . . . . . . . . . . . . . . 110 vii 4.3 4.4 4.5 4.6 4.7 4.8 4.9 5.1 5.2 0 SFlow : An attempt to statically enforce information ow in FLAIR . . . . . 116 Attempting to track effects in some simple example programs . . . . . . . 118 SFlow : A FLAIR signature to statically enforce an information ow policy . 123 Translating a simple Core-ML program to FLAIR . . . . . . . . . . . . . 126 Tracking effects using SFlow . . . . . . . . . . . . . . . . . . . . . . . . . 129 Higher-order programs that contain secure indirect ows . . . . . . . . . 132 Higher-order programs with insecure indirect ows . . . . . . . . . . . . 134 An overview of the execution model of LINKS . . . . . . . . . . . . . . . 145 A LINKS program that renders the contents of an employee database in a web browser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 Enforcing a ne-grained access control policy in LINKS . . . . . . . . . . 152 An example illustrating the syntax of singleton label types in SELINKS . 158 Protecting a socket interface with simple security labels . . . . . . . . . . 159 An enforcement policy to restrict data sent on a socket . . . . . . . . . . 161 An enforcement policy for sockets using dependently typed functions . . 164 An enforcement policy for sockets using dependently typed records . . . 165 A policy to protecting salary data in an employee database . . . . . . . . 168 5.3 5.4 5.5 5.6 5.7 5.8 5.9 5.10 A policy to construct unforgeable user credentials . . . . . . . . . . . . . 171 5.11 An example program that enforces a policy in a database query . . . . . . 173 5.12 A lattice-based policy for integer addition . . . . . . . . . . . . . . . . . 175 5.13 A lattice-based policy for integer addition, with phantoms . . . . . . . . . 176 5.14 Extending FABLE with phantom variables . . . . . . . . . . . . . . . . . 177 5.15 Example illustrating how client code can violate its abstractions . . . . . 180 5.16 Rening a type based on the result of a runtime check . . . . . . . . . . . 186 viii 6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8 6.9 8.1 The representation of security labels in SEWIKI . . . . . . . . . . . . . . 191 A document model and enforcement policy for SEWIKI . . . . . . . . . 193 A function that performs a keyword search on the document database . . 194 Cross-tier Policy Enforcement in SELINKS . . . . . . . . . . . . . . . . 197 PostgreSQL User-Dened Types . . . . . . . . . . . . . . . . . . . . . . 199 Generated PL/pgSQL code for access . . . . . . . . . . . . . . . . . . . 203 SQL query generated for getSearchResults . . . . . . . . . . . . . . . . . 204 Test platform summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 207 Throughput of SELINKS queries under various congurations . . . . . . 208 A cross-site scripting attack on SEWIKI . . . . . . . . . . . . . . . . . . 254 A.1 Enforcing a simple access control policy . . . . . . . . . . . . . . . . . . 270 A.2 Similarity of expressions under the access control policy . . . . . . . . . 271 A.3 Enforcing a dynamic provenance-tracking policy . . . . . . . . . . . . . 274 A.4 A logical relation for dynamic provenance tracking (Part 1) . . . . . . . . 275 A.5 A logical relation for dynamic provenance tracking (Part 2) . . . . . . . . 276 A.6 Enforcing a static information ow policy . . . . . . . . . . . . . . . . . 281 A.7 Semantics of FABLE2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282 A.8 Core-ML syntax and typing (Functional fragment) . . . . . . . . . . . . . 284 A.9 Translation from a Core-ML derivation D to FABLE . . . . . . . . . . . . 285 B.1 Static semantics of AIR (Typing judgment) . . . . . . . . . . . . . . . . 288 B.2 Static semantics of AIR (Type equivalence and kinding judgment) . . . . 289 B.3 Dynamic semantics of AIR . . . . . . . . . . . . . . . . . . . . . . . . 290 B.4 Translating an AIR policy to a AIR signature (Part 1) . . . . . . . . . . 300 ix B.5 Translating an AIR policy to a AIR signature (Part 2) . . . . . . . . . . 301 B.6 Trace acceptance condition dened by an AIR class. . . . . . . . . . . . 302 C.1 Dynamic semantics of FLAIR, revises semantics of AIR in Figure B.3 . . 307 C.2 Instrumenting FLAIR to track afne capabilities and program counter tokens309 C.3 Semantics of FLAIR2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314 C.4 Dynamic semantics of FLAIR2 . . . . . . . . . . . . . . . . . . . . . . . 315 x 1. Introduction The 9/11 Commission Report, in an attempt to explain the failure of the United States government to prevent the September 11, 2001, terrorist attacks, included the following statement among its general ndings: Action ofcers should have been able to draw on all available knowledge about al Qaeda in the government. Management should have ensured that information was shared and duties were clearly assigned across agencies, and across the foreign-domestic divide. [. . .] The U.S. government did not nd a way of pooling intelligence and using it to guide the planning and assignment of responsibilities for joint operations . . . [92] In response to these ndings, and driven in part by the success of web sites like Wikipedia, YouTube, Flickr, Facebook, and MySpace,1 the U.S. government has begun using web applications to disseminate critical information in a timely manner across its various divisions. Examples include SKIWEB [20], used for strategic knowledge integration in the U.S. military, and Intellipedia [108], a set of web-based document management systems used throughout the sixteen agencies that comprise the U.S. intelligence community. While many of the details about these applications are classied, a recent press release by the Central Intelligence Agency (CIA) about Intellipedia includes the 1 wikipedia.org, youtube.com, flickr.com, facebook.com, myspace.com 1 following statement: . . . the CIA now has users on its top secret, secret and sensitive unclassied networks reading and editing a central wiki that has been enhanced with a YouTube-like video channel, a Flickr-like photo-sharing feature, content tagging, blogs and RSS feeds. [34] Of course, sharing sensitive information via the web is not limited to the U.S. government. For example, in the United Kingdom, the National Health Services Spine is a web-based application intended to provide health-care providers with convenient access to a patients records [93]. Even within smaller organizations, web-based information sharing is common, e.g., web applications like Continue [74] and EasyChair [134] are frequently used to manage academic conferences. While there are substantial efciencies to be had from information sharing, clearly, there can also be signicant consequencesloss of life, health-based discrimination, identity theft, etc.should sensitive information not be properly protected. Networked information-sharing applications must therefore balance two competing ends: to maximize the sharing of information while mitigating, to the greatest extent possible, the risk due to unauthorized release of, or tampering with, sensitive information. Take the case of Intellipedia: to protect against improper usage of sensitive intelligence documents, it should address a number of security concerns. At the most basic level, it should control which users can access content by enforcing forms of multi-level security policies. Intellipedia may also need to track provenance information [22], such as revision history and data sourcing, on documents to reason about information integrity 2 and to support auditing. To improve information availability, a policy may release content to a certain user with some particularly sensitive information withheld, either by redaction or by some other form of downgrading. Computations over the document databases, like PageRank-style algorithms [98], should be careful to respect these security concerns so as not to inadvertently leak sensitive data in response to search queries. Without recourse to formal verication to ensure that all these security needs have been met, Intellipedia currently heads off the threat of information misuse by placing sharp limits on the sharing of information. For example, condentiality is achieved by falling back on the security of the underlying computer networks of the U.S. Department of Defense. The DoD manages several computer networks each cleared to handle data at specic classication levelsNIPRNet may only handle sensitive but unclassied data, SIPRNet is cleared for secret data, and JWICS for top-secret data [129]. Versions of Intellipedia that are accessible on each of these networks are kept physically separate, and all access controls are applied at the network level. But, by resorting to an air gap to secure sensitive data, Intellipedia surrenders much of the benet that may be had from sharing information at a ne granularity.2 For example, a document that contains a fragment of top-secret information must be placed in JWICS, even though much of its content may be of relevance to users with a lower clearance. To work around these limitations, such a document may have to be downgraded and copied into one of the other networks. But, if the downgrading is not performed properly, top-secret data may have been leaked inadvertently. Additionally, as documents are edited, it is easy for the the enforcement of data condentiality on U.S. DoD systems is strongly inuenced by well-entrenched institutional practices. As for other concerns, like reliable tracking of data provenance, it appears highly unlikely that Intellipedia provides any formal guarantees about these. 2 Admittedly, 3 original document and the downgraded version to become inconsistent, compromising information integrity. Instead, we would like to allow Intellipedia to share information at a ne granularity while verifying that the software meets all the security requirements. In pursuit of this goal, one must, of course, begin by formalizing the security requirements. Policy languages that can articulate such requirements have been the focus of a number of research effortse.g., UNIX-style access control lists, RBAC [107], XACML [143], Ponder [38], and trust management frameworks like RT [76]. Evidently, the raison d tre of each of e these languages is to express enforceable specications of the permissible behaviors of a system. This latter point is the focus of our work: given one of these security policies, how do we assure that a software system enforces it properly? A simple way to think about this question is as complete mediationare all securitysensitive operations properly mediated by queries to the security policy? Researchers have proposed using static analysis of software source code to check this condition. For example, Zhang et al. [148] used CQual to check that SELinux operations on sensitive objects are always preceded by policy checks; Fraser [53] did the same for Minix. However, these systems only ensure that some policy function is called before data is accessed. Calls to the wrong policy function or incorrect calls to the right one (e.g., with incorrect arguments) are not prevented. Security-typed programming languages, like Jif [31] or FlowCaml [104], aim to allay this weakness. Through the use of novel type systems, these languages are able to show that well-typed programs enjoy useful extensional security properties as a consequence of complete mediation. These security properties are typically based on forms 4 of noninterference for lattice-based information ow policies [42], and roughly means that high-security data cannot be inferred via a low-security channel. But, security-typed languages have problems of their own. First, noninterference is often too strongsome protected information inevitably must be released, either via downgrading or even according to simpler access control policies. Moreover, security-typed languages usually x the mechanisms by which a policy is specied. For example, policies in Jif are specied using the decentralized label model [88], which may not always be suitable for certain applications. We have observed, for instance, that a role-based label model may be better when policies are expected to change at runtime [125]. Finally, information ow policies provide no clear way to express and provably enforce data provenance or other styles of policy, such as security automata [113] and history- or stack-based access control [1, 51]. All told, despite their promise, security-typed programming languages are not yet exible enough to guarantee the enforcement of the variety of security policies that often must be applied to real-world applications. This dissertation sets out to demonstrate that security typing can be made exible enough to be applied to a broad range of policies. In particular, our thesis is the following: A language-based framework, while being practical enough to construct real applications, can be used to verify that a range of user-dened security policies are correctly enforced, and that, as a consequence, programs enjoy useful extensional security properties. 5 1.1 Overview of our Approach The main contribution of this dissertation is a generalization of security typing that does not bake in a particular model of security as primitive. Instead, using a novel combination of several standard (but advanced) type-theoretic constructs, we are able to enforce various forms of access control, provenance, information ow, and automatabased policies. While our encodings make specic design choices, our solution is exible enough that programmers can control all the low-level details of policy specication and enforcement. For example, programmers are free to develop custom label models when enforcing an information ow policy; or, they may implement an access control policy using capabilities, access control lists, or some combination of the two. Nevertheless, we retain the benet of traditional security-typed languages by being able to show that type-correct programs enjoy useful security properties. We have implemented our ideas in a programming language that we call SELINKS and have validated its practicality by building SEWIKI, a web-based document management system inspired by Intellipedia. In this section, we present a summary of the main elements of our approach. 1.1.1 A Brief Primer on Security Typing Volpano et al. [132] were the rst to propose using a type system to certify the en- forcement of a security policy. In particular, they address the enforcement of information ow security policies specied in Dennings lattice model [42]. In this model, a policy is specied as a lattice (L , ), where L is a nite set of security labels partially ordered by the relation . For example, L may identify secrecy classes like High and Low, with 6 Low High, where these classes are used to categorize the secrecy of objects in a system. Informally, the intention of such a policy is to ensure that no information about an object of the High security class ows to an object of the Low security class. The key insight of Volpano et al. was to rene the types of a programming language to include a security label. For example, the type intHigh represents the set of all High security integers. Volpano et al. dene a type system that tracks the ow of information through the various constructs of a programming language. For instance, for a program statement h := l, where h has the type intHigh and l has the type intLow , the typing judgment records a ow from the Low security class to the High class. Such a ow is accepted as secure for the lattice Low High. However, an assignment in the opposite direction (l := h) is judged by the type rules to cause a ow from High to Low and is deemed insecure for the example lattice. A program containing such an insecure assignment is rejected as type-incorrect. A large body of work [111] has extended these basic ideas of security typing to accommodate variations on the lattice model and to incorporate information ow analyses of the programming constructs of real languages (e.g., exceptions, higher-order functions, objects etc.). Jif [31], an extension of Java, and FlowCaml [104], an extension of Caml, are two noteworthy language implementations that utilize security typing to guarantee the correct enforcement of information ow policies. 7 1.1.2 Enforcing User-dened Security Policies Our work begins with the observation that the many security policies are enforce- able by associating labels with data in the types (as in Jif and FlowCaml), where the label expresses the security policy for that data. What varies among policies is the specication and interpretation of labels, in terms of the actions that are permitted or denied. By allowing the syntax and semantics of labels to be user-dened, we stand to benet from the high degree of assurance provided by security typing while still retaining the exibility we desire to enforce a range of policies. In subsequent chapters, we develop FABLE, AIR, and FLAIR, a succession of programming-language calculi, each building upon the previous, which embody this observation in two respects. First, a policy designer can dene custom security labels and associate them with the data they protect using dependent types [4]. Next, rather than hard-code their semantics, policy enforcement is parameterized by a programmerprovided interpretation of labels, specied in a privileged part of the program. The type system forbids application programs from manipulating data with a labeled type directly. Instead, in order to use labeled data, the application must call the appropriate privileged functions that interpret the labels. By verifying the interpretation of labels, and relying on the soundness of the type system, policy implementers can prove that type-correct programs enjoy relevant security properties. In our rst language, FABLE, a programmer could dene a label High, and give a high-security integer value a type that mentions this label, int{High}. As another example, the programmer could dene a label ACL(Alice, Bob) to stand for an access control list 8 and give an integer a type such as int{ACL(Alice, Bob)}. Programmers dene the interpretation of labels in an enforcement policy, a set of privileged functions distinguished from the rest of the program. Thus, in order to capture the intuition that an integer with the type int{ACL(Alice, Bob)} is only to be accessed by Alice or Bob, one writes an enforcement policy function like the following: policy access simple (acl:lab, x:int{acl}) = if (member user acl) then {}x else 1 Here, access simple takes a label acl as its rst argument (like ACL(Alice, Bob)), and an integer protected by that label as its second argument. If the current user (represented by the variable user) is a member of xs access control list acl (according to some function member, not shown), then x is returned with its label removed, expressed by the syntax {}x, which coerces xs type to int so that it can be accessed by the main program. If the membership test fails, it returns 1, and xs value is not released. By preventing the main program (i.e., the non-policy part) from directly examining data with a labeled type, we can ensure that all its operations on data with a type like int{ACL(Alice, Bob)} are preceded by a call to the access simple policy function, which performs the necessary access control check. In Chapter 2, we show that FABLE is powerful enough to encode the enforcement of various styles of access control, data provenance, and information ow policies. In each case, we state and prove useful security properties for well-typed programs. However, FABLE is limited in that it applies only to purely functional programs. While the purely functional setting is both useful and illustrative, complex real-world policies often rely on some mutable state to make authorization decisions. In Chapter 3, we dene AIR, a 9 calculus that can enforce stateful policies in addition to the policies enforceable in FABLE. We present AIR by rst picking a specic model for stateful policies. In particular, we propose AIR, a novel model for specifying high-level information release protocols (a kind of declassication policy [112]) in terms of security automata [113]. AIR is of independent interest in that, to our knowledge, it is the rst time security automata have been used to specify information release protocols. To enforce AIR policies, sensitive data in AIR are labeled with states from an AIR security automaton. Prior to using labeled data, a AIR program must call functions (analogous to enforcement policy functions in FABLE) that consult the state of the automaton mentioned in the label and the usage is permitted only if it authorized by the automaton. Since the state of the automaton changes as the program executes, the type system of AIR has to be careful to ensure that stale automaton states are never used in authorization decisions. The technical machinery that accomplishes this is the use of afne types [140]. We prove that an AIR policy is correctly enforced in AIR by showing that type correct AIR programs (that use type signatures corresponding to specic AIR policy) produce execution traces that are strings in the language accepted by the AIR automaton. Of course, useful real-world programs include side effects, e.g., some output is printed to the terminal, or a message is sent over the network. In AIR, we model state updates in a purely functional way. While this is sufcient if programs are always written in a monadic style, we show in Chapter 4 that the combination of afne and dependent types in AIR is powerful enough to enforce policies for programs that may cause side effects directly. The main contribution of Chapter 4 is our nal calculus FLAIR, which extends AIR with mutable references to memory. We develop an encoding of a canonical 10 information ow policy in FLAIR and show that type correct FLAIR programs using this encoding enjoy a standard noninterference property. The FLAIR calculus is our main evidence in support of the claim that a languagebased framework can be expressive enough to support the enforcement of broad range of policies. Chapter 3 shows that FLAIR can be used to enforce security automata policies. As a consequence of prior work on the expressiveness of security automata [113, 59, 13, 79], we get a useful lower bound on the class of properties enforceable in FLAIR informally, a broad class of safety properties. Additionally, FLAIR can enforce noninterference, which has been categorized variously as a 2-safety property [126] and, more recently, as a kind of hyperproperty [33]. We make no attempt to enforce liveness properties in FLAIR. Unlike traditional security type systems which guarantee that type-correct programs enjoy strong security properties like noninterference, our method does not, in and of itself, guarantee any such property. Clearly, by allowing the semantics of policy enforcement to be user-dened we open the possibility of a programmer constructing policies that are patently insecurewe make no attempt to prevent this. However, the design of each of our type systems facilitates (and, indeed, greatly simplies) proofs that the enforcement of a specic policy entails a corresponding security property of type-correct programs we have conducted these proofs for each of the policies explored in this dissertation. This stands as evidence for the claim that our approach admits proofs of extensional security properties for programs. 11 1.1.3 Building Secure Web Applications Given the increasing demand for web-based information availability, the construc- tion of secure web applications is a useful point of reference when evaluating the practicality of a new approach to security. As such, we have used FABLE in the design of a new programming language called SELINKS. The nal claim of our thesis is that SELINKS is practical enough to be used in the construction of realistic secure web applications. Web applications are often distributed over several tiers. In a typical conguration, such an application is comprised of a client tier, where much of user-interface logic runs in a web browser (as JavaScript); a server tier, where the bulk of the application logic is executed (in a language like Java or PHP); and nally, a database tier (executing SQL code) that serves as a high-efciency persistent store. Our effort to construct secure web applications begins with the LINKS programming language [35]. LINKS is designed to make web programming easier. Rather than programming each tier in a separate language, in LINKS, a programmer writes an entire multi-tier web application as a single program. The compiler splits that program into components to run on the client (as JavaScript), server (as a local fragment of LINKS code), and database (as SQL). From our perspective LINKS is also useful in that it makes it easier to reason about the security behavior of all three tiers of an application by analyzing a single source program. In Chapter 5, we describe our extension to LINKS called Security-Enhanced LINKS, or SELINKS. Our extensions consist of two main components. The rst is a new type system for LINKS based on FABLE-style security typing. Next, in order to efciently enforce security policies in data-access code, we have designed and implemented a novel 12 compilation procedure that translates SELINKS enforcement policy code to user-dened functions that can be run in the database. Our experiments show that this compilation strategy can improve the throughput of a database query by as much as an order of magnitude. To evaluate the practicality of SELINKS (and, by extension, FABLE), we have constructed two medium-sized web applications that enforce custom security policies using SELINKS. Describing these applications is the main focus of Chapter 6. The larger of these two applications is SEWIKI, a web-based document management system inspired by Intellipedia, that enforces a custom combination of a ne-grained access control policy and a provenance-tracking policy on HTML documents. The second application, SEWINESTORE, is an e-commerce application distributed with LINKS that we have extended with an access control policy. In general, we have found that SELINKS label-based security policies are sufcient to enforce many interesting policies and are relatively easy to use. Additionally, the modular specication of the enforcement policy permits some reuse of policy code between the two applications. A limitation of our evaluation is that we have only implemented the FABLE system for SELINKS. We have yet to evaluate the practicality of the full-generality of FLAIR for enforcing user-dened policies. As it stands, we conjecture that FLAIR may be more suitable as the basis of an intermediate language, rather than as a type system for a source level language like SELINKS. A detailed discussion of this and other limitations, along with steps we might take to overcome them, can be found in Chapter 8. 13 1.2 Summary of Contributions In summary, this dissertation makes the following contributions. 1. We dene FABLE, a core calculus for the enforcement of purely functional userdened policies. We have proved FABLE sound. We provide encodings in FABLE of the following security policies and prove each of them correct: Two styles of access control, one based on inlined policy checks and another based on capabilities. We formulate an extensional correctness property for access control called non-observability and prove that our encoding satises this property. A data provenance tracking policy augmented with an access control policy to protect the provenance metadata itself. We prove that our encodings satisfy dependency correctness, a standard property for data provenance [26]. Two versions of a lattice-based information ow policy, one with static labels and fully static enforcement, and another with dynamic labels. We prove a standard noninterference property for the static information ow policy. 2. We propose AIR, a novel policy language for expressing high-level information release protocols based on security automata. AIR is intended to promote reasoning about the declassication behavior of a system independently from the systems implementation. 3. We dene AIR, a calculus that extends dependent typing in FABLE with support for afne types. We have proved AIR sound. We show that AIR can be used 14 to enforce stateful security policies by developing an enforcement mechanism for automata-based policies expressed in the AIR language. We prove that type-correct AIR programs enjoy a trace-based correctness property (standard for automatabased policies [139]). 4. We extend AIR with mutable references to produce the calculus FLAIR. We have proved FLAIR sound. We show how FLAIR can be used to enforce a static information ow policy while accounting for information leaks due to side effects on memory. We prove that type-correct FLAIR programs that use our encoding enjoy a standard noninterference property. Additionally, we show how the basic FABLE type system can be embedded in FLAIR and argue, as a consequence, that all the security policies explored in this dissertation can be enforced using FLAIR. 5. We implement SELINKS, an extension of the LINKS web-programming language with support for enforcing user-dened security policies, in the style of FABLE. SELINKS also includes a novel compilation strategy for enforcement policy functions that enables security policies to be seamlessly and efciently enforced for code spanning the server and database tiers. 6. We demonstrate the practicality of SELINKS by building two substantial multitier web applications. The rst, SEWIKI, is a web-based document management system that enforces a combination of ne-grained access controls and provenance tracking on HTML documents. The second, SEWINESTORE, is an e-commerce application retrotted with an access control policy. 15 2. Enforcing Purely Functional Policies This chapter presents FABLE, a core formalism for a programming language in which programmers may specify security policies and reason that these policies are properly enforced. We focus here on purely functional policies applied to purely functional programs. To illustrate FABLEs exibility we show how to use it to encode a range of policies, including access control, static [111] and dynamic information ow [149], and provenance tracking [26]. In our experience, the soundness of FABLE makes proofs of security properties no more difcultand arguably simplerthan proofs of similar properties in specialized languages [104, 127, 132]. To demonstrate this fact we present proofs of correctness for our access control, provenance, and static information ow policies (Appendix A), using three substantially different proof techniques. While precisely stating correctness properties for each of these policies required some careful construction, we found it relatively easy to discharge proofs of these properties by relying on various lemmas from the metatheory of FABLE. This experience indicates that with the accumulation of a set of broadly applicable lemmas about FABLE, many of our security proofs could be partially automated; however, we leave exploration of this issue to future work. 16 Expressions (Fable-specic) Types (Fable-specic) Patterns Pre-values App. values Policy values e t ::= | ::= | ::= ::= ::= ::= n | x | x:t.e | e1 e2 | x x:t.v | .e | e [t] C(e) | match e with pi ei | ( ) | {}e | {e }e [e] int | | .t2 | (x:t1 ) t2 lab | lab e | t{e} x | C(p) n | C(u) | x:t.e | .e u|( [{e}vpol ] ) u | {e}vpol p u vapp vpol Figure 2.1: Syntax of FABLE 2.1 FABLE: System F with Labels This section presents the syntax, static semantics, and operational semantics of FABLE. The next section illustrates FABLEs exibility by presenting example policies along with proofs of their attendant security properties. 2.1.1 Syntax Figure 2.1 denes FABLEs syntax. Throughout, we use the notation a to stand for a list of elements of the form a1 , . . . , an . Where the context is clear, we will also treat a as the set of elements {a1 , . . . , an }. Expressions e extend a standard polymorphic -calculus, System F [55]. Standard forms include integer values n, variables x, abstractions x:t.e, term application e1 e2 , the xpoint combinator x x:t.v, type abstraction .e, and type application e [t]. We exclude mutable references from the language to simplify the presentation. Subsequent chapters extend the language with references and considers their effect on various policies, e.g., information ows through side effects. The syntactic constructs specic to FABLE are distinguished in Figure 2.1. The 17 expression C(e) is a label, where C represents an arbitrary constructor and each ei e must itself be a label. For example, in ACL(Alice, Bob), ACL is 2-ary label constructor and Alice and Bob are 0-ary label constructors. Labels can be examined by pattern matching. For example, the expression match z with ACL(x,y) x would evaluate to Alice if zs runtime value were ACL(Alice, Bob). As explained earlier, FABLE introduces the notion of an enforcement policy that is a separate part of the program authorized to manipulate the labels on a type. Following Grossman et al. [58] we use bracketed expressions ( ) to delimit policy code e [e] from the main program. In practice, one could use code signing as in Java [56] to ensure that untrusted policy code cannot be injected into a program. As mentioned earlier, the expression {}e removes a label from es type, while {e }e adds one. Labeling and unlabeling operations may only occur within policy code; we discuss these operations in detail below. Standard types t include int, type variables , and universally quantied types .t. Functions have dependent type (x:t1 ) t2 where x names the argument and may be used in t2 . We illustrate the usage of these types shortly. Labels can be given either type lab or the singleton type lab e, which describes label expressions equivalent to e. For example, the label constructor High can be given the type lab and the type lab High. Singleton types are useful for constraining the form of label arguments to enforcement policy functions. For example, we could write a specialized form of our previous access simple function: policy access pub (acl:lab ACL(World), x:int{acl}) = {}x 18 The FABLE type checker ensures this function is called only with expressions that evaluate to the label ACL(World), i.e., the call access pub(ACL(Alice,Bob),e) will be rejected. In effect, the type checker is performing access control at compile time according to the constraint embodied in the type. We will show in Section 2.2.3 that these constraints are powerful enough to encode an information ow policy that can be checked entirely at compile time. The dependent type t{e} describes a term of type t that is associated with a label e. Such an association is made using the syntax {e}e . For example, {High}1 is an expression of type int{High}. Conversely, this association can be broken using the syntax {}e. For example, {}({High}1) has type int. Now we illustrate how dependent function types (x:t1 ) t2 can be used. The function access simple can be given the type (acl:lab) (x:int{acl}) int which indicates that the rst argument acl serves as the label for the second argument x. Instead of writing (x:t1 ) t2 when x does not appear in t2 , we simply omit it. Thus access simples type could be written (acl:lab) int{acl} int. The operational semantics of Section 2.1.4 must distinguish between application and policy values to ensure that policy code does not inadvertently grant undue privilege to application functions. Application values vapp consist of either pre-values uintegers n, labels containing values, type and term abstractionsor labeled policy values wrapped with ( ) brackets. Values within policy code are pre-values preceded by zero or more [] relabeling operations. Encodings. To make our examples more readable, we use the syntactic shorthands shown in Figure 2.2. The rst three shorthands are mostly standard. We use the policy keyword to 19 type abbreviation typename N = t in e2 (N t (( t )t))e2 ( x:t.e2 ) e1 let f = x f:t . . x:t.e1 in e2 let binding, for some t let x = e1 in e2 polymorphic function denition, for some t let f (x:t) = e1 in e2 policy function def, for some t policy f (x:t) = e1 in e2 let f = x f:t . . x:t.( 1 ] in e2 [e ) dependent tuple type x:tt dependent tuple introduction, for some t,t (e, e ) dependent tuple projection, for some t,t , and te let x,y = f in e .((x:t) t ) . f:((x:t) t ). f e e f [te ]( x:t. y:t .e) Figure 2.2: Syntactic shorthands designate policy code instead of using brackets ( ). A dependent pair (e, e ) of type x:t t [] allows x, the name for the rst element, to be bound in t , the type of the second element. For example, the rst two arguments to the access pub function above could be packaged into a dependent pair of type (acl:lab ACL(World) int{acl}), which is inhabited by terms such as (ACL(World),{ACL(World)}1). Dependent pairs can be encoded using dependently typed functions. We extend the shorthand for function application, policy function denitions, type abbreviations, and tuples to multiple type and term arguments in the obvious way. We also write as a wildcard (dont care) pattern variable. Phantom label variables. We extend the notation for polymorphic functions in a way that permits quantication over the expressions that appear in a type. Consider the example below: policy add l (x:int{l}, y:int{l}) = {l}({}x + {}y) 20 This policy function takes two like-labeled integers x and y as arguments, unlabels them and adds them together, and nally relabels the result, having type int{l}. This function is unusual because the label l is not a normal term argument, but is being quantiedany label l would do. The reason this makes sense is that in FABLE, (un)labeling operations are merely hints to the type checker to (dis)associate a label term and a type. These operations, along with all types, can be erased at runtime without affecting the result of a computation. After erasing types, our example would become policy add (x, y) = x + y, which is clearly only a function of x and y, with no mention of l. For this reason, we can treat add as polymorphic in the labels of x and yit can be called with any pair of integers that have the same label, irrespective of what label that might be. We express this kind of polymorphism by writing the phantom label variable l, together with any other normal type variables like , , . . ., in a list that follows the function name. In the example above, the phantom variable of add are listed as l . Of course, not all label arguments are phantom. For instance, in the access simple function of Section 1.1.2, the acl is a label argument that is passed at runtime. For simplicity, we do not formalize phantom variable polymorphism here. Chapter 5 shows the key judgments related to phantom variable polymorphism; a related technical report [123] contains a proof of soundness. 2.1.2 Example: A Simple Access Control Policy Figure 2.3 illustrates a simple but complete enforcement policy for access control. Protected data is given a label listing those users authorized to access the data. In partic- 21 policy login(user:string, pw:string) = let token = match checkpw user pw with USER(k) USER(k) FAILED in (token, {token}0) let member(u:lab, a:lab) = match a with ACL(u, i) TRUE ACL(j, tl) member u tl FALSE policy access k, (u:lab USER(k), cap:int{u}, acl:lab, data:{acl}) = match member u acl with TRUE {}data halt Figure 2.3: Enforcing a simple access control policy ular, such data has type t{acl}, where acl encodes the ACL as a label. The policys login function calls an external function checkpw to authenticate a user by checking a password. If authentication succeeds (the rst pattern), checkpw returns a label USER(k) where k is some unique identier for the user. The login function returns a pair consisting of this label and a integer labeled with it; this pair serves as our runtime representation of a principal. The access function takes the two elements of this pair as its rst two arguments. Since FABLE enforces that only policies can produce labeled values, we are assured that the term with type int{USER(k)} can only have been produced by login. The access functions last two arguments consist of the protected datas label, acl, and the data itself, data. The access function calls the member function to see whether the user token u is present in the ACL. If successful, the label TRUE is returned, in which case access returns the data with its acl label removed. 22 2.1.3 Typing Figure 2.4 denes the typing rules for FABLE. The main judgment c e : t types expressions. The index c indicates whether e is part of the policy or the application. Only policy terms are permitted to use the unlabeling and relabeling operators. records three kinds of information: x:t maps variables to types, records a bound type variable, and e p records the assumption that e matches pattern p, used when checking the branches of a pattern match. The rules (T-INT), (T-VAR), (T-FIX), (T-TAB) and (T-TAP) are standard for polymorphic lambda calculi. (T-ABS) and (T-APP) are standard for a dependently typed language. (T-ABS) introduces a dependent function type of the form (x:t1 ) t2 . (T-APP) types an application of a (dependently typed) function. As usual, we require the type t1 of the argument to match the type of the formal parameter to the function. However, since x may occur in the return type t2 , the type of the application must substitute the actual argument e2 for x in t2 . As an example, consider an application of the access simple function, having type (acl:lab) int{acl} int, to the term ACL(Alice, Bob). According to (T-APP) the resulting expression is a function with type int{ACL(Alice,Bob)} int, which indicates that the function can be applied only to an integer labeled with precisely ACL(Alice,Bob). This is the key feature of dependent typingthe type system ensures that associations between labels and the terms they protect cannot be forged or broken. Rule (T-LAB) gives a label term C(e) a singleton label type lab C(e) as long as each component ei e has type lab. According to this rule ACL(Alice,Bob) can be given the type lab ACL(Alice,Bob). For that matter, the expression (( x:lab.x) High) can be 23 c e:t Expression e has type t in environment under color c ::= | x:t | | e p | 1 , 2 ::= | (x e) | ( t) | 1 , 2 ::= pol | app x:t c x:t (T-TAB) (T-VAR) Environments Substitutions Colors c c e:t t t = e:t c (T-CONV) c n : int (T-INT) t , f :t c v : t c x f :t.v : t (T-FIX) , c e : t c .e : .t t c e : .t c e [t] : ( t)t c (T-TAP) t , x:t c e : t c x:t.e : (x:t) t c ei : lab c C(e) : lab C(e) (T-ABS) e1 : (x:t1 ) t2 c e 2 : t1 c e1 e2 : (x e2 )t2 (T-HIDE) (T-APP) (T-LAB) c e : lab e c e : lab c e : lab e : lab e c (T-SHOW) c e : lab t pn = x where x dom() xi = FV (pi ) \ dom() , xi :lab c pi : lab , xi :lab, e pi c ei : t c match e with p1 e1 . . . pn en : t pol (T-MATCH) e : t{e } pol {}e : t (T-UNLAB) pol pol e : lab pol {e }e : t{e } e:t (T-RELAB) pol e : t c ( ):t [e] (T-POL) t t = Types t and t are convertible t t (TE-ID) = Type contexts T ::= | {e} | x: t | x:t | . Term label contexts L ::= lab | t{ } t t = t t = (TE-SYM) t t = T t T t = (TE-CTX) e p Le L p = (e2 ) : lab (TE-REFINE) .(dom( ) = FV (e1 ) (e1 ) : lab) (e1 ) L e1 L e2 = c (e2 ) (TE-REDUCE) t t int (K-INT) Type t is well-formed in environment (K-LABT) (K-TVAR) t1 lab (K-LAB) (K-FUN) pol e : lab lab e (K-SLAB) pol e : lab t{e} , x:t1 t2 (x:t1 ) t2 , t .t (K-ALL) Figure 2.4: Static semantics of FABLE 24 given the type lab (( x:lab.x) High); there is no requirement that e be a value. The rule (T-HIDE) allows a singleton label type like this one to be subsumed to the type of all labels, lab. Rule (T-SHOW) does the converse, allowing the type of a label to be made more precise. Rule (T-MATCH) checks pattern matching. The rst premise conrms that expression e being matched is a label. The second line of premises describes how to check each branch of the match. Our patterns differ from patterns in ML in two respects. First, the second premise on the second line requires , xi : lab c pi : lab, indicating that patterns in FABLE are allowed to contain variables that are dened in the context . Second, pattern variables may occur more than once in a pattern. Both of these features make it convenient to use pattern matching to check for term equality. For example, in the expression let y = Alice in match x with ACL(y,y) e, the branch e is evaluated only if the runtime value for the label variable x is ACL(Alice, Alice). A key feature of (T-MATCH) is the nal premise on the second line, which states that the body of each branch expression ei should be checked in a context including the assumption e pi , which states that e matches pattern pi . This assumption can be used to rene type information during checking (similar to typecase [60]) using the rule (TCONV), which we illustrate shortly. (T-MATCH) also requires that variables bound by patterns do not escape their scope by appearing in the nal type of the match; this is ensured by the second premise, t, which conrms t is well formed in the top-level environment (i.e., one not including pattern-bound variables). For simplicity we require a default case in pattern-matching expressions: the third premise requires the last pattern to be a single variable x that does not occur in . 25 Rule (T-UNLAB) types an unlabeling operation. Given an expression e with type t{e }, the unlabeling of e strips off the label on the type to produce an expression with type t. Conversely, (T-RELAB) adds a label e to the type of e. The pol-index on these rules indicates that both operations are only admissible in policy terms. This index is introduced by (T-POL) when checking the body of a bracketed term ( ). For example, [e] given expression e x:int{Public}.( [{}x] , we have ) will be typed with index pol by (T-POL). Rule (T-CONV) allows e to be given type t assuming it can given type t where t and t are convertible, written t t . Rules (TE-ID) and (TE-SYM) dene convertibility to = be reexive and symmetric. Rule (TE-CTX) structurally extends convertibility using type contexts T . The syntax T t denotes the application of context T to a type t which denes the type that results from replacing the occurrence of the hole in T with t. For example, if T is the context {C}, then T int is the type int{C}. (Of course, rule (TE-CTX) can be applied several times to relate larger types.) The most interesting rules are (TE-REFINE) and (TE-REDUCE), which consider types that contain labels (constructed by applying context L to an expression e). Rule (TEREFINE) allows two structurally similar types to be considered equal if their embedded expressions e and p have been equated by pattern matching, recorded as the constraint e p by (T-MATCH). To see how this would be used, consider the following example: let tok,cap = login "Joe" "xyz" in match tok with USER(k) access tok cap app e : int{Public} int since {}x halt We give the login function the type string string (l:lab int{l}). The type of 26 access (dened in Figure 2.3) is (u:lab USER(k)) int{u} t. We type check access tok using rule (T-APP), which requires that the functions parameter and its formal argument have the same type t. However, here tok has type lab while access expects type lab USER(k). Since the call to access occurs in the rst branch of the match, the context USER(k) due to (T-MATCH). From (T-SHOW) we can give includes the renement tok tok type lab tok, and by applying (TE-REFINE) we have lab tok lab USER(k) and = so tok can be given type lab USER(k) as required. Similarly, for access tok cap, we can check that the type int{tok} of cap is convertible with int{USER(k)} in the presence of the same assumption. Rule (TE-REDUCE) allows FABLE types to be considered convertible if the expression component of one is reducible to the expression component of the other [4]; reduction e c e is dened shortly in Figure 2.4. For example, we have c int{( x:lab.x) Low} = int{Low} since ( x:lab.x) Low Low. One complication is that type-level expressions may contain free variables. For example, suppose we wanted to show y : lab int{( x:lab.x) y} int{y} = It seems intuitive that these types should be considered convertible, but we do not have that ( x:lab.x) y c y because y is not a value. To handle this case, the rule permits two types to be convertible if, for every well-typed substitution of the free variables of e1 , (e1 ) c (e2 ). This captures the idea that the precise value of y is immaterialall reductions on well-typed substitutions of y would reduce to the value that was substituted for y. 27 Satisfying this obligation by exhaustively considering all possible substitutions is obviously intractable. Additionally, we have no guarantee that an expression appearing in a type will converge to a value. Thus, type checking in FABLE, as presented here, is undecidable. This is not uncommon in a dependent type system; e.g., type checking in Cayenne is undecidable [6]. However, other dependently typed systems impose restrictions on the usage of recursion in type-level expressions to ensure that type-level terms always terminate [17]. Additionally, there are several possible decision procedures that can be used to partially decide type convertibility. One simplication would be to attempt to show convertibility for closed types only, i.e., no free variables. In SELINKS, our implementation of FABLE, we use a combination of three techniques. First, we use type information. If l is free in a type, and the declared type of l is lab e, then we can use this information to substitute e for l. Similarly, if the type context includes an assumption of the form l e (when checking the branch of a pattern), we can substitute l with e. Finally, since type-level expressions typically manipulate labels by pattern matching, we use a simple heuristic to determine which branch to take when pattern matching expressions with free variables. These techniques sufce for all the examples in this chapter and both our SEWIKI and SEWINESTORE applications. A related technical report [123] discusses these decision procedures in greater detail and proves them sound. Finally, the judgment t states that t is well-formed in . Rules (K-INT), (K- TVAR), and (K-LAB) are standard, (K-FUN) denes the standard scoping rules for names in dependent function types, and (K-ALL) denes the standard scoping rule for universally quantied type variables. (K-SLAB) and (K-LABT) ensure that all expressions e that appear in types can be given lab-type. Notice that type-level expressions are typed in 28 e c e Small-step chromatic reduction rules Evaluation contexts Ec ::= e | vc | [t] | C(vc , , e) | match with pi ei | {e} | {} e Ec e c c e Ec e (E-CTX) e () [e] pol app e (] [e ) (E-POL) ( x:t.e) vc c c (x vc )e (E-APP) ( .e) [t] c ( t)e i < j. vc (E-TAP) x f :t.v ( f x f :t.v)v (E-MATCH) (E-FIX) pi : i vc pj : j c match vc with p1 e1 . . . pn en j (e j ) app ( [C(u)] ) ( .e] [ ) e p p: p: app C(u) (E-BLAB) () [n] app n pol (E-BINT) ( x:t.e] [ ) x:t.( ) (E-BABS) [e] pol app .( ) [e] (E-BTAB) () [e] e (E-NEST) {}{e}vpol vpol (E-UNLAB) Expression e matches pattern p under substitution (U-PATID) v x:xv (U-VAR) i.i = (0 , . . . , i1 ) ei C(e) C(p) : i pi : i (U-CON) Figure 2.5: Dynamic semantics of FABLE pol-context. Because FABLE enjoys a type-erasure property, any (un)labeling operations appearing in types pose no security risk. We use this feature to good effect in Section 2.2.2 to protect sensitive information that may appear in labels. 2.1.4 Operational Semantics Figure 2.5 denes FABLEs operational semantics. We dene a pair of small-step reduction relations e Rules of the form e app e and e pol e for application and policy expressions, respectively. c e are polychromaticthey apply both to policy and application expressions. Since the values for each kind of expression are different, we also parameterize the evaluation contexts Ec by the color of the expression, i.e., the context, either app 29 or pol, in which the expression is to be reduced. Rule (E-CTX) uses these evaluation contexts Ec , similar to the type contexts used above, to enforce a left-to-right evaluation order for a call-by-value semantics. (In the context of FABLE, which is purely functional, the call-by-value restriction is unnecessary. However, in subsequent chapters, a call-by-value semantics is important.) Policy expression reduction e pol e takes place within brackets according to (E-POL). The rules (E-APP), (E-TAP), and (E-FIX) dene function application, type application, and xed-point expansion, respectively, in terms of substitutions; all of these are standard. Rule (E-MATCH) relies on a standard pattern-matching judgment v p : , also dened in Figure 2.5, which is true when the label value matches the pattern such that v = (p). (E-MATCH) determines the rst pattern p j that matches the expression v and reduces the match expression to the matched branchs body after applying the substitution. The (U-CON) rule in the pattern-matching judgment v p : is the only non-trivial rule. As explained in Section 2.1.3, since pattern variables may occur more than once in a pattern, (U-CON) must propagate the result of matching earlier subexpressions when matching subsequent sub-expressions. For example, pattern matching should fail when attempting to match ACL(Alice, Bob) with ACL(x, x). This is achieved in (U-CON) because, after matching (Alice x : x Alice) using (U-VAR), we must try to match Bob with (x Alice)x, which is impossible. An applied policy function will eventually reduce to a bracketed policy value vpol . When vpol has the form ( ), the brackets may be removed so that the value u can be used by [u] application code. (E-BLAB) and (E-BINT) handle label expressions ( [C(u)] and integers ) n, respectively. To maintain the invariant that (un)labeling operators only appear in policy code, rules (E-BABS) and (E-TABS) extrude only the and binders, respectively, from 30 bracketed abstractions, allowing them to be reduced according to (E-APP) or (E-TAP). Brackets cannot be removed from labeled values ( [{e}u] by application code, to preserve ) the labeling invariant. On the other hand, brackets can be removed from any expression by policy code, according to (E-NEST). This is useful when reducing expressions such as ( x:t.x] ( ), which produces ( [v] ] after two steps; (E-NEST) (in combination with [ ) [v] [( ) ) (E-POL)) can then remove the inner brackets. Finally, (E-UNLAB) allows an unlabeling operation to annihilate the top-most relabeling operation. Notice that the expressions within a relabeling operation are never evaluated at runtimerelabelings only affect the types and are purely compile time entities. The types that appear elsewhere, such as (E-TAP), are also erasable, as is usual for System F. 2.1.5 Soundness We state the standard type soundness theorems for FABLE here. In addition to en- suring that well-typed programs never go wrong or get stuck, we have put this soundness result to good use in proving that security policies encoded in FABLE satisfy desirable security properties. We discuss this further in the next section. Appendix A contains a full statement and proof of this theorem. Theorem 1 (Type soundness). If more, if e c c e : t; then either e .e c e or vc .e = vc . Further- e ; then, c e : t. 31 2.2 Example Policies in FABLE This section uses FABLE to encode several security policies. We prove that any well-typed program using one of these policies enjoys relevant security propertiesi.e., the program is sure to enforce the policy correctly. We focus on four kinds of policies: access control, provenance, static information ow, and dynamic information ow. As mentioned in the introduction, FABLE does not, in and of itself, guarantee that well-typed programs implement a particular security policys semantics correctly. That said, FABLE has been designed to facilitate proof of such theorems. To illustrate how, we chose to use three very different techniques for each of the correctness results reported here. We conclude from our experience that the metatheory of FABLE provides a useful repository of lemmas that can naturally be applied in showing the correctness of various policy encodings. As such, we believe the task of constructing a correctness proof for a FABLE policy to be no more onerous, and possibly considerably simpler, than the corresponding task for a special-purpose calculus that bakes in the enforcement of a single security policy. 2.2.1 Access Control Policies Access control policies govern how programs release information but, once the information is released, do not control how it is used. To prove that an access control policy is implemented correctly, we must show that programs not authorized to access some information cannot learn the information in any way, e.g., by bypassing a policy check (something not uncommon in production systems [114]) or by exploiting leaks due 32 to control-ow or timing channels. We call this security condition non-observability. Intuitively, we can state non-observability as follows. If some program P is not allowed to access a resource v1 having a label l, then a program P that is identical to P except that v1 has been replaced with some other resource v2 (having the same type and label as v1 ) should evaluate in the same way as Pit should produce the same result and take the same steps along the way toward producing that result. If this were not true then, assuming Ps reduction is deterministic, P must be inferring information about the protected resource. To make this intuition formal, we will show that the evaluations of programs P and P are bisimilar, where the only difference between them is the value of the protected resource. To express this, rst we dene an equivalence relation called similarity up to l (analogous to denitions of low equivalence [111, 26]) which holds for two terms e and e if they only differ in sub-terms that are labeled with l, with the intention that l is the label of restricted resources. Denition 2 (Similarity up to l). Expressions e and e , identied up to -renaming, are similar up to label l according to the relation e1 l e2 shown in Figure 2.6. The most important rule in Figure 2.6 is (SIM-L), which states that arbitrary expressions e and e are considered similar at label l when both are labeled with l. Other parts of the program must be structurally identical, as stated by the remaining congruence rules. We extend similarity to a bisimulation as follows: two similar terms are bisimilar if they always reduce to similar subterms, and do so indenitely or until no further reduction is possible. This notion of bisimulation is the basis of our access control security theorem; 33 e l e (SIM-ID) e l e x:t.e l x:t.e {l}e l {l}e (SIM-L) e l e l = l {l }e l {l }e e1 l e1 e2 l e2 (SIM-L2) (SIM-ABS) e1 e2 l e1 e2 e l e .e l .e i.ei l ei C(e) l C(e ) (SIM-MATCH) (SIM-APP) v l v x f :t.v l x f :t.v e l e t l t e[t] l e[t ] (SIM-FIX) (SIM-TAB) (SIM-TAP) (SIM-LAB) e l e ei l fi pi l qi match e with pi ei l match e with qi fi e l e ( ) l ( ] [e] [e ) (SIM-POL) Figure 2.6: Similarity of expressions under the access control policy it is both timing and termination sensitive. Denition 3 (Bisimulation). Expressions e1 and e2 are bisimilar at label l, written e1 l e2 , if and only if e1 l e2 and for {i, j} = {1, 2}, ei c ei e j c e j and e1 l e2 . Theorem (Non-observability). Given all of the following: 1. A ( )-free expression e. [] 2. (a:ta , m:tm , cap:int{user}, x:t{acl} e : te ) where acl and user are label constants. app 3. A type-respecting substitution = (a access, m member, cap ( [{user}0] ). ) 4. Type-respecting substitutions i = , x vi where Then, we have (member user acl c app vi : t{acl} for i = 1, 2. False) 1 (e) acl 2 (e). This theorem is concerned with a program e that contains no policy-bracketed terms (it is just application code) but, via the substitution , may refer to our access control func34 tions access and member (dened in Figure 2.3) through the free variables a and m. Additionally, the program is granted a single user capability ( [{user}0] through the free vari) able cap, which gives the program the authority of user user. The program may also refer to some protected resource x whose label is acl, but the authority of user is insufcient to access x according to the access control policy because (member user acl c False). Under these conditions, we can show that for any two (well-typed) vi we substitute for x according to substitution i , the resulting programs are bisimilartheir reduction is independent of the choice of vi . In all our proofs, two key features of FABLE play a central role. First, dependent typing in FABLE allows a policy analyst to assume that all policy checks are performed correctly. For instance, when calling the access function to access a value v of type t{acl}, the label expressing vs security policy must be acl, and no other. The type system ensures that the application program cannot construct a label, say ACL(Public), and trick the policy into believing that this label, and not acl, protects v, i.e., dependent typing rules out confused deputies [18]. Second, the restriction that application code cannot directly inspect labeled resources ensures that a policy function must mediate every access of a protected resource. Assuring complete mediation is not unique to FABLE Zhang et al. [148] used CQual to check that SELinux operations on sensitive objects are always preceded by policy checks and Fraser [53] did the same for Minix. However, the analysis in both these instances only ensures that some policy check has taken place, not necessarily the correct one. As such, these other techniques are vulnerable to aws due to confused deputies. When combined with these two insights, our proof of non-observability for the ac35 cess control policy is particularly simple. In essence, the FABLE system ensures that a value with labeled type must be treated abstractly by the application program. With this observation, the proof proceeds in a manner very similar to a proof of value abstraction [58]. This is a general semantic property for languages like FABLE that support parametric polymorphism or abstract types. Indeed, the policy as presented in Figure 2.3 could have been implemented in a language like ML, which also has these features. For instance, an integer labeled with an access control list could be represented in ML as a pair consisting of an access control list and an integer with type (string list int). A policy module could export this pair as an abstract type, preventing application code from ever inspecting the value directly, and provide a function to expose the concrete type only after a successful policy check. While such an encoding using MLs module system would sufce for the simple policy of Figure 2.3, it would not work for more sophisticated models of access control. For example, a form of access control using capabilities can be easily encoded in FABLE. Such a model could provide access to more than one resource with a single membership test, as in the following code policy access cap k (u:lab USER(k), cred:int{u}, acl:lab) = match member u acl with True . x:{acl}.{}x #fail Here the caller presents a user credential and an access control label acl but no resource labeled with that label. If the membership check succeeds, a function with type .{acl} is returned. This function can be used to immediately unlabel any resource with the authorized label, i.e., the function is a kind of key that can be used to gain access to a protected resource. This is useful when policy queries are expensive. It is 36 also useful for encoding a form of delegation; rather than releasing his user credential, a user could release a function that uses that credential to a limited effect. Of course, this may be undesirable if the policy is known to change frequently, but even this could be accommodated. Variations that combine static and dynamic checks are also possible. Finally, notice that this theorem is indifferent to the actual implementation of the acl label and the member function. Thus, while our example policy is fairly simplistic, a far more sophisticated model could be used. For instance, we could have chosen labels to stand for RBAC- or RT-style roles [76], and member could invoke a decision procedure for determining role membership. Likewise, the theorem is not concerned with the origin of the user authentication tokena function more sophisticated than login (e.g., that relied on cryptography) could have been used. The important point is that FABLE ensures the second component of the user credential (l:lab USER(k) int{l}) is unforgeable by application code. 2.2.2 Dynamic Provenance Tracking Provenance is information recording the source, derivation, or history of some in- formation [26]. Provenance is relevant to computer security for at least two reasons. First, provenance is useful for auditing, e.g., to discover whether some data was inappropriately released or modied. Second, provenance can be used to establish data integrity, e.g., by carefully accounting for a documents sources. This section describes a labelbased provenance tracking policy we constructed in FABLE. To prove that this policy is implemented correctly we show that all programs that use it will accurately capture the 37 dependences (in the sense of information ow) on a value produced by a computation. Figure 2.7 presents the provenance policy. We dene the type Prov to describe a pair in which the rst component is a label l that records the provenance of the second component. The policy is agnostic to the actual form of l. Provenance labels could represent things like authorship, ownership, the channel on which information was received, etc. An interesting aspect of Prov is that the provenance label is itself labeled with the 0-ary label constant Auditors. This represents the fact that provenance information is subject to security concerns like condentiality and integrity. Intuitively, one can think of data labeled with the Auditors label as only accessible to members of a group called Auditors, e.g., as mediated by the access control policy of Figure 2.3; of course, a more complex policy could be used. Finally, note that because the provenance label l is itself labeled (having type lab{Auditors}), it would be incorrect to write {l} as the second component of the type since this requires that l have type lab. Therefore we unlabel l when it appears in the type of the second component. As explained in Section 2.1.3, unlabeling operations in types pose no security risk since the types are erased at runtime. The policy function apply is a wrapper for tracking dependences through function applications. In an idealized language like FABLE it is sufcient to limit our attention to function application, but a policy for a full language would dene wrappers for other constructs as well. The rst argument of apply is a provenance-labeled function lf to be called on the second argument mx. The body of apply rst decomposes the pair lf into its label l and the function f itself and does likewise for the argument mx. Then it applies the function, stripping the label from both it and its argument rst. The provenance of the result is a combination of the provenance of the function and its argument. We write this 38 typename Prov = (l:lab{Auditors} {{}l}) policy apply , (lf:Prov ( ), mx:Prov ) = let l,f = lf in let m,x = mx in let y = ({}f) ({}x) in let lm = Union({}l, {}m) in ({Auditors}lm, {lm}y) policy atten (x:Prov (Prov )) = let l,inner = x in let m,a = inner in let lm = Union({}l, {}m) in ({Auditors}lm, {lm}a) Figure 2.7: Enforcing a dynamic provenance-tracking policy as the label pair Union(l, m) which is then associated with the nal result. Notice that we strip the Auditors labels from labels l and m before combining them, and then relabel the combined result. The policy also denes a function atten to convert a value of type Prov (Prov ) to one of type Prov by extracting the nested labels (the rst two lines) and then collapsing them into a Union (third line) that is associated with the inner pairs labeled component (fourth line). An example client program that uses this provenance policy is the following: let client , , (f : Prov( ), x : Prov , y : Prov ) = apply [ ][] (apply [][ ] f x) y This function takes a labeled two-argument function f as its argument and the two arguments x and y. It calls apply twice to get a result of type Prov . This will be a tuple in which the rst component is a labeled provenance label of the form Union(Union(lf,lx), ly) and the second component is a value labeled with that provenance label. In the label, we will have that lf is the provenance label of the function argument f and lx and ly are the 39 [[e]] [[C]] = {C} t t t t e l e : t,t def def Interpretation of labels as sets [[Union(l1 , l2 )]] = [[l1 ]] [[l2 ]] Prexing relation on types t t t t {e} e and e are related by provenance p at types t and t pol t ti ei vlab i {1, 2} c vi : ti ti {ei } ti i l [[vlab ]] [[vlab ]] Auditors [[vlab ]] [[vlab ]] 1 2 1 2 v1 l v2 : t1 ,t2 ei i {1, 2} c ei : ti c vi v1 [l v2 : t1 ,t2 e1 l e2 : t1 ,t2 (R-EQUIVP) n l n : int, int (R-INT) c v : (x:t1 ) t2 v1 , v1 . v1 l v1 : t1 ,t1 (R-EXPR) c v : (x:t1 ) t2 vv1 l v v1 : (x v1 )t2 , (x v1 )t2 v l v : (x:t1 ) t2 , (x:t1 ) t2 (R-ABS) ... Figure 2.8: A logical relation that relates terms of similar provenance (selected rules) provenance of the arguments x and y, respectively. Note that a caller of client can instantiate the type variable to be a type like Prov int. In this case, the type of the returned value will be Prov (Prov int), which can be attened if necessary. We can prove that provenance information is tracked correctly following Cheney et al. [26]. The intention is that if a value x of type Prov inuences the computation of some other value y, then y must have type Prov (for some ) and its provenance label must mention the provenance label of x. If provenance is tracked correctly, a change to x will only affect values like y; other values in the program will be unchanged. The essence of this correctness condition is much like the similarity relation v1 l v2 40 dened for the non-observability property in Section 2.2.1. However, there is a difference between provenance tracking and access control that complicates the statement of the correctness condition for provenance. When a computation e that depends on the value of some variable x is reduced in two different contexts for x, the intermediate terms that are produced in one context can be entirely different from the terms that are produced in the other context. That is, if we have e(x v1 ) c e1 c . . . and e(x v2 ) c e1 c . . .; then the terms e1 and e2 may not even have the same shape, which makes it difcult to state a purely syntactic similarity condition between the terms. However, the provenance tracking policy ensures that if both reduction sequences terminate with a value, then the corresponding values are labeled with the appropriate provenance label. In contrast, although non-observability for access control also applies to programs e that are reduced in different contexts, we can assure that at each step the terms that are produced are identical, except for holes in the terms that contain the access-protected values of x. Our formulation of dependency correctness follows technique that is due to Tse and Zdancewic [127] (although Tse and Zdancewic use this technique to show noninterference in the presence of a form of dynamic labeling). This approach involves dening a logical relation [85] that relates terms whose set of provenance labels include the same label l. Figure 2.8 shows a selection of rules in this relation. (The full relation can be found in Section A.3.) The top of the gure begins by giving a semantics for label values in terms of sets, [[e]]. The relation t t is a prexing relation on types, which is convenient for constraining the shape of types in the main relation e1 l e2 : t1 ,t2 . This latter relation, states that expressions e1 and e2 are related by provenance label l, and can be given types 41 t1 and t2 , respectively. The key rule in the relation is (R-EQUIVP). It states that two arbitrary values v1 and v2 are related at the label l, if they both have labeled types t1 and t2 (the third premise) and if these types share a common prex t (the fourth premise). The constraints on the labels on these types require that related values be labeled with the provenance label l. In the fth premise, we require that some label ei on each type reduce to a label value vlab i since we are only concerned with terminating computations, we can safely ignore the case where the label expression diverges. The last premise is a disjunct in which the rst clause requires the provenance label l to be mentioned in the labels of both expressionsnotice that the labels do not have to be identical; the sets represented by each label just have to contain l. The second clause in the disjunct handles an important corner case. Since our encoding uses dynamic provenance labels that are themselves always protected with an access control policy, and because the way in which these labels are constructed can depend on the other values in the program, we treat all provenance labels (those terms that are protected by the label Auditors) as being related. The remaining rules in Figure 2.8 are standard and give a avor of the elided rules. (R-INT) states that identical integers are related. (R-EXPR) states that expressions e1 and e2 are related if their normal forms v1 and v2 (if these exist) are related. (R-ABS) relates function-typed values if these functions reduce to related values when they are applied to related arguments. Theorem (Dependency correctness). Given all of the following: (A1) A ( )-free expression e such that a:ta , f :t f , x:Prov t [] 42 e:t , app (A2) A type-respecting substitution = (a apply, f atten). (A3) vi : Prov t, for i = 1, 2 and v1 l v2 : Prov t, Prov t app (A4) For i {1, 2}, i = , x vi Then, (1 (e) app v1 2 (e) app v2 ) v1 l v2 : 1t , 2t . Intuitively, this theorem states that an application program e that is compiled with the policy of Figure 2.7 and is executed in contexts that differ only in the choice of a tracked value of label l will compute results that differ only in sub-terms that are also colored using l. The crux of this proof involves showing that the logical relation is preserved under substitution, i.e., a form of substitution lemma for the logical relation. While constructing the infrastructure to dene the logical relation requires some work, strategic applications of standard substitution lemma for FABLE can be used to discharge the proof without much difculty. 2.2.3 Static Information Flow Both policies discussed so far rely on runtime checks. This section illustrates how FABLE can be used to encode static lattice-based information ow policies that require no runtime checks. In a static information ow type system (as found in FlowCaml [111]) labels l have no run-time witness; they only appear in types t{l}. Labels are ordered by a relation that typically forms a lattice. This ordering is lifted to a subtyping relation l2 t{l1 } <: t{l2 }. Assuming the lattice ordering is xed on labeled types such that l1 during execution, well-typed programs can be proven to adhere to the policy dened by the initial label assignment appearing in the types. 43 policy lub(x:lab, y:lab) = match x,y with , HIGH | HIGH, HIGH | , LOW policy join ,l,m (x:{l}{m}) = ({lub l m}{}{}x) policy sub ,l (x:{l}, m:lab) = ({lub l m}{}x) policy apply , ,l,m (f:( ){l}, x:) = {l}(({}f) x) let client (f:(int{HIGH} int{HIGH}){LOW}, x:int{LOW}) = let x = (sub [int] x HIGH) in join [int] (apply [int{HIGH}][int{HIGH}] f x) Figure 2.9: Enforcing an information ow policy Figure 2.9 illustrates the policy functions, along with a small sample program. In our encoding we dene a two-point security lattice with atomic labels HIGH and LOW and protected expressions will have labeled types like t{HIGH}. The ordering LOW HIGH is exemplied by the lub (least upper bound) operation for the lattice. The join function (similar to the atten function from Figure 2.7) combines multiple labels on a type into a single label. The interesting thing here is the label attached to x is a label expression lub l m, rather than an label value like HIGH. The type rule (T-CONV) presented in Figure 2.4 can be used to show that a term with type int{lub HIGH LOW} can be given type int{HIGH} (since lub HIGH LOW programs that use this policy. The policy includes a subsumption function sub, which takes as arguments a term x with type {l} and a label m and allows x to be used at the type {lub l m}. This is a restatement of the subsumption rule above, as l m implies l m = m. (Once types c HIGH). This is critical to being able to type are erased, join and sub are both essentially the identity function and could be optimized away.) Finally, the policy function apply unlabels the function f in order to call it, and then adds f s label on the computed result. 44 Consider the client program at the bottom of Figure 2.9 as an example usage of the static information ow policy. The function client calls the function f with x, where f expects a parameter of type int{HIGH} while x has type int{LOW}. For the call to type check, the program uses sub to coerce xs type to int{lub LOW HIGH} which is convertible to int{HIGH}. The call to apply returns a value of type int{HIGH}{LOW}. The call to join collapses the pair of labels so that clients return type is int{lub HIGH LOW}, which converts to int{HIGH}. We have proved that FABLE programs using this policy enjoy the standard noninterference propertya statement of this theorem appears below. We have also shown that a FABLE static information ow policy is at least as permissive as the information ow policy implemented by the functional subset of Core-ML, the formal language of FlowCaml [104]. Both proofs may be found in Appendix A. Theorem (Noninterference). Given p : t, x : t{HIGH} free and t is not a labeled type; and, for i = 1, 2, c c e : t {LOW }, where e is ( )[] vi : t{HIGH}. Then, for typec respecting substitutions i = (p , x vi ), where is the policy of Figure 2.9, 1 (e) v1 2 (e) c v2 v1 = v2 . While it would be possible to reuse our infrastructure for the dependency correctness proof to show the noninterference result for the static information ow policy (as in Tse and Zdancewic), we choose instead to use another technique, due to Pottier and Simonet [104]. This technique involves representing a pair of executions of a FABLE program within the syntax of a single program and showing a subject reduction property holds true. As with the logical relations proof, once we had constructed the infrastruc45 ture to use this technique, the proof was an easy consequence of FABLEs preservation theorem. 2.2.4 Dynamic Information Flow Realistic information ow policies are rarely as simple as that of Section 2.2.3. For example, the security label of some data may not be known until run-time, and the label itself may be more complex than a simple atom, e.g., it might be drawn from the DLM [89] or some other higher-level policy language, such as RT [125]. Figure 2.10 shows how dynamic security labels can be associated with the data and an information ow policy enforced using a combination of static and dynamic checks [149]. The label lattice is dened by the external oracle function. The enforcement policy interfaces with the oracle through the function ow, which expects two labels src and dest as arguments and determines whether the oracle permits information to ow from src to dest. The representation of these labels is abstract in the policy and depends on the implementation of the oracle. The ow function is given the type (src:lab) (dest:lab) (l:lab unit{l}) If the oracle permits the ow, the ow function returns a capability similar to that provided by the login function of Figure 2.3. The sub function takes this capability as its rst argument as proof that type {src} may be coerced to type {dest}. The low function must appeal to the oracle to acquire the bottom label in the lattice. The app function is analogous to the apply function in the static information ow policy. It takes a 46 policy ow(src:lab, dest:lab) = let f = if oracle src dest then FLOW(src,dest) else NOFLOW in (f, {f}() policy low (x:) = let l = oracle low() in (l, {l}x) policy sub ,src,dest (cap:unit{FLOW(src,dest)}, x:{src}) = {dest}x policy app , ,l,m (f:( ){l}, x:{m}) = {JOIN(l, m)} ({}f) ({}x) let client (lb:lab, b:bool{lb}, lx:lab, x:{lx}, let lxy = JOIN(lx,ly) in let fx,capx = ow lx lxy in let fy,capy= ow ly lxy in match fx,fy with ly:lab, y:{ly}) = FLOW(lx,lxy), FLOW(ly,lxy) let x = sub [] capx x in let y = sub [] capy y in let tmp = app [] [ ] ( b[] ) x in app [] [] tmp y , ... #ow must be allowed if oracle is a lattice Figure 2.10: A dynamic information ow policy and a client that uses it labeled function f and its argument x as parameters. In the body, it unlabels f and applies it to x (after unlabeling x also). Since the returned value depends both on the function and the argument, we label it with the labels of both f and x. The bottom part of Figure 2.10 shows a client program that illustrates a usage of this policy. This client program has the same high-level behavior as the example program we showed for the static information ow policyit branches on a boolean and returns either x or ybut here the security labels of the arguments are not statically known. Instead, the argument lb is a label term that species the security level of b, and similarly lx for x and ly for y. As previously, our encoding of booleans requires each branch to have the same 47 type, including the security label. In this case, the program arranges the branches to have the type JOIN(lx,ly). The rst three lines of the main expression use the ow function to attempt to obtain capabilities that witness the ow from lx and ly to JOIN(lx,ly). The match inspects the labels that are returned by ow and in case where they are actually FLOW(...) the nal premise of (T-MATCH) permits the type of fx to be rened from lab fx to lab FLOW(lx,lxy) and the type of capx to be rened to unit{FLOW(lx,lxy)}, and similarly for capy. The remainder of the program is similar to the static case, but requires more uses of subsumption since less is known statically about the labels. The type of this program is: . (lb:lab) bool{lb} (lx:lab) {lx} (ly:lab) {ly} {JOIN(JOIN(lx,ly), lb)} We have not explicitly proved a noninterference property for this policy. However, a proof would essentially combine the proof of dependency correctness for the provenance tracking policy and the proof of noninterference for the static information ow policy. 2.3 Composition of Security Policies All our correctness theorems impose the condition that an application program be ( )-free. That is, these theorems apply only to situations where a single policy is in effect [] within a program. However, in practice, multiple policies may be used in conjunction and we would like to reason that interactions between the policies do not result in violations of the intended security properties. To characterize the conditions under which a policy can denitely be composed with another, we dene a simple type-based criterion, which when satised by two (or more) policies P and Q , implies that neither policy will interfere 48 Composes(P,t) t{e} t{P(e)} = Composes(P,t{e}) A type t is wrapped within the label namespace P Composes(P,t) Composes(P, ::.t) i.Composes(P,ti ) Composes(P, (x:t1 ) t2 ) Figure 2.11: A type-based composability criterion with the functioning of the other policy when applied in tandem to the same program. Figure 2.11 denes a predicate Composes(P,t), which states that all the labels that appear in the type t of a policy term are enclosed within a top-level constructor P, i.e., the constructor P serves as a namespace within which all the labels are enclosed. Intuitively, a policy can be made composable by enclosing all its labels within a unique top-level label constructor that fullls the role of a namespace. A policy that only manipulates labels and labeled terms that belong to its own namespace can be safely composed with another policy. The main benet of compositionality is modularity; when multiple composable policies are applied to a program, one can reason about the security of the entire system by considering each policy in isolation. Policy designers that are able to encapsulate their policies within a namespace can package their policies as libraries to be reused along with other policy libraries. Our notion of composition is a noninterference-like propertya policy is deemed composable if it can be shown not to depend on, or inuence the functioning of another policy. The statement of this property appears below. Theorem (Noninterference for policy composition). Given (A1) x : t, y : s e : t{P(l)}, such that e is ( )-free [] ei : ti 49 Composes(P,ti ) app (A2) {e1 , ..., en } such that i. pol (A3) { f1 , ..., fm } {g1 , ..., gm } such that i. with P = Q pol fi : si pol gi : si Composes(Q, si ) (A4) f = (x (( 1 ] , ..., ( n ] ), y (( f1 ] , ..., ( fm ] )) , and [e ) [e ) [) [) g = (x (( 1 ] , ..., ( n ] ), y (( 1 ] , ..., ( m ] )) [e ) [e ) [g ) [g ) (A5) f (e) c vf g (e) c vg Then, v f = vg Assumption (A1) in the statement of the theorem above posits a well-typed application program e that refers to two sets of policy terms x and y. Additionally, (A1) requires e to have a labeled type, where the label P(l) is drawn from the namespace P. Assumption (A2) posits the existence of well-typed terms e that inhabit the types t of x, where each type ti is drawn from the P-namespace. Similarly, assumption (A3) posits two sets of terms f and g, both of which inhabit the types s of y, where each type si is drawn from a different namespace Q. The remaining hypotheses and conclusion of this theorem state that if e is linked with e and, in one case, with f and in another case with g, then the values v f and vg produced by an evaluation of e in each case are identical. That is, when the type of e indicates that it should produce a value protected by the policy in the P-namespace, then the specic implementation of the policy in the Q-namespace is insignicant. Or, somewhat more intuitively, this theorem states that the choice of the Q-policy cannot inuence the behavior of the P-policy. The proof of this theorem is a corollary of the noninterference result for the static information ow policy using a degenerate lattice where P and Q are incomparable. This notion of security policy composition generalizes the results of all the security 50 theorems shown here. The Composes(P,t) predicate provides a recipe by which each of our policy encodings can be adapted so that they compose well with all other policies that have also be so adapted. However, the model of composition proposed here is fairly simpleit essentially allows no interaction between policies. As with noninterference properties in other contexts, this is often too restrictive for many realistic examples in which policies, by design, must interact with each other. We nd that policies that do not compose according to this denition perform a kind of declassication (or endorsement) by allowing labeled terms to exit (or unlabeled terms to enter) the policys namespace. We conjecture that the vast body of research into declassication [112] can be brought to bear here in order to recover a degree of modularity for interacting policies. Aside from generalization via composition, we could also imagine generalizing our security theorems in more ad hoc ways. For example, one could try to prove that nonobservability holds in the presence of multiple user credentials, or with multiple protected objects. In the case of access control, it appears straightforward to prove that such a generalization holds. However, it seems unlikely that such extensions could be proved secure without a policy-specic analysis. For example, in the case of access control with multiple user credentials, one would need to show that a policy implementation does not mistakenly confuse credentials and grant improper access to an unauthorized user. The idea of an enforcement policy in FABLE speaks directly to this concernit allows all the details of policy enforcement to be dened precisely so that a security analysis can be conducted. 51 2.4 Concluding Remarks This chapter has presented FABLE, a core formalism for a programming language in which programmers may specify security policies and reason that these policies are properly enforced. We have shown that FABLE is exible enough to implement a wide variety of security policies, including access control, provenance, and static information ow, among other policies. We have dened extensional correctness properties for each of our policies and proved that type-correct programs using our policy encodings exhibit these properties. In discussing the structure of the proofs of each of our security theorems, we have argued that FABLEs design greatly simplies these proofs. In particular, FABLEs metatheory provides a useful repository of lemmas that can be used to discharge many important proof obligations. Finally, we have proposed a method by which composite policies can be applied to a program while still preserving the security properties of each component. Our focus here has been on enforcing security policies in a purely functional setting. While this has helped keep the presentation simple, in practice, security policies are often stateful and must be applied to programs that may themselves manipulate mutable state. In the following chapters, we show the basic approach of FABLE can be extended to enforce policies that account for state modication. 52 3. Enforcing Stateful Policies for Functional Programs Security policies frequently make authorization decisions based on events that may have occurred during a programs execution. For example, various models of stack- and history-based access control have been proposed to modulate the privileges of a piece of code depending on what code has already been executed in a program [51, 1]. The access rights of principals can also change during a programs execution. For instance, with long-running operating systems, network servers, and database systems, new principals may enter the system, while existing principals may leave or change duties. Changes to a principals privileges may also be transient. In a role-based policy [107], in adherence to the principle of least privilege, users are required to activate a role before requesting access to a resource. Once the access is complete, the user deactivates the role. Such a policy can be implemented in terms of a security automaton [113], where each state records a set of valid facts (e.g, the rights of principals) and security-sensitive events (e.g., role-activations) trigger state transitions. Even when not concerned with the dynamic changes to access rights, many common policies are naturally phrased in terms of mutable state (in contrast to the purely functional policies of the previous chapter). For example, in an effort to ensure separation of duties, a companys policy may permit a payment to be released only after it has been authorized by two different managers [15]. One could implement this policy using an automaton 53 which is in the initial state when a payment is requested. Each time an authorization is submitted by a manager, the automaton transitions to a new state. An accepting state is reached when two different authorizations have been received, and only then is the permission to make the payment granted. Security automata policies have been studied extensively, and are particularly important because they are known to precisely characterize the set of safety properties that can be enforced by an execution monitor. Prior work on enforcing automata-based policies has, for the most part, relied on transforming programs to insert inlined reference monitors [45]. However, this approach has a large trusted computing base in that the compiler that does the transformation has to be trusted to correctly insert code to intercept all security relevant program actions. We would prefer to have a way of verifying that the code produced by one of these transformations correctly implemented the automaton policythis would remove the complicated compiler from the trusted computing base. In this context, a type-based approach to verifying the correct enforcement of an automaton-based policy can be particularly useful. Type checking is generally a fairly lightweight syntactic procedure, likely to scale to large programs. In this chapter, we describe an extension to FABLE that, in addition to all the policies explored in Chapter 2, can be used to verify the enforcement of automata-based policies. 3.1 Overview Our approach has two parts. We begin by introducing a concrete instance of a stateful policy intended to control the terms under which information in the possession of 54 one principal can be released to another, i.e., an information release policy. Our model is based on AIR (Automata for Information Release), a formal language we developed for dening information release policies. AIRs design follows from the observation that information release policies can be naturally expressed as automata. As obligations mentioned in the policy are fullled by the program, the state of the automaton advances towards an accepting statea release is authorized only in the accepting state. AIR policies are able to address a number of concerns, including, to varying degrees, each of the four dimensions of declassication [112]. As such, AIR is of independent interest insofar as, to our knowledge, no other language allows a high-level information release policy (of comparable expressiveness) to be specied separately from the program that is to be secured. Second, we dene AIR (pronounced lair), a language related to FABLE in which type-correct programs can be shown to correctly enforce an AIR policy. Although AIR extends FABLE with singleton and afne types [121, 140] and uses a more general language of type constructors, the means by which a policy is enforced in AIR follows the same pattern as in FABLE. The rst step is to protect sensitive data in the program using a security labeling. For example, an object x representing the state of a security automaton is given the afne type InstanceN , where N is a type-level name unique to x. (Afne types in AIR are written t, to contrast with the of course modality in linear logic, which is typically denoted using !.) Then, an integer i protected by x would be given type Protected Int N, which is analogous to a labeled type in FABLE. When the automaton transitions to a new state y, because x has an afne type, we are able to consume the old state x and ensure that the new state y is used in subsequent authorization decisions. 55 Operations in the AIR policy that correspond to policy state transitions are represented by privileged policy functions in AIR. In order to manipulate data with a Protected type, a AIR program is required to call these policy functionsi.e., just as with enforcement policy functions in FABLE. Policy functions in AIR take arguments that express release obligations. These obligations are given dependent types, where an object having that type serves as a proof that the obligation has been fullled. For example, data could be released to a principal p only if p acts for some principal q (where p and q are program variables that store public keys). A proof of this fact could be represented by an object with type ActsFor p q, where ActsFor is a programmer-dened dependent type constructor. Generally speaking, proof objects represent certicates which are used to produce a certied evaluation of stateful policy logicevery authorization decision is accompanied by a proof that all obligations mandated by the high-level policy have been met. To focus on the new elements of the type system, our presentation of AIR takes a more abstract view (relative to FABLE) of the policy functions. Rather than include concrete enforcement policy functions, in AIR, we allow the policy designer to provide just a type signature for these functions. For example, to interpret access control lists in AIR, we might include a type signature that gives access simple the type (acl:Lab) (Protected Int acl) Int. This type states that access simple is a function that takes a label acl as its rst argument; an integer protected by this label as the second argument; and returns an unlabeled integer. The runtime behavior of access simple, in FABLE is implemented in the language itself (in bracketed code) to include specic membership test and an unlabeling operation. Here, the semantics of this function is specied outside of AIR, using an abstract model. Additionally, AIRs more general 56 language of type constructors allows us to give more convenient types to capabilities and certicates. For example, instead of representing a ow capability using a value of type unit{FLOW(src, dst)}, as we did with FABLE in Section 2.2.4, we can simply dene a de- pendent type constructor Flow, and give the capability a type Flow src dst. In Section 3.6 we show how the FABLE type system can be embedded in AIR. 3.2 AIR: Automata for Information Release Many organizations, including nancial institutions, health-care providers, the mil- itary, and even the organizers of academic conferences, wish to specify the terms under which sensitive information in their possession can be released to their partners, clients, or the public. Such a specication constitutes an information release policy. These policies are often quite complex. For example, consider the policy that regulates the disclosure of military information to foreign governments as dened by the United States Department of Defense [128]. This policy includes the following provisions: a release must be authorized by an ofcial with disclosure authority who represents the DoD Component that originated the information; the system must edit or rewrite data packages to exclude information that is beyond that which has been authorized for disclosure; a disclosure shall not occur until the foreign government has submitted a security assurance [. . .] on the individuals who are to receive the information; and, that the release must take place in the Foreign Disclosure and Technical Information System in which both approvals and denials of a release request must be logged. We would like to ensure that software systems that handle sensitive dataincluding 57 military systems, but also programs like medical-record databases, online auction software, and network appliancescorrectly enforce such a high-level policy. As a concrete example, consider a specic kind of application called a cross-domain guard. These are programs, like network rewalls, that mediate the transfer of information between organizations at different trust levels. Commercial guards, e.g., the Data Sync guard produced by BAE [48], do not enforce high-level policies but rather implement low-level dirty keyword lters. The research community has only recently begun to consider the veried enforcement of release policies. For instance, FlowWall [62] is arguably the research counterpart of a system like DataSync guard. By virtue of its being built with the Jif programming language [31], FlowWall is sure to enforce a low-level ltering policy, but it does not appeal to high-level information release criteria. Augmenting information ow policies with high-level conditions that control information release has been proposed by Chong and Myers [30] and, more recently, by Banerjee and Naumann [9]. However, in both these cases, reasoning separately about high-level release decisions is difcult since the release policy is embedded within the program. To ll this gap, we dene AIR, a formal language for dening information release policies separately from the program that is to be secured. AIRs design follows from the observation that an information release policy is a kind of stateful authorization policy naturally expressed as an automaton. Satisfaction of a release obligation advances the state of the automation, and once all obligations have been fullled, the automaton reaches the accepting state and the protected information can be released. AIR allows one to express such automata in a natural way. 58 In subsequent sections, we show how an AIR policy can be compiled to an API in AIR, where each API function corresponds to an automaton transition such that the type of that function precisely expresses the evidence necessary for a transition to succeed these API functions represent the enforcement policy that ties an AIR policy to a AIR program. The type system of AIR ensures that programs use the compiled AIR API correctly and, as a consequence, meet the specications of the high-level policy. More precisely, we prove that the sequence of events produced by a programs execution is a word in the language accepted by the AIR automaton. Using our techniques, one could build a cross-domain guard that adheres to highlevel policy prescriptions; e.g., it would release information only after conrming that appropriate security assurances have been received, that to-be-released data packages have been rewritten appropriately, and that audit logs have been updated. Our use of AIR policies for information release departs from prior work on declassication policies in that we do not focus on establishing a noninterference-like property for programs. However, our work complements noninterference-oriented interpretations of information release. In particular, by showing how to embed FABLE in AIR(Section 3.6), we argue that high-level AIR policies can be enforced in conjunction with information ow in AIR. For example, we could ensure that an adversary can never inuence a program to cause information to be released, and furthermore, when it is released, it always follows the prescription of the high-level AIR policy. 59 3.2.1 Syntax of AIR, by Example An AIR policy consists of one or more class declarations. A program will contain instances of a class, where each instance protects some sensitive data via a labeling. Protected data can be accessed in two ways. First, each class C has an owning principal P such that P and all who act for P may access data protected by an instance of C. Second, each class denes a release policy by which its protected data can be released to an instance of a different class. The release policy is expressed using rules that dene a security automaton, which is a potentially innite state machine in which states represent security-relevant congurations. In the case of AIR, the security automaton denes conditions that must hold before data can be released. Each class instance consists of its current state, and each condition that is satised transitions the automaton to the next state. These transitions ultimately end in a release rule that allows data to be released to a different class instance, potentially in a modied form. Because sensitive data is associated with instances rather than classes, multiple resources may be governed by the same policy template (i.e., the automaton dened by the class) but release decisions are made independently for each resource. Dually, related resources can be protected by the same instance, thereby allowing release decisions made with respect to one resource to affect the others. The formal syntax of AIR policies is presented in Figure 3.1. We explain the syntax of AIR while stepping through a running example, shown in Figure 3.2. A class declaration consists of a class identier, an identier for the owning principal, a list of automaton states, and a sequence of rules that dene the automaton transitions. Our example de- 60 Metavariables id class and rule ids P principals C state constructors n, i, j integers x, y, z variables Core language Declarations D ::= class id = (principal:P; states: S ; R ) States S ::= C | C of t Rules R ::= id : R | id : T Release R ::= When G release e with next state A Transition T ::= When G do e with next state A Guards G ::= x requested for use at y and x:t.C Conditions C ::= A1 IsClass A2 | A1 InState A2 | A1 ActsFor A2 | A1 A2 Atoms A ::= n | x | id | P | C ( A ) | A1 + A2 | Self | Class(A) | Principal(A) e is an expression and t is a type in AIR. (cf. Figure 3.4) Figure 3.1: Syntax of AIR clares a single class US Army Condential, owned by the principal US Army, that denes the policy for condential data owned by the U.S. Army. For simplicity, our examples use a at namespace for class identiers, and abstract names for principals. Automaton states are represented by terms constructed from an algebraic datatype. The example has two kinds of states. The nullary constructor Init represents the initial state of the automaton; all classes must have this state. The other kind of state is an application of the unary constructor Debt to an argument of type Int. Constructors of the form C of t may carry data as indicated by the types t . Types t (such as Int) are drawn from the programming language AIR in which programs using AIR policies are written; AIR is discussed in the next section. Each rule in an AIR class is given a name, and is either a release rule or a transition rule. Each rule begins with a clause When x requested for use at d, which serves to bind 61 variables x and d in the remainder of the rule. Here, x names the information protected by an instance of this class, requested for release to some other instance d (usually of another class). This clause is followed by a conjunction of conditions that restrict the applicability of a rule; we discuss these in more detail below. Following these conditions, the rule species a AIR expression e that can either release information (perhaps after downgrading it by ltering or encryption) or do some other action (like logging), depending on whether the rule is a release rule or a transition rule. A rule concludes with the next state of the automaton. The rst rule in the US Army Condential class is a release rule called Conf secret. This rule is qualied by a condition expression Class(d) IsClass US Army Secret stating that the rule applies when releasing x to an instance d of a class named US Army Secret. If applicable, this rule allows x to be released without modicationthe release expression is simply x, and not, some function that downgrades x. After the release, the automaton remains in its current state; i.e. the state Self. We use a small ontology for conditions based on integers, principals, classes and their instancesIsClass mentioned above, is one such condition. We expect this ontology to be extended, as needed. Generally speaking, condition expressions C are typed binary predicates over atoms A. For example, A1 ActsFor A2 is dened for Principal-typed atoms A1 and A2 , and asserts that A1 acts for A2 according to some acts-for hierarchy among principals (not explicitly modeled here). Atoms include integers n, variables x, identiers id, principal constants P, state literals constructed from an application of a state constructor C to a list of atoms, addition of integers and the implicit variable Self. We also include two operators: Class(z) is the class of the argument z, a class instance; and, Principal(z), 62 class US Army Condential = principal : US Army; states : Init, Debt of Int; Conf secret : When x requested for use at d and Class(d) IsClass US Army Secret release x with next state Self Conf init : When x requested for use at d and Self InState Init do with next state Debt(0) Conf coalition : When x requested for use at d and Principal(Class(d)) ActsFor Coalition, count:Int.Self InState Debt(count), count 10 release (log(...x...d); encrypt (pubkey (principal (class d))) x) with next state Debt(count + 1) Figure 3.2: A stateful information release policy in AIR which is the principal that owns the class z. Finally, we permit a condition C to be prexed by one or more existentially quantied variablesi.e., in x1 :t1 .C1 , . . . , xn :tn .Cn , each xi is a variable of type ti and is in scope as far to the right as possible, until the end of the rule. We omit the quantier prex when no such variables exist. 3.2.2 A Simple Stateful Policy in AIR Taken as a whole, the class US Army Condential can be thought of as implement- ing a simple kind of risk-adaptive access control [27], in which information is released according to a risk budget, with the intention of quantifying the risks vs. the benets of releasing sensitive information. This class maintains a current risk debt, as reected in the state Debt of Int. Each time the class authorizes an information release we add an 63 estimate of the risk associated with that release to the debt. When the accumulated risk debt exceeds a threshold then releases outside the U.S. Army are no longer permitted. The other two rules in the policy, Conf init and Conf coalition, implement this behavior. The Conf init transition rule applies when processing a release to an instance d and when the automaton is in the Init state. The do expression initializes the risk debt to 0 by transitioning the automaton to the Debt(0) state. The Conf coalition rule allows information to be released to a coalition partner. In particular, if the release target class is owned by a principal that acts for the Coalition (expressed by Principal(Class(d)) ActsFor Coalition), then information can be released only if the current risk debt has not exceeded the budget, as expressed in the latter two conditions. The rst of these requires the current state of the automaton to be Debt(count), where count is variable with type Int which holds the current risk debt. The last condition requires that count is not above the preallocated risk budget of 10. With these conditions satised, Conf coalition logs the fact that a release has been authorized and permits release of the data after it has been downgraded using an encryption function. In this case, the downgrading expression encrypts x with the public key of the principal that owns the class of the instance d. Unlike releases to US Army Secret which do not alter the risk debt, Conf coalition increments the risk debt by transitioning to the Debt(count + 1) state, indicating that releases to the Coalition are more risky than upgrading to a higher classication level of the same organization (via rule Conf secret). AIR as presented here is particularly simple. We anticipate extending AIR with support for more expressive condition ontologies and release rules. For instance, instead of a xed set of ontologies, we could embed a stateful authorization logic (say, in the style of SMP [15]) to allow custom ontologies and release rules to be programmed within 64 an AIR class. We could also introduce a set of downgrading and logging primitives to completely separate AIR from AIR. Additionally, AIRs object-oriented design is intended to support extensions like inheritance and overloading that are likely to help with the modular construction and management of large policies. 3.3 A Programming Model for AIR Given a particular AIR policy, we would like to do two things. First, we must have a way of reecting an AIR policy in a program by protecting sensitive resources with instances of an AIR class. Second, we must ensure that all uses of protected data adhere to the prescriptions of the AIR policy. Taken together, we can then claim that an AIR policy is correctly enforced by a program. To achieve these goals, we have dened a formal model for a language called AIR in which one writes programs that use AIR policies. AIRs type system ensures that these policies are used correctly. The rest of this section denes the programming model for this language and the next two sections esh out its syntax and semantics. Section 3.5.4 proves that type-correct programs act only in accordance with their AIR policies. The programming model for using AIR policies has two elements. First, programmers tie an AIR policy to data in the program by constructing instances of AIR classes and labeling one or more pieces of data with these instances. This association denes (1) the set of principals that may view the data (in particular, the principal P that owns the class, and any principals that may act for P), and (2) the rules that allow the data to be released. As in other security-typed languages, the labeling specication (expressed 65 using type annotations) is part of the trusted computing base. Second, programmers manipulate data protected by an AIR class instance through a class-specic API that is generated by compiling each AIR class denition to a series of program-level denitions. For example, each AIR classs release and transition rules are compiled to functions that can be used to release protected data. The types given to these functions ensure that a caller of the function must always provide evidence that the necessary conditions to release protected data have been met. Figure 3.3 illustrates a program using the AIR policy of Figure 3.2, written using a ML-like notation. (Signicantly, our examples omit type annotations where they do not help clarify the exposition. AIR does not support type inference at all.) At a high level, this program processes requests to release information from a secret le. The les are stored on the le system together with a policy label that represents a particular AIR class instance. Before disclosing the information, the program must make sure that the automaton that protects the data is in a state that permits the release. The rst two lines set up the scenario. At line 1, we read the contents of a secret le into the variable x a1 and the automaton that protects this le into the variable a1. Initially, only the principals that act for the owner of the class of a1 can view these secrets. At line 2, the program blocks until a request is received. The request consists of an output channel and another automaton instance a2 that represents the policy under which the requested information will be protected after the release. In effect, the information, once released, will be under the protection of the principal that owns the class of a2. Prior to responding to the request, on lines 4-7 we must establish that a1 is in a state that permits the release. At line 4, we extract the class of the instance a2. At line 66 1 2 3 4 5 6 7 8 9 10 let x a1, a1 = get secret le and policy () in let a2, channel = get request () in ( generating evidence of policy compliance ) let a2, a2 class = get class a2 in let ev1 = acts for (principal a2 class) Coalition in let a1, Debt(debt), ev2 = get current state a1 in let ev3 = leq debt 10 in ( supplying evidence to policy API and releasing data ) let a1, a2, x a2 = Conf coalition a1 x a1 a2 ev1 debt ev2 ev3 in send channel x a2 Figure 3.3: Programming with an AIR policy 5, we check that the owner of a2s class acts for the Coalition principal and, if this check succeeds, we obtain a certicate ev1 as evidence of this fact. At line 6, we extract the current state of the automaton a1, use pattern matching to check that it is of the form Debt(debt) (for some value of debt) and receive an evidence object ev2 that attests to the fact that a1 is currently in this state. At line 7, we check that the total debt associated with the current state of the automaton is not greater than 10 and obtain ev3 as evidence if the check succeeds. At line 9 we call Conf coalition, a function produced by compiling the AIR policy. We pass in the automaton a1 and the secret data x a1; the automaton a2 to which x a1 is to be released; and the certicates that serve as evidence for the release conditions. Conf coalition returns a1 which represents the next state of the automaton (presumably in the Debt(debt+1) state); a2 the unchanged destination automaton; and nally, x a2, which contains the suitably downgraded secret value. On the last line, we send the released information on the channel received with the request. For programs like our example, we would like to verify that all releases of information are mediated by calls to the appropriate transition and release rules as dened 67 by the AIR policy (functions like Conf coalition). Additionally, we would like to verify that a program satises the mandates of an AIR policy rule by presenting evidence that justies the appropriate release conditions. This evidence-passing style supports our goal of certifying the evaluation of all authorization decisions, while being exible about the mechanism by which an obligation is fullled. To return to the DoD example from the introduction, this design gives us the exibility to allow release authorizations to be obtained in one part of the system and security assurances from the recipient to be handled in another; the cross-domain guard must simply collect evidence from the other components rather than performing these operations itself. AIRs type system is designed so that type correctness ensures these goals are satised, i.e., a type-correct program uses its AIR policy correctly. The type system has three key elements: Singleton types. First, in order to ensure complete mediation, we must be able to correctly associate data with the class instance that protects it. For example, Conf coalition expects its rst argument to be an automaton and its second argument to be data protected by that automaton. In an ML-like type system, this functions type might have the form .Instance t But such a type is not sufciently precise since it does not prescribe any relationship between the rst and second argument, e.g., allowing the programmer to erroneously pass in a2 as the rst argument, rather than a1. To remedy this problem, we can give Conf coalition a type like the following (as a rst approximation): N, .InstanceN Protected N . . . Here, N is a unique type-level name for the class instance provided in the rst argument. 68 The second arguments type Protected N indicates it is an value protected by the instance N, making clear the association between policy and data. We can ensure that values of type Protected N may only be accessed by principals P that act for the owner of the class instantiated by the instance named N. This approach is more exible than implicitly pairing each protected object with its own (hidden) automaton. For example, with our approach one can encode policies like secret sharing, in which a set of related documents are all protected by the same automaton instance. Each documents type would refer to the same automaton, e.g., Protected Doc N. Information released about one document updates the state of the automaton named N and can limit releases of the other documents. Dependent types. Arguments 4-7 of Conf coalition represent evidence (proof certicates) that the owner of class instance a2 acts for Coalition, and that a1 is in a state authorized to release the given data. The types we give to these arguments reect the propositions that the arguments are supposed to witness. For example, we give the seventh argument (ev3) to Conf coalition the type LEQ debt 10 where LEQ is a dependent type constructor applied to two expressions, debt and 10, which themselves have type Int. Data with type LEQ n m represents a certicate that proves n m. If we allow such certicate values to only be constructed by trusted functions that are known to correctly implement the semantics of integer inequality, then we can be sure that functions like Conf coalition are only called with valid certicates, i.e., type correctness guarantees that all certicates are valid proofs of the propositions represented by their types, and there is no need to inspect these certicates at run time. If we interface with other programs, we can check the 69 validity of proof certicates at run time before allowing a call to proceed. Either way, the type system supports an architecture that enables certied evaluation of an AIR policy. Afne types. The nal piece of our type system is designed to cope with the stateful nature of an AIR policy. The main problem caused by a state change is illustrated by the value returned by the Conf coalition function. In our example, a1 represents the state of the policy automaton that protects x a1 after a release has been authorized. Thus, we need a way to break the association between x a1 and the old, stale automaton state a1. We achieve this in two steps. First, even though our type system supports dependent types, as shown earlier, we use singleton types to give x a1 the type Protected N, where N is a unique type name for a1 (rather than giving x a1 a more-direct dependent type of the form Protected a1). The second step is to use afne types (values with an afne type can never be used more than once) to consume stale automaton values, so that at any program point, there is only one usable automaton value that has the type-name N. Thus, we give both a1 and a1 the type InstanceN , where t denotes an afnely qualied type t. Once a1 is passed as an argument to Conf coalition (constituting a use) it can no longer be used in the rest of the program; a1 is the only automaton that can be used in subsequent authorization checks for x a1. Thus, a combination of singleton and afne types transparently takes care of relabeling data with new automaton instances. (One might also wonder how we deal with proof certicates that can become stale because of the changing automaton state; we discuss this issue in detail in Section 3.5.1.) To illustrate how singleton, dependent, and afne types interact, we show the (slightly simplied) type of Conf coalition below. The full type is discussed in Section 3.5.2. 70 N, M, . InstanceN Protected N InstanceM . . . (debt : Int) . . . (LEQ debt 10) (InstanceN InstanceM Protected M) The rst three arguments are the afne source automaton (a1), the data it protects (x a1), and the afne destination automaton (a2). On the next line, we show the dependent type given to the evidence that the current debt of the automaton is not greater than 10. Finally, consider the return type of Conf coalition. The rst component of this three-tuple is a class instance with the same name N as the rst argument. This returned value is the new state of the automaton named Nit protects all existing data of type Protected N (such as x a1). The second component of the three-tuple is the unchanged target automaton. The third component contains the data ready to be releasedits type, Protected M, indicates that it is now protected by the target automaton instance M. In effect, AIR models state modications by requiring automata states to be manipulated in a store-passing style, reminiscent of a monadic treatment of side effects in a purely functional language [69]. However, by imposing the additional discipline of afne types, we are able to ensure that the program always has a consistent view of an automatons state, while still retaining the benets of a well-understood and relatively simple functional semantics. The reader may be concerned about the difculty of programming with afne types in AIR. We put forth an argument in two parts in order to quell this concern. First, when enforcing purely functional FABLE-style policies in AIR, afne types need not be used at all. More subtly, even when afne types are used to enforce stateful policies, we conjecture that AIRs type system may actually simplify the programming task rather 71 than complicate it. Admittedly, prior work on adding afne types to a programming language ies in the face of this conjecture. For example, afne types have been used in Cyclone to prevent the creation of pointer aliases [124]. When the referent of an aliasfree pointer is deallocated, it is easy to show that no dangling pointers remain. In our experience, using afne types to control pointer aliasing makes programming in Cyclone considerably harder. A main difculty is that non-afne aliasing is not always a symptom of a programming error, e.g., pointers may be freely aliased so long as no pointer in an alias set is dereferenced after the referent is deallocated. In contrast, afne types in AIR are used to restrict the use of stale policy states rather than to control pointer aliasing. We are optimistic about the usability of afne types in AIR because these types appear to very naturally capture the only correct usage mode of a stateful policyany use of a stale policy state in an authorization decision violates the consistency of the policy. Thus, we conjecture that any correct implementation of a stateful policy must adhere to an afne discipline on policy states. AIRs type system may actually simplify this task, since it can detect common programming errors that cause the required afne discipline to be violated. Nevertheless, we acknowledge that adhering to the constraints of AIRs type system is surely more burdensome than when using a more traditional programming language. Thus AIR may be most appropriate for the security-critical kernel of an application, or even as the (certiable) target language of a program transformation for inline reference monitoring. We leave to future work support for improving AIRs usability, such as type inference. 72 Metavariables B Base terms functions T Type constructors D Data constructors , , Type variables Core language Terms e ::= x | x:t.e | e e | ::k.e | e [t] | B | D | case e of x:t.e : e else e | | new e Types t ::= (x:t) t | | ::k t | T | t t | q t | t t | t e | t Type names ::= | Afnity q ::= | Simple kinds k ::= U | A | N Kinds K ::= k | k K | t K Name constraints ::= | | | Signatures and typing environments Phase index ::= term | type Signatures S ::= (B:t) | (D:t) | (T::K) | S, S Type env. ::= , x:t | , ::k | S Afne env. A ::= x | A, A Figure 3.4: Syntax of AIR 3.4 Syntax and Semantics of AIR AIR extends a core System F [85] with support for singleton, dependent, and afne types. AIR is parameterized by a signature S that denes base term functions B, data constructors D, and type constructors T each AIR class declaration D is compiled to a signature SD that acts as the API for programs that use D. All AIR classes share some elements in common, like integers, which appear in a prelude signature S0 . We explain the core of AIR using examples from the prelude. The next section describes the remainder of the prelude and shows how our example AIR policy is compiled. 73 3.4.1 Syntax Figure 3.4 shows the syntax of AIR. The core language expressions e are mostly standard, including variables x, lambda abstractions x:t.e, application e e , type abstraction ::k.e, and type application e [t]. Functions have dependent type (x:t) t where x names the argument and may be bound in t . Type variables are . A type t universally quantied over all types of kind k is denoted ::k t. Here, is a name constraint that records the type names given to automaton instances in the body of the abstraction; we discuss these in detail later. When the constraint is empty we write a universally quantied type as ::k.t. The signature S denes the legal base terms, B and D, and type constructors T, mapping them to their types t and kinds K, respectively. (We distinguish between base terms functions and data constructors syntactically since, as illustrated in Section 3.4.3, they have different operational semantics.) The prelude S0 denes several standard terms and types which we use to illustrate some of AIRs main features. The type constructor Int represents the type of integers, and is given U kind in the prelude (written Int::U). Kind U is one of three simple kinds k. A type t with simple kind A is afne in that the typing rules permit terms of type t to be used at most once. t is an instance of the form q t where q = . Terms whose types have kind U are unrestricted in their use (explaining the choice of U as the name of this kind). We explain kind N, the kind of type names, shortly. The prelude also denes two base data constructors for constructing integers: Zero : Int represents the integer 0, while Succ : Int Int is a unary data constructor that produces an Int given an Int. Data constructor application is written e (e); thus the integer 1 is 74 represented Succ (Zero) (but we write 0, 1, 2 etc. for brevity). Programs can pattern match data constructors applications using the expression form case e of x:t.e : e else e. This is mostly standard; details are in Appendix B. In addition to simple kinds k, kinds K more generally can classify functional type constructors, using the forms k K and t K. A type constructor t1 having the rst form can be applied to another type (as t1 t2 ) to produce a (standard) type, while one of the second form can be applied to a term (as t e) to produce a dependent type. As an example of the rst case, the prelude denes a type constructor ::U U U to model pairs; Int Int is the type of a pair of integers (for clarity, from here on we will use inx notation and write a pair type as t t ). The prelude also denes a base-term constructor Pair which has a polymorphic type , ::U. for constructing pair values. Evidence for condition expressions in an AIR policy are given dependent types. For example, the prelude provides means to test inequalities A1 A2 that appear in a policy and generate certicates that witness an inequality: (LEQ::Int Int U), (leq:(x:Int) (y:Int) LEQ x y) LEQ is a dependent-type constructor that takes two expressions of type Int as arguments and produces a type having kind U. This type is used to classify certicates that witness the inequality between the term arguments. These certicates are generated by leq which is a base term function with a dependent type: the labels x and y on the rst two arguments appear in the returned type. Thus the call leq 3 4 would return a certicate of type LEQ 3 4 75 because 3 is indeed less than 4. An attempt to construct a certicate LEQ 4 3 by calling leq 4 3 would fail at run time, returning (an unrecoverable failure) in our semantics we could use option types to handle failures more gracefully. The signature does not include a data constructor for the LEQ type, so its values cannot be constructed directly by programsthe only way is by calling the leq function. We discuss the remaining constructsincluding name constraints , named types t , and the new e constructin conjunction with the type rules next. 3.4.2 Static Semantics Figure 3.5 shows the main rules from the static semantics of AIR, which consists of two judgments. The full semantics can be found in Appendix B. The typing judgment is parameterized by a phase index , which indicates whether the judgment applies to a term- or type-level expression. (Note that, though seemingly related, the phase index is not to be confused with the color index in FABLE. Colors distinguish application from enforcement policy code. The phase index distinguishes expressions that appear at the type-level from those that appear within term.) The judgment giving an expression e a type t is written ; A e : t; where is the standard typing environment augmented with the signature S (used to type base terms and type constructors), A is a list of afne assumptions, and is a name constraint that records the set of fresh type names assigned to automata instances in e. The second judgment, in the environment . Recall that the type system must address three main concerns. First, we must cort :: K states that a type t has kind K 76 ; A ; e : t; (T-X) A -level expression e has type t and uses names x : (x); ; A (T-NC-type) (T-XA) (x) :: U x : (x); ; x ; type x : (x); (T-X-type) ; A type e : t; 1 ; A type e : t; ; A e : t; dom() e : t; ; A ; A (T-WKN) ; A, A (T-NEW) e : t ; e : t ; (T-DROP) e : t; t :: U () = N ; ; A new e : t , ::k; A e : t; ; A {, } q = p(A, ) (T-TAB) ::k.e : q(::k t); , x : tx ; A, a(x, k) tx :: k q = p(A, ) ; A e : te ; (T-ABS) x:tx .e : q((x:tx ) te ); ; A ; A ; A e : q(::k t ); e [t] : [ t]t ; t :: k (T-TAP) ([ t] ) ; A e : q((x:t ) t); 1 ; A, A e : t ; 2 (T-APP) e e : [x e ]t; 1 2 where a(x, A) = x a(x, U) = p(A, ) = p(, ) = t :: K (K-A) A type t has kind K in environment t :: A () = N = t :: A (K-FUN) (K-N) () = k :: k t :: U t :: A (K-AFN) t :: k , x : t t :: k (x:t) t :: U = , ::k t :: t K ; S,type e : t ; t e :: K (K-UNIV) (K-DEP) t :: k ( ) = N ::k t :: U Figure 3.5: Static semantics of AIR (Selected rules) 77 rectly assign unique type names to automata instances and then associate these names with protected data. Next, for certied evaluation, we must be able to accurately type evidence using dependent types. Finally, to cope with automaton state changes, we must (via afne types) prevent stale automaton instances from being reused. We consider each of these aspects of the system in turn, rst in the typing judgment and then in the kinding judgment. Assigning unique names to automata. We construct new automata using new e. (TNEW) assigns the name to the type in the conclusion, ensuring (via ) that is distinct from all other names that have been assigned to other automata. We require to be in the initial environment , or to be introduced into the context by a type abstraction. Recall from Section 3.3 that protected values will refer to this name in their types (e.g., Protected Int ). The resulting type t is also afnely qualied; we discuss this shortly. (T-DROP) allows the unique name associated with a type to be replaced with the distinguished constant name . This is sound because although the name of a type t can be hidden, cannot be reused as the type-level name of any other automaton (i.e., is unaffected). This form of subtyping is convenient for giving types to proof objects that witness properties of the state of an automaton, while keeping our language of kinds for type constructors relatively simple. Section 3.5.1 illustrates an example use of (T-DROP). (T-TAB) is used to check type abstractions. The rst premise checks the body of the abstraction e in a context that includes the abstracted type variable . Since we treat type names and types uniformly, functions polymorphic in a type name can be written by quantifying over ::Nthe interesting elements of this rule have to do with managing 78 these names. If the body of the abstraction e constructs a new automaton assigned the name in (T-NEW), then will be recorded in , the name constraints of e. In this case = and contains all the other names used in the typing derivation of e; otherwise is empty. In the conclusion, we decorate the universally quantied type with to signify that the abstracted name is used in e. Type abstractions are destructed according to (T-TAP). In the premises we require the kind of the argument to match the kind of the formal type parameter. In the conclusion, we must instantiate all the abstracted names used in the body e and ensure that these are disjoint from all other names used in the body. Two additional points are worth noting. First, universally quantied types can be decorated with arbitrary name constraints (rather than just singleton names ). We expect this to be useful when enforcing composite policies. The name instantiation constraint can ensure that a function always constructs automata that belong to a specic set of classes in a large policy. Second, we could support recursion by following an approach taken by Pratikakis et al [105]. This requires using existential quantication to abstract names in recursive data structures and including a means to forget names assigned to automata that go out of scope (e.g., in each iteration of a loop). Dependently typed functions and evidence. (T-ABS) gives functions a dependent type, (x:t) t . Here, x names the formal parameter and is bound in t . When a function is applied, (T-APP) substitutes the actual argument e for x in the return type. Thus, given a function f that has type (debt : Int) (LEQ debt 10) t, the application ( f 11) is given the type (LEQ 11 10) t. That is, the type of the second argument of f depends on the 79 term passed as the rst argument. Note that although AIR permits arbitrary expressions to appear in types, type checking the enforcement of an AIR policy is decidable because we never have to reduce expressions that appear in types. However, in order to enforce policies like the static information ow policy of Chapter 4, reduction of type-level expressions is, as in FABLE, critical. Afne types for consistent state updates. Finally, we consider how the type system enforces the use at most once property of afne types. First, (T-NEW) introduces afne types by giving new automaton instances the type t . Values of afne type can be destructed in the same way as values of unrestricted type. For example, (T-APP) and (T-TAP) allow e to be applied irrespective of the afnity qualier on es type. However, we must make sure that variables that can be bound to afnely typed values are not used more than once. This is prevented by the type rules through the use of afne assumptions A, which lists the subset of variables with afne type in which have not already been used. The use of an afne variable is expressed in the rule (T-XA), which types a variable x in the context of the single afne assumption x. To prevent variables from being used more than once, other rules, such as (T-APP), are forced to split the afne assumptions between their subexpressions. Afne assumptions are added to A by (T-ABS) using the function a(x, k), where x is the argument to the function and k is the kind of its type. If the argument xs type has kind A then it is added to the assumptions, otherwise it is not. We include a weakening rule (T-WKN) that allows afne assumptions to be forgotten (and for additional names to be consumed). Finally, the function p(A, ) is used to determine the afnity qualier of an abstraction. If no afne assumptions from the environment are 80 used in the body of the abstraction (A = ) and if no new automata are constructed in the body ( = ), then it is unrestricted. Otherwise, it has captured an assumption from the environment or encloses an afnely tracked automaton and should be called at most once. Kinding judgment. In t :: K, the rule (K-A) is standard. (K-N) allows a name to be associated with any afne type t. (K-AFN) checks an afnely-qualied type: types such as t are not well-formed. (K-FUN) is standard for a dependent type systemit illustrates that x is bound in the return type t . (K-UNIV) is mostly standard, except that we must also check that the constraint only contain names that are in scope. (K-DEP) checks the application of a dependent-type constructor. Here, we have to ensure that the type of the argument e matches the type of the formal. However, since e is a type-level expression, we check it in a context with the phase index = type. Since types are erased at run time, type-level expressions are permitted, via (T-X-type), to treat afne assumptions intuitionistically. Erasure of types also allows us to lift the name constraints for type-level expressions e(T-NC-type) allows any subset 1 of the names used in e to be forgotten. 3.4.3 Dynamic Semantics Figure 3.6 denes the dynamic semantics of AIR as a call-by-value, small-step reduction relation, using a left-to-right evaluation order. The form of the relation is : M ee l 81 Equations, models, and certicates equation E eqn. domain D ::= D e ::= v | t | D, D | model M ::= B : E | M, M certicates e ::= . . . | [[B]]D Values and evalutation contexts values v eval ctxt E M ee M M ee l l l ::= D | v (v ) | [[B]]D | x:t.e | ::k.e | new v ::= | e | v | [t] | (e) | v ( ) | case of . . . | new An expression e reduces to e recording l in the trace. e = (E-CTX) M M (E-APP) e E e M l l (E-BOT) E eE e e = (x v) e M x:t.e ve if (v M (E-INF) e = ( t)e ::k.e [t]e (E-TAP) M B: E M e pat : ) then e = (e ) else e = e case v of x:t.e pat : e else e e D, v M e E [[B]]D ve l (E-CASE) B: E M M B[[B]] B: E M M D, v (E-DELTA) l = B : D, v (E-B1) e E [[B]]D v[[B]]D,v D,t e E (E-B2) B: E M D,t M e E l l = B : D,t [[B]]D [t]e (E-B3) B: E M M [[B]]D [t][[B]]D,t (E-B4) v ep : v v : (U-ID) v x : x v (U-VAR) v Pattern matching data constructors. e :: v e : v (v ) e (e ) : , (U-CON) Figure 3.6: Dynamic semantics of AIR This judgment claims that a term e reduces in a single step to e in the presence of a model M that interprets the base terms in a signature. The security-relevant reduction steps are annotated with a trace element l, which is useful for stating our security theorem. Following a standard approach for interpreting constants in a signature [85], we dene a model M by axiomatizing the reductions of base-term function applications. The syntax of the model is shown M is shown at the top of Figure 3.6. A model M contains 82 equations B : D e, where D is a sequence of types and values. A simple example of an 3, indicating that an application (plus 1 2) reduces to 3 at runtime. equation is plus 1 2 Base term functions in AIR are also used to perform runtime checks that provide evidence for the release conditions in an AIR policy. For example, a AIR program can perform a test (leq x y) to attempt to construct evidence of type LEQ x y. The model equations for leq need to generate valid proof certicates for tests that succeed and throw runtime errors for those tests that fail. Handling failures is relatively straightforward we simply include equations of the form leq : 4, 3 indicating that the expression (leq 4 3) reduces to , i.e., a runtime error. However, we also need a way to construct proof certicate values that inhabit types like LEQ 3 4. Our solution is to introduce special values [[B]]D to represent these certicates. For example, [[LEQ]]3,4 will be the representation of a value that inhabits the type LEQ 3 4, and model equations of the form leq : 3 4 [[LEQ]]3,4 will serve to indicate that the application (leq 3 4) reduces at runtime to a valid proof certicate. In practice, if we are in a purely type-safe setting, we could choose an arbitrary value (like unit) to represent a proof certicates. However, a concrete runtime representation for proof certicates can be of practical use if proofs need to be checked at run time, e.g., when interfacing with type-unsafe code. The values and evaluation contexts in AIR are also dened in Figure 3.6. Note that the base-term data constructors D are treated as values; e.g., constructors Succ and Zero are both treated as values. Constructor applications like Succ (Zero) are also values. The other values include certicates, abstractions, and new automaton instances. The rules in the reduction relation M ee from (E-CTX) to (E-CASE) are e p : follows a similar judgment in l entirely standard. The pattern matching judgment v 83 FABLE. The remaining rules manipulate base-term functions. In (E-DELTA), we reduce a base term B to a certicate [[B]] that serves a proof that a term inhabits the type given to B in the signature. In (E-B1), we show how the application of a base term B to a sequence of types and terms D, v is reduced using an equation that appears in the model. The security-relevant actions in a program execution are the reduction steps that correspond to automaton state changes. As indicated earlier, each transition and release rule in a policy will be translated to a function-typed base term like Conf coalition. Thus, every time we reduce an expression e using a base-term equation B : D M ee . (E-B3) handles the application of a type to a polymorphic base-term function and is identical in structure to (E-B1). One point to note about (E-B3): although we allow an equation in M to depend on the type argument t (i.e., it is free to perform an intensional analysis on t, potentially violating parametricity), our soundness theorem places constraints on the form of the model equations to ensure type safety. Finally, the rules (E-B2) and (E-B4) handle partial applications of base terms. For example, the partial application reduces in several steps as shown below: B:D e , we record l = B : D in the trace: i.e., M leq 3[[leq]][[leq]]3 An implementation of AIR would, of course, take a less abstract approach to the semantics of base terms. For example, we could use enforcement policy functions in FABLE to produce unforgeable certicates for runtime tests (using FABLEs labeling and 84 unlabeling constructs). However, the abstract presentation here both keeps the presentation simple and allows us to prove a standard type-soundness theorem. Theorem (Type soundness). Given an environment = S, 1 ::N, ..., n ::N that only binds type names, such that ; consistent, then e .M e : t; . Notice that the statement of this theorem relies on a hypothesis that the model M and the signature S are type-consistent. That is, patently type-unsafe model equations like leq : 3 4 17 are ruled out. Furthermore, in order to prove this theorem, we need term e : t; , and an interpretation M l such that M and S are typeee then ; l term ee or e is a value. Moreover, if M a way to type certicates. That is, we need additional type rules that allow certicates like [[LEQ]]3,4 to be typed as LEQ 3 4this is easily done. Appendix B denes these additional rules and contains a detailed proof sketch of this type-soundness theorem. We have also mechanized the proof of soundness for AIR using the Coq proof assistant [17]. Our formalization adapts a proof technique recently proposed by Aydemir et al [7]. In particular, we use a locally nameless approach for representing both term- and type-level bindings and rely on conite quantication to introduce fresh names. We rely on a set of libraries distributed by Aydemir et al. that provide basic support for working with environments and nite sets. Our Coq proof is complete, modulo a collection of identities about nite sets and context splitting. The proofs of these identities are beyond the capabilities of the decision procedures in the nite set libraries that we use and, without automation, we have found proofs of these identities in Coq to be tedious and time consuming. However, we expect it will be possible to devise specialized decision proce85 dures to automatically discharge the proofs of these identities. Our development of AIR in Coq can be obtained from http://www.cs.umd.edu/projects/PL/selinks. 3.5 Translating AIR to AIR In this section, we show how we translate an AIR class to a AIR API, describe how that API is to be used, and state our main security theorem. 3.5.1 Representing AIR Primitives In order to enforce an AIR policy we must rst provide a way to tie the policy to the program by protecting data with AIR automata. We must also provide a concrete representation for automata instances and a means to generate certicates that attest to the various release conditions that appear in the policy. These constructs are common to all AIR programs and appear in the standard prelude S0 , along with the integers and pairs discussed in Section 3.4.1. Protecting data. As indicated in Section 3.3, we include the following type constructor to associate an automaton with some data: (Protected::U N U). A term with type Protected t is governed by the policy dened by an automaton instance with type-level name . We would like to ensure that all operations on protected data are mediated by functions that correspond to AIR policy rules. For this reason, we do not provide an explicit data constructor for values of this type, ensuring that they cannot be destructed directly, say, via pattern matching. Values of this type are introduced only by assigning the appropriate types to functions that retrieve sensitive data. For instance, library functions 86 that read secret les from the disk can be annotated so that they return values with a protected type. In addition to functions corresponding to AIR class rules, we can provide functions that allow a program to perform computations over protected values while respecting their security policies. We have explored such functions in Chapter 2 and showed that computations that respect a variety of policies (ranging from access control to information ow) can be encoded; we do not consider these further here. Next, we discuss our representation of an AIR automatonthese include representations of the class that the automaton instantiates and the principal that owns the class. Principals. The nullary constructor Prin is used to type principal constants P, i.e., (Prin::U), (P:Prin). As with integers, we need a way to test and generate evidence for acts-for relationships between principals. We include the dependent-type constructor and run-time check shown below. (ActsFor::Prin Prin U) (acts for:(x:Prin) (y:Prin) ActsFor x y) AIR classes. A class consists of a class identier id and a principal P that owns the class. The type constructors (Id::U), (Class::U) are used to type identiers and classes. Classes are constructed using the data constructor (Class:Id Prin Class). The translation of an AIR class introduces nullary data constructors like US Army Condential:Id 87 and US Army:Prin, from which we can construct the class USAC = Class (US Army Condential) (US Army) Finally, we use a dependent-type constructor and run-time check to generate evidence that two classes are equal. (IsClass::Class Class U), (is class:(x:Class) (y:Class) IsClass x y) Class instances. Instances are typed using the Instance::U type constructor. Each instance must identify the class it instantiates and the current state of its automaton. For each state in a class declaration, we generate a data constructor in the signature that constructs an Instance from a Class and any state-specic arguments. For example, we have: Init:Class Instance, Debt:Class Int Instance Thus the expression new Init (USAC) constructs a new instance of a class. According to (T-NEW), this expression has the afne type Instance , where the unique type-level name allows us to protect some data with this automaton. Since we wish to allow data to be protected by automata that instantiate arbitrary AIR classes, we give all instances, regardless of their class, a type like Instance , for some . This has the benet of exibilitywe can easily give types to library functions that can return data (like le system objects) protected by automata of different classes. However, we must rely on a 88 run-time check to examine the class of an instance since it is not evident from the type. The prelude includes the the following two elements to construct and type evidence about the class of an automaton instance: ClassOf ::N Class U class of inst:::N.(x:Instance ) (Instance c:Class ClassOf c) The function class of inst extracts a Class value c from an instance named and produces evidence (of type ClassOf c) that is an instance of c. The return type of this function is interesting for two reasons. First, because the returned value relates the class object in the second component of the tuple to the evidence object in the third component, we give the returned value the type of a dependently typed tuple, designated by the symbol . Although we do not directly support these tuples, they can be easily encoded using dependently typed functions, as shown in Figure 2.2. Second, notice that even though class of inst does not cause a state transition, the rst component of the tuple it returns contains an automaton instance with the same type as the argument x. This is a common idiom when programming with afne types; since the automaton instance is afne and can only be used once, functions like class of inst simply return the afne argument x to the caller for further use. The following constructs in the prelude allow a program to inspect the current state of an automaton instance. InState::Instance Instance U state of inst:::N.(x:Instance ) (z:Instance y:Instance InState z y) 89 These constructs are similar to the forms shown for examining the class of an instance, but with one important difference. Since the state of an automaton is transient (it can change as transition rules are applied), we must be careful when producing evidence about the current state. This is in contrast to the class of an automaton which never changes despite changes to the current state. Thus, we must ensure that stale evidence about an old state of the automaton can never be presented as valid evidence about the current state. The distinction between evidence about the class of an automaton and evidence about its current state is highlighted by the rst argument to the type constructor InState. Unlike the rst argument of the ClassOf constructor (which can be some type-level name ::N), the rst argument of InState is an expression with an afne type Instance (introduced via subsumption in (T-DROP)) that stands for an automaton instance that has been assigned some name. Using this form of subtyping allows us to use InState to type evidence about the current state of any automaton. An alternative would be to enhance the kind language by allowing type constructors to be have polymorphic kindswe chose this form of subtyping to keep the presentation simpler. As described further in the next subsection, functions that correspond to AIR rules take an automaton instance a1 (say, in state Init) as an argument, and produce a new instance a1 as a result (say, in state Debt(0)). Importantly, both a1 and a1 are given the type Instance , i.e., the association between the type-level name and the automaton instance is xed and is invariant with respect to state transitions. Since the class of an automaton never changes (both a1 and a1 are instances of USAC) it is safe to give evidence about the class of an instance the type ClassOf USAC, i.e., evidence about the class of an automaton can never become stale. On the other hand, evidence about the current 90 src; dst; |= x:t.C; e : t t :: k Translation of an AIR rule (TR-COND) src; dst; S; , x:t |= C : t src; dst; S; , x:t |= x:t.C; e : t src; dst; S; |= x:t.C, x:t.C; e : (x:t) t t , s ::N; e : Protected s; s; d; S; |=r ; e : (Instances Instanced Protected d) , s ::N; e : t; (T-BODY) (R-BODY) s; d; S; |=t ; e : (!Instances !Instanced ) Figure 3.7: Translating an AIR rule to a base-term function in a AIR signature state of the automaton can become stale. If we were to type this evidence using types of the form InStateBad Init, then this evidence may be true of a1 but it is not true of a1 . Therefore, we make InState a dependent-type constructor to be applied to an automaton instance rather than a type-level name. 3.5.2 Translating Rules in an AIR Class Appendix B denes a translation procedure from an AIR class to a AIR signature. The key judgment in this translation is shown in Figure 3.7. In this section, we discuss the form of this judgment and describe its behavior by focusing on the translation of the rules in the example policy of Figure 3.2. Each rule r in an AIR class is translated to a function-typed constant fr in the signature. Each condition in a rule is represented as an argument to the function fr the translation of these conditions is the same for both release and transition rules. Where the translation of release and transition rules differs is in the construction of the nal return type of the function fr . 91 The translation judgment shown in Figure 3.7 uses a more compact notation for an AIR policy than the syntax of Figure 3.1. In particular, we treat both release and transition rules as a AIR expression e prexed by a list of binders and conditions x:t.C. The judgment src; dst; |= x:t.C; e : t , states that in a context where src and dst are the type-level names of the source and destination automata, and where is a standard AIR typing environment, the rule x:t.C; e is translated to a base-term function with the type t . The index that appears on the turnstile differentiates transition rules ( = t) from release rules ( = r). The rule (TR-COND) shows how a condition is translated. Its index indicates that it applies to both release and transition rules. (TR-COND) peels off a single condition x:t.C from the list of conditions associated with a rule. The rst premise checks that the type is well-formed. The second premise translates the condition C to the type t that stands for the evidence of the condition. The third premise recurses through the rest of the release conditions. In the conclusion, we have the type t of the bound variable and the evidence type t shown as arguments to a function whose return type t is the type produced by the recursive call. The rules (R-BODY) and (T-BODY) translate release and transition rules respectively. We turn to the concrete example of the policy of Figure 3.2 to illustrate the behavior of these rules in detail. Release rules. At a high-level, release rules have the following form. In response to a request to release data x, protected by instance a1 , to an instance a2 , the programmer must provide evidence for each of the conditions in the rule r. If such evidence can be 92 produced, then fr returns a new automaton state a1 , downgrades x as specied in the policy, and returns x under the protection of a2 . As an example, consider the full type of the Conf coalition rule shown below. Conf coalition : 1 2 3 4 5 6 src::N, dst::N, ::U. (a1:Instancesrc ) (x:Protected src) (a2:Instancedst ) (e1:ClassOf src USAC) (cd:Class) (e2:ClassOf dst cd) (e3:ActsFor (principal cd) Coalition) (debt:Int) (e4:InState a1 (Debt (USAC) (debt))) (e5:LEQ debt 10) (Instancesrc Instancedst Protected dst) The rst two lines of this type were shown previouslyx is the data to be released from the protection of automaton a1 (with type-level name src) to the automaton a2 (with type-level name dst). Since the argument a1 is afne, we require every function type to the right of a1 to also be afne, since they represent closures that capture the afne value a1. At line 3, the argument e1 is evidence that shows that the source automaton is an instance of the USAC class; cd is another class object, and e2 is evidence that the class of the destination automaton is indeed cd. At line 4, e3 stands for evidence of the rst condition expression, which requires that the owning principal of the destination automaton acts for the Coalition principal. Line 5 contains evidence e4 that a1 is in some state Debt(debt), where, from e5, debt 10. The return type, as discussed before, contains the new state of the source automaton, the destination automaton a2 threaded through from the argument, and the data value x, downgraded according to the policy and with a type showing that it is protected by the dst automaton. Transition rules. Each transition rule r in a class declaration is also translated to a function-typed constant fr in the signature. However, instead of downgrading and co93 ercing the type of some datum x, a transition function only returns the new state of the source automaton and an unchanged destination automaton. That is, instead of returning a three-tuple like Conf coalition, a transition rule like Conf init returns a pair (Instancesrc Instancedst ), where the rst component is the new state of the source automaton and the second component is the unchanged destination automaton threaded through from the argument. The full type of Conf init is shown below. Conf init : 1 2 3 4 src::N, dst::N, ::U. (a1:Instancesrc ) (x:Protected src) (a2:Instancedst ) (e1:ClassOf src USAC) (cd:Class) (e2:ClassOf dst cd) (e4:InState a1 Init) (Instancesrc Instancedst ) A nal point about the translation of an AIR class: It is also possible to translate an AIR class D to a model that captures the runtime behavior of each rule in the class. We focus on the signature SD alone as this sufces for type checking. However, in order to state our security theorem, we require constraining possible models MD of SD , so that MD is consistent with the AIR rules. For example, equations in MD that represent transition rules must return automaton states that correspond to the next states specied in the AIR rules. Appendix B denes the consistency of a model MD with an AIR policy precisely. 3.5.3 Programming with the AIR API The program in Figure 3.8, a revision of the program in Figure 3.3, illustrates how a client program interacts with the API generated for an AIR policy. As previously, the rst two lines represent boilerplate code, where we read a le and its automaton policy and then block waiting for a release request. At line 3, we generate 94 1 2 3 4 5 6 7 8 9 10 11 let x a1, a1:Instancesrc = get usac le and policy () in let a2:Instancedst , channel = get request () in let a1,USAC,ca1 ev = class of inst [src] a1 in let a2,ca2,ca2 ev = class of inst [dst] a2 in let actsfor ev = acts for (principal ca2) Coalition in let a1, Debt{USAC}{debt}, a1 state ev = state of inst [src] a1 in let debt ev = leq debt 10 in let a1,a2,x a2 = Conf coalition [src][dst][Int] a1 x a1 a2 ca1 ev ca2 ca2 ev actsfor ev debt a1 state ev debt ev in send [Int] [dst] channel x a2 Figure 3.8: A AIR program that performs a secure information release evidence a1 class ev that a1 is an instance of the USAC class and at line 4 we retrieve a2s class ca2 and evidence ca2 ev that witnesses the relationship between ca2 and a2. At line 5, we check that the destination automaton is owned by a principal acting for the Coalition. At lines 6 and 7 we check that a1 is in the state Debt{USAC}{debt}, for some value of debt 10. If all the run-time checks (i.e., calls to functions like leq) succeed, then we call Conf coalition, instantiating the type variables, passing in the automata, the data to be downgraded and evidence for all the release conditions. We get back the new state of the src automaton a1, a2 is unchanged, and x a2 which has type Protected Int dst. We can give the channel a type such as Channel Int dst, indicating that it can be used to send integers to the principal that owns the automaton dst. The send function can be given the type shown below: send:::U, ::N.Channel Protected Unit This ensures that x a1 cannot be sent on the channel. If the call to Conf coalition succeeds, then the downgraded x a2 has type Protected Int dst, which allows it to be sent. 95 3.5.4 Correctness of Policy Enforcement In this section, we present a condensed version of our main security theorem and discuss its implications. The full statement and proof can be found in Appendix B. Theorem (Security). Given all of the following: (1) an AIR declaration D of a class with identier C owned by principal P, and its translation to a signature SD ; (2) a model MD consistent with SD ; (3) = SD , src::N, dst::N, s:Instancesrc ; (4) ; s src ; and (5) M l l term e : t; where n 1 ((s v)e)e1 . . . en where v = new Init (Class (C) (P)). Then the string l1 , . . . , ln is accepted by the automaton dened by D. The rst condition relies on our translation judgment (discussed in Section 3.5.2) that produces a signature SD from a class declaration D. The second condition is necessary for type soundness. Conditions (3) and (4) state that e is a well-typed expression in a context with a single free automaton s : Instancesrc and two type name constants src and dst. By requiring that src we ensure that e does not give the name src to any other automaton instance. This theorem asserts that when e is reduced in a context where s is bound to an instance of the C class in the Init state, then the trace l1 , . . . , ln of the reduction sequence is a word in the language accepted by the automaton of D. The trace acceptance judgment has the form A; D |= l1 , . . . , ln ; A , which informally states that an automaton dened by the class D, in initial state A, accepts the trace l1 , . . . , ln and transitions to the state A . Recall that the trace elements li record base terms B that stand for security-relevant actions and sets of values that certify that the action is permissible. The trace acceptance judgment allows a transition from A to A only if each transition is justied by all the evidence required by the rules in the class. This condition 96 is similar to the one used by Walker [139]. 3.6 Encoding FABLE in AIR The power of dependent types to express a kind of customized type system is well known [144, 17, 49]. The dependent typing features of AIR and FABLE are no exceptionthey can also be used to customize a type system. In fact, one view of the policy functions in FABLE (as proposed in Chapter 2) is that they provide a means of introducing axioms into the type system to which an application program can appeal in order to prove the validity of certain type-level invariants. For example, the sub function in the policy of Figure 2.9 effectively introduces a subsumption rule into the type system. By calling this function, an application program can prove that a term that inhabits the type t{l} also inhabits the type t{lub l m}. AIR provides a similar ability, while using a slightly different mechanism. Instead of using explicit policy functions, AIR programs are checked in the presence of a signature S that asserts the existence of terms of a particular type; i.e., S includes elements of the form B:t for some base term B and type t. Whereas in FABLE we chose to pay attention to the particular implementation of these terms, in AIR we take a more abstract view in that the semantics of these base terms is simply axiomatized in terms of some model. For instance, the same sub policy function in FABLE can be represented in AIR as base term with the appropriate type. AIR programs can appeal to this base term (or axiom) in order to prove the same subsumption relation that can be proved using FABLE. The operational behavior of sub is not specied within the language; instead we just model 97 sub at runtime using a model that axiomatizes the set of possible reductions that result from an application of the sub function. By directly modeling the implementation of these policy functions, FABLE provided us with a way to reason concretely about the correctness of specic policies. In this respect, FABLE is more expressive than AIR. However, for situations in which we do not necessarily care about specic policy implementations, the more abstract approach in AIR sufces. Additionally, AIR exceeds the expressiveness of FABLE in two important ways. First, the signature in a AIR program is not limited to dening base terms only. This makes the type language used in a AIR program customizable via the signature. In particular, the signature S also includes elements of the form T::K, which binds the type constructor T to a kind K. Thus, while FABLE bakes in specic type constructors like int, lab, and the labeling construct t{e}, such constructs can be introduced into AIR by simply plugging in the appropriate bindings in the signature. The second way in which AIR exceeds FABLE is in its use of afne types (and type-level names). In the remainder of this section, we develop a AIR signature, SFABLE , that will allow us to embed FABLE in AIR. We will argue informally that this embedding is faithful in the sense that all programs typeable in FABLE can be translated to equivalent programs in AIR. However, this translation does not come for free. In particular, several of the typing rules in FABLE will be translated to base terms in SFABLE . For instance, the rule that allows a term with type lab e to be subsumed to the type lab will be represented by the base term thide in SFABLE . Whereas the FABLE type system is able to apply this subsumption nondeterministically, in AIR, we will expect programs to include explicit 98 Type constructors (Int::U), (Lab::U), (SLab::Lab U), (Labeled::U Lab U) Base terms (0:Int), (S:Int Int), (Low:Lab), (High:Lab), (Tuple:Lab Lab Lab), . . . (tshow : (e:Lab) (SLab e), (thide : (e :Lab) (e:SLab e ) Lab Figure 3.9: SFABLE : An embedding of FABLE in AIR invocations of the thide base term whenever subsumption is necessary. Thus, much as our encoding of an information ow type system in FABLE places a greater burden on the application programmer compared to programming in a special purpose system (they must insert the appropriate calls to policy functions), emulating FABLE within AIR is also somewhat more burdensome than programming in FABLE directly. 3.6.1 SFABLE : A AIR Signature for FABLE Figure 3.9 shows SFABLE , the signature that embeds FABLE in AIR. The standard base types in FABLE include the type int of integers. This is easily mapped to AIR by including the corresponding nullary type constructor Int with the kind Ui.e., integer variables can be used an arbitrary number of times. The term representation of integers follow the standard Peano construction, i.e., a nullary constructor 0 to represent zero and a constructor S, which when applied as S n will denote the successor of its Int-typed argument n. The type of labels in FABLE is represented next. The nullary constructor Lab stands for the FABLE type lab. In FABLE we represented label terms by arbitrary applications of constructors C, where C is a meta-variable ranging over all possible constructors. In AIR our approach is to dene a set of constructors for the Lab type, i.e., an approach that 99 closely resembles the denition of variant types in a language like ML. Thus, we include the nullary term constructors like Low and High to represent the labels from a standard two-point lattice. We can also represent more complex label constructors using variant types in AIR. For instance, we include the Tuple constructor, which can be applied to two Lab-typed values to produce a label e.g., Tuple Low High can be given the type Lab. FABLE also includes the type lab e, a singleton type inhabited only by the value v which is a normal form of e. This type is represented in AIR using the dependent-type constructor SLab. This constructor can be applied to any term argument that has the type Lab and produces a type of unrestricted kind. For example, the type lab High in FABLE is represented by the type-constructor application SLab High. We also need a way to represent the three typing rules in FABLE that handle labels. These rules are reproduced below. c ei : lab c C() : lab C() e e (T-LAB) c e : lab e : lab e c (T-SHOW) c e : lab e c e : lab (T-HIDE) The (T-LAB) rule, the introduction form for labels, is handled in AIR via each of the type constructors for labels. That is, the kind given to constructors like Low, High and Tuple directly encode the (T-LAB) rule in its kind, e.g., Tuple requires that each of its arguments is itself a Lab-typed value. However, one difference is that whereas (T-LAB) introduces a term with a singleton type lab e, the type constructors in SFABLE introduce terms with a less precise type Lab. However, this precision is easily recovered via an 100 application of the base terms that correspond to the (T-SHOW) and (T-HIDE) rules. The (T-SHOW) rule is represented by the base term tshow, a dependently typed function. This function represents a subsumption rule that allows any term e of type Lab to be used at the type SLab e, which is identical to the form of the (T-SHOW) rule. The thide dependently typed function models the (T-HIDE) type rule. Given a term e of type SLab e (for any e specied in the rst argument), thide allows e to be used at the type Lab. One point to note here is that the thide expects e as its rst term argument, even though the runtime behavior of thide is independent of e . It is exactly in this kind of situation that phantom variables (introduced in Chapter 2) are useful. An extension of AIR with phantoms would permit us to express that thide is parametric in the type index e. Finally, we use the Labeled constructor to represent FABLE types of the form t{e}. This constructor is analogous to the Protected constructor from Chapter 3, except the here, sensitive resources will be protected by labels e rather than the type-level name N of some AIR automaton. For example, the FABLE type int{High} is represented in AIR using the type-constructor application Labeled Int High. The remaining type rules in FABLE can be represented by the built-in type rules of AIR. For instance, the (T-CONV) rule in FABLE that allows type-level terms to be reduced corresponds to the (T-CONV) rule that also appears in AIR. (See the full static semantics of AIR in Figures B.1 and B.2 for a denition of the (T-CONV) rule in AIR.) Similarly, the rules for abstractions, applications, and pattern matching are subsumed by the corresponding rules in AIR. The FABLE constructs that we do not model are the unlabeling and relabeling op101 erators. Recall that these operators can only be used within FABLE policy functions. In AIR we will represent policy functions using base terms in the signature, for which we only give an abstract specication in terms of typeswe simply axiomatize the operational behavior of these base terms in the language, which eliminates the need for labeling operators. 3.7 Concluding Remarks This chapter has presented AIR, a simple policy language for expressing stateful information release policies. We have dened a core formalism for a programming language called AIR, in which stateful authorization policies like AIR can be certiably enforced. Additionally, we have shown how the type system of FABLE can be embedded within AIR. As a result, AIR can also be used to enforce the access control, provenance and information ow policies that were developed in Chapter 2. A limitation of AIR is that its programming model is still purely functional. Although we show how to model purely functional state updates in AIR using afne types, we would also like to be able to work with a richer programming model that allows the direct manipulation of mutable state. In the next chapter, we extend AIR with mutable references and show how the resulting calculus, FLAIR, can be used to enforce information ow policies while accounting for side effects. 102 4. Enforcing Policies for Stateful Programs The preceding chapters have demonstrated that a simple, lightweight form of dependent typing can be used to specify and implement the enforcement of a range of security policies. We have illustrated the expressiveness of our approach by showing that several special-purpose security type systems can be encoded within FABLE and its relative, AIR. We have argued that our approach has several benets, prominent among which are exibility and customizability. In particular, FABLE makes it possible to enforce a range of security policies in a manner best suited to the specic requirements at hand. However, our results so far apply only to purely functional programs, undermining the claim that our approach promises to be widely applicable to realistic programs. Our focus on purely functional programs reveals a fundamental conict. As functions, a purely functional program is not permitted to have any side effect. But, to write programs that are useful, one needs to cause some form of side effect, e.g., some output has to be printed to the screen. To omit side effects from our model of security is to ignore the obviousclearly we wish to control what messages a program is permitted to send over the network. In this chapter, we seek to redress this deciency by demonstrating how a version of the AIR language can be used to track and secure a canonical form side effects. In particular, we focus on equipping a core functional calculus with references to a mutable 103 store and enforcing an information ow policy for programs in this language. Such a policy partitions the set of memory locations into public and secret locations. Intuitively, the correct enforcement of this policy must ensure that no information about secret values is leaked into public locations. Whereas the FABLE approach can easily be adapted to ensure that secret information is not leaked into a public memory location via direct assignment, this is not sufcient to establish correct enforcement of an information ow policy. Unfortunately, dependences on secret data can be revealed without requiring a direct assignment of secrets to a public locationthe leak can take place via a so-called implicit or indirect ow. The classic example of such a leak is illustrated by the example program below, where h is a high-security boolean and a lloc is a low-security location. if (h) then lloc := true else lloc := false Although the secret value h is never directly assigned to the public location lloc, clearly this program successfully copies h to lloc. Plugging this form of leak will be the main challenge faced by our enforcement mechanism. The key idea will be to use afne types in AIR to provide evidence that the set of implicit dependences at an assignment to a location l is not more secret than the location itself. This chapter makes the following contributions: Section 4.1 denes FLAIR, a straightforward extension of AIR with references and the nal formal calculus of this dissertation. We state a type-soundness theorem for FLAIR. The proof of this theorem, an extension of the corresponding proof for AIR, is in Appendix C. 104 Accounting for side effects when enforcing an information ow policy involves a number of subtle issues. In Section 4.2, we present Core-ML, a simplication of a calculus of the same name proposed by Pottier and Simonet [104], as a model for the purely static enforcement of information ow controls in an ML-like, mostly functional language. Core-ML serves as a high-level specication of correct enforcement of information ow, in the familiar language of natural deduction. The main result of this chapter (Section 4.4) is an encoding of the Core-ML type system in FLAIR. Our security theorem asserts that every type correct FLAIR program using our encoding enjoys a noninterference property analogous to the corresponding property for Core-ML programs. Although this result demonstrates that FLAIR is expressive enough to enforce policies for stateful programs, our encoding is relatively complex. The difculty of programming at the source level in FLAIR while enforcing such a complex policy is substantial. Thus, our contention in Chapter 3 that AIR is likely to be more useful as an intermediate language is only more pronounced with the static enforcement of information ow in FLAIR. 4.1 FLAIR: Extending AIR with References In this section, we dene FLAIR, an extension of AIR with references. The exten- sion is mostly standard and is shown in Figure 4.1. The extensions to the syntax of AIR include a Unit type and the corresponding value (); a value form which stands for a memory reference literal; a dereferencing operation !e; and an assignment operation e 1 := e 2. References to a value of type t are 105 Syntax extensions Terms Types Typing environment Store Values Eval ctxt ; A e t v E ::= ::= ::= ::= ::= ::= . . . | () | | !e | e1 := e2 . . . | Unit | ref t . . . | , :t ( , v) | , | . . . | () | . . . | ! | := e | v := Extensions to typing judgment e : t; ( ) = t ; term : t; ; A term e1 (T-LOC) ; A ; A term e : ref t; term !e : t; (T-DREF) : ref t; 1 ; A term e2 : t; 2 ; A term e1 := e2 : Unit; 1 2 (T-ASN) t :: K Unit :: U (K-UNIT) Extensions to kinding judgment t :: U ref t :: U (K-REF) M (, e)( , e ) M M M M ee l l l Dynamic semantics in the presence of a model M (E-PURE-CTX) e = M e = (E-CTX) M e l l (E-PURE-BOT) (, E e)(, E e ) (, e)( , e ) l l (, E e)(, ) M M (, e) l (E-BOT) l (, E e)( , E e ) ( , v) (, ! )(, v) (E-DEREF) (, E e)(, ) (E-ASN) M M = 1 , ( , v ), 2 (, := v)(1 , ( , v), 2 , ()) Figure 4.1: Syntax and semantics of FLAIR (Extends AIR with references) given the type ref t. For simplicity we do not include a dynamic allocation construct. While adding dynamic allocation to FLAIR is straightforward, developing a policy to control allocation effects is somewhat tedious (although it is still possible). Rather than further complicate 106 the information ow policy of Figure 4.5, we just omit dynamic allocation. The typing judgment ; A e : t; extends the judgment of the same form found in Figure 3.5. In this case, the environment is extended to include bindings of memory locations to their reference types. Since we do not support dynamic allocation, we expect all memory locations to be bound to their types in the initial typing environment. Recall that the phase index distinguishes the typing of term-level from type-level expressions and that is a constraint that manages the usage of type-level names plays no signicant role in this extension to AIR. The typing judgments are mostly standard. Our one concern is to ensure that effectful expressions (i.e., expressions that use references to manipulate the store) do not appear in type-level expressions. This is standard in a dependent type system. (Hoare type theory [91], a dependent type system with stateful higher-order functions, is a notable exception.) To enforce the effect-free restriction on type-level expressions, (T-LOC), (T-DREF) and (T-ASN) are all applicable only when typing a term-level expression. (T-LOC) simply looks up the type of the location in the environment. (T-DREF) gives !e the type of the referent of e. (T-ASN) requires the value being written to have the same type as the contents of the location and gives the result of an assignment the Unit type. It also propagates the constraints on names 1 and 2 as enforced in the other rules of the system. The extensions to the kinding judgment are next. (K-UNIT) is straightforward. In (K-REF) we impose a restriction that values in the store be given an unrestricted type. Allowing afne values to escape into the store is common (e.g., afne types in Cyclone [124] are used primarily to track the usage of the heap-directed pointers). Although standard, 107 the machinery required to support this feature is somewhat cumbersome, e.g., we would need to ensure that it there are no unrestricted references to afne values in the store. To keep the formalism simple, we exclude this feature. We would expect a practical implementation of FLAIR to allow afne values to be stored in the heap. Figure 4.1 concludes with an extension to the dynamic semantics of AIR. In Chapter 3, we dened the operational semantics of (purely functional) AIR as a relation of the form M ee , where M is a model dening the reduction of base-term applications. l Here, we extend the relation to dene a reduction of a congurations (, e), consisting of a store and an expression e. The rules (E-PURE-CTX) and (E-PURE-BOT) are congruences that reduce the stateful reduction relation to the purely functional relation for sub-terms that do not require the store . (E-CTX) and (E-BOT) are the stateful versions of the rst two congruences. The only new base rules in the dynamic semantics are (E-DEREF), which simply looks up the value v stored at a location when is dereferenced, and (E-ASN), which updates the store at with the new value v and reduces the expression to the unit value. Appendix C extends the soundness result of AIR to include the store manipulation constructs of FLAIR. The statement of the theorem appears below. Theorem (Type soundness). Given all of the following: 1. A well-formed environment of type names and memory locations, with signature S, = S, 1 ::N, . . . , n ::N, 1 :t1 , . . . , m :tm 2. A type correct expression e such that ; term e : t; , for some t and . 3. An interpretation M such that M and S are type consistent. 108 4. A store such that |= . Then, e , .M Moreover, if M (, e)( , e ) or e is a value. (, e)( , e ) then ; l term l e : t; and |= . This theorem is nearly identical to the soundness theorem of AIR. The only difference is that in FLAIR, we guarantee that the store remains well-typed as a term evaluates. 4.2 A Reference Specication of Information Flow With the ability to write programs with mutable state in FLAIR, we can begin to address the problem of constructing policies that control information ows that can occur via side effects. But rst, we make precise the semantics of static information ow by presenting a simple information ow type system for Core-ML, a core subset of an MLlike language proposed by Pottier and Simonet [104]. 4.2.1 Information Flow for Core-ML Figure 4.2 denes the syntax and static semantics of Core-ML, a minimal core of an ML-like language that enforces static information ow controls. This system (the purely functional fragment of which appears in Appendix A) is a simplication of a type system proposed by Pottier and Simonet [104], and is established as correctly enforcing a standard noninterference property for mostly functional programs that can manipulate a mutable store. The syntax of Core-ML appears at the top of Figure 4.2. Expressions include variables x and the value forms for units, booleans, memory locations , and abstractions. 109 Core-ML syntax Expressions ::= | Types t ::= Labels l, pc ::= Environment ::= e x | () | true | false | | x.e if (e) then e else e | e1 e2 | e1 := e2 | !e pc unit | booll | (t1 t2 )l | refl t L|H x:t | :t | , | Guards l unit t ok unit ok ; pc e:t ML lt l l l l l pc l l l bool l (t t )l l refl t where L H Well-formed types booll ok t ok t ok pc (t t )l ok t ok l .l t l refl t ok l ML Core-ML typing () : unit (ML-UNIT) ; pc ML ; pc ; pc ML x : (x) (ML-VAR) ; pc ML ML : ( ) (ML-LOC) true : booll (ML-T) ; pc false : booll (ML-F) ML e2 ; pc ML e : booll ; pc l ML e1 : t ; pc l ; pc ML if (e) then e1 else e2 : t : refl t ; pc ML e2 : t ; pc ML e1 := e2 : unit (ML-DREF) :t lt (ML-IF) ; pc ML e1 pc l (ML-UPD) ; pc ML e1 : refl t ; pc ML !e1 : t ; pc ML e1 , x:t; pc ; pc ML e:t pc (ML-ABS) l ML x.e : (t t ) : (t t )l pc ; pc ML e2 : t ; pc ML e1 e2 : t lt pc l pc (ML-APP) Subtyping bool ( ) ref Figure 4.2: Core-ML syntax and typing Notice that we do not decorate bound variables with their types; the static semantics of Core-ML guesses these types. The non-value forms include a conditional statement, 110 function application, assignment, and dereference. Although not strictly necessary, we include booleans and conditionals since they are convenient for illustrating indirect ows. For simplicity, as in FLAIR, we do not support dynamic allocation of memory and assume instead that all memory locations are statically known. Core-ML types may be decorated with labels l drawn from the two-point lattice, L H. Unit values carry no sensitive information and so these types are unlabeled. The pc type boolH classies high-condentiality boolean values. Function types are (t t )l . The outer-most label l represents the condentiality of the function, e.g., if a function literal is constructed in each branch of a conditional expression, then each literal must be at least as condential as the guard expression. The annotation pc that appears above the function arrow represents a lower bound on the condentiality of the memory locations that this function may update when it is applied. For example, (boolL boolH )H is the type of a high-condentiality function from low- to high-condentiality booleans. Additionally, the pc annotation H indicates that this function does not mutate any memory locations that are public, i.e., have condentiality level L. The type given to a memory location is of the form refl t. (Note that although the typical notation for reference in the ML family of languages is t ref, we adopt the ref t for consistency with our notation in FLAIR.) The static semantics begins by dening the security semantics of each type. The relation l t (read: l guards t ) requires t to have security level l or greater. It is used to record a potential information ow, e.g., we will use this relation to ensure that the values returned by the branches of a conditional have a security level no less than the level of the guard. 111 H Since unit values carry no information, they are guarded by all labels. Booleans and function types are standardtheir security level is just dened by the outer-most security label. Reference types are somewhat more subtle. In a language with dynamic allocation, both the address and the contents of the location can carry sensitive information. For instance, if allocation occurs conditional on a high-security boolean value, the address chosen by the allocator can reveal information about the boolean. But, in our setting, since we make the simplifying assumption that all memory locations are known statically, we can adopt a simpler security model for reference types. We will interpret a reference type like refH boolH as the type of a memory location only visible to a principal with privilege to view high-security values. Under this interpretation, a type like refL boolH stands for a public memory location that holds a secret boolean value. We will treat such a type as ill-formed, where well-formedness of types is dened by the predicate t ok. The interesting case is the last one, which rules out a type like refL boolH , since H and H boolH L. (One might wonder why we even require a top-level label for a reference type, i.e., why not just have types like ref boolH ? The reason, which will become clear in the next section, is that this model illustrates a particularly close correspondence with the type of labeled references in our FLAIR encoding.) In the typing judgment ; pc ML e : t, the environment binds variables and mem- ory locations to their types, as usual. The pc element is a label representing the condentiality of the context in which e occurs. The judgment states that an expression e has the type t in an environment , and that it does not effect memory at a condentiality level lower than pc. The rst ve rules in the judgment (ML-UNIT), (ML-VAR), (ML-LOC), (ML-T) 112 and (ML-F) are standard. The rst interesting rule is (ML-IF), which type checks a conditional expression. When the guard e has condentiality level l, the branches e1 and e2 execute in a context that carries information about the guard e. Thus, the (T-IF) checks each of these in a context where where the program counter is elevated to pc l (where is the least upper bound operator on the L H lattice), i.e., the program counter is at least as secret as the guard. The other rules will ensure that the side effects of e1 and e2 are only to locations at least as secret as l. Since the value returned from the branches also depends on the guard expression, the nal premise l t requires that the type of the entire expression also be at least as secure as the guard. (ML-UPD) type checks assignments through a reference. The rst two premises ensure that the type of e2 matches the type of the values that can be stored in the location e1 . The third premise pc l enforces the program counter constraint; it ensures that the location being written to is at least as condential as the context. (ML-DREF) allows any location to be dereferenced, irrespective of the value of the condentiality of the contextthe well-formedness of reference types ensures that the dereferenced value is at least as secret as l. Finally, we turn to the rules for functions. (ML-ABS) guesses a type t for the abstracted variable x and type checks the body of the abstraction, e, in a context that includes an assumption for x. Additionally, a label pc is chosen for the condentiality of the program counter. This label serves as the lower bound for the memory effects of the body when the abstraction is applied and this bound is recorded above the arrow in the resulting type. Finally, we pick a condentiality label l for the function itself. As an example, consider the program below, typed in a context = secret:boolH , hloc:refH boolH . 113 if (secret) then x. hloc := true else x. hloc := false One legal type for this program is (unit unit)H . Since, each branch of the function carries information about the secret boolean, we must ensure that the values returned from the branches are at least as secure as secret. This explains the outermost label H reecting the condentiality of the function itself. The pc annotation on the function arrow is H, indicating that it writes only to high-condentiality memory locations. This is desirable since when the function returned by this program is applied, the value of secret is written into the memory location hloc. Thus, when secret:boolH , this program is secure only when hloc is also a high-condentiality location. The nal typing rule (ML-APP) handles function application. The rst two premises are straightforwardthey simply require the type of the formal parameter to match the type of the actual argument. Since the value returned from the application carries information about the identity of the function that was applied, the third premise requires the returned value to be at least as secure as the function itself. Finally, the last premise ensures that the memory effects that occur in the functions body do not leak information about the program context (pc pc ) or about the identity of the function itself (l pc ). H To illustrate how function applications are typed, we revisit our example program, this time typed in a context = secret:boolH , lloc:refL boolL . if (secret) then x. lloc := true else x. lloc := false Since the function bodies in the branches update low-security locations, the type of this program must be of the form (unit unit)H , i.e., the pc annotation on the function arrow is L, reecting that this function (call it f ), when applied, can effect a low-security 114 L location. However, if we try to apply this function as f (), we nd that if fails to type check. The nal premise of (ML-APP) requires the pc annotation on f s type to be at least high as the condentiality of f itself. This is a good thingwe expect to reject an application of f as insecure because, when f is applied, the value of the secret boolean is copied into the low-security location lloc. The last section of Figure 4.2 shows a simple subtyping relation which extends the partial order over labels L H to types. The notation, due to Pottier and Simonet, is a compact form for dening covariance () and invariance ( ). For instance, we have that boolL is a subtype of boolH . For simplicity, we do not permit contravariance of function arguments and covariance of function return types. Contravariance of the program counter annotation is also customary. Handling each of these features poses no signicant challenge; Appendix A shows how to handle these constructs in the context of FABLE. Finally, one might have expected covariant subtyping on the label associated with a reference type. However, since our attacker model interprets a location of type t1 = refL boolL to be readable by the attacker, it is not safe to treat this as a subtype of t2 = refH boolL . Again, for simplicity, we limit ourselves to invariance on the labels of reference types. 4.3 Tracking Indirect Flows in FLAIR using Program Counter Tokens In this section, we describe an encoding of a policy in FLAIR that attempts to pre- 0 vent illegal information ows through side effects. We present a signature SFlow which illustrates the basic idea behind our encoding, i.e., we use a special runtime value to represent the condentiality of the program counter. Policy functions receive these program 115 Type constructors Bool :: U LabeledRef :: U Lab U PC :: Lab U Base terms True : Bool False : Bool lub update branch : : : (x:Lab) (y:Lab) Lab ::U.(l:Lab) PC l (x:LabeledRef (ref ) l) (y:) Unit ::U.(l:Lab) PC l (m:Lab) (b:Labeled Bool m) (t:PC (lub l m) ) ( f :PC (lub l m) ) (Labeled m) 0 Figure 4.3: SFlow : An attempt to statically enforce information ow in FLAIR counter values in their arguments, and, by giving these arguments appropriate types, we are able to place constraints on the indirect ows that occur in the program. In Section 4.4, we elaborate upon this basic idea to develop a complete signature SFlow , which we then prove to successfully enforce a noninterference property for FLAIR programs. 4.3.1 0 SFlow : A Sketch of a Solution 0 Figure 4.3 shows SFlow , to be read as an extension of SFABLE , which appears in Fig- ure 3.9. The signature begins with a type constructor for booleans and the corresponding term constructors. We also include a lub function that computes the least upper bound of two labels. We then include a type constructor LabeledRef we use this constructor to protect references by giving them types like LabeledRef (ref t) High. As usual, application programs cannot manipulate labeled references directly. Instead, they must call base-term functions that mediate all operations on these references. 116 The update function controls assignments through references. To prevent illegal direct ows, we simply require that the type of the referent of x is the same as the type of the new value y to be stored in that locationthis corresponds to the constraints specied in the rst two premises of (ML-UPD) in Figure 4.2. The constraints of the third premise of (ML-UPD) (in order to protect against indirect ows) are captured by the rst two arguments of update. We require the application to pass in a label l and a program counter token of type PC l. This token is a runtime value that is to be used as a proof that the program counter (at the point where update is called) is only as condential as l. If we can ensure that such proofs are always constructed properly, we can be sure that a low-security reference is never updated in a high-security context. The branch function mediates all conditional expressions where the guard is a labeled boolean value. The rst two arguments of branch show a program counter token of type PC l, indicating that the program counter just before the conditional is executed is only as condential as l. The next two arguments show the boolean guard b, which is labeled mthis much captures the rst premise of (ML-IF). The arguments t and f represent the true and false branches, respectively. Since FLAIR is a call-by-value language, we need to suspend the execution of t and f when passing them as arguments to branch this explains why both t and f are given function types. To ensure that the effects of the branches do not leak information about the guard, the third premise of (ML-IF) checks each branch in a context where the program counter is at least as secure as the guard. The constraint is captured in branch by the arguments of t and f , i.e., a program counter token of type PC (lub l m). The idea is that given a token of type PC (lub l m), t and f can never modify a location that is less secret than b since they cannot pass in the appropriate token 117 0 = SFlow , initpc:PC Low, h:Labeled Bool High, l:Labeled Bool Low, hloc:LabeledRef (ref Bool) High, lloc:(LabeledRef (ref Bool) Low), if (h) then hloc := true else hloc := false let tbranch (pc:PC High) = update pc hloc true let fbranch (pc:PC High) = update pc hloc false branch initpc h tbranch fbranch if (h) then lloc := true else lloc := false let tbranch (pc:PC High) = update initpc lloc true let fbranch (pc:PC High) = update initpc lloc false branch initpc h tbranch fbranch if (l) then lloc := true else hloc := false let tbranch (pc:PC Low) = update pc lloc true let fbranch (pc:PC Low) = update pc hloc false branch initpc l tbranch fbranch Figure 4.4: Attempting to track effects in some simple example programs as an argument to update. Finally, the value returned by branch is the value of type returned by either branch. This value is labeled m to reect the dependence on the guard b, i.e., the return type captures the last premise of (ML-IF). 4.3.2 0 Example Programs that use SFlow 0 To illustrate how SFlow works, we consider three example programs in Figure 4.4. These programs are checked in the context where initpc is an initial program counter token of type PC Low, h is a High boolean value, l is a Low boolean value, hloc and lloc are High and Low locations respectively. To the left, we show (in pseudo-code resembling Core-ML) programs that manipulate these variables, and at the right, we show equivalent 118 programs in FLAIR. Throughout the remainder of this chapter, our examples will use a more readable ML-like syntax rather than the primitive notation of FLAIR. In the top-most section of the gure we have a secure program that examines the h value and, based on the result, writes to the secret location. The FLAIR program at the right, starting from the bottom, calls the branch function, passing in initpc as a token representing the initial program counter. Next, we pass in the boolean guard h, and two functions, tbranch and fbranch, representing each branch of the conditional. Both tbranch and fbranch expect program counter arguments of type PC High, indicating that they execute in a context control dependent on a High condentiality value. In the bodies of these functions, we pass the High program counter, the location to be updated hloc, and the value to be stored to the update function. The entire program can be given the type Labeled Unit High, which reects that the value computed by this program depends on the boolean expression h, which is labeled High. 0 Although SFlow illustrates our basic strategy of tracking indirect ows by passing program counter tokens between the policy and application, it is awed in two important ways. First, an application program can spoof the policy by re-using a stale program counter token, making information ow tracking unsound. Second, in some contexts, 0 SFlow prevents a program from causing side effects to high-security locations, even though 0 such effects are always secure, i.e., in this respect, the policy enforced by SFlow is more restrictive than a system like Core-ML. The next two examples in Figure 4.4 illustrate these two problems and hint at possible solutions. The program in the middle part of Figure 4.4 is insecure because it updates lloc, a public location, in a context depending on the value of h, thereby exhibiting an indirect 119 0 ow from High to Low. However, the types in SFlow are not precise enough to reject a transcription of this program to FLAIR (at the right) as type-incorrect. The trouble is that in tbranch (and fbranch) the application presents initpc as a program counter token to update. The initpc token represents the condentiality of the program counter only at the start of the program. In the context of tbranch, the initpc token is no longer valid. We need a way to ensure that functions like tbranch only use the program counter tokens they receive as arguments, rather than re-using stale tokens. In Section 4.4 we show how afne types in FLAIR can be used to accomplish exactly this. The nal program in Figure 4.4 illustrates the second problem. At the left, we have a secure program that updates either lloc or hloc based on the value of l. However, using 0 SFlow , it is not possible to translate this to a type-correct FLAIR program. In fbranch, we have a function that expects a PC Low token as an argument, reecting that it is control dependent only on the Low-security boolean value b. In the body of fbranch, we need to update hloc; but, the call to update fails to type check because the label of the program counter does not exactly match the label of hloc. To solve this problem, in Section 4.4, we will develop an encoding that allows an application program to use a program counter token of type PC l to produce capabilities that allow it to update any location that is labeled l or higher. 4.4 Enforcing Static Information Flow in FLAIR In this section, we develop SFlow , a signature that incorporates afne types and ca- 0 pabilities into SFlow to accurately enforce an information ow policy in the presence of 120 side effects. We begin by providing a brief overview of our solution, which consists of the following main elements. Representing a program counter with an afnely typed value. The main problem with 0 our attempt to track indirect ows in SFlow was that the program counter token could be freely duplicated by the application program. This meant that the policy could not correctly ensure that the program always presented a program counter token that represented the sensitivity of the context in which the update policy function was called. This problem can be overcome by using afne types. Our encoding will give the PC type constructor 0 the kind Lab A (rather than Lab U, as was the case in SFlow ). So, a value of the afne type PC e will represent a proof that the program counter is no more secret than the label e. Since this value is afne, the type system will ensure that any given program point, there is only a single value that represents the current state of the program counter. Generating capabilities from program counter tokens. In order to modify a location with label l, an application program must present a capability that demonstrates that the program counter at that point is not greater than l. We will use a value of the type Cap l m as this capabilitya Cap l m value represents a proof that the label l is not less than m, the program counter label at that program point. In order to construct such capabilities, we provide a function pc2cap which, when given a PC m value, produces a capability of type Cap (lub l m) m, for some label l, while consuming the PC m value. Conversely, the program can call cap2pc to consume a capability Cap l m and retrieve a program counter token of type PC m. Representing the pc bound on function types with an extra parameter. In a Core-ML 121 function type (boolL boolH )L , the H pc-annotation over the arrow is a static guarantee that the functions effects are limited to the fragment of memory with security level at least H. So, our representation in SFlow of this Core-ML function type will be a function with a formal parameter (PC High Labeled Bool Low). That is, the argument of the function is a pair that includes the PC High value, from which it can generate capabilities that authorize it to update only High-condentiality memory locations. However, since PC e is an afne type, when a value of type PC High is passed as an argument to a function, the caller can no longer use this value to generate capabilities to modify some other memory location (or even to call some other function). The solution to this problem is a standard idiom for programming with afne typesevery function that receives a PC e value as an argument also returns this value back to the caller for further use. (This style of threading afne values through a function was introduced in Chapter 3.) Thus, the FLAIR representation of our example Core-ML function type is therefore H Labeled ((PC High Labeled Bool Low) (PC High Labeled Bool High)) Low 4.4.1 SFlow : A Signature for Static Information Flow Figure 4.5 denes the signature for a policy that can statically control information ows in a FLAIR program. It is intended to be read as an extension of the SFABLE signature (which encodes some FABLE primitives) presented in Figure 3.9. This signature, in effect, denes an interface that any correct implementation of a policy must satisfy. However, many possible implementations exist, of which only some are correct. In Section 4.4.4 we 122 Type abbreviation Boxed l (PC l ) Type constructors Bool :: U LabeledRef :: U Lab U :: A U A PC :: Lab A Cap :: Lab Lab A Base terms True : Bool False : Bool Low High lub Pair join sub initpc pc2cap cap2pc update : : : : : : : : : : : : Lab Lab (x:Lab) (y:Lab) Lab ::A, ::U. ::U.(l:Lab) (m:Lab) (x:Labeled (Labeled l) m) Labeled (lub l m) ::U.(l:Lab) (m:Lab) (x:Labeled l) Labeled (lub l m) ((l:Lab) PC l) (l:Lab) (m:Lab) PC l Cap (lub l m) l (l:Lab) (m:Lab) Cap l m PC m ::U.(l:Lab) (m:Lab) (cap:Cap l m) (x:LabeledRef (ref ) l) (y:) Boxed m Unit ::U.(l:Lab) (x:LabeledRef (ref ) l) (Labeled l) ::U.(l:Lab) (pc:PC l) (m:Lab) (b:Labeled Bool m) (t:Boxed (lub l m) Unit) Boxed (lub l m) ) ( f :Boxed (lub l m) Unit) Boxed (lub l m) ) Boxed l (Labeled m) ::U, ::U.(l:Lab) (pc:PC l) (m:Lab) ( f :Labeled ((Boxed (lub l m) ) Boxed (lub l m) ) m) (x:) Boxed l (Labeled m) deref branch apply : Figure 4.5: SFlow : A FLAIR signature to statically enforce an information ow policy discuss the form of specic policy implementations (in terms of a model M for a FLAIR program) that satises this signature. This is in contrast to the approach in FABLE, where we focused directly on providing a concrete semantics for an enforcement policy. 123 Our encoding will make repeated use of packaging a value of type t along with a program counter value that, in the case of functions argument, will represent a lower bound on the functions side effects. Functions will also package their result with a program counter and return the pair to the caller. Throughout the remainder of this chapter, we will use the type abbreviation Boxed e t to stand for the tuple type (PC e t). Since our policy encoding provides a purely static guarantee, an optimizing compiler can chose to erase the program counters and choose a runtime representation for values of type Boxed e t that is the same as the representation chosen for values of type t. The binary type constructor is used to give a type to a tuple consisting of a afne value and an unrestricted value and will be used to package a program counter with some other value. The corresponding term constructor is Pair. (However, we will use the more intuitive notation of (e1 , e2 ) to construct pairs, and functions like fst and snd to destruct pairs.) Notice that Pair is polymorphic in and , the types given to each component of the tuple that it constructs. Since the rst component of the tuple is afne, we require the tuple itself to be afne to ensure that the rst component is not projected from it repeatedly. The type constructor LabeledRef is used to protect memory locations; Labeled is used to protect all other values. The kinds of both constructors are identical. By distinguishing protected locations from other protected values, we will be able to dene subsumption rules that apply only to labeled values, not to labeled references. (Recall that in Core-ML, labeled reference types were invariant in their labels, whereas covariant subtyping on the labels were permissible on other labeled values.) The terms join and sub dene these subsumption rules. The join functions allows a type with multiple labels l 124 and m to be coerced to a type with a single label lub l m. As previously, sub is intended to encode a subsumption relation which takes as arguments a term x with type Labeled l and a label m and allows x to be used at the type Labeled (lub l m). This is a restatement of the covariant subsumption rule, as l m implies l m = m. (Of course, as in Chapter 2, to simplify the construction of source programs we could use phantom label variables in the types of functions like join and sub. We omit phantom variables here for simplicity.) 0 Unlike SFlow , in SFlow we type the program counter token using the dependent-type constructor PC that constructs an afne type (a type with kind A) from a label. Rather than provide data constructors for the program counter token, the signature includes a function initpc that an application program can call to construct an initial program counter token. The type of initpc is afnely qualied (prexed by ). This ensures that the application program can construct an initial program counter token only once. For example, the application can call initpc Low to create a token of type PC Low. Recall that the program counter serves as a lower bound on the memory effects of a program. Thus, the application is free to initialize the program counter as initpc High as this only further restricts the set of permissible memory effects of the program. The type of a capability that authorizes a program to update a memory location will be formed from the binary dependent-type constructor, Cap. Specically, Cap High Low is a capability that states that the current program counter is Low and the program is authorized to modify a location with the label High. A program can trade in a program counter value PC l for a capability Cap (lub l m) l using the pc2cap function. Notice that lub l m is always greater than l, ensuring that the only capabilities that can be generated at a program point are to modify memory at least as high as the current program counter. 125 = SFlow , hloc:LabeledRef Bool High, lloc:LabeledRef Bool Low hloc := true; lloc := false let pc = initpc Low in let hcap = pc2cap Low High pc in let pc1,() = update High Low hcap hloc True in let lcap = pc2cap Low Low pc1 in update Low Low lcap lloc False Boxed Low Unit Figure 4.6: Translating a simple Core-ML program to FLAIR Importantly, the Cap m l type is also afne, ensuring that a program cannot duplicate capabilities and use them when they are no longer consistent with the program counter. In order to retrieve a program counter from a capability, the program can call the function cap2pc. 0 The update base term corresponds renes the function of the same name in SFlow to account for the afne tokens and capabilities. As before, the arguments x and y represent the reference and the value to be stored therein, respectively. However, instead of preventing indirect ows by requiring an argument of type PC l, the rst three arguments of update show a capability Cap l m. This capability proves that the label l of the reference is not less than the condentiality m of the current program counter. Since the program counter token and capability cap are afne, update must return a token to the caller to allow it to make subsequent calls to the policy. So, the return type of update includes a value of type PC m (packaged as a Boxed m Unit). One remaining point to note about the type of update: as with our translation of AIR release rules in Chapter 3, since the argument cap is afne, we require every function type to the right of cap to also be afne, since they represent closures that capture an afne value. Figure 4.6 contains a simple example program that illustrates how program coun- 126 ters, capabilities, and the update policy term interact. The top of the gure shows the environment that records the types of the hloc and lloc as High and Low locations, respectively. Next, we show a Core-ML program (at left) and its corresponding FLAIR program (at right). In FLAIR, we obtain the initial program counter token of type PC Low by calling the initpc function. Since this function is afne, it can never be called again in the rest of the program. In order to update hloc, we must construct a capability of type Cap High m (for some m). So, we call pc2cap to produce a value hcap of type Cap High Low from the initial program counter. We then pass hcap to the update function, along with the reference hloc and the value to be stored. The update function updates the location hloc, consumes the capability hloc, and returns a program counter value pc1 of type PC Low back to the caller, along with the unit value that results from the assignment. Finally, in order to update lloc, we must present a capability of type Cap Low Low to update. We construct such a value by applying pc2cap to pc1 and then passing the result to update as before. The type of the entire program is shown in the box at the bottom, i.e., a pair consisting of a PC Low token and a Unit value. Returning to the signature SFlow of Figure 4.5, we have the term deref, which mediates access to the dereferencing operation. A location can be dereferenced at any point in the program, irrespective of the program counter. However, we must be careful to ensure that we do not allow the program to read out of a secret (High) location and write the contents to a public (Low) location. The deref function ensures this by labeling the value read out of the location with the same label as the location itself. The result is that the value is at least as secret as the location in which it was stored. We turn now to the branch function, which corresponds to the (ML-IF) rule in the 127 0 Core-ML semantics. As in SFlow , the arguments show a program counter token at level l, the boolean guard labeled m, and the thunkied branches t and f . The branches receive as arguments program counter tokens at level lub l m. Since the tokens are now afne, the types guarantee that the branches only use this token of type PC lub l m in their bodies (and not some other stale token in scope, like initpc). As with all other functions, the branches thread the tokens back to their callers along with the value of type that they compute. Finally, as before, the return type of branch labels the result of type with the label m of the guard. In addition, branch includes the program counter token in the boxed type that it returns. The apply function corresponds to the type rule (ML-APP) in Core-ML and is similar in structure to branch. It allows a labeled function f to be applied to an argument x. The type of f ensures that its body executes in a context where the program counter is labeled with the condentiality of f itself (which corresponds to the nal premise of (ML-APP), pc l pc ). Additionally, the return type of apply ensures that the value returned from the function is also as secret as the function itself (corresponding to the third premise of (ML-APP)). 4.4.2 Simple Examples using SFlow In this section, we revisit the example programs of Figure 4.4 and show how they can be checked in FLAIR using SFlow . The top-most part of Figure 4.7 shows a secure Core-ML program that updates a High-location based on a High guard. In the FLAIR program at the right, as before, we have a call to the branch function, passing in the 128 = SFlow , h:Labeled Bool High, l:Labeled Bool Low, hloc:LabeledRef (ref Bool) High, lloc:(LabeledRef (ref Bool) Low), if (h) then hloc := true else hloc := false let tbranch (x:Boxed High Unit) = let hcap = pc2cap High High (fst x) in update High High hcap hloc True let fbranch (x:Boxed High Unit) = let hcap = pc2cap High High (fst x) in update High High hcap hloc False branch Low (initpc Low) High h tbranch fbranch if (h) then lloc := true else lloc := false let tbranch (x:Boxed High Unit) = let cap = pc2cap High Low (fst x) in update Low High cap lloc True #require cap:Cap Low Low let fbranch (x:Boxed Low Unit) = let lcap = pc2cap Low Low (fst x) in update Low Low lcap lloc False branch Low (initpc Low) High secret tbranch fbranch #1st arg of fbranch must be Boxed High Unit if (l) then lloc := true else hloc := false let tbranch (x:Boxed Low Unit) = let cap = pc2cap Low Low (fst x) in update Low Low cap lloc True let fbranch (x:Boxed Low Unit) = let hcap = pc2cap Low High (fst x) in update High Low hcap hloc False branch Low (initpc Low) Low l tbranch fbranch Figure 4.7: Tracking effects using SFlow guard h and the branches. However, this time we call the initpc function to construct an initial program countersince initpc is afne in SFlow , it cannot be called elsewhere in the program. In tbranch (and in fbranch) we receive a token of type PC High as an argument (reecting the dependence on the guard h). Before updating hloc, we construct 129 a capability hcap of type Cap High High by calling pc2cap. We then pass this capability to update along with hloc and the value to be stored. Both branches return values of type Boxed High Unit, and the branch itself returns Boxed Low Unit. In the middle part of Figure 4.7 we have a Core-ML program that is insecure (and untypable) because it has an indirect ow from High to Low. We might try to write a similar program in FLAIRthe right of the gure shows one such attempt with two typing errors. For instance, if we were to were to try to update the location lloc in the body of tbranch, we must pass in evidence that the program counter is not more secret than the contents of lloc. We try to construct such a capability by calling pc2cap High Low, but we get back a value of type Cap (lub High Low) High, which is equivalent to Cap High High. Thus the type checker rejects the call to update as incorrect, since in order to update lloc, update requires a Cap Low Low capability. In fbranch, the argument pc is given a type that allows the body of the function to type check. However, fbranch cannot be passed to branch as an argument, because the type of branch dictates that the rst argument of both branches include program counters that are at least as secret as the guard secretin this case, at least PC High. The nal program in Figure 4.7 shows how capabilities can be used to modify all locations more secret than the current program counter. At the left, in the else-branch, we update hloc in a context that is dependent on l. At the right, fbranch receives a token of type PC Low as an argument. To update hloc, we can construct a capability of type 0 Cap High Low and pass this to update. In contrast, when using SFlow in Figure 4.4, we could only update Low locations in contexts that were dependent on Low-security values. 130 4.4.3 Examples with Higher-order Programs We now turn to some examples of higher-order functions and show how they can be checked in FLAIR using SFlow . The top-most part of Figure 4.8 shows a Core-ML program to the left, where, instead of modifying a location in each branch depending on the value of a boolean, we construct closures that modify the locations when they are applied. In FLAIR, the top level is as in Figure 4.7we call branch passing in the true and false branches, tbranch and fbranch respectively. Each branch returns a pair where the rst component is just the program counter token received in the argument and the second component is the closure. In each case, the closure itself takes an argument that includes a program counter token of type PC High, indicating statically that this functions effects are only to the High fragment of memory. In the body of the closure, we project out the program counter token from the argument y, generate a capability hcap, and call the update function. We call the whole program on the right p2 and can give it the type: Boxed Low (Labeled (Boxed High Unit Boxed High Unit) Low) That is, a pair consisting of a Low program counter, and a Low-security function from Unit to Unit which is guaranteed to only have an effect (if at all) on the High fragment of memory. The next part of Figure 4.8 shows how the function p2 can be applied. Since this is a boxed value, we rst project out each componentthe program counter token pc and the labeled function f. We then call the apply function, passing in the program counter, the function f and the argument (). However, our construction of p2 requires that the function be called with a program counter that has the type PC High. But, at the call 131 = SFlow , l:Labeled Bool Low, hloc:LabeledRef Bool High let p2 = if (l) then x. hloc := true else x. hloc := false let tbranch (x:Boxed Low Unit) = (fst x, y:Boxed High Unit. let hcap = pc2cap High High (fst y) in update High High hcap hloc True) let fbranch (x:Boxed Low Unit) = ... branch Low (initpc Low) Low l tbranch fbranch = . . . , p2:Boxed Low (Labeled (Boxed High Unit Boxed High Unit) Low) let p3 = let pc, f = p2 in p2 () apply Low pc High (sub Low High f) () = . . . , p3:Boxed Low (Labeled Unit High) let p2 = if (l) then let tbranch (x:Boxed Low Unit) = x. hloc := true; true else x. hloc := false; false in p2 () (fst x, y:Boxed Low Unit. let hcap = pc2cap Low High (fst y) in let pc, () = update High High hcap hloc True in (pc, true) let fbranch (x:Boxed Low Unit) = .. let pc = initpc Low in let pc1, f = branch Low pc Low l tbranch fbranch in apply Low Low pc f () = . . . , p4:Boxed Low (Labeled Bool Low) Figure 4.8: Higher-order programs that contain secure indirect ows site, the program counter pc has type PC Low and the function f is labeled Low. In this context, the type of apply requires f to be a function from Boxed (lub Low Low) Boxed (lub Low Low) , whereas, our function f has the underlying type Boxed High Unit Boxed High Unit, and so cannot be passed as is to apply. One way allow this application to proceed is to use subtyping to coerce the outer132 most label of f from Low to Highwhich is what that call to sub Low High f achieves. Now, the type that apply requires for the underlying function matches the type we have for fthe arguments of both are Boxed (lub Low High) . This approach illustrates a way in which subsumption can be used. But, this approach has the unfortunate consequence that the returned value is also labeled High condentiality (since it must be as condential as the function itself). In this case, since the returned value is just (), the fact that it is High condentiality is insignicant. However, if the value is, say, a boolean, spuriously treating the result is High security is undesirable. The nal part of Figure 4.8 shows an alternative translation that xes this problem. Here the closure in tbranch is a function that takes a PC Low token as an argument, which statically only guarantees that its memory effects are to the Low (or higher) fragment of memory. In the body of the function, we use pc2cap to generate a capability to modify the High security location hloc and then package the boolean to be returned along with return the program counter of type PC Low. This time, the function p2 has the type Boxed Low (Labeled (Boxed Low Unit Boxed Low Bool) Low) In order to call this function, we just have to unbox it and pass the components to the apply function. The resulting value has the type Boxed Low (Labeled Bool Low), as desired. We turn next to the programs in Figure 4.9 which display insecure indirect ows. In the rst section of the gure, we have a program fragment that type checks in both Core-ML and in FLAIR. This is a program that based on a secret value h, constructs a closure that, only when applied leaks the value of h into the public location lloc. This program p2 has the type shown in the box, reproduced below: 133 = h:Labeled Bool High, lloc:LabeledRef Bool Low let p2 = if (h) then x. lloc := true else x. lloc := false let tbranch (x:Boxed High Unit) = (fst x, y:Boxed Low Unit. let lcap = pc2cap Low Low (fst y) in update Low Low lcap lloc True let fbranch (x:Boxed High Unit) = (fst x, y:Boxed Low Unit. let lcap = pc2cap Low Low (fst y) in update Low Low lcap lloc False branch Low (initpc Low) High h tbranch fbranch = . . . , p2:Boxed Low (Labeled (Boxed Low Unit Boxed Low Unit) High) p2 () let pc, f = p2 in apply Low pc High f () #fs argument must be (Boxed High Unit) Figure 4.9: Higher-order programs with insecure indirect ows p2:Boxed Low (Labeled (Boxed Low Unit Boxed Low Unit) High) This is the type of a boxed function, where importantly, the function itself is labeled High. This reects that fact that the value of p2 depends on a High security value h. If we try to apply this function (in the nal section of the gure), we nd that the application fails to type check. The reason is that apply requires the rst argument of the function f to include a program counter token that proves that f s effects are only to locations that are at least as secure as f itself. In this case, f is High security, but the program counter argument of f is PC Low. 134 4.4.4 Security Theorem The main security result of this chapter is a proof that FLAIR programs that are type correct with respect to SFlow , the signature of Figure 4.5, enjoy a noninterference property. However, before we can proceed, we must dene a model for the base terms in SFlow . This model axiomatizes the reductions of base-term applications by associating a set of equations with each base term. (In FLAIR, each equation is optionally parameterized by a store .) For instance, the desired semantics of the lub function is dened by the following set of equations Elub , where the equation v1 , v2 of application lub v1 v2 to the expression e3 . e3 axiomatizes the reduction Elub : Low, Low, Low High Low High High High High, Low High, High Our security theorem will be parameterized by a model for FLAIR programs M, where M(lub) = Elub , i.e., all applications of lub are dened by the above set of equations. Our proof in Appendix C provides a complete model for SFlow . However, for all the base terms other than lub, the types in FLAIR are precise enough that any set of equations that are consistent with the types in the signature are sufcient for noninterference. To illustrate the sufciency of type-consistency, we point out rst that several of the function-typed base terms in SFlow are just type coercionsoperationally, these coercions are the identity function (on one of its arguments). For example, the join function has the 135 following type: ::U.(l:Lab) (m:Lab) (x:Labeled (Labeled l) m) Labeled (lub l m) The only possible implementation of join that respects this type is the identity function on the argument x. The same is true of the other coercions like sub. The insight that these coercions must be identity functions on one of their arguments is a particular instance of a parametricity theorem [136]. A reading of the types in SFlow with parametricity in mind indicates that a similar theorem applies to most of the base terms. In the case of deref, any set of type consistent equations must simply read a value out of the location received as an argument and return the result. The apply base term also has only one possible type-correct implementationit must apply f to x and return the result. In the case of branch and update, the types in the signature admit more than one possible denition. For branch, while parametricity guarantees that an implementation must apply one of the branches, our types are not precise enough to guarantee that the branch executed correctly reects the value of the guard b. For update, since the returned type is Unit, one possible type correct implementation is to simply return (). However, any implementation that mutates the store must do so as intended. That is, it must assign the value y to the location x (and not to any other). Clearly, the choice we make for dening the operational behavior branch and update has a profound impact on the semantics of the FLAIR program that uses these terms. However, purely from a security perspective, the specic implementation that is chosen does not matter. For instance, an ill-chosen 136 (but type-correct) denition of branch may cause the else-branch to be executed instead of the then-branch; but, the types guarantee that even in this case, no high-condentiality information is leaked to low-condentiality outputs. While we do not formalize these parametricity arguments, an interesting direction of future work would be to investigate the validation of a policy specication (in the form of a signature) with respect to the theorems that can be deduced from the types in the specication. Rather than prove a security result (as we did in FABLE) by considering specic implementations (in the source language) of a policy, reasoning with parametricity at the meta-level may lead to simpler and more abstract proofs. Denition 4 (Low-equivalence of stores). Two stores 1 and 2 are low-equivalent with respect to an environment if and only dom(1 ) = dom(2 ) and .1 ( ) = 2 ( ) ( ) = LabeledRef t High Theorem (Noninterference for FLAIR, with SFlow ). Suppose, for well-formed , the signature SFlow , a model M type-consistent with SFlow , such that M(lub) = Elub , we have ; initpc term e : t; . Then, for any two low-equivalent stores and , such that |= and |= , if we have M M (, e)(1 , e1 ) . . . (n , en ) ( , e)(1 , e1 ) . . . (m , em ) Then, the sequences , 1 , . . . n and , 1 , . . . , m are low-equivalent up to stuttering. This timing- and termination-insensitive noninterference property is similar to an analogous property for Core-ML programs. Our proof is based on a technique due to 137 Pottier and Simonet that allows two program executions to be embedded in the syntax of an extended calculus. Since SFlow embeds Core-ML in FLAIR, the terms structure of a FLAIR program essentially mirrors the structure of a Core-ML typing derivation, e.g., each application of the sub function in FLAIR corresponds to an application of a subtyping judgment in the Core-ML derivation. However, FLAIR programs can make use of (rst-class) polymorphism, but the Core-ML subset that we have dened is strictly monomorphic. Rather than extend Core-ML with polymorphism, we simply assume that all the polymorphism in FLAIR is removed via code replication. The blow-up in code size is quadraticfor n FLAIR functions at m call sites, we can produce n m Core-ML function denitions. 4.5 Concluding Remarks This chapter concludes a development in which we have shown how a general pur- pose type system, as embodied by FLAIR, has an expressive power with regard to security policy enforcement that, to our knowledge, is matched by no other single programming formalism. In this chapter, we have demonstrated how an information ow policy can be enforced with purely static controls for programs that manipulate mutable references to memory. Although we have focused on memory effects, our encoding of information ow in SFlow can easily be generalized to account for other kinds of side effects, e.g., sending messages over the network, or printing output to terminal. We have focused so far on the expressive power of our type-based approach. We have repeatedly dismissed concerns of usability by positioning FLAIR as the kernel of 138 an intermediate representation rather than a source-level language for use by a human programmerparticularly for the more complex policies that we have explored. In subsequent chapters make the claim that for many simple policies of interest, the basic idea of a customizable security label model that is interpreted by a user-dened enforcement policy is in fact practical for real-world programs. The main evidence for this claim: SELINKS, a new programming language for building secure web applications. 139 5. Enhancing LINKS with Security Typing Multi-tier web applications are becoming the de facto standard for programs that need to share sensitive information across a wide community of users. To recapitulate the discussion from the Chapter 1, we would like verify that such applications correctly enforce ne-grained security policies. For a program like Intellipedia for instance, which makes classied documents available to the U.S. intelligence community using a Wikipedia-like interface, we would like to protect fragments of documents with access control and provenance tracking policies. On-line stores, web portals, e-voting systems, and online medical record databases have similar needs. This chapter and the next set out to show that by applying FABLE to the design of a new programming language, user-dened security policies can be reliably and efciently enforced in multi-tier web applications. There are two main approaches to enforcing ne-grained policies in a multi-tier web application. A database-centric approach relies on native security support provided by the DBMS. For example, Oracle 10g [97] supports a simple form of row-level security in which security labels can be stored with individual rows, and the security semantics of these labels is enforced by the DBMS during database accesses. A similar approach is possible with views backed by user-dened functions [97]. A customized row-level security label is hidden by the view, and the labels semantics is transparently enforced 140 by the DBMS via invocations to user-dened functions as part of query processing. Alternatively, a server-centric approach is to enforce application-specic policies in the server. For our example, the programmer could dene a custom format for accesscontrol labels, store these with rows as above, and then perform access control checks explicitly in the server prior to security-sensitive operations. This is the basic approach taken by J2EE [56] and other application frameworks. Neither approach is ideal. The database-centric approach is attractive because highly-optimized policy enforcement code is written once for the database for all applications, rather than once per application, improving efciency and trustworthiness. On the other hand, DBMS support tends to be coarse-grained and/or too specialized. For example, most DBMSs provide only simple access control policies at the table level, and Oracles relatively sophisticated per-row labels only apply to totally-ordered multi-level security policies [42]. Even customized support based on views, or further native security extensions, will only go so far: some policies simply cannot be enforced entirely within the database. For example, an end-to-end information ow policy [111] requires tracking data ows through the server to ensure, for instance, that the server does not write condential data to a publicly-viewable web server log. The server-centric approach has the opposite characteristics: it can enforce highlyexpressive application-specic policies, but is potentially far less efcient and less trustworthy. In the worst case the server must load entire database tables into server memory to access and interpret the custom security labels associated with each row. And because the application performs security checks explicitly, programming errors can create security vulnerabilities. 141 As a remedy to this state of affairs, this chapter proposes an extension to the LINKS web-programming language [35] that can be used to build secure, multi-tier applications by combining the best features of the server-centric and database-centric enforcement strategies. Our extension is called Security-Enhanced LINKS, or SELINKS, and employs a server-centric programming model for maximum policy expressiveness, and uses compilation and verication techniques to make performance and trustworthiness competitive with the database-centric approach. We have used SELINKS to implement two applications that enforce interesting security policies. However, this chapter focuses on the features of the SELINKS language. Chapter 6 discusses our example applications as well as some aspects of the implementation of SELINKS in detail. 5.1 Overview We begin in Section 5.2 by illustrating several of the features of LINKS via a simple multi-tier example program. We then consider the security issues that arise for such multitier programs and indicate how these issues might be addressed through the use of labelbased security policies. Our extensions to LINKS consist of two main components. The rst is an implementation of a FABLE-like type system for LINKS, which can be used to verify that application programs correctly enforce their policies. The second component of SELINKS is a novel compilation procedure that aims to make the enforcement of policies in database code more efcient. This chapter focuses on a description of the main feature of the SELINKS 142 type system. The next chapter motivates and describes our cross-tier policy compilation strategy. Policy enforcement in SELINKS works much as it does in FABLE. The SELINKS programmer species a policy by associating customizable security labels with sensitive data in the program. The usage modes of labeled data are dened via specially privileged enforcement policy functions. The type checker ensures that application programs include the appropriate calls to the enforcement policy to ensure that all usages of sensitive data is mediated by the policy. Section 5.3 sketches the main elements of how this works. Although the basic concepts of FABLE translate directly to SELINKS, programming with label-based security policies at the source level requires several additional constructs. Many of these constructs are standard extensions, but their inclusion in SELINKS required addressing some subtle details. For example, we include built-in support for dependently typed records, rather then encoding them with higher-order functions, as we did in Chapter 2. However, adding this constructs to LINKS required adapting techniques from the theory of existential types to manage names bound within the scope of a record. Other constructs involve small theoretical advances. Notable among these is our use of of phantom variable polymorphism. When coupled with inference, we have found this feature to signicantly reduce the annotation burden of programming with dependent types. Sections 5.4, 5.5 and 5.6 catalog each of our SELINKS-specic constructs in detail. Where the theory is novel, as with phantom variable polymorphism, we sketch formal denitions of the semantics. However, for the most part, we rely on informal descriptions of the implementation of these features in the current version of the SELINKS compiler. 143 As such, one purpose of this chapter is to serve as a reference manual for the intrepid programmer intent on experimenting with the research prototype that is SELINKS. We should note that the current version of SELINKS does not support the enforcement of policies in the style of AIR or FLAIR. As we have already observed, we expect working with FLAIRs combination of afne and dependent types at the source level to require too much effort on the part of the programmer. In Chapter 8 we suggest directions for future work that aim to address this limitation. 5.2 An Introduction to LINKS Modern web applications are typically designed around a three-tier architecture. The part of the application related to the user interface runs in a clients web browser. The bulk of the application logic typically runs on a web server. The server, in turn, interacts with a relational database that serves as a high-efciency persistent store. Oftentimes, this architecture is generalized to n-tiers. For instance, one might split the web server into a tier that processes HTTP requests and handles the presentation logic, and a so-called application server that runs the core application logic. Multiple web and application servers are also possible, for better load distribution. Programming such an application can be challenging for a number of reasons. First, the programmer typically must be procient in a number of different languagesfor example, client code may be written as JavaScript; server code in a language like Java, C#, or PHP; and data-access code in SQL. Furthermore, the interfaces between the tiers are cumbersomethe data submitted by the client tier (via AJAX [54], or from an HTML 144 Links Program Compile to JavaScript Function call/return Links compiler and interpreter Compile to SQL Client Web Server Functional list comprehensions Data base Figure 5.1: An overview of the execution model of LINKS form) is not always in a form most suitable for processing at the server or database. These factors are elements of the so-called impedance mismatch in web programming. LINKS aims to reduce this impedance mismatch by making it easier to synchronize the interaction between the tiers of a web application. LINKS is a strict, typed, mostlyfunctional language, with syntax resembling that of JavaScript and employing ideas from other languages, including XQuery [146], Erlang [44], Kleisli [142], Scheme [71], and others. The principal novelty of LINKS is that it brings together a diverse set of proposals in single language in a manner that enables a unique execution model. Rather than construct the multiple tiers of a web application in separate languages (and glue them together via non-standard interfaces), a LINKS programmer writes a single program that expresses the entirety of a multi-tier web application, from client to server to database. Figure 5.1 illustrates the execution model of a LINKS program graphically. A LINKS program consists of a series of function denitions followed by some ini- 145 tialization code to start the application. Each function is annotated with qualiers, either client or server, to indicate where it is supposed to run. LINKS provides a code genera- tor that translates client-side functions to JavaScript to run in the browser; additionally, an interpreter runs server functions at the web server. Function calls may traverse the client/server gapLINKS automatically translates such calls into synchronous remote procedure calls (RPCs) using AJAX [54]. LINKS also allows data access code to be integrated with server-side functions by representing database operations as list comprehensions in the style of Kleisli [142] and LINQ [83]. The server-side interpreter translates list comprehensions to SQL expressions and dispatches these to be run at the database. Thus programs are expressed at a fairly high-level while the low-level details are handled transparently by the compiler. The original LINKS paper [35] provides a comprehensive discussion of the various features of the language. Here, we just attempt to provide the reader with a feel for LINKS programming, with an eye towards the issues that arise when attempting to enforce negrained custom security policies. 5.2.1 Programming in LINKS Figure 5.2 shows a simple, but fairly typical, LINKS program. At a high level, this program provides a web-based interface to a database of employee records. The database contains a table that associates an employees name with her salary. The program allows the user to enter a minimum salary and the program selects all records in the database for which the salary exceeds the minimum and renders the result in the browser as HTML. 146 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 var employeeTab = table Employee with (name : String, salary : Int) from (database EmpDB); fun getRecords(minSalary) server { for (var row employeeTab) where (row.salary > minSalary) [row] } fun showRecords(minSalary) client { var recs = getRecords(minSalary); var tableBody = for (var r recs) <tr> <td>{stringToXml(r.name)}</td> <td>{intToXml(r.salary)}</td> </tr>; <html> <body> <table>{tableBody}</table> </body> <html> } fun main() client { <html> <body> <form method=POST l:action={showRecords(minSalary)}> Enter minimum salary: <input type=text l:name=minSalary/> <input type=submit value=Get records!/> </form> </body> </html> } main() Figure 5.2: A LINKS program that renders the contents of an employee database in a web browser 147 The program begins by dening a schema for the database table that stores the employee records (lines 1-3). The remainder of the programmer shows the functions getRecords, showRecords, and main. Notice that each of these are annotated with a location qualier (client or server), indicating on which tier they are intended to run. Finally, (line 38) we have a call to the main functionthis is code that will be run on the client in order to start the program. The database table in this case is called Employee and is dened as a relation in the database called EmpDB. Each row in this table has two elds (columns). The rst, name, stores the employees name as a String, and the salary eld is an Int (the type of integers). A handle to this table is bound to the variable employeeTab which is in scope for the remainder of the programmer. All operations on this table (such as querying or updating) will be performed using this handle. LINKS does not currently allow database operations (like queries) to be performed directly from client code. Instead, an interface to these tables are exposed to client functions by server functions that encapsulate the application logic. In this case, we have a single server-side function getRecords that allows the Employee table to be queried for all records where the salary exceeds the argument minSalary. The LINKS view of a database table is simply a list of records. Under this model, a database query is a list comprehension [135]. The body of getRecords is a single list comprehension that selects data from the Employee table. In particular, for each row in the table (the syntax for(var row employeeTab)) for which the where clause is true, the row is included in the nal list to which the comprehension evaluates (the syntax [row]). Since the LINKS view of each row is a record, the where-clause projects out the salary 148 eld and checks if it is greater than the argument minSalary. The list computed by this comprehension is returned by the function. (LINKS functions simply return the value computed by their last expressionas in most functional languages, there is no explicit return keyword.) A database list comprehension is checked against the type signature provided as the table schema. In this case, both String and Int are primitive types in LINKS. Therefore, comparing the Int-typed salary eld against a String constant in the where-clause of the query would be agged by LINKS as a type error. Additionally, as far as the programmer is concerned, the types given to the columns of the table are independent of the underlying representation of these types in the databasethe translation between the database representation of these types and the LINKS representation is taken care of by the LINKS runtime. However, if no such translation is possible, the current implementation of LINKS will signal a runtime error. However, it should be straightforward to parameterize the LINKS type checker with a database schema and statically check that the LINKS types given to a tables columns can always be translated to corresponding types in the database. We now turn to the client functionality, beginning with the main function. This function constructs the initial web page of the application. Its body is an HTML page (LINKS allows XML literals to be embedded within the source) which contains a form to collect the users input. The two input elds in the form are, rst, a text eld named minSalary (the name minSalary is in scope throughout the enclosing form element), and a form submission button. When the user enters an integer value in the text eld and presses the submit button, the l:action handler specied in the enclosing form element is called. In 149 this case the handler is a local call to the client function showRecords where the argument is the contents of the text eld named minSalary. The LINKS runtime takes care of input validationin case the user enters a non-integer value in the text eld, the runtime will fail to parse the value and refuse to dispatch the function call. (A more graceful failure mode that, say, prompts the user to enter a different value is not yet provided.) The showRecords function makes a remote call to the server for the function getRecords passing in the user input minSalary as input. There is no distinction at the source level between a local and a remote call. The LINKS runtime, running in the web browser as a JavaScript library, dispatches this call to the server via a synchronous AJAX call. The returned value is a list recs of database rows that matched the query. The name rec is bound in the remainder of the function (i.e., the notation var x = e1; e2 is LINKS notation for the more familiar let x = e1 in e2). The function showRecords then iterates through these rows (using the same list comprehension syntax), but this time constructing an HTML representation of the matched rowseach is an HTML table row (the <tr>) element with two columns (the <td> element) containing the name and salary elds coerced to their XML (equivalently, HTML) representations. Finally, showRecords returns a new HTML page that contains a <table> element, where the body of the table is the list of XML rows constructed by the list comprehension and bound to the tableBody variable. Finally, in order to explain the examples that appear in the rest of this chapter it is important to note that, by default, functions are not curried in LINKS. The type of the showRecords function, for instance, is (Int) Xml, indicating that it is a function that takes a tuple containing a single Int-typed eld as an argument and returns an Xml value. Functions that take multiple arguments usually do so by accepting multiple elds in the 150 argument record. For example, a version of showRecords that took both a minSalary and a maxSalary as arguments is typically dened as fun showRecords(min, max) { ... } and is a given the type (Int, Int) Xml. It is possible to explicitly dene a function as being curried, by using the notation fun showRecords (min) (max) { ... }. This function would be given the type (Int) (Int) Xml, the type of a function that expects a tuple with a single integer as an argument and returning a function that expects a tuple with a single integer which in turn returns some Xml. 5.2.2 Fine-grained Security with Links It is natural to want to enforce application-specic security policies for programs like the example of Figure 5.2. For instance, we might want to limit access to an employees salary information only to certain principalsfor instance, the employee herself, her managers, and maybe certain other privileged actors like members of an organizations human-resources team. One way to apply such a security policy would be to partition the table into multiple tables where all rows in a given table have identical access control requirements. The database can enforce access protection at the level of the table itself preventing a user from accessing a table when she does not have the right set of privileges. But, for a large organization with a complex managerial hierarchy, such an approach can lead to a proliferation of tables. Managing a large number of tables can easily become unwieldy. Furthermore, the privilege of creating tables and setting access controls is often restricted to users with administrative rights. This makes it difcult for ordinary users to apply 151 var employeeTab = table Employee with (acl : String, name : String, salary : Int) from (database EmpDB); fun getRecords(credential, minSalary) server { for (var row employeeTab) where (accessAllowed(credential, row.acl) && (row.salary > minSalary)) [row] } fun selectAll() server { for (var row employeeTab) [row] } Figure 5.3: Enforcing a ne-grained access control policy in LINKS discretionary controls to their data with table-level protection. Additionally, indexing data in multiple tables can be difcult or impossible, which can degrade the performance of query execution. An alternative approach is to associate some metadata with each row in the employee table that identies the set of users that can access the record. Queries of the table can be expected to examine this metadata against the credentials of the user issuing the query and return the result only if the access check succeeds. For instance, one might dene the Employee table as shown in Figure 5.3. Each row now contains three columns; name and salary are as before, and the new acl eld holds some string metadata that represents an access control list. We can then revise our function getRecords to take two arguments, credential and minSalary. The new argument credential is some token that represents the identity of the user on whose behalf the query is to be executed. The query itself is similar to what we had before, except now, in the where-clause, we include an access control check. This check is a call to the function 152 accessAllowed, passing in the users credential and the access control list on the row be- ing examined. We only include the row in the list of results if the access control check succeeds. Of course, we would like to ensure that LINKS programs are always correct with respect to their security policies. For instance, we would like to ensure that access control checks like accessAllowed are always present at the right places in the program. One denition of correctness might be that the program examines the salary eld of a row in the database only after it has performed an access check of the corresponding acl eld in the same row. Under this denition, using the getRecords function of Figure 5.2 with the table declaration of Figure 5.3 is deemed insecure, since it does an integer comparison on the salary eld (thereby examining it) without checking the acl eld of the row. On the other hand, consider the function selectAll shown at the bottom of Figure 5.3. The query in this function simply selects every row in the Employee table and returns it. On its own, we might consider this program to be secure since it certainly does not inspect the salary eld of any row in the table. However, clearly the list of rows returned by selectAll contains sensitive data. So, we would also like to ensure that such sensitive data does not ow to a location where it can be inspected by an unprivileged user. These examples illustrate the two main concerns that our extensions to LINKS must address. First, we aim to ensure complete mediation of the security policy. By augmenting the type language of LINKS with security labels in the style of FABLE, and modeling functions like accessAllowed as FABLE enforcement policy functions, we can check 153 that the appropriate policy checks are always present in an SELINKS program. Second, we seek to ensure that all cross-tier data ows in the program are consistent with the level of trust we have in those tiers. In particular, our trust model considers code that runs in the client tier to be untrusted, since we cannot easily assure that the client runs code sent by the LINKS compiler to the web browser. In the context of the example of Figure 5.3, this trust model means that the list of rows returned by selectAll are not allowed to ow directly to a client functiona policy check must intervene to authorize the release of this data to the client. Additionally, we would like to ensure that database queries that contain calls to potentially complex enforcement policy functions (like accessAllowed) can still be executed efciently within the database. We defer addressing this concern to Chapter 6, where we show how enforcement policy functions and database list comprehensions can be compiled for good performance. 5.3 SELINKS Basics: Enforcing Policies with Static Security Labels We begin our presentation of SELINKS by considering how to enforce particularly simple security policies. For pedagogical reasons, we will begin with simple policies specied using static security labels. Subsequent sections will illustrate how to specify and enforce policies using dynamic labels in SELINKS. Whether static or dynamic, specifying and enforcing a security policy in SELINKS typically proceeds in three steps. 154 First, the policy designer chooses a language of security labels. For example, for the simplest form of information ow policy, we might use the labels Low and High. Next, we identify the sensitive resources in our program and label their types with security labels that protect them from unrestricted usage by the application program. For instance, we might give sensitive values in the program, such as passwords, types such as String{High}, indicating that these will be treated as High condentiality. Additionally, library functions that are signicant from a security perspective are also given types to reect their intended usage. For instance, a library function print that prints strings to a users terminal might be given a type such as (String{Low}) (), indicating that only Low-security strings can be printed to the terminal. Finally, we write enforcement policy functions that give an interpretation to the security labels. Without the enforcement policy, the labels that decorate types are entirely uninterpreted in the program. There is no way, for instance, to allow a Lowsecurity integer to be treated as a High-security one. As in FABLE, the enforcement policy is granted special privileges to interpret label types by dening the conditions under which labeled data can be used, or how type of labeled data can be coerced from one type to another. Under the assumption that the enforcement policy is correct, and given that we have assigned proper types to protected data and sensitive library functions, the SELINKS type checker can be used to ensure that an application program meets a set of high-level security goals. As with FABLE, the type system ensures that an application program always 155 relies on the enforcement policy to construct and destruct protected data. Additionally, as we will see in Section 5.5, the SELINKS type system also ensures that protected data is never sent directly to the untrustworthy client tier. In the remainder of this section, we illustrate each of these three basic steps towards security enforcement in SELINKS. 5.3.1 Dening a Language of Security Labels Specifying a security policy in SELINKS begins by choosing a language of security labels. In FABLE, we restricted terms in this language to be applications of constructors from an algebraic datatype. While this was adequate in the formal setting, for practical policies, we would like to be able to construct labels that include values other than just the data constructors of an algebraic datatype. For instance, it would be much more convenient to represent an access control list as a list of tuples, where each tuple contains a users integer UID and the users name (say, for pretty printing). This would allow us to manipulate access control lists using all the standard list library functions like searching through the list for an element, folding over it etc. For this reason, SELINKS generalizes the language of security labels to include arbitrary data values (with some caveats, discussed shortly). An example label type in SELINKS is shown below. typename UserRec = (username: String, uid: Int); typename Acl = List (UserRec) is lab ; This declaration denes a type alias called Acl, intended to represent an access control policy. This is an alias for the type of a List, where each element of the list is a record with two eldsthe rst, a String-typed eld called username and the second an Int-typed eld 156 called uid. Here, List is a type constructor dened in the standard library, and the notation List(t) represents the application of this type constructor to the type t. Notice that this type declaration concludes with an assertion is lab . This assertion serves as a type annotation which signals the programmers intention to use values of the Acl type as security labelswe call such types label types. One way to the think of the is lab annotation is that it asserts that the type Acl is a member of the lab type class [137]. In our current implementation, the semantics of this type class is utterly trivial. We permit any type declared with the is lab assertion to be treated as a member of the lab type classi.e., membership in this type class does not demand any particular constraint of the underlying datatype. However, we expect this to change in the near future to accommodate the two features discussed below. In the meantime, we allow the is lab annotation to be elided for convenience. We expect future versions of SELINKS to be more strict with this requirement, in order to satisfy the following two properties. 1. Ensuring the purity of type-level expressions. First, since expressions of label type can appear at the type level, we should to ensure that these expressions are purei.e., that they have no side effect. Although LINKS is primarily a functional language (unlike a language like ML, LINKS programs cannot manipulate memory via references), programs can have side effects by altering the database. Our current prototype permits type-level expressions to include database operations although attempting to give a reasonable semantics to such expressions at the type-level appears to be unwise. One enhancement that we anticipate is to enrich the LINKS type system so that we can lock all side-effecting computation within a monad [87]. We 157 could then ensure that the only members of the lab type class are types whose values are computed by purely functional code. Recall that we adopted a similar restriction with FLAIR in Chapter 4, where we tracked memory effects in the type system and forbade effectful expressions from appearing at the type level. 2. Ensuring the serializability of label values. Since LINKS targets multi-tier applications, data values are required to be communicated across tiers. For instance, in Section 5.4.2 (and in greater depth in Chapter 6), we will argue that reliable enforcement of security policies requires label-typed values to be stored in the database. With this in mind, we envisage limiting membership in the lab type class to types whose values can be readily serialized for storage in the database. This would, for instance, exclude function types since serializing code to the database is unlikely to be efcient. typename LatticeLab = [| Low | Med | High|]; sig foobar: (LatticeLab is lab .High) () fun foobar (h) { () } Figure 5.4: An example illustrating the syntax of singleton label types in SELINKS In addition to the lab type, FABLE provides a precise singleton type of labels lab e. The latter type is only inhabited by the value to which e evaluates (if one exists). An SELINKS version of this construct is shown in Figure 5.4. The type alias LatticeLab that stands for a variant type consisting of three constructors, Low, Med or High. We then dene a type for the function named foobar, using the sig construct from standard LINKS. The type we give to this function shows that it expects a single argument a value of the type LatticeLab, 158 but the label type assertion is lab .High asserts that not only is the argument to be used as a label, but additionally that it must be the value High. That is, the is lab .High renes the variant type LatticeLab to just the single data constructor High. The type checker ensure that the foobar function is only ever called with the argument High. In this case, the body of foobar is trivial (it just returns the unit value), but if foobar were to perform some security sensitive operation, we would be able to assume that its argument h is High throughout the body of the function. We permit arbitrary label-typed expressions e to be used in the lab .e construction. 5.3.2 Protecting Resources with Labels Security labels are only useful insofar as they can be used to protect sensitive re- sources with a policy. This kind of security labeling is a central feature of FABLE, and it translates naturally to SELINKS. The SELINKS type t{e} is the type of some data of underlying type t, protected by the security label in the expression e. sig sock send : (Socket) (String) () sig sock send Low : (Socket{Low}) (String{Low}) () fun sock send Low (sock) (data) { ... } Figure 5.5: Protecting a socket interface with simple security labels The code in Figure 5.5 illustrates a particularly simple usage of labeled types in SELINKS. This snippet begins with a type signature for the function sock send, a curried function that represents a function from an API that allows data to be sent on a network socket. This is a function that takes two arguments, the socket and the string data to be sent on the socket, and returns a unit. 159 In the event that we wish to control what data is sent on which socket, we can protect sockets with security labels indicating the security level of data which they are allowed to carry. An instance of such a protection policy is dened in the types of the next function in the snippet, sock send low. In this case, the rst argument is a Socket value that is protected by the static label Low, indicating that it is only cleared to carry data that is marked as being Low security. The next argument is a String, but one that is labeled Low security. These types ensure that the security requirements of the socket interface are observed. An application program cannot call the sock send function with a protected socket since the types do not match, and must call the sock send low function with a socket and data that are both tagged with the label Low. 5.3.3 Interpreting Labels via the Enforcement Policy Enforcement policy functions in FABLE translate directly to SELINKS in that cer- tain functions can be tagged with the policy keyword, indicating that they are privileged. These policy functions then have access to two special built-in operators, unlabel and relabel , that permit them to manipulate labeled data. The type checker ensures that ap- plication programs (i.e., code that does not have the privilege conferred by the policy keyword) treat labeled data abstractly. The example program in Figure 5.6, adapted from our previous example, illustrates a usage of enforcement policy functions. As before, the sock send function is from the socket API and does not pay any particular attention to the security level of sockets or the data that is allowed to be sent on a socket. The new socket function is also a library 160 sig sock send : (Socket) (String) () sig new socket : (String) Socket{Low} sig sock send Low : (Socket{Low}) (String{Low}) () fun sock send Low (sock) (data) policy { sock send (unlabel (sock)) (unlabel (data)) } sig concat LH : (String{Low}) (String{High}) Int{High} fun concat LH (l) (h) policy { relabel ((unlabel (l) ++ unlabel (h)), High) } Figure 5.6: An enforcement policy to restrict data sent on a socket function which provides the only way to construct a new socket. Its type ensures that, by default, new sockets are tagged with the Low label, indicating that they are cleared only to carry Low-security data. (If we were implementing a lattice-based policy, a complete implementation would presumably also provide some way to also construct sockets labeled High.) Since the sock send function cannot be called directly by an application program with a new socket, it is forced to use the sock send low function. This time, we show how to implement this as an enforcement policy function. The policy keyword that is associated with the function denition gives the sock send low function the privilege to use the unlabel operation in its body. In this case, it simply unlabels the sock and data arguments (coercing their types to Socket and String, respectively) and calls the library function, sock send. To illustrate a usage of the relabel operator, the program in Figure 5.6 concludes with a policy function that denes how labeled strings can be concatenated. The function concat LH specialized is to the concatenation of a Low string with a High string, although, as subsequent examples will show, polymorphism in SELINKS can be used to avoid having 161 to specialize policy functions in this manner. In the body of the function, we rst unlabel each argument before adding themthe type of the ++ operator ensures that we cannot use it with labeled integers. Then, we use the relabel operator to return a value of the Int{High}. Finally, a note about the policy keyword: The attentive reader will have noticed from Section 5.2 that LINKS functions are usually tagged with qualiers (like client or server) that indicate the tier on which they are to be executed. In SELINKS, the policy qualier is overloadedall policy functions are pinned to the server. 5.4 Enforcing Policies with Dynamic Labels In this section, we show how SELINKS can be used to specify an enforce policies specied using dynamic labels [149]. We provide two mechanisms to express dynamic label relationships. First, as in FABLE, SELINKS contains dependently typed functions. Second, SELINKS provides built in support for dependently typed tuples, rather than requiring the programmer to encode them using functions. 5.4.1 Dependently Typed Functions Figure 5.7 shows an example of dynamic labels using a dependently typed function. The policy function sock send dyn is an elaboration of the simpler sock send low function from Figure 5.6. The sock send low function was specialized to controlling data sent over sockets, where both the data and the sockets were statically known to be labeled as Low. Here, we want to enforce a policy where the labels of the socket and data are represented 162 by some program value at runtime. Prior to sending the data over the socket, we must check that the label of the data is not more secure than the label of the socket. We want to give sock send dyn a type that captures the labeling relationships among its arguments. In this case, we want to write a type for a function of four arguments, where the rst argument l is a label that labels the second argument sock, and where the third argument m labels the fourth argument data. In FABLE, we would write such a type as (x:lab ) Sock{x} (y:lab ) String{y} unit. However, parsing conicts with existing LINKS notation prevents us from reusing the FABLE notation in SELINKS source programs. Instead, we use the notation Pix:tt . Here, the term variable x is bound to the formal parameter of type t and is in scope all the way to the right of the arrow, in the type t . That is, this is the SELINKS version of the FABLE type (x:t) t . To understand the type of sock send dyn shown on line 3, recall (from Section 5.2) that every argument of a function in LINKS is a tuple, i.e., a record where the eld names are 1, 2, etc. So, in Pi x:(LatticeLab) (Socket{x.1}) ... we have the name x is bound to the type of the rst formal parameter, a tuple that contains a single element of type LatticeLab. The name x is in scope all the way to the right. So, the second argument is a tuple containing a Socket labeled by the LatticeLab provided in the rst argument x.1 projects out the rst component of the rst formal parameter. Similarly, the third and fourth arguments show a tuple y containing a LatticeLab and a string labeled with the contents of y. In the body of sock send dyn, we check that label of the data is not greater than the label of the socket. If the check succeeds, we unlabel the socket and the data and call the sock send library function. Otherwise, we simply return a unit. 163 1 2 3 4 5 6 7 8 9 10 sig sock send : (Socket) (String) () sig sock send dyn: Pi x:(Lab) (Socket{x.1}) Pi y:(Lab) (String{y.1}) () fun sock send dyn (l) (sock) (m) (data) policy { if (less than eq (l, m) ) { sock send (unlabel (sock)) (unlabel (data)) } else { () } } Figure 5.7: An enforcement policy for sockets using dependently typed functions As another example of a dependently typed function, consider the type of the relabel operation as given in the SELINKS standard library. Pi x:(, ) {x.2} This type states that relabel is a function (polymorphic in the type variables and ) that takes a tuple of an and as an argument, where is the type of the data to be labeled and is the type of the label to be used. In this case, we bind x to the formal parameter, a record containing the data in its rst component and the label in its second component. So, the return type of this function, {x.2} shows that it returns a value of the same underlying type as the argument passed in, but now, the value is protected by a label. In particular, the label that is used is the second component of the argument x that was passed in, i.e., x.2 projects out the second component of the input argument. When type checking a function application, as in FABLE, we substitute the actual argument for the formal parameter in the return type. For instance, the function call, relabel (uid, Grant) from our previous example, in fact has the type sInt{(uid, Grant).2}. Clearly, a record projection like (uid, Grant).2 is not a value. We would like this function call to have the type Int{Grant}. Happily, the type reduction relation as dened in FABLE, 164 sig sock send dyn : (l LatticeLab, Socket{l}) (m LatticeLab, String{m}) () fun sock send dyn (l, sock l) (m, data m) policy { if (leq (m, l)) { sock send (unlabel (sock)) (unlabel (data)) } else { () } } sig sock send bad : (l LatticeLab, Socket{l}) (l LatticeLab, String{l}) () fun sock send bad (l, sock l) (l, data l) policy { ... } Figure 5.8: An enforcement policy for sockets using dependently typed records translates naturally to SELINKS. We are able to reduce the expression (uid, Grant).2 to the value Grant, as desired. Section 5.6 describes this type reduction process in further detail. 5.4.2 Dependently Typed Records SELINKS provides special constructs to declare and directly manipulate depen- dently typed records, rather than encoding them in terms of functions. In our experience, dependently typed records have been the most common way of specifying dynamic labelings in SELINKS. Figure 5.8 shows a program that uses dependently typed records. The function sock send dyn is a revision of the function of the same name from Figure 5.7. Instead of requiring the label and data to be passed to the policy as separate arguments, here, we can package the label and data as a record and pass them together as a single argument. The signature declares sock send dyn to be a curried policy function whose rst argument is dependently typed tuple, containing a lattice label l and a socket sock l that is protected by that label. The notation l LatticeLab is a binding constructit binds the 165 name l to the value stored in the rst eld of the tuple, and the name l is in scope for the remainder of the record declaration. To indicate that the socket is protected by the label l, we give it a dependent type Socket{l}, which makes clear the relationship between the elds of the tuple. Similarly, the next argument of sock send dyn is another pair, containing a label m and data m, some data protected by m. The function denition begins on line 2 where we dene patterns (l, sock l) and (m, data m) that match the tuples provided to the function as arguments. In the body of the policy function, we can inspect the labels and only permit the data to be sent after checking that m is less than, or equal to, l. Figure 5.8 concludes with a variation on sock send dyn that illustrates a tricky issue when programming with dependent types: shadowing of names can be problematic. The type signature of sock send bad shows its rst argument as a dependently typed pair in which the rst eld is bound to the name l. As weve pointed out before, the scope of this name is for the remainder of the recordi.e., it is not in scope in the second argument of the function. In the second argument, we have another dependently typed pair, where we bind the rst eld to the name l. This much is ne, it is clear from the scoping rules that the string in the second argument is protected by the label that it is tupled withthere is no name l that is being hidden by the name binding in the second dependently typed pair. The situation in the function denition is different. Here, the arguments (l, sock l) and (l, data l) pattern match the tuples, and in the second pattern, the name l shadows the name in the previous pattern. If we give sock l the type Socket{l} in the body of the function, and allow l to be shadowed, then we have inadvertently severed the association between the socket and its label and mistakenly associated it with the label of the string. 166 There are many possible solutions to this problem. For instance, we could explicitly -convert all terms using fresh names before type checking them. Or, we could use a nameless representation such as de Bruijn indices to represent variable bindings [21]. Or some combination of the two, like the recently proposed locally nameless approach [7]. However, the easiest solution, in terms of compatibility with the implementation of LINKS, is to forbid shadowing of variables that may appear in type-level expressions. In effect, this no-shadowing approach rules out programs such as our example, while complaining that the second binding of l shadows the rst. Dependently typed records in table types. Dependently typed records are not limited to function arguments. We can also use them to give types to database tables that store secret data (among other things). Returning to the employee database example from Section 5.2.2, we would like to make explicit the relationship between the access control list and the data that it protects in each row. Given that LINKS models a database row as a record, a natural model for this relationship is in terms of dependently typed record. Figure 5.9 shows a small policy to protect salary data stored in our example employee database. Line 1 reproduces the type declaration for access control lists shown previously. At line 3, we use a dependently typed record to type each row in the Employee table. The rst eld, acl, stores data of type Acl. The notation l Acl (as in Figure 5.8) binds the name l to the value stored in the acl eld of the record, and the name l is in scope for the remainder of the record declaration. The next eld is the name eldwe chose not to protect the name with a label. The sensitive data in each row is the salary of the employee. So, we give the salary eld a dependent type Int{l}, indicating that it is protected 167 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 typename Acl = List((username:String, uid:Int)) is lab ; var employeeTab = table Employee with (acl : l Acl, name : String, salary : Int{l}) from (database EmpDB); typename EmployeeRec=(acl:l Acl, name:String, salary:Int{l}); typename Maybe () = [|Nothing | Just:|]; sig releaseSalary: (Credential, EmployeeRec) Maybe(Int) fun releaseSalary (u:Credential, x:EmployeeRec) policy { unpack x as (acl=m, name= , salary=s m); if (member(u, m)) { Just(unlabel (s m)) } else { #Authorization failure Nothing } } Figure 5.9: A policy to protecting salary data in an employee database by the label l; i.e., the contents of the acl eld. The rest of the example uses the type alias EmployeeRec to stand for this record. Explicit scopes for names using existential packages. Before proceeding to the rest of this example, we need to clarify a subtle issue in working with dependently typed recordswe need to ensure that names bound within a record never escape their scope. For instance, consider the following (incorrect) program. fun foo(x:EmployeeRec) { x.salary } Here, we have a function that accepts an EmployeeRec, x, as an argument. This type is a dependently typed record, where the salary eld is protected by the contents of the acl eld. Now, since x is a record, in the body of the function, we could try to project out the salary eld from the record. While attempting to do so is certainly reasonable, giving a type to the expression x.salary is problematic. The salary is eld is protected by the 168 acl eld, but there is no valid name in the current scope that can be given to this label expression. Clearly, giving this program the type (EmployeeRec) Int{l} is nonsensical the label variable l is free. We could try to give x.salary the type Int{x.acl} (which would be accurate), but does not solve the problem of giving a type to the return type of the function because the pattern variable x is not in scope in the return type. Our solution to this problem is standard. We view dependently typed records as a kind of existential package [149, 85, 101]. Under this view, we read the EmployeeRec type as follows: EmployeeRec is the type of a record of three elds, acl, name and salary, where there exists a constant l of type Acl in the acl eld, a value of type String in the name eld, and an integer labeled with l in the salary eld. As is standard when working with existential types, we expect these records to be manipulated using special pack and unpack operations, that control the scoping of the existentially bound names. Unpacking a dependently typed record. To illustrate the usage of the unpack construct we return to Figure 5.9. At line 9-17, we have a policy function releaseSalary that controls access to the salary eld of an employee record. This function takes a record with two elds as an argument. The rst, u, is some representation of a user credential (say, some unforgeable representation of the UID of the user currently logged in to a system). The next argument, x is our dependently typed employee record. The goal of this policy function is to release the salary eld to the caller, but only after checking that the credential u presented is mentioned in the access control list that protects the salary. However, as 169 illustrated before, projecting the salary eld out of the record x is not permissible, since the existentially bound name l escapes its scope. The solution here is to unpack the record x to introduce the name l into the scope, before using the salary eld. At line 11, we use the syntax unpack x as p; e, for some record pattern p and expression e. We check that the names bound by pattern variables in p are distinct, and that they do not shadow any other names that can appear with a type-level expression. The names bound by the pattern are in scope for the expression e, and we check that no name bound in the pattern p escapes e. In this particular case, we bind the acl eld to the name m, which allows us to give s m, the salary eld, the type Int{m}. In the remainder of the body, we check that the credential u is mentioned in the acl, and if it is, we unlabel the salary and expose it to the user. We package the result as an option type (Maybe(Int)), returning Nothing if the authorization check fails. Thus, the body of the unpack operation (and, as a consequence, the value returned by the function) can be given the type Maybe(Int), which does not leak the existentially bound variable m. The SELINKS type checker ensures that elds in records whose types include existentially bound names can never be projected out of the record. They must always be accessed by unpacking the record. However, elds that do not include such names, like name, or even acl in our example, can both be projected out using the standard dot nota- tion. Constructing a dependently typed record with pack. The counterpart of the unpack operation (the destructor for a dependently typed record) is the pack operation (the introduction form). The example in Figure 5.10 illustrates its use. Here, we have a trusted login 170 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 typename Auth = [| Grant | Deny |]; typename Credential = (tag:l Auth, userid:Int{l}); typename Maybe (a) = [|Nothing | Just:a|]; sig checkpw : (String, String) Maybe(Int) fun login (uname, password) policy { switch(checkpw(uname, password)) { case Nothing error(Failed login) case Just(uid) var cred = pack (tag=Grant, userid=relabel (uid, Grant)) as Credential; cred } } Figure 5.10: A policy to construct unforgeable user credentials function that produces an unforgeable user credential for a user after checking a username and password against some password database. Our representation of a credential is the type Credential, a dependently typed pair consisting of a tag of type Auth and a userid eld that is an integer labeled by the value stored in the tag eld. Since only policy functions can construct values with a labeled type, we can ensure that application programs cannot forge Credential values. In the body of the login function, we check the supplied username and password by calling some library function checkpw and returns an option type Maybe(Int), containing the user id of the user if the password check succeeds. We pattern match the result using LINKS switch construct, and if the check succeeds, we have to return a Credential value. Lines 11-15 show a use of the pack construct. The syntax in general is of the form var x = pack e as t; e , where x is some variable, e and e are expressions and t is a type. The semantics is for e to be a record expression, that is to be packed into the existential 171 package (equivalently, the dependently typed record) described by the type t. In our case, we have e as (tag=Grant, userid=relabel (uid, Grant)). On its own, this expression can be given the type (tag=Auth is lab .Grant, userid=Int{Grant}), which although a valid (and extremely precise) type, fails to capture the relationship between the tag and userid elds. The type annotation t in the pack construct is a hint to the type checker to generalize the type given to e so as to introduce the relationship between the elds as prescribed by the Credential type. In this case the generalization succeeds and e is bound to the variable cred (of type Credential) in the remainder e in this case, cred is just returned. Ad hoc inference for dependently typed records. The last example illustrates that the pack construct is simply an annotation that indicates how the type checker should gener- alize the type of a record expression. Fortunately, such a hint is only very rarely needed. Usually, the type checker is able to infer enough information from the context to decide how to generalize the type appropriately. For instance, if the programmer provided a signature for the login function (String, String) Credential, then there is sufcient information for the type checker to choose the right type without the need for the pack construct. Alternatively, if the record was to be passed as an argument to a function that expected a Credential argument the type checker would again generalize the type appropriately. Similarly, the way in which we type check a functions arguments often allows the programmer to avoid writing explicit unpack operations. For example, in sock send dyn, the tuple patterns that appear in the functions declaration are type checked exactly as if they were the patterns that unpack a dependently typed record, with the scope of the unpack being the entire function body. This syntactic sugar for a functions arguments 172 1 sig getRecords: (Credential, Int) List(String, Maybe(Int)) 2 fun getRecords(cred, minSalary) server { 3 for (var row employeeTab) 4 where (switch (releaseSalary(cred, row)) { 5 case Just(salary) salary > minSalary 6 case Nothing false 7 }) 8 [(row.name, releaseSalary(cred, row))] 9} Figure 5.11: An example program that enforces a policy in a database query has proved to be very helpful in keeping the notation of our larger example programs relatively lightweight. Securing a database query in SELINKS. We conclude this section by combining the programs of Figures 5.9 and 5.10 to apply access controls to our employee database. Figure 5.11 revises the getRecords server function rst shown as a LINKS program in Section 5.2.1. Our goal remains to select only the records in the database for which the salary eld exceeds the minSalary threshold. The SELINKS type checker ensures that we do the appropriate policy check before examining the salary eld. In this case, the check amounts to a call to the releaseSalary function in the where-clause, passing in the user credential and a relevant row in the table. If the check succeeds, the option value returned contains the exposed salary eld which we can then test. This function returns a list of tuples, where each tuple contains the name and the salary from the rows that matched the query. Notice that on line 8, (the expression that computes the value returned by the comprehension) we have to perform an additional authorization check by calling releaseSalary again. Clearly, this is less than optimal. However, the scoping rules of list comprehensions in LINKS prevent us from simply re-using 173 the result of the authorization query performed in the where-clause. Another source of concern is the efciency of the query. If as we have said before, policy functions like releaseSalary are pinned to the server, is it possible to compile this list comprehension to SQL in a manner that it can still be executed efciently (and securely) within the database? The next chapter speaks primarily to the issue of efciently enforcing security policies that span the server and the database. 5.5 Rening Polymorphism in SELINKS Like most strongly typed functional languages, the type system of LINKS pro- vides for ML-style let-polymorphism. In extending LINKS with security typing, this kind of polymorphism presents us with a useful opportunity. The parametricity results of Reynolds [81] and Wadler [136] guarantee that code that is polymorphic in the type of some data must view that data abstractly. This allows us to safely pass protected data to well-typed polymorphic code and rest assured that the data remains protected. In Section 5.5.1 we show how the power of type polymorphism can be extended to polymorphism over terms that appear at the type level. The result, phantom variable polymorphism, confers two main benets on SELINKS programs. First, as with standard polymorphism, we can derive useful parametricity results about programs that use phantom variable polymorphism. Additionally, we show how source programs can be simplied substantially through the use of phantom variables, both through the re-use of code (by avoiding over-specialization) as well as enabling a simple and tractable form of type inference. 174 1 2 3 4 5 6 7 8 sig add : (l LatticeLab, Int{l}) (m LatticeLab, Int{m}) Int{lub l m} fun add (l, x l) (m, y m) policy { relabel ((unlabel (x l) + unlabel (y m)), lub l m) } fun addcaller (x:Int{High}, y:Int{Low}) { add(High, x)(Low, y) } Figure 5.12: A lattice-based policy for integer addition However, enhancing polymorphism in SELINKS by unleashing phantom variables is only half the story. We must also rein in the power of standard type polymorphism to cope with the cross-tier execution model of LINKS. Since we have no way of guaranteeing that code that runs at the client respects the abstractions specied in its types, we need a way to control the degree of polymorphism that can be used in client code. In Section 5.5.2, we show how to rene polymorphism in SELINKS by stratifying the language of types into a family of kinds. This allows us to ensure that abstraction violations in client code do not compromise the security of protected data. 5.5.1 Phantom Variables: Polymorphism over Type-level Terms To illustrate the need for phantom variables, consider the sample program in Fig- ure 5.12. This is a policy function that denes the semantics of integer addition under a lattice-based information ow policy. As in our other examples, this policy function takes two dependently typed pairs of a label and a protected integer as arguments. In the body, we unlabel each integer, add them together, and then relabel the result with a label that is the least upper bound of the two labels. Even though this function receives the labels l and m as arguments, the runtime 175 1 2 3 4 5 6 7 8 sig add : phantom l.(Int{l}) phantom m.(Int{m}) Int{lub l m} fun add (x l) (y m) policy { relabel ((unlabel (x l) + unlabel (y m)), lub l m) } fun addcaller (x:Int{High}, y:Int{Low}) { add(x)(y) } Figure 5.13: A lattice-based policy for integer addition, with phantoms behavior of this function is entirely independent of the concrete values chosen for the labels. To see why, recall that both unlabel and relabel operations are erased at runtime they serve only as type coercions. After erasing these operations, we see that the body of the function is simply x l + y m. The only reason l and m are mentioned in the arguments is because we need to provide names for the labels of the integer arguments. Unfortunately, just because we need place-holders for the names of the labels, we force a caller of this function to pass in concrete label terms as arguments. In the function addcaller, these labels are particularly simple, but in practice, constructing these label terms often be cumbersome. We would much prefer a way of providing some constructs that allows the label names in the arguments of add to be bound, without requiring that the exact label terms be passed in as arguments. The revised version of add in Figure 5.13 makes use of phantom label polymorphism and solves exactly this problem. The type signature of add states that the rst argument is an integer labeled with a label l, for some label l. The notation phantom l. serves as a binder for l and the name is in scope all the way to the right. Similarly, the next argument is an integer labeled m, for some label m. The return type is the same as before. In the denition of add, notice there are no explicit term arguments for the label 176 Extensions to syntactic forms of FABLE Expression Types Environment Phase index e t ::= ::= ::= ::= . . . | phantom. x:t.e | . . . abstraction with phantom variables y y .x:t t ... | y function type 1 2 . . . | x t | . . . : phantom variables bindings term | type phase distinction Extensions to static semantics of FABLE e:t = FV (t) \ dom() y : : , lab t y , lab , x:t y . x:t.e : .x:t t phantom y y e:t (T-ABS) e1 : .x:t1 t2 y e2 : t1 . (t1 ) = t1 = ( , x e2 ) e1 e2 : (t2 ) x t : type x : t (T-PHANTOM) (T-APP) x:t x:t (T-VAR) Figure 5.14: Extending FABLE with phantom variables variables l and mwhich explains why we call them phantom variables. Since add does not receive the labels as concrete arguments, a result that concludes that the runtime behavior of add is parametric with regard to label values l and m is triviala result that is useful when reasoning about the correctness of the policy implementation. Not having to pass in explicit term witnesses for these labels simplies the code of the caller. For example, in the snippet below, we call the add policy function with a High and Low integer respectively. Notice that the caller does not even have to explicitly instantiate the phantom variables l and mthe type checker is able to infer the instantiations as High and Low, respectively, and compute the return type of this function as Int{lub High Low}. In the remainder of this section, we sketch an extension to the static semantics of FABLE that supports this form of phantom variable polymorphism. 177 Static semantics of phantom variables in SELINKS. Figure 5.14 begins with an extension to the syntax of FABLE (which mirrors the concrete syntax for phantoms in SE LINKS). Term abstraction, phantom. x:t.e now binds two kinds of variables: the -bound y variable x is standard, while the phantom-prexed list binds phantom label variables. y These represent label terms that require no run-time witness, and will be used to express the just-described avor of polymorphism over the label expressions that appear in the rst arguments type. Whereas previously the type of a function was simply (x:t) t , we now record the list of phantom variables that can appear in the argument. In the type .x:t t , the list represents the free (phantom) variables in the formal parameter y y 1 2 t1 . As before, x names the formal parameter. Both x and are bound in t2 . y Next, we extend the typing environment to include an additional form of binding for phantom variables: x t. Since all phantom variables are implicit parameters that have : no runtime witness, we must ensure that these variables are never used in code that may be executed at runtime. Maintaining a separate binding construct in will allow us to enforce this invariant. However, in order to so, we must also parameterize our static semantics with a phase index that indicates whether we are type checking a type- or a term-level expression. (We used a similar mechanism in the semantics of FLAIR to rule out side effects for type-level expressions.) Thus, our typing judgment has the form e : t. The new rules in the system pertain mainly to the typing of abstractions and their applications. (The original semantics of FABLE are in Figure 2.4.) In (T-ABS), the rst 178 premise ensures that the phantom variables precisely record the free variables of the y formal parameters type, t. When a function is applied we will attempt to infer instantiations for all these free variables by unication. Ensuring that exactly the free variables are mentioned in allows us to guarantee that such an instantiation, if one exists, can y always be computed. The next premise ensures that the ascribed type of the formal is well formed. In particular, since the phantom variables are bound in t, we check t in a context extended with the phantoms. Importantly, the types of the phantoms show that they can only be instantiated with label-typed terms. Finally, the last premise, checks the body of the abstraction e as usual, in a context extended with the formal parameter x, and with the phantom variables. The rest of the type rules will ensure that the phantoms never appear in with a subterm of e that has operational signicance. In (T-APP), the rule for applications, the rst two premises are standard. In the third . premise, (t1 ) = t1 we compute a substitution of the phantom variables in the formal parameter t1 that allows it to be unied with the type t1 of the actual argument. A separate technical report [123] shows that computing such a substitution is decidable, given the constraints of the rst premise of (T-ABS). Finally, in the conclusion, we substitute the actual argument e2 for the formal parameter x in the return type, as is standard. However, we also instantiate all the phantom variables in t with their substitutions . Finally, we show the rules that ensure that phantom variables are never used in runtime computations. (T-VAR) asserts that variables in the context that are bound using normal bindings can be used in both the term and the type phase. However, according to (T-PHANTOM), phantom bound variables can only be used in the type phase. Ensuring that these variables are only used in the type-phase ensures that a policy function like add 179 1 2 3 4 5 6 7 fun leak() server { var x:String{High} = read password (); consume(x) } sig consume : () () fun consume(x) client { () } Figure 5.15: Example illustrating how client code can violate its abstractions is parametric in its phantom labels l and m. 5.5.2 Restricting Polymorphism by Stratifying Types into Kinds While we can ensure that both the server and the database run type-correct LINKS code, such an assurance is not easy to provide for the client tier. This means that we must ensure that protected data (i.e., data that is given a labeled type) is never sent directly to the client. However, a nave use of type polymorphism, as in the example of Figure 5.15, can cause this invariant to be violated. The example shows a program with a server function leak and a client function consume. In the body of leak, we read a High-security string x out of a secret password le and then pass x to the client function consume. The type of consume shows that it is parametric in the type of its argument. This ensures that it consume treats its argument abstractly in its body, and indeed it does; it simply returns unit. However, the call to consume in leak is dispatched across tiers to the clients web browser. Nothing prevents the client from directly examining the secret argument x. In other words, untrusted client code (or type-incorrect code) can freely mount abstraction violating attacks that can compromise the security of protected data. 180 Our solution to this problem is to stratify SELINKS types into two kinds: U-kind and M-kind. A type t that inhabits the kind U is assured to contain no labeled typesU is the unlabeled kind. In contrast, a type t the inhabits the kind M may contain a labeled typeM is the maybe-labeled kind. We restrict client code to only manipulate data of types that reside in U-kind. Examples of types that inhabit U-kind are Int, String, (Int, String), etc. Types that reside in M-kind include Int{Low}, String{High}, (Int{Low}, String), etc. The last of these types is interesting in that although it is itself unlabeled, since it contains an labeled component, it is considered to be in M-kind. We could also permit function types that have labeled types only in a negative position to reside in U-kind. For instance, the type (Int{High}) () can be dened as residing in U-kind since it expects a protected data as an argument, rather than producing protected data as a result. However, such a function is useless at the client, since the client has no way to manufacture such an argument. For simplicity, our current implementation deems such a function type as being in M-kind. We also include a sub-kinding relationevery type that resides in U-kind also resides in M. Additionally, by default, every type variable is considered to be instantiable only with types residing in U-kind. An explicit annotation is required in order to introduce a type variable at M-kind (using the syntax ::M). Revisiting our example program, the type checker deems it insecure because the type variable in consume is treated as being a U-kinded variable; i.e., ::U. Since the variable x has an M-kinded type, the call to consume in the server function leak is type incorrect since an M-kinded type cannot be used to instantiate a U-kinded variable. An attempt to circumvent this check by explicitly declaring to be of M-kind is 181 shown below: sig consume : (::M) () fun consume(x) client { () } However, this program is also agged by SELINKS because M-kinded types are not permitted in client code. 5.6 Expressiveness of Policy Enforcement in SELINKS A central argument in favor of the FABLE-style of policy enforcement is the degree of expressiveness that it affords. This exibility in FABLE is derived from two main insights. First, by proposing the notion of an enforcement policy in order to interpret a language of security labels, FABLE can enforce highly customized security policies. We have already seen that this basic idea translates directly to SELINKS. The second key to the expressiveness of FABLE is its use of a simple but powerful combination of renement types within a dependent type system. A FABLE policy designer willing to write complex type-level expressions can leverage the power of typelevel computation to statically enforce a policy. Where such types become unwieldy, a policy designer can discharge the burden of proof to runtimetype renements in FABLE allow the result of runtime checks to be incorporated in a ow-sensitive manner in the types of a program. In this section, we discuss the implementation in SELINKS of these latter two features. Our bias in SELINKS is towards policies that are specied via dynamic labelings. In such a setting, the possibility of purely static enforcement of a security policy is severely limited. With this in mind, our implementation focuses mainly on ow-sensitive 182 type renements based on runtime checks, leaving the implementation of type-level computation fairly rudimentary. We speculate that future implementations might benet from type-level computation via powerful technologies like automated theorem provers. 5.6.1 Type-level Computation The reduction of type-level expressions in a dependent type system is something of a double-edged sword. Importantly, performing computation at the type level increases the expressiveness on the type system. For example, our ability to enforce purely static information ow controls in FABLE and in FLAIR hinges crucially on the reduction of type-level expressions. But, in a language like SELINKS (or FABLE) which includes general recursion in the form of xed points, performing computation at the type level leads directly to the undecidability of type checking. What is more, type-level computations may involve open terms and it is not always clear how such terms are to be reduced. In light of the difculties due to type-level computation, one might consider forgoing the expressiveness that it offers and settling for a more tractable system in which type-level expressions never need to be reduced. However, for a dependent typing system like FABLE, such an option is not viable. We turn to Altenkirch et al. [2] for a particularly pithy explanation of why this is so: Let us examine the facts, beginning with the type rule for application: e1 : (x:t) t e2 : t e1 e2 : t [e2 /x] Its clear from the premises that, as ever, to check an application we need 183 to compare the function domain and the argument type. Its also clear from the rules conclusion that these types may contain expressions. If computation is to preserve typings, then f (2 + 2) should have the same type as f 4, so t [(2 + 2)/x] must be the same type as t [4/x]. To decide typechecking, we therefore need to decide some kind of equivalence up to computation. This argument makes it clear that in order to show that a calculus like FABLE is sound via subject reduction, we must include a type equivalence relation based on the reduction of expressions that appear in types (which is exactly the purpose of the (T-CONV) rule in the semantics of Figure 2.4). However, from the perspective of an implementation like SELINKS, ensuring that computation preserves typing is, at best, pedantic. After type checking a program and allowing it to run, we never actually re-check it after it has taken a step of reduction. Besides, given that we have proved that FABLE is sound, we can rest in the knowledge that if we were to include type-level computations in SELINKS, we would always be able to check that the type of a program is invariant under reduction. Our current implementation of SELINKS takes a conservative (and practical) view of type-level reductionwe only include as much as is necessary to ensure that our example applications can be type checked. In particular, we concede the full expressive power of FABLE in that we are unable to statically enforce policies like information ow (we must rely on some runtime checks). In return, we prot from the simplicity of our current implementation. In practice, conceding the expressiveness of static enforcement is not severe handicap. Enforcing a policy without runtime checks demands complete static knowledge of 184 the policy. For real applications, policies are typically not discovered until runtime. For instance, in our scenario which attempts to protect salary information in an employee database, static information about the labels stored in each row is scant. Even a specialpurpose security type system like Jif [31] must rely on runtime checks to enforce this policy. Our implementation currently supports only the following forms of type-level reduction: 1. Reducing projections of elds from a record. This allows us to type examples that use the relabel operator (among others). For instance, we are able to prove Int{(uid, Grant).2} is equivalent to Int{Grant}. We also support reductions that result from pattern matching a recorda variation on projecting a eld from a record. 2. Renement due to type information. In a computation where a variable l is free in a context where we have a precise type for l (such as the singleton type lab .High), we permit reduction to proceed by substituting for l with an expression derived from ls type. Notably, our type equivalence relation does not extend to -equivalence. If the additional power of type-level computations should become necessary, extending our existing techniques to include -equivalence is feasible. In a related technical report [123], we discuss a simple (partial) decision procedure that can prove the equivalence of typelevel expressions even in the presence of free variables and proves the procedure sound (including -equivalence). However, if the main motivation for type-level reduction is expressive power, it is unclear that a purely syntactic equivalence algorithm, with ad hoc 185 1 2 3 4 5 6 7 8 9 sig print : (String{Low}) () sig dynprint: (l LatticeLab, String{l}) () fun dynprint (l, x) { switch (l) { case Low print (x) case () } } Figure 5.16: Rening a type based on the result of a runtime check techniques to cope with free variables, is the way forward. A more promising approach might be to interface with a more powerful formal tools (such as a automated rst-order SMT solver like Z3 [40]) in order to prove -equivalence of expressions. 5.6.2 Rening Types with Runtime Checks In the absence of a complete type-reduction relation, the need to trade off static enforcement in favor of dynamic enforcement is critical in SELINKS. In FABLE, we supported a form of type renement based on the results of pattern matching operations performed at runtime. The SELINKS type checker reproduces this behavior by accumulating equality constraints in each branch of a pattern matching statement. In deciding the equivalence of types, SELINKS can appeal to the set of equality constraints to show that two type-level expressions are equivalent (without needing to reduce them). An example of this behavior is shown in Figure 5.16. At line 1 of this example, we dene an interface for a library function print which states that only strings labeled Low are allowed to be printed to the terminal. Next, we have a function dynprint, whose argument is dependently typed pair consisting of some label l and a string x labeled with 186 l. Statically, we only know that l inhabits the LatticeLab type, and thus x could be a High- security string. So, before we can print x, we must establish that l is Low. The body of dynprint does exactly this: it pattern matches l and in the case where it is Low, we call the print function. The typechecker checks the print(x) function call in a context that includes the equal. ity constraint l = Low. Given that the declared type of x is String{l}, in the presence of the equality constraint, the type checker is able to prove that String{l} is in fact equivalent to String{Low}which is sufcient to type check the call to print. 5.7 Concluding Remarks This chapter has described our efforts in adapting the core formalism of FABLE to a full-edged programming language like LINKS. The result, SELINKS, offers a variety of constructs that aim to make programming with a dependently typed security-oriented programming language practical. However, as ever, the proof of the pudding remains in the eating. We defer a verdict on the practicality of SELINKS to the next chapter, wherein we describe our experience putting SELINKS to use in the construction of two secure web applications. 187 6. Building Secure Multi-tier Applications in SELINKS We have used SELINKS to implement two applications. The rst is SEWIKI, an online document management system that allows sensitive documents to be shared securely across a community of users. SEWIKI implements a combination of a ne-grained access control policy and a data provenance policy [22]. We have also implemented SEWINESTORE, an e-commerce application that implements a ne-grained access control policy. We were able to reuse much of the policy code across the applications, suggesting that SELINKS promotes the modular enforcement of security policies. Critical to ensuring reasonable performance for these applications is a novel compilation strategy for SELINKS code. Recall that a security policy in SELINKS is enforced by requiring application programs to include calls to privileged enforcement policy functions that guard access to protected resources. The nave approach to compiling data access code that includes calls to these policy functions results in performance that is comparable to the server-centric approach to policy enforcement. Rather than insisting that policy functions execute only in the web server, our approach is to translate enforcement code to user-dened functions (UDFs) stored in the database. These functions can be called directly from queries running within the database. Performance experiments (Section 6.4) show that this cross-tier enforcement mechanism in SELINKS substantially improves application throughput when compared to server-only enforcement. 188 This gain in performance does not come at the expense of expressiveness. Enforcement functions can also be called as necessary within the server to enforce more expressive, end-to-end policies, e.g., for tracking information ow. Nor must we compromise on the benet of protecting multiple applications with a common database-level policy. By associating the policy UDFs with views on database tables, multiple applications can be protected by a uniform policy. As such, cross-tier enforcement in SELINKS retains many of the best features of both the database and server-centric approaches while minimizing the drawbacks of each. Furthermore, SELINKS makes secure applications more portable. Security policy enforcement relies only on common DBMS support for user-dened functions, and not on particular security features of the DBMS. Because programmers write enforcement functions in SELINKS high-level language, they need not write variants of their application for different UDF languages. At the moment our implementation (Section 6.3) targets only PostgreSQL, but we believe other DBMSs could be easily supported. In summary, the core contribution of this chapter is a demonstration that SELINKS is well-suited to building multi-tier applications that enforce expressive security policies in an efcient, reliable, and portable manner. 6.1 Application Experience with SELINKS This section illustrates that SELINKS can support applications that enforce of ne- grained, custom security policies. We present two examples we have developed, a blog/wiki SEWIKI, and an on-line store SEWINESTORE. Demos of both applications can be found 189 at the SELINKS web-site, http://www.cs.umd.edu/projects/PL/selinks. 6.1.1 SEWiki Our design for SEWIKI was motivated by Intellipedia, discussed in Chapter 1. As such, we aim to satisfy two main requirements: Requirement 1: Fine-grained secure sharing. SEWIKI aims to maximize the sharing of critical information across a broad community without compromising its security. To do this, SEWIKI enforces security policies on fragments of a document, not just on entire documents. This allows certain sections of a document to be accessible to some principals but not others. For example, the source of sensitive information may be considered to be high-security, visible to only a few, but the information itself may be made more broadly available. Requirement 2: Information integrity assurance. More liberal and rapid information sharing increases the risk of harm. To mitigate that harm, SEWIKI aims to ensure the integrity of information, and also to track its history, from the original sources through various revisions. This permits assessments of the quality of information and audits that can assign blame when information is leaked or degraded. As discussed in the introduction, these requirements are germane to a wide variety of information systems, such as on-line medical information systems, e-voting applications, and on-line stores. The implementation of SEWIKI consists of approximately 3500 lines of SELINKS code. It enforces a combined group-based access control policy and provenance policy. 190 typename Group = [| Principal: Int | Auditors | Admins |]; typename Acl = (read:List(Group), write:List(Group)); typename Op = Create | Edit | Del | Restore | Copy | Relab typename Prov = List(oper:Op, user:String, time:String) typename DocLabel = (acl: Acl, prov: Prov) Figure 6.1: The representation of security labels in SEWIKI As discussed in Chapter 5, implementing a security policy in SELINKS proceeds in three steps. First, we must dene the form of security labels which are used to denote policies for the applications security-sensitive objects. Second, we must dene the enforcement policy functions that implement the enforcement semantics for these labels. Finally, we must modify the application so that security-sensitive operations are prefaced with calls to the enforcement policy code. We elaborate on these three steps in the context of SEWIKI. Security labels. Policies are expressed as security labels having type DocLabel, the record type shown in Figure 6.1. Documents are protected with security labels with the type DocLabel, which is a record type with two elds, acl and prov, representing labels from the access control and provenance tracking policies, respectively. The access control part is dened by the type Acl, which is itself a record containing two elds, read and write, that maintain the list of groups authorized to read and modify a document, respectively. At the moment, we have three kinds of groups: Principal(uid), stands for the group that contains a single user uid; Auditors, is the group of users that are authorized to audit a document; and Admins, which include only the system administrators. We also address information integrity by maintaining a precise revision history in the labels of each document nodethis is a form of data provenance tracking [22]. This 191 part of a label, having type Prov, is also shown in Figure 6.1. A provenance label of a document node consists of a list of operations performed on that node together with the identity of the user that authorized that operation and a time stamp. Tracked operations are of type Op and include document creation, modication, deletion and restoration (documents are never completely deleted in SEWIKI), copy-pasting from other documents, and document relabeling. For the last, authorized users are presented with an interface to alter the access control labels that protect a document. This provenance model exploits SELINKS support for custom label formats. This policy does not directly attempt to protect the provenance data itself from insecure usage. We have shown in Chapter 2 that protecting provenance data is an important concern and is achievable in SELINKS without too much difculty. SEWIKI label-based policies can be applied at a ne granularity. In what follows we discuss SEWIKIs document model and the three policy elements of a DocLabel. Document structure. An SEWIKI document is represented as a tree, where each node represents a security-relevant section of a document at an arbitrary granularitya paragraph, a sentence, or even a word. Security labels are associated with each node in the tree. When manipulating documents within the server, the document data structure is implemented as a variant type. To store these trees in a relational database, we dene a database table documents as shown in Figure 6.2. The rst column in this table, docid, is the primary key. The second column stores the rows security label, having type DocLabel. The third columns data has labeled type String{l}, i.e., it is protected by the label in the doclab eld. 192 var doc table handle = table documents with (docid : Int, doclab : l DocLabel, text : String{l}, ischild: Boolean parentid: Int, sibling: Int, ) from database docDB; fun access text (cred, row) policy { unpack row as (doclab =dl, text=x | ); if (member(cred, dl.acl.read)) { Just(unlabel (x)) } else { Nothing } } Figure 6.2: A document model and enforcement policy for SEWIKI The parentid eld is a foreign key to the docid of the nodes parent, the sibling eld is an index used to display the sub-documents in sequential order, and the ischild eld is used to indicate whether this node is a leaf (containing text) or a structural node (containing sub nodes). To retrieve an entire document, we fetch the parent, look up all the immediate children (by searching for nodes with a parentid of the parent), then recursively look up all the childrens children, until we retrieve all the leaf nodes. (Although other representations of n-ary trees are possible, our choice is a fairly typical choice when trees have to stored in a relational database.) Enforcement Policy. Authorization checks in SEWIKI are implemented with an enforcement policy similar to the function access text shown at the bottom of Figure 6.2. The rst argument cred is the users login credential, and has type Group; the second argument, row is a record representing a row in the documents table. (LINKS type inference infers the types of the rst two arguments.) The function returns a value of the option type Maybe(String). This function is marked with the policy qualier to indicate that it is a part of the enforcement policy. 193 1 2 3 4 5 6 7 8 9 fun getSearchResults(cred, keyword) server { for(var row doc table handle) where (var txtOpt = access text(cred, row); switch(txtOpt) { case Just(data) data /.{keyword}./ case Nothing false }) [row] } Figure 6.3: A function that performs a keyword search on the document database In the body of the function, we rst unpack the dependently typed record that represents the row (Section 5.4.2 explains this construct), binding the doclab eld to dl and the text eld to variable x (the syntax | allows the rest of the elds to be ignored). Since x has a labeled type String{dl}, prior to releasing x, access text checks whether the users credential is a member of dls read access control list (using the standard member function, not shown). If access is granted, the released text is wrapped within the option-type constructor Just; otherwise, Nothing is returned. Mediate actions. Figure 6.3 shows a function that performs text search on the document database. The getSearchResults function runs at the server (as evinced by the server annotation on the rst line), and takes as arguments the users credential cred and the search phrase keyword. The body of the function is a single list comprehension that selects data from the documents table. In particular, for each row in the table for which the whereclause is true, the row is included in the nal list to which the comprehension evaluates. The where-clause is not permitted to examine the contents of row.text directly because it has a labeled type String{row.doclab }. Therefore, at line 3, we call the access text policy function, passing in the users credential and the row containing the security label and the protected text data. If the user is authorized to access the labeled text eld of the row, then 194 access str reveals the data and returns it within a Maybe(String). Lines 4-7 check the form of txtOpt. If the user has been granted access (the rst case), then we check if the revealed data matches the regular expression. If the user is not granted access, the keyword search fails and the row is not included. 6.1.2 SEWineStore We also extended the wine store e-commerce application, distributed with LINKS, with security features. We dened labels to represent users and associate these labels with orders, in the shopping cart and in the order history. This helps ensure that an order is only accessed by the customer who created it. Order information in SEWINESTORE is represented with the following type: typename Order = (acl:Acl, items:List(CartItem){acl}) An order is represented by a record with two elds. The acl eld stores a security label while the items eld contains the items in the shopping cart. The Acl type is the same as that used in SEWIKI, and many of the enforcement policy functions are shared between the two applications. In general, we found that access control policies were easy to dene and to use, with policy code consisting of roughly 200 lines of code total (including helper functions). Our experience also indicates that it is possible for security experts to carefully program policy code once, and for several applications to benet from highreliability security enforcement through policy-code reuse. 195 6.2 Efcient Cross-tier Enforcement of Policies LINKS compiles list comprehensions to SQL queries. Unfortunately, for queries like getSearchResults that contain a call to a LINKS function, the compiler brings all of the relevant table rows into the server so that each can be passed to a call to the local function. (The compiler essentially translates the query to SELECT * FROM documents.) This is one of the two main drawbacks of the server-centric approach: enforcing a custom policy may require moving excessive amounts of data to the server to perform the security check there. In this section, we present an overview of our cross-tier enforcement technique that seeks to remedy this shortcoming. In order to remedy the inefciency of pure server-side enforcement of a security policy, SELINKS compiles enforcement policy functions that appear in queries (like access text) to user-dened functions (UDFs) that reside in the database. Queries run- ning at the database can call out to UDFs during query processing, thus avoiding the need to bring all the data to the server. Our implementation currently uses PostgreSQL but should just as well with other DBMSs. We implement this approach with three extensions to the LINKS compiler (in addition to the type system changes described in Chapter 5). First, we extend it to support storing complex LINKS values (most notably, security labels like those of type DocLabel) in the database. Prior to this modication, LINKS only supported storing base types (e.g., integers, oating point numbers, strings, etc.) in database tables. Second, we extend the LINKS code generator so that enforcement policy functions can be compiled to UDFs and stored in the database. Finally, we extend the LINKS query compiler to include calls to 196 app.links getSearchResults(cred,kw) { for (var <- ...) where (...access_str(...)) } 3 { acl: read: Auditors, write: ...; prov: ...; declass: ... } 1 doclab: text: ... docid: query query proc. engine policy.links typename Group = ... access_str(cred,lab,x) { ... } policy compilation 2 user-dened functions CREATE OR REPLACE FUNCTION access_str ... Server DBMS Figure 6.4: Cross-tier Policy Enforcement in SELINKS UDF versions of enforcement policy functions in generated SQL. Each respective step is labeled (1), (2), and (3) in Figure 6.4. Representing complex SELINKS data in the database. The simplest way to encode a LINKS value of complex type into a database-friendly form would be to convert it to a string. The drawback of doing so is that UDFs would have to either directly manipulate the string encoding or else convert the string to something more usable each time the UDF was called. Therefore, we extend the LINKS compiler to construct a PostgreSQL user-dened type (UDT) for each complex LINKS type possibly referenced or stored in a UDF or table [102]. To dene a UDT, the user provides C-style struct declaration to represent the UDTs native representation, a pair of functions for converting to/from this representation and a string, and a series of utility functions for extracting components from a UDT, and for comparing UDT values. UDT values are communicated between the server and the database as strings, but stored and manipulated on the database in the native format. In SELINKS, UDTs are produced automatically by the compiler. 197 At the top of the DBMS tier in Figure 6.4, we show the three columns that store SEWIKI documents. The doclab column depicts storage of a complex DocLabel record. This value is compiled to a C struct that represents this label. Section 6.3.1 discusses our custom datatype support in detail. Compiling policy code to UDFs. So that enforcement policy functions like access text can be called during query processing on the database, SELINKS compiles them to database-resident UDFs written in PL/pgSQL, a C-like procedural language. (Similar UDF languages are available for other DBMSs.) SELINKS extends the LINKS compiler with a code generator for PL/pgSQL that supports a fairly large subset of the SELINKS language; notably, we do not currently support higher-order functions. The generated code uses the UDT denitions produced by the compiler in the rst step when producing code to access complex types. For example, LINKS operations for extracting components of a variant type by pattern matching are translated into the corresponding operations for projecting out elds from C structs. Section 6.3.2 describes the compilation process. Figure 6.4 illustrates that UDFs are compiled from LINKS policy code in the le policy.links. We note that policy code can, if necessary, be called directly by the application program, in le app.links, running at the server. Compiling LINKS queries to SQL. The nal step is to extend the LINKS list comprehension compiler so that queries like that in getSearchResults can call policy UDFs in the database. This is fairly straightforward. Calls to UDFs that occur in comprehensions are included in the generated SQL, and any LINKS values of complex type are converted to their string representation; these representations will be converted to the native UDT 198 typedef struct Value { int32 vl len ; int32 type; union { Variant variant; Record record; int32 integer; text string; ... } value; } Value; typedef struct Variant { int32 vl len ; char label ; Value value; } Variant; typedef struct Record { int32 vl len ; int32 num args; Value value; Record rest; } Record; Variant variant in(cstring); cstring variant out(Variant); boolean variant eq(Variant, Variant); Variant variant init(text , anyelement); text variant get label (Variant); Record variant get record(Variant); Variant variant get variant(Variant); int32 variant get integer(Variant); text variant get string(Variant); Record record in(cstring); cstring record out(Record); Record record init(anyelement); Record record set(Record, int32, anyelement); text record get string(Record, int32); Figure 6.5: PostgreSQL User-Dened Types representation in the DBMS. Section 6.3.3 shows the precise form of the SQL queries produced by our compiler. 6.3 Implementation of Cross-tier Enforcement in SELINKS In this section, we present the details of the cross-tier policy-enforcement features of the compiler, overviewed in Section 6.2. We describe our data model for storing SELINKS values in PostgreSQL using user-dened types, illustrate how we compile SELINKS functions to user-dened functions, and explain how we compile SELINKS queries to make use of these functions and manipulate complex SELINKS data. 199 6.3.1 User-dened Type Extensions in PostgreSQL User-dened types (UDTs) in PostgreSQL are created by writing a shared library in C and dynamically linking it with the database. For each UDT, the library must dene three things: an in-memory representation of the type, conversion routines to and from a textual representation of the type, and functions for examining UDT values. Our inmemory representation for SELINKS values is centered around the Value, Variant, and Record structures, shown in Fig. 6.5. The Value type denes a variable-length data structure that represents all SELINKS values. The rst eld vl len (used by all the structures) is used to store the size (in memory words) of the represented SELINKS value. The remainder of the structure denes a tagged union: the eld type is a tag denoting the specic variant of the value eld that follows. All the possible forms of SELINKS values are recorded in the value union, including variants (like Group), records (like Acl), integers, and strings. The Variant type represents an SELINKS value that inhabits a variant type. Every instance of a Variant type consists of a single constructor applied to a Value (stored in the value eld of the Variant structure). For example, a SELINKS value like Principal(Alice) is represented in the database as an object of type Variant where the label eld contains the zero-terminated string Principal, and the value eld is a Value whose type eld indicates it is a string, with the strings value stored in the string eld of the value union. The Record type represents a record that can hold an arbitrary number of SELINKS values of different types. In particular, it is used to store the values of multi-argument labels; for example, ActsFor(Alice,Bob) is a Variant whose value eld contains a Record- 200 typed value, (Alice, Bob). A records eld names are omitted (the name is implied by position). Some of the functions which work on these data types are listed in Fig. 6.5. The string conversion functions end with the sufxes in and out. These are used internally by PostgreSQL to translate between a UDTs in-memory and string representation. Since our composite types allow embedded values, the in functions must be able to recursively parse subexpressions (e.g., in Principal("Alice"), the "Alice" subexpression must be parsed as a string). The variant eq function compares two Variant types for equality; in PostgreSQL, it is called by overloading the = operator. The variant eq function implements a special pattern matching syntax, where the value is treated a wild card, and will match any subexpression. For example, Acl("Alice") = Acl( ) is true. The variant get label function returns the text label of a Variant, while the variant get functions get the value of the Variant; if the type does not match, a run-time error occurs. We require a different accessor function for each type because PostgreSQL requires return variables to have a type. On the other hand, the variant init function, which creates a new Variant type, takes an argument of type anyelement. This is a PostgreSQL pseudo-type that accepts any type of argument; the actual type can be determined dynamically. This allows us to create user-dened functions that take a polymorphic type (such as access, described in the next section). The Record functions are similar to the Variant functions. The record get functions take a record x and a (zero-based) integer index i as arguments and returns the ith 201 component of the record x, if such a component exists and is of the proper type. If either condition is unsatised, then a run-time error results. record init creates a new single record with the given value, while record set sets a records value, possibly extending the record by one element as a result. In the remainder of this section we show how these types are used both within our compiled UDFs as well as in the body of SQL queries. 6.3.2 Compilation of SELinks to PL/pgSQL To compile SELINKS functions to UDFs, we built a new LINKS code generator that produces PL/pgSQL code, one of PostgreSQLs various UDF languages. Prior to our extension the LINKS code generator could only generate JavaScript code for running on the client. PostgreSQL supports several different UDF languages, but PL/pgSQL is the most-widely used. It has has a C-like syntax and is fairly close to Oracles PL/SQL.(Note that, unlike most database systems, PostgreSQL makes no distinction between stored procedures and user-dened functions.) Code generation is straightforward, so we simply show an example. Figure 6.6 shows the (slightly simplied) code generated for an enforcement policy function called access, a generalization of the function access text shown in Figure 6.2, that can take any type of argument (which is useful when labels annotate values of many different types, since we can write a single access function rather than one per type). A function denition in PL/pgSQL begins with a declaration of the functions name and the types of its arguments. Thus, line 1 of Figure 6.6 denes a UDF called access that takes three 202 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14 CREATE FUNCTION access(text,record,anyelement) RETURNS variant AS $$ DECLARE cred ALIAS FOR $1; doclab ALIAS FOR $2; x ALIAS FOR $3; BEGIN IF member(cred,record_get_rec( record_get_rec(doclab, 0),0)) THEN RETURN variant_init(Just, x); ELSE RETURN Nothing; END IF; END; $$ language plpgsql Figure 6.6: Generated PL/pgSQL code for access arguments of built-in type text, a custom type record, and the special built-in pseudotype anyelement. The anyelement type allows us to (relatively faithfully) translate usages of polymorphic types (as in the argument of our generalized access function) in SELINKS to PL/pgSQL. At line 2, we dene the return type of access to be variant, since it is supposed to return an option type. At lines 4, 5, and 6, we give names to the positional parameters of the function by using the ALIAS command (a peculiarity of PostgreSQL). That is, the rst argument is named cred to represent the credential; the second argument is doclab to represent the security label of DocLabel type; the nal argument x, is protected data of any type. In the body of the function, lines 8-12, we check if the users credential cred is mentioned in the doclab .acl.read eld. Accessing this eld requires rst projecting out the record doclab .read, using record get rec(doclab , 0) and then the read eld using a similar construction. The authorization check at line 8 relies on another UDF (member) whose denition is not shown here. 203 1. SELECT docid, doclab, text FROM 2. (SELECT 3. S.doclab as doclab, S.docid as docid, 4. S.text as text, 5. access(Alice, S.doclab, S.text) AS tmp1, 6. FROM documents AS S 7. ) as T 8. WHERE 9. CASE 10. WHEN ((T.tmp1 = Just((_)))) 11. THEN (variant_get_str(T.tmp1) LIKE %keyword%) 12. WHEN (true) 13. THEN false 14. END Figure 6.7: SQL query generated for getSearchResults If this authorization check succeeds, at line 9 we return a value corresponding to the SELINKS value Just(x). Notice that the unlabel operator that appears in SELINKS is simply erased in PL/pgSQLit has no run-time signicance. If the check fails, at line 10 we return the nullary constructor Nothing. 6.3.3 Invoking UDFs in Queries The last element of our cross-tier enforcement strategy is to compile SELINKS comprehension queries to SQL queries that can include calls to the appropriate policy UDFs. This is built on infrastructure provided by Dubochet [43] (based on work in Kleisli [142]). Prior to our extensions, the LINKS compiler was only capable of handling relatively simple queries. For instance, queries like our keyword search with function calls and case-analysis constructs were not supported. Figure 6.7 shows the SQL generated by our compiler for the keyword search query in the body of getSearchResults. This query uses a sub-query to invoke the access policy 204 UDF and lters the result based on the value returned by the authorization check. We start by describing the sub-query on lines 25. Lines 3 and 4 select the relevant columns from the documents table; line 5 calls the policy function access, passing in as arguments the user credential (here, just the username Alice, but, in practice, an unforgeable authentication token); the document label eld S.doclab ; and the protected text S.text, respectively. The result of the authorization check is named tmp1 in the sub-query. Next, we describe the structure of the where-clause in the main query, at lines 8 14. We examine the the value returned by the authorization check; if we have obtained a Some(x) value, then we search x to see if it contains the keyword, otherwise the whereclause fails. Thus, at line 10, we check that T.tmp1, the result of the authorization check for this row, matches the special variant pattern Some(( )). In this case, the test on line 10 is satised if the value T.tmp1 is the variant constructor Some applied to any argument. If this pattern matching succeeds, at line 11, we project out the string argument of variant constructor using the function variant get str. Once we have projected out the text of the document, we can test to see if it contains the keyword using SQLs LIKE operator. Lines 1213 handle the case where the authorization check fails. Finally, we turn to line 1 of this query which selects only a subset of the columns produced by the sub-query. The reason is efciency: we do not wish to pass the temporary results of the authorization checks (the T.tmp1 eld) when returning a result set to the server. Although our code generators are fairly powerful, there are some features that are not currently supported. First, our current label model requires storing a security label within the same row as the data that it protects. Next, our support for complex join queries 205 as well as table updates is still primitive. We anticipate improving our implementations to handle these features in the near future. Finally, as mentioned earlier, we do not allow function closures to be passed from server to the database; however, we do not foresee this being a severe restriction in the short term. 6.4 Experimental Results In this section, we present the results of an experiment conducted to compare the efciency of server-side versus database-side policy enforcement. We also examine other factors that come into play when running database applications, such as the number of rows being processed by the query, and the location of the database (local host or network). We also benchmark SELINKS against a simple access control program written in C. We show that, in the case of SELINKS, running a policy on the database greatly reduces the total running time compared to running the same policy on the server when tables are large (up to a 15 speed-up). In addition, our C implementation highlights the high current overhead of SELINKS programs, while at the same time showing that our PostgreSQL implementation is comparable in speed (and shows a slight improvement when network latency is considered). 6.4.1 Conguration Our system conguration is shown in Figure 6.8. We ran two different system con- gurations: a single-server mode (local) where the server and database reside on Machine A, and a networked version where the server runs on Machine B. 206 CPU: RAM: HDD: Network: OS Kernel OS Distribution: DBMS: Machine A Intel Quad Core Xeon 2.66 GHz 4.0 GB 7,200 RPM SATA 100 Mbit/s Ethernet Linux 2.6.9 Red Hat Enterprise Linux AS 4 PostgreSQL 8.2.1 Machine B Intel Quad Core Xeon 2.0 GHz 2.0 GB 7,200 RPM EIDE 100 Mbit/s Ethernet Linux 2.6.9 Red Hat Enterprise Linux AS 4 N/A Figure 6.8: Test platform summary For our test, we used the getSearchResults query presented in Fig. 6.7, which checks if a user has access to a record and, if so, returns the record if it contains a particular keyword. We generated two tables of random records (1,000 and 100,000 records), each comprised of 520 words selected from a standard corpus. Each record has a 10% probability of containing our keyword, and each record is labeled by a random access control label, which grants access 50% of the time. Thus, the query should return approximately 50 results and 5,000 results for the 1,000-record and 100,000-record tables, respectively. In running our tests on SELINKS, we varied the number of records in the table (1,000 or 100,000), whether the policy was enforced on the server or the database, and the locality of the server (e.g., same or networked machine). We also created a C program that queries the database, manually performs the access control check, and searches for the keyword. The C program operates in one of three modes; no access control, server-side access control, or database-side access control, using the same SQL query as generated SELINKS program, including the database-level UDF function.We compare this program against our SELINKS implementation for all the tests above. All running times are the mean of ve runs. 207 0.35 0.32 0.30 0.25 Time(s)meanof5runs 0.20 0.15 Local Network 0.15 0.10 0.08 0.04 0.02 0.04 0.04 0.05 0.03 0.00 CDatabase CServer SELinksDatabase SELinksServer Language/PolicyLoca8on Keyword search on 1,000 rows 35.00 31.78 30.00 25.00 Time(s)meanof5runs 20.00 15.02 Local Network 15.00 10.00 5.00 1.15 1.24 0.00 CNone CDatabase 2.21 2.25 1.74 2.51 1.92 1.99 CServer SELinksDatabase SELinksServer Language/PolicyLoca8on Keyword search on 100,000 rows Figure 6.9: Throughput of SELINKS queries under various congurations 6.4.2 Results The results of our experiment are summarized in Figure 6.9, which illustrate the time required to run the query on 1,000 and 100,000 records, on the left and right respec208 tively. The horizontal axis illustrates the language (C or SELINKS) and policy enforcement location (None, Database, or Server) used. For each language/policy pair, we show two bars representing the local or networked database congurations, respectively. The highlight of both gures is the signicant improvement shown in running an SELINKS policy on the database rather than the server. For the 100K-record example running over the network we see a 16 improvement; for the 100K-record case with a local database the improvement is 7.5; and for local and network queries on 1,000 records the improvement is 4. The current incarnation of SELINKS, however, is an interpreter language with few optimizations. Our C program results illustrate some more general results with regard to this technique. Consider the 100K-record results in Figure 6.9. First, running our C program with no policy enforcement takes a little over one second; this gives us a baseline for how long it takes to retrieve the full data set; this illustrates time that could possibly be saved by reducing the result set at the database. Our C implementation of server access control is much faster than our SELINKS implementation ( 12.5); this illustrates the lack of optimization in SELINKS. That said, the SELINKS database policy implementation is comparable to the C version on a single machine, (only 27% slower) and is marginally faster when network transmission is taken into account. It is interesting to note that, when running the database-policy versions, the SELINKS implementation actually slightly out-performs the C implementation; this indicates that the C implementation may not be as optimized as possible. In summary, running SELINKS policies on the database instead of the server greatly improves performance, particularly for large queries. Based on the comparison with C, we 209 note that the SELINKS server component could benet greatly from more optimization, while database-side enforcement is quite efcient. 6.5 Concluding Remarks This chapter has demonstrated that feasibility of constructing realistic multi-tier web applications in SELINKS. We have implemented a wiki application that demonstrates multiple security properties, and have extended an existing LINKS e-commerce application with simple security protection. In general, we have found that SELINKSs label-based security policies are neither lacking nor burdensome, and the modular separation of the enforcement policy permitted some reuse of policy code between the two applications. We have also argued that a multi-tier approach to security is necessary for expressing rich application policies while maintaining efciency and trustworthiness. We have shown how SELINKS can model and enforce a variety of secure application policies, and have described how SELINKS implements such policies in the database by compiling the enforcement policy to user-dened functions in the database. Finally, we have shown that enforcing policies in the database, versus in the server, improves throughput in SELINKS by as much as an order of magnitude. 210 7. Related Work This chapter describes various threads of related work. We begin with a discussion of security-typed languages. FLAIR is distinct from existing languages primarily in that it is intended to be extensible with custom policy enforcement mechanisms. In that respect, our work follows a long tradition of extensible programming languages, and we discuss these next. More recently, researchers have noted that dependent types are useful for extensibility; so, we discuss a variety of languages that include dependent types, whether for extensibility, program verication, or for security. We then turn to a discussion of the various kinds of security policies that we have explored. These include stateful authorization policies, policies for declassication, and data provenance policies. As far as the practical aspects of this work are concerned, the main related works are other projects that target multi-tier web applicationswe discuss these next. The nal section of this chapter ties together some loose ends and mainly places miscellaneous technical aspects of our work in context. 7.1 Security-typed Languages Broadly speaking, security-typed programming languages augment the types of program variables and expressions with annotations that specify policies on the use of the typed data. These policies are typically enforced at compile-time by a type checker, 211 although some reliance on runtime checks is not uncommon. As such, FABLE, AIR and FLAIR all t this description. The basic idea of security typing is usually attributed to Volpano, Smith and Irvine [132]. Sabelfeld and Myers [111] provide comprehensive survey of a large body of work in this eld. Much of the work on security typing has focused primarily on information ow policies. We have sought to extend security typing beyond information ow. This appears to be a trenda number of works concurrent with ours have also begun exploring security typing for other kinds of policies, and we discuss these elsewhere. 7.1.1 FlowCaml Pottier and Simonets FlowCaml language [104, 118] statically enforces an infor- mation ow policy for ML programs. Our encoding of information ow in Chapter 4 closely follows their type systemin fact, our correctness proof is via a translation to a subset of Core-ML, the underlying formal system of Core-ML. We also give a direct proof of correctness for the purely functional information ow policy of Chapter 2this proof relies heavily on a syntactic proof technique also due to Pottier and Simonet. Aside, of course, from extensibility, our work is distinct from FlowCaml in two main respects. First, FlowCaml extends Hindley-Milner-style type inference to include security types. We make no attempt to infer security-type annotations. However, FlowCaml makes the simplifying assumption that security labels are always known statically. Although static labelings are permissible in our formal languages (and in SELINKS), our main focus is dynamically specied policies. 212 7.1.2 Jif Jif [31], an extension of Java, is probably the most full-featured implementation of a security typed language. Unlike FlowCaml, Jif does not support full inference of securitytype annotations. However, Jif does include dynamic labels (more on this below) and is thus more expressive that FlowCaml. Despite the lack of type inference, programming with an information ow policy in Jif is signicantly easier than programming with a similar policy in SELINKS. In SELINKS, we require programmers to explicitly insert calls to enforcement policy functions to construct evidence that no illegal information leaks occur. This is the price of generality in SELINKSJif effectively bakes in these policy checks in its type system, so programmers need not insert these checks manually. Zheng and Myers [149] formalize the use of dynamic security labels in Jif and show how data values can be used as security labels to express information ow policies. The technical machinery for associating labels to terms in their system is similar to ourswe both use forms of dependent types. There are two main differences. First, the security policy in Jifan information ow policy in the decentralized label model [88]is expressed directly in the type system whereas in AIR both the security policy and the label model are customizable. As discussed in Section 2.2.4, dynamic labels for information ow policies can be encoded in FABLE. Second, we allow non-values to appear in types, e.g., lub l m in Figure 2.9. This is a more powerful form of dependent typing and allows us to encode a combination of static and dynamic policy checking. However, we need to take special care to ensure that type checking remains decidableSection 7.3 contains a more detailed discussion of dependent types in FLAIR and SELINKS. 213 7.2 Extensible Programming Languages Loosely, extensible languages aim to allow the user to modify the features of a language to suit his changing needs and purposes. This ts the description of FABLE and FLAIR, at least as far as security enforcement goes. In FABLE, extensions are dened using the enforcement policy and, in FLAIR using the type signature. 7.2.1 Classic Work on Extensible Programming Languages Research in extensible programming languages dates back nearly 50 years. Stan- dish surveys some of the early results [120] and provides a useful taxonomy of extensions. In his terminology, an extension is a paraphrase when a new construct is exchanged for an existing denition; e.g., by macro expansion. An enforcement policy function, like the apply function of Section 2.2.3, can be thought of as a paraphrase, in that it denes the application of labeled functions (which is not possible directly in the base language) in terms of existing constructs in the language (i.e., those used in the body of apply). Extensions can also be orthophrases where, an entirely new construct is added to a language. An example of an orthophrase in our context is the inclusion of a base term constant in a FLAIR signature. The third and nal class of extensions are metaphrases, where new interpretations are given to existing constructs in the language. These are perhaps the most interesting aspect of our extensions. For example, the sub function in the policy of Figure 2.9 effectively introduces a subsumption rule into the type system. By using this function, an application program can give a new type-level interpretation to a termi.e., a term of type t can be used at its sub-type t , where the subtype relation is dened by the 214 sub extension. 7.2.2 Extensible Type Systems Researchers have explored how user-dened type systems can be supported directly via customizable type qualiers. For example, CQual [50] is a framework that allows qualiers to be added to the C programming languages. Qualiers in CQual are arranged in a lattice and CQual uses various dataow analyses to enforce properties like taintedness. Unlike a language like FlowCaml, which also tracks lattice-based qualiers, CQual only tracks direct data ows and not implicit ows of the form described in Chapter 4. By focusing on direct data ows (among other reasons), CQual has seen broad practical applicability. For example, Shankar et al. [116] have used taint tracking in CQual to detect format string vulnerabilities and buffer overruns in C programs. Zhang et al. [148] and Fraser et al. [53] have used qualiers to check complete mediation in access control systems. Millstein et al [28, 3] have developed a qualier-based approach in which programmers can indicate data invariants that custom type qualiers are intended to signify. This is contrast to CQual, where one does not generally attempt to prove that the user-dened qualiers correctly establish some property of interest. In some cases, Millstein et al. are able to automatically verify that these invariants are correctly enforced by the custom type rules. While their invariants are relatively simple, we ultimately would like to develop a framework in a similar vein, in which correctness properties for FABLEs enforcement policies can be at least partially automated. Marino et al. [82] have proposed using proof 215 assistants for this purpose, and we have begun exploring this idea in the context of FABLE policies. While our security labels and type qualiers share many similarities, our approach to type extensions is substantially different. In FABLE and FLAIR, the qualier language is the same as the term language, i.e., qualiers are specied using dynamic labels, where the labels themselves are program expressions. In FABLE, the semantics of labels are also given using constructs that are directly in the language. In contrast, Millstein et als JavaCop includes a separate domain-specic language for introducing new type rules in the system. Because dependent types in FABLE and FLAIR essentially conate the term language and the type language, we are able to describe typing constructs directly in the types of enforcement policy terms. But, the power of dependent types is not always necessary for extensibilitywe discuss some alternatives next. 7.2.3 Extensions Based on Haskells Type System It has been said that the Haskell programming language is a laboratory and play- ground for advanced type hackery [70]. On occasion, Haskells type system has been put to use to encode interesting security-related constructs. Li and Zdancewic show how to encode information ow policies in Haskell [78]. They use type classes [137] in Haskell to dene a meta-language of arrows [63] that makes the control-ow structure of a program available for inspection within the program itself. Their enforcement mechanism relies on the lazy evaluation strategy of Haskell that allows the control ow graph to be inspected for information leaks prior to evaluation. While 216 their encoding permits the use of custom label models, they only show an encoding of an information ow policy. It is not clear their system could be used to encode the range of policies discussed here. Besides, the reliance on a call-by-name evaluation scheme, with all the attendant challenges of handling side-effects and sequential computation, appears to be a considerable handicap of this approach. Kiselyov and Shan [73] use type classes and higher-rank polymorphism in Haskell to encode a form of dynamic labeling. While their focus is on easily propagating runtime conguration parameters through a program, it appears as though their techniques could also be applied to labeling data with security policies. However, in the absence of true dependent types in Haskell, purely static enforcement of security policies using their method is not possible. Furthermore, by including security labeling as a primitive (rather than a derived construct), SELINKS makes it easier to manipulate labels and labeled data. In the same work, Kiselyov and Shan provide a mechanism for conguration data to be passed as implicit parameters to functions. This resembles our use of phantom variable polymorphism in Section 5.5.1, but with an important distinction. Phantom label variables have no term level representation (i.e., they are phantom) and so the runtime behavior of a function is parametric with respect to its implicit phantom arguments. In contrast, Kiselyov and Shans implicit parameters are concrete term-level arguments and can inuence the runtime behavior of a function. Adding this to our SELINKS implementation might help reduce the annotation burden further. 217 7.3 Dependent Typing Despite being more powerful than Zheng and Myers dynamic labeling [149], se- curity labels in FABLE and FLAIR still employ only a fairly weak and lightweight form of dependent typing. Traditionally, full-blown dependent types have been used as the basis for theorem provers like Coq [17], Isabelle/HOL [94] etc; for program verication, as in DML [145] and other dependently typed programming languages. In this section, we briey survey these works. Aspinall and Hoffmann provide a useful introduction to dependent types [4]. 7.3.1 Dependently Typed Proof Systems Dependently typed formalisms like Pure Type Systems [10] and the Calculus of Constructions [36] have been used both as the basis of frameworks to design and formalize type systems as well as to build theorem provers [17, 11]. In these languages, the type and term languages overlap, allowing extremely expressive types to be used as specication. In comparison, dependent typing in FABLE and FLAIR is much simpler. Rather than conate the language of types and terms, our approach only uses dependent types to express security labelings. ATS is a programming language and a proof assistant based on a form of dependent types that has extensibility as one of its primary goals [144]. ATS (and its predecessor DML [145]) differs from Pure Type Systems and FLAIR in that the language of types and terms are completely separate. However, types can be indexed by so called static terms, essentially a purely functional lambda calculus at the type level. This separation 218 simplies a number of issues in ATSe.g., ATS by denitions rules out side effects and nontermination in type-level terms; in FLAIR, we require a (simple) effect analysis to achieve the same property. But, this separation means that indices in ATS have no runtime representation as they are drawn from a language intentionally kept separate from program expressions. Thus, while some of the statically enforceable policies we explore can be encoded in ATS, enforcing policies based on dynamic labels appears to be difcult since this requires indexing types with expressions drawn from the term language. (It may be possible to encode an indirect form of dynamic labels in ATS using singleton types.) FLAIR, in contrast, is specically designed to make it easy to express and enforce policies using dynamic labels. ATS also includes linear types, a relative of the afne types we use in FLAIR [140]. Coq, Isabelle, ATS, and similar systems certainly outstrip FABLE in generality and power of static checking. Whereas these other systems target program verication, we have focused on showing that the simple form of dependent typing in FABLE and FLAIR can be used to provide useful assurances about the enforcement of security policies. Thus, while the generality of, say, Coq allows it to be used to dene one of our type systems and to construct proofs that the type system is sound (e.g., we proved AIR sound in Coq), the Calculus of Constructions does not intrinsically facilitate proving that welltyped terms enjoy relevant security properties (since it has no notion of security labels, complete mediation, etc.). 219 7.3.2 Dependently Typed Programming Languages Cayenne [6] is a pure language in which the type and term languages coincide, and is possibly the rst programming language to include the full power of dependent types. The resulting system is extremely powerful, though type-checking can be undecidable (as it is for the formulation of FABLE in Chapter 2). Cayenne focuses on static verication, while in our languages policies are enforced using a mixture of static and dynamic checks. Cayenne also does not support side effects, as we do in FLAIR. Epigram [2] is another pure language with dependent types. Unlike Cayenne, Epigram ensures that type level expressions are always total functions, thereby ensuring decidability of type checking (and soundness of the underlying logic). Unlike our extremely simple proposal of ruling out recursion at the type level [123], Epigram employs much more sophisticated reasoning about structural recursion to ensure that functions are total. As with AIR policies in Chapter 3, Epigram also makes extensive use of dependently typed evidencetypes like LEQ x y can be used as propositions that stand for integer inequality. In AIR, certicates that witness these propositions were simply constructed at runtime using trusted function-typed base-terms in the signature. In contrast, the Epigram programmer species proof rules to interpret dependently-typed evidence and the compiler checks that these rules are always satised when certicates are constructed. Concurrent with our own work, Nystrom et al. have developed a dependently typed extension to the X10 programming language [96]. They provide a way to associate a constraint expression (drawn from the term language) with the type of an X10 object. Although their focus is not security, it appears possible to use this feature to encode a 220 dynamic security labeling, much like in SELINKS. However, they provide no means of being able to dene constraints that control the side effects of a program, or for that matter, to allow constraints themselves to be stateful (as we do in Chapters 3 and 4). 7.3.3 Dependent Types for Security We are also not the rst to use dependent types for security. Walkers type system for expressive security policies [139] is also dependently typed. Labels in Walkers language are uninterpreted predicates rather than arbitrary expressionswe are not aware of an earlier use of dependent types for security. Walkers system can enforce policies expressed as security automataas we shows in Chapter 3, this kind of policy is also enforceable in AIR. However, in Walkers system, the policy is always enforced by means of a runtime check. In order to recover some amount of static checking, Walker suggests that a user might add additional rules to the type system, though he is not specic about how this would be done. These additional rules would have to be proved correct with respect to a desired security property. Aura is a programming language that incorporates an authorization logic in its type system using dependent types [64]. Statements from the authorization logic can refer to program values and specify constraints that must be satised in order for those values to be manipulated. Aura differs from FABLE and FLAIR in a number of ways. First, Aura focuses on authorization policies and on auditing. Although policies in Aura are userdened to the extent that the authorization logic is general-purpose, the customizability does not extend to security automata, provenance, information ow and downgrading 221 policies as they do in FLAIR. Aura also uses dependently typed evidence [130], as we do in the enforcement of AIR policies. However, as with Epigram, Aura uses sound type rules to ensure that evidence objects are constructed properly, where we rely on ad hoc approaches like trusted runtime checks. However, unlike Epigram where the rules for constructing evidence are user-dened, Aura uses a xed set of rules that capture the requirements of the underlying authorization logic. On a technical note, unlike in FLAIR, dependent types in Aura not quotiented by -equivalence of type-level expressions, i.e., Aura never reduces type-level expressions using a rule like (T-CONV) in Figure 2.4. Vaughan et al. point out that this is unnecessary for authorization policies; however, we have found this to be useful for statically enforcing information ow policies. Aura is also purely functional and does not account for stateful policies. Bengston et als RCF [16] is a language equipped with dependent and renement type which the authors have used to ensure the proper implementation of cryptographic protocols. Unlike dependent types in FABLE, which are just security policies, renement types in RCF reect underlying properties of the values that inhabit theme.g., the type of access control lists that contain the username Alice. These extremely precise types are useful in statically enforcing security policies, but in order to type check programs RCF must rely on an SMT solver [40]. (In Chapter 5, we speculated that using SMT solvers in order to prove type equivalences may be of use in SELINKS too.) But, because they always reect structural properties of the underlying data, renements in RCF make it difcult to enforce security properties that are not necessarily structural. For example, two strings in the program can be identical, yet have different provenance. In FABLE, it is possible to give these strings different types by associating different provenance labels 222 with each. In RCF, identical data values always inhabit the same types, so distinguishing between the provenance of these items purely in the types is impossible. 7.4 Security Policies Here we survey work related to each of the security policy models explored in this dissertation. Information ow policies were discussed in Section 7.1; Bishop is a good reference for various models of access control [18]we do not discuss these two kinds of policies further. 7.4.1 Security Automata Schneider proposed using security automata to characterize the class of security policies that are enforceable by execution monitors [113]. In Chapter 3, we dened AIR, a language for specifying information release policies using security automata. AIR policies are actually a more general form of security automata called edit automata [79], i.e., automata that in the process of deciding whether an input word w is in a language L, may also transform w to some other word w . AIR classes t this description, since they may modify data before releasing it. To our knowledge, no prior work has used automata to specify the protection level and release conditions of sensitive data. The canonical means of enforcing of security automata policies is through the use of reference monitors. Erlingsson and Schneider developed SASI [46] and its successor, PSLang/PoET [45], both inlined reference monitors to enforce security automata policies. Our approach is in contrast with SASI in that we support local policy statei.e., bits of 223 automata state are maintained in proximity to the data that it protects and the association between the policy and data is reected in the types. SASI, however, maintained global automata state and this was identied by Erlingsson as a main obstacle towards making it practicalspecifying global policies required cumbersome state management and the runtime overhead of lookup up the relevant part of the global state was prohibitive. PSLang/PoET does support local policy state, but unlike AIR, PSLang/PoET augments the run-time representation of protected data to include the policy. Dynamic labels in AIR are more expressiveas discussed in Section 3.3, we can easily enforce secret sharing policies on related data. Local automaton state in AIR is also likely to be useful when applying policies to concurrent programsenforcement code does not need to synchronize on some global policy state, thereby allowing greater parallelism. Additionally, by reecting the association between policy and data in the types, AIR provides a way to verify that automata and protected data are always correctly manipulated. In the concurrent setting, dynamic labels in AIR also clearly identify the synchronization requirements on policy statethus AIRs type system can improve reliability by helping prevent race conditions. As such, one could imagine putting AIR to use to certify that IRMs correctly enforce their policies. Security automata enforcement in AIR essentially works by tracking the state of objects in types. As such, this is a form of typestate, a construct that dates back to Strom and Yemini [122]. The calculus of capabilities [37] provides a way of tracking typestate, using singleton and linear types (a variant of afne types) to account for aliasing. The Vault [41] and Cyclone [67] programming languages implement typestate checkers in a practical setting to enforce proper API usage and correct manual memory management, 224 respectively. AIRs use of singleton and afne types is quite close to these systems. However, in these systems the state of a resource is a static type annotation, while in AIR a policy automaton is rst-class, allowing its state to be unknown until run time. Walkers type system [139] (discussed in Section 7.3.3) also supports rst-class automaton policy state. But, in his system, there can only be a single policy automaton the denition of which is embedded into the type system. In contrast, our approach allows multiple automata policies to be easily dened separately. As a consequence of dynamically dened policy state, full static verication of a security automaton policy is not possible in AIR. Instead, we propose certifying the evaluation of policy logic by statically ensuring that proofs that support every authorization decision are constructed at runtime. This form of certied evaluation of authorization decisions has been explored in a number of contexts. For instance, certied evaluation is a feature of the SD3 trust-management system proposed by Jim [65]. Jia et als Aura language [64] also maintains audit logs to record evidence to justify authorization decisions made at runtime. The architecture we propose for certied evaluation in AIR is closely related to both these approaches. While more investigation is required, AIRs ability to accurately track evidence in the presence of state modications opens the possibility of certied evaluation of a wider class of stateful authorization policies, like those expressible in SMP, a stateful authorization logic recently proposed by Becker and Nanz [15]. 225 7.4.2 Declassication Policies The specication and enforcement of policies that control information release has received much recent attention. Sabelfeld and Sands [112] survey many of these efforts and provide a useful way of organizing the various approaches. AIR policies address, to varying degrees, the what, who, where and when of declassication, the four dimensions identied by Sabelfeld and Sands. Most of this work approaches information release from the perspective of information ow policies. As such, the security properties typically used with declassication are variants of noninterference or related forms of bisimulation. By contrast, the security theorem we show for the enforcement of AIR policies states that the programs actions are in accord with a high-level policy, not that these actions enforce an extensional security property (like noninterference). We believe that the two approaches are complementary. As Chapter 4 shows, AIR is expressive enough to enforce noninterference-like properties too. By applying an AIR policy in combination with our information ow encoding, we could show a noninterference-like security theorem (e.g., noninterference until conditions [30], or robust declassication [147]) while being able to reason that a high-level protocol for releasing information is correctly followed. AIR policies are dened separately from programs that use them, allowing them to be reasoned about in isolation. Most related work on declassication embeds the policy within program that uses it, obscuring high-level intent. One exception is work on trusted declassiers [61]. Here, all possible information ows are specied as part of a graph in which nodes consist of either downgrading functions or principals, and edges consists 226 of trust relationships. Paths through the graph indicate how data may be released. AIR classes generalize this approach in restricting which paths may occur in the graph, and in specifying release conditions in addition to downgrading functions. Chong and Myers [30] propose declassication policies as labels consisting of sequences of atomic labels separated by conditions c. Initially, labeled data may be viewed with the privileges granted by the rst atomic label, but when a condition c is satised, the data may be relabeled to the next label in the sequence, and viewed at its privileges. Declassication labels are thus similar to AIR classes, with the main difference that our approach is more geared toward run-time checking: we support dynamically-checked conditions (theirs must be provable statically) and run-time labels (theirs are static annotations). 7.4.3 Data Provenance Tracking Simmhan et al. [117] dene data provenance to be information that helps deter- mine the derivation history of a data product, starting from its original sources. They also provide a useful survey of various models of provenance and techniques used to track provenance. Their survey proposes a taxonomy based on potential uses of provenance data. Our formal encodings of provenance in Chapter 2 show how to track provenance accurately, but are somewhat agnostic as to how this data is to be used. In the implementation of SEWIKI, (according to their taxonomy) we use provenance primarily for data quality, attribution, and to construct audit trails. Buneman et al. [22, 23] discuss various approaches to provenance, specically 227 in the context of database systems. They propose an alternative means of categorizing provenance in terms of the information it reects, rather than potential usages of that information. In their terminology, where-provenance is information about the location (such as a specic database record) from which some data was retrieved. Alternatively, why-provenance refers to the source data that may have inuenced the result of a computation. In these terms, we have focused primarily on why-provenance, although tracking where-provenance does not appear to pose serious difculties. Our approach to provenance tracking is closely related to computing dynamic program slices [141]. Cheney has also observed this connection and discusses ways in which ideas from slicing can be used to improve provenance tracking in databases [25]. Dependency correctness, the security property we prove for provenance policies, is also due to Cheney et al [26]. 7.5 Web Programming SELINKS expands on the original goal of LINKS [35], which is to reduce the impedance mismatch in programming multi-tier web applications. Our work aims to reduce the impedance mismatch faced when synchronizing the security mechanisms available in the various tiers of a web application. Several other languages also aim to simplify web programming by providing a unied view of the client and server tiers, e.g., Hop [115], the Google Web Toolkit (GWT) [57] and Volta [84]we could have applied FABLE to any of these instead of LINKS. However, we found LINKS three-tier solution (spanning client, server, and database) particularly attractive. 228 SIF [32] and Swift [29] are two Jif-based projects that aim to make web applications secure by construction. The former is a framework in which to build secure Java servlets. The latter, Swift, is a technique that permits a web application to be automatically split according to a policy into JavaScript code that runs on the client and Java code on the server. Being based on Jif, both these projects focus primarily on enforcing information ow policies; SELINKS aims to be more exible by enforcing user-dened policies. Another distinction is that Swift and SIF target interactions between the client and server, while server-database interactions is the focus of our work with SELINKS. Despite these distinctions, there appear to be a number of ways in which Swift and SIF can complement SELINKSwe discuss some examples below. In SELINKS, we expect programmers to insert annotations that partition the program into client and server components. The resulting partition is often fairly coarse grained, with more code running at the server than is strictly necessary for security. As such, the basic ideas of Swift could be applied to SELINKS to direct the partitioning of code into client and server components. A Swift-partitioned SELINKS program could have more code running at the client, potentially improving the responsiveness of the application and reducing load at the server. In designing SIF, Chong et al. have worked out several useful idioms for enforcing information ow policies in web applications. For example, to provide a degree of protection against attacks like script injection, SIF places restrictions on the use of cascading style sheets and dynamically generated JavaScript. (Section 8.4 describes measures in SELINKS to defend against similar threats.) One could implement these behaviors in an SELINKS policy module and the type system could ensure that developers adhere to a 229 programming discipline that has been found to be effective with SIF. 7.5.1 Label-based Database Security To understand the benets of our approach with SELINKS, we consider some alter- native approaches to securing a document-management application. Database-side enforcement. Some DBMSs aim to enforce a ne-grained policies directly, with little or no application assistance. For example, Oracle 10g [97] has native support for schemas in which each row includes a security label that protects access to that row. In this case, the label model and the enforcement policy are provided directly by the DBMS. As a result, the application code does not need to be trusted to perform the security checks correctly since the DBMS will perform them transparently. Application programmers need only focus on the functional requirements; i.e., they can write queries like (using LINKS syntax): for(var row doc table handle) where (row.text /.{keyword}./) [row] } Native support for authorization checks in the DBMS can be optimized. There are two downsides to a database-only enforcement model. The rst problem is the lack of customizability. Each DBMS has different security mechanisms, and these may not easily map to application concerns. For instance, Oracles row-level security is geared primarily to a hierarchical model of security labels, in which security labels are represented by integers that denote privilege levels. A user with privilege at level l1 may access a row labeled l2 assuming l1 l2 . While useful, this native support is not 230 sufcient to implement the label model we described above. For one, a typical encoding of access control lists in a hierarchical model requires a lattice model of security [42], rather than the total-order approach used in Oracle. Encoding principal sets in a hierarchical model is also not robust with respect to dynamic policy changes [125]. Furthermore, Oracle 10g is atypicalmost DBMSs provide a far more impoverished security model. For instance, PostgreSQL [103], SQLServer [119], and MySQL [90] all provide roughly the same security model, based on discretionary role-based access control [95]. Object privileges are coarse-grained (read, write, execute etc.) and apply at the level of tables, columns, views, or stored procedures. By contrast, SELINKS labels can be dened using LINKS rich datatype specication facility, labels can be associated with data at varying granularity (table, row, or even within a row), and these labels can be given user-dened semantics via the enforcement policy. The second problem is that database-only enforcement does not solve the end-toend security problemwhile we may be condent that no data moves from the database to the server without proper access, the DBMS cannot ensure the server does not (inadvertently or maliciously) release the data inappropriately, e.g., by writing it to a publiclyvisible web log. By contrast, SELINKS ensures that sensitive data, whether accessed via a database query or a server action, is always mediated by a call to the enforcement policy. This provides a level of trustworthiness similar to application-transparent enforcement within the DBMS, but with greater scope. Indeed, it opens up the possibility for enforcing policies that combine information available in the database and the server. Server-side enforcement. Another common approach is to enforce ne-grained security 231 policies primarily in the server. This is the approach taken in the web application frameworks, like J2EE and ASP.NET. In J2EE [56], Entity Enterprise Java Beans (EJBs) are used to represent database rows at the server where row data is made available via userdened methods. For our example we could dene a method findByKeyword to search a documents text. Access to this method (and other operations) is controlled using the Java Authentication and Authorization Service (JAAS) to invoke user-dened functions under relevant circumstances. ASP.NET is similar to J2EE except it integrates more cleanly with authentication services provided by the Windows operating system [5]. Other lightweight approaches to web programming, like PHP [100] or Ruby On Rails [110], take a more ad hoc approach to securitya set of best practices is recommended to protect applications from common vulnerabilities like code injection attacks. All these approaches are extremely exible. As with SELINKS, the developer can customize the label model and its semantics. Because policies are enforced at the server, they can consider server and database context, providing broader scope. The main drawback of the server-side approach is the performance hit that comes with moving data from the database to the server, potentially unnecessarily. As illustration of this, Cecchet et al [24] report that J2EE implementations based on entity beans can be up to an order of magnitude slower than those that do not. That said, Spacco and Pugh [106] report that for the same application much of the performance can be restored with some additional design and tuning, but this can be a frustrating and brittle process. The other problem with server-side enforcement is trustworthiness: the application programmer is responsible for correctly invoking security policy functions manually, so that mistakes can lead to security vulnerabilities. 232 Hybrid enforcement. SELINKS essentially represents a kind of hybrid enforcement strategy: it presents a server-side programming model but compiles server functions to UDFs to allow them to run on the database and thus optimize performance. This same basic strategy could also be encoded by hand. One could dene a custom notion of security label (e.g., as a certain format of strings), and then write a series of user-dened functions akin to the SELINKS enforcement policy for interpreting these strings. The application writer would then be responsible for calling these functions during database accesses to enforce security. To avoid changing the application, a popular alternative is to have the DBMS perform UDF calls transparently when accessing the database via a view [97]. For example, we could dene a view of our document table as containing only the docid and text elds; when querying these elds, calls to UDFs would be made by the DBMS transparently to lter results according to the hidden doclab eld. This by-hand approach has three main drawbacks, compared to what is provided by SELINKS. First, database-resident functions are painfully low-level, operating on application object encodings rather than, as in SELINKS, the objects themselves. Second, different DBMSs have different UDF languages, and thus a manual approach requires possibly many implementations; by contrast the SELINKS compiler can be used to target many possible UDF languages. Finally, if application programmers must construct queries with the appropriate calls to security enforcement functions there is the danger that coding errors could result in a policy being circumvented. While using views reduces the likelihood of this problem, there are still parts of the application that manage the policy, e.g., by updating the doclab portions of the objects, and these bits of code are subject to mistakes. The SELINKS type checker ensures that operations on sensitive 233 data (whether in queries like our keyword search example or in operations such as server logging functions) respect the security policy. Finally, an important benet of our approach is that it enables an application to make design choices that are pertinent to security, and have them reliably enforced in the server using the abstractions that are available there, if necessary. For example, in the case of SEWIKI, the policy label of a parent node indirectly and uniformly restricts access to all its children; labels that appear at child nodes may add to this restriction. To implement this tree-based semantics literally would require recursive policy checks. Doing this in the database would be both cumbersome and inefcient because SQL is not particularly well suited to handling recursive relations. Instead, we use code running in the web server to enforce the invariant that a nodes label always reects the policies of its ancestors labels. Thus, even in situations where end-to-end tracking of information ow is not essential (as in our access control policy here), the exibility afforded by server-side security enforcement in SELINKS is crucial both to reasonable efciency and ease of use. As a nal note, a form of hybrid policy enforcement has also been recently proposed outside the context of web applications. SEPostgreSQL is an extension of PostgreSQL that aims to achieve end-to-end security through integration with the SELinux secure operating system [80]. SEPostgreSQL allows SELinux policy metadata to be associated with tables, columns, and rows in the database. Access to protected objects in SEPostgreSQL is mediated by the SELinux operating systems underlying reference monitor. This opens the possibility of enforcing a uniform policy throughout the operating system and database. However, it is unclear if the ideas behind SEPostgreSQL translate well to other DBMSs. In contrast, a key benet of our work is portability. We rely only on 234 widely-used features of PostgreSQL (user-dened types and functions) which are also available in most other mainstream DBMSs. The assurance using FABLE types to avoid security bypasses is also unique to our approach. 7.6 Other Technical Machinery Our technique of separating the enforcement policy from the rest of the program (in FABLE) is based on Grossman et als colored brackets [58]. They use these brackets to model type abstraction, whereas we use them to ensure that the privilege of unlabeling and relabeling terms is not mistakenly granted to application code. As a result, we do not need to specially designate application code that may arise within policy terms, keeping things a bit simpler. We plan to investigate the use of different colored brackets to distinguish different enforcement policies, following Grossman et al.s support for multiple agents. 235 8. Looking Ahead This dissertation was motivated by a long-term vision of a modular, composable, and formally veriable approach to the enforcement of security policies. In its idealized form, we conceive of a framework exible enough to capture the idiosyncrasies of enforcement mechanisms used by real software implementations, yet precise enough to admit formal verication of high-level security goals. The enforcement mechanisms would be modular so that, once veried, they could be reused with a variety of applications with high assurance. For applications that needed to address a range of concerns, policies would be enforceable in combination. For example, some critical components of an application could be protected by strong, highly restrictive policies, while other parts could be secured by permissive, low-overhead policies. Source-level programmers would be able use this framework to ensure that new software was secure by construction. Legacy applications could also be retrotted with security policies customized to match specic deployment scenarios. While this vision remains a distant (perhaps even unattainable) goal, the work in this dissertation has made signicant strides towards its fulllment. We have shown, at least in theory, that a language-based framework can be expressive enough to verify the enforcement of many interesting security policies. We have also shown that the kernel of our approach is practical, and that programming with simple user-dened policies in 236 SELINKS is possible today. But, there is much left to do. In this chapter we acknowledge some limitations of our work and identify a number of directions in which our work might be advanced. 8.1 Assessment of Limitations This dissertation aimed to defend the thesis that language-based enforcement of user-dened policies can be both expressive and practical. By developing encodings of many kinds security policies, we have demonstrated that FLAIR is more expressive than any previously known security-typed language. Additionally, by proving that programs enjoy useful extensional properties as a consequence of type correctness, we have shown that our approach can provide a level of assurance that is competitive with more traditional, specialized security-typed languages. By developing SELINKS and using it to construct realistic applications, we have also showed that our approach can be applied in practice. Nevertheless, our work suffers from a number of limitations. First, we are unable to precisely characterize the class of security properties enforceable in FLAIR. Many of the security policies we have explored here aim to establish safety properties for programs; i.e., they proscribe certain bad events from occurring during a program execution [75]. The security automata policies of Chapter 3 are a canonical example of such safety-oriented policies. A succession of results about the expressiveness of security automata have attempted to characterize the precise class of safety properties that they are able to enforce [113, 59, 13, 79]. By virtue of our ability to enforce automata-based policies, these prior results establish a useful lower bound on the 237 class of safety properties that can be enforced in FLAIR. But, this lower bound is not very precise. We have also demonstrated that FLAIR is powerful enough to enforce properties like noninterference, which are not safety properties. Noninterference-like properties have been variously categorized as 2-safety properties [126] and, more recently, as hyperproperties [33]. But, only a small subset of hyperproperties are within reach of our enforcement techniques. For example, hyperproperties include liveness propertiesi.e., properties that ensure that a program eventually does something good. We have made no attempt to enforce liveness properties. A more practical concern is that the generality of policy enforcement in FLAIR/SELINKS comes at a price. We require programmers to insert explicit calls to enforcement policy functions, rather than inferring the placement of the calls automatically. While the annotation burden is small for simple policies like access control, for more complex policies like information ow, as programs grow larger, the number of policy checks that must be inserted quickly becomes unacceptable. Specialized systems like Jif or FlowCaml do not suffer from this problem. Automatic insertion of policy checks is essential if SELINKS is to compete with these systems in the enforcement of information ow policies. Another concern is the usability of our advanced typing constructs in a source language. Our experience with SELINKS is limited to FABLE-style purely functional policies. We have found that using dependent types to express a security labeling is fairly lightweight. However, the combination of dependent and afne types that we propose in FLAIR is much more burdensome for the programmer. As such, we conjecture that FLAIR may be more useful as the common intermediate language for a variety of systems that enforce special-purpose policies. 238 Finally, our theory of policy composition is fairly primitive. For example, although SEWIKI, our main example program, enforces a combined access control and provenance policy, our proofs of correctness apply only to each of these policies in isolation. We do propose some syntactic techniques to ensure that policies compose well, but this applies only to relatively simple modes of composition. Despite this weakness, our work is the rst to provide a platform on which further work on modular proofs of correctness for composite policies may be explored. 8.2 Automated Enforcement of Policies While SELINKS makes it easy to to enforce simple policies, the difculty of enforc- ing more complex policies is often unacceptable. This has been one of the main factors in limiting our example applications to access control and simple provenanceinformation ow policies are too hard to enforce for large programs. The main thrust of the work proposed in this section is to make it easier to reason about and enforce some of the more complex policies. The difculty of enforcing a policy in FLAIR/SELINKS is due to three main concerns. We consider each of these and, in turn, propose ways in which these concerns may be addressed. The rst source of difculty is that we require application programs to include explicit calls to enforcement policy functions. For security policies like information ow, complete mediation demands that enforcement policies mediate all operations on sensitive data; e.g., function calls must be performed indirectly by calling policy functions like 239 apply. So, we propose transforming programs to insert policy checks automatically, Second, even for programs that include all the required enforcement policy calls, type checking relies on annotations that make explicit the connection between security labels and data. So, we propose using novel forms of type inference, suitable for use with dependent types. Finally, even if a program can be shown to be type correct with respect to an enforcement policy, the security of the system relies crucially on a proof that the policy encoding correctly establishes some high-level goals. To reduce this burden, our proposal is to leverage automated theorem provers or proof assistants to mechanize much of the reasoning about policy correctness. 8.2.1 Transforming Programs to Insert Policy Checks Currently, in order to enforce a policy, we expect a programmer to provide type an- notations that protect sensitive data with their security labels. Programs that manipulate labeled data are required to include calls to policy functionsthese calls must also be inserted manually into the program text. Instead of requiring the programmer to insert checks manually, we would prefer, given a manual security labeling, to insert the appropriate policy checks automatically. This section outlines a line of research that aims to achieve this goal. We begin with an abstract statement of the problem. Given a program e and an enforcement policy P, where e is not type-correct with respect to P, compute a program e (if one exists), where e is related to e by the relation R, such that e is type-correct with respect to P. 240 An enforcement policy P in FABLE (or, equivalently, a signature in FLAIR) is a purely declarative specication of the mechanism by which a policy is to be enforced. In order to automatically insert policy checks into a program, we could require the policy designer to additionally provide an algorithmic specication of P. Essentially, we could augment P with a set of rewriting rules that describe a program transformation. The general form of a rewriting rule could be rewrite p1 as p2 , where p1 and p2 are program patterns. The intention is to apply a collection of these rules to the source program e, rewriting sub-expressions of e that match the pattern p1 according to the pattern p2 . An example rule is shown below: rewrite as (e1 : ( ){l} e2 : ) (apply e1 e2 ) In this case, the pattern (e1 : ( ){l} e2 : ) is matched by any sub-expression of e that is an application of a function e1 to some argument e2 . The type annotations that appear in the pattern restrict e1 to be typeable as a function with type , labeled with the label l. Since labeled functions are not directly applicable, we need to wrap the application in a call to the apply enforcement policy function. The pattern p2 does exactly this: it calls the apply function, passing in e1 and e2 as arguments. This approach has a number of benets. First, we aim to guarantee that the target program e is type-correct with respect to the policy. This means that verifying the security of e only requires reasoning about the declarative policy P, and not the (potentially complicated) rewriting rules, i.e., despite the added complexity of the program transformation algorithm, the trusted computing base is unchanged. Furthermore, if successful, this work would not only ease the construction of new secure programs, but would also 241 open the possibility of retrotting existing programs with policy checks. However, making this idea work in practice will require addressing a number of concerns. First, it may be possible to apply multiple rules to a single sub-expression. This could be either because we are attempting to enforce multiple policies simultaneously, or due to inherent ambiguities or non-determinism in the way in which a single policy is to be enforced. Thus, we would require some mechanism by which conicts among the rules are to be resolved. A related issue is the termination of the rewriting process. Can program fragments be re-written multiple times, and if so, can we provide any assurances that the rewriting process converges ultimately? Or, perhaps, non-termination (or, pragmatically, simply a long-running rewriting process) can be treated as a failure mode. If so, what are the ways in which this failure can be explained? That is, is the non-termination due to an ambiguity in the rewriting rules, or is it due to a badly formed source program? What about explaining other failure modes? For instance, in the simple example above, if the sub-program contains an application e1 e2 , where the type of e2 does not match the type of the formal parameter of e1 , then, it seems reasonable that the rewriting process should fail. Is there a way in which these so-called reasonable failure modes can be characterized? Finally, the abstract statement of the problem intentionally left the notion of correctness unspeciedthis is potentially the most challenging issue to address. What are reasonable ways of constraining the behavior of the rewritten program e so that it accurately reects the intended semantics of the source program e, i.e., what is an appropriate denition for the relation R? Clearly, we would like to ensure that the programmerspecied labeling relationships in e are left unchanged in e . But, we might also like to go 242 further and ensure that e preserves the semantics of e in some non-trivial way, e.g., that the runtime behaviors of e (under a suitable operational semantics) and e are identical, except for possibly failed policy checks in e . There are a number of related works that might provide suitable answers to each of these questions. First, work on aspect-oriented programming [72] is based on similar ideas. An aspect consists of a point-cuta pattern that denes a set of program points of interestand some associated advice, which denes an action to be performed at the points of interest. However, our notion of rewriting departs from aspects in two respects. First, we seek to transform a type-incorrect program by inserting advice, in the form of the appropriate policy checks. Aspects have traditionally been used to alter the runtime behavior of type-correct programs, rather than to x type-incorrect programs. Additionally, since policy checks in our framework can often be erased entirely, rewriting may not alter the runtime behavior of a transformed program at all. Nevertheless, it seems likely that the close connection to aspect-oriented programming can be used protably; e.g., notions of correctness associated with aspects (like harmlessness [39]) may also be applicable in our setting. Other ideas that may also be useful include work on blame assignment for use with software contracts [47] or hybrid programming languages [138]. It might be useful to show that if it is impossible to transform a program, or if a policy check in a transformed program fails at runtime, that the blame for the failure resides with the application program and is not some undesired artifact of the policy or rewriting rules. Finally, in this context, recent work on explaining failures of program analyses may also be relevant [133]. 243 8.2.2 Inferring and Propagating Label Annotations As a means of documenting security assumptions, security label type annotations appear to be far from optimal. Type annotations are buried within and are interspersed throughout the program text, causing the high-level intent of the connection between policy and data to be obscured. We would prefer to have a way of specifying a mapping between protected data and their policies separate from the program. From such a highlevel specication, we would like to automatically derive the labeling annotations needed to type check a program. Our vision for this element of proposed work is to complement the enforcement policy with a notion of a labeling policy, a high-level specication of the security constraints on the data sources and sinks in the program. Ideally, this arrangement would further simplify reasoning about the security of an applicationgiven that the type checker can ensure that an application is consistent with its policies, a security analyst can ignore the program text altogether and focus only on the enforcement and labeling policies. A labeling policy might include constraints of the following avors. For instance, it might state that all resources accessible on the le system via a certain path are to be considered high-security. Or, for example, that the label of every network packet can be found at a specic offset from the start of the packet. Or, even that the label of a database element can be retrieved by following a succession of foreign-key/primary-key associations between multiple relations in a database. The labeling policy might also place constraints on data sinks like network ports or terminals, limiting the types of data that can be sent out on them. 244 Given a labeling policy, we would have to infer labels for objects manipulated by the program. For example, if the program opens a le by passing a string constructed by the concatenation of various constants, we would like to be able to discover the security label for that le by examining the labeling policy. Of course, this would require precise reasoning about the string constants in the program; but, it may be possible to leverage the power of dependent types in SELINKS to reason in this manner (as is done, say, in a language like Cayenne [6]). Alternatively, we could automatically give network packets the type of a dependent record, with the security label stored at the appropriate offset. We would have to develop similar methods to ensure that database queries always retrieved the appropriate labels along with the protected data. Associating labels with the data sources and sinks is only one half of the problem. Given such an association, we would also like to use type inference to propagate label/data dependences throughout a program. As discussed in Chapter 7, this form of inference for security-type systems has been explored in context of FlowCaml. However, the static label model of FlowCaml is a simplifying assumption that is not applicable to our setting. Extending type inference to support dynamic labels would be a useful contribution in its own rightsuch a procedure could also simplify programming in Jif, for example. But, the generality of dependent types in SELINKS poses an additional challenge. Recent proposals like liquid types [109] may provide the basis of an inference mechanism for SELINKS. 245 8.2.3 Semi-Automated Proofs of Policy Correctness A key benet of our approach to policy enforcement is the ability to prove that programs satisfy end-to-end properties as a result of complete mediation. As such, conducting proofs of these properties is an integral part of our conception of the process of policy enforcement. That is, policy enforcement is not complete until all the policy code has been veried to correctly establish a desired security property of a program. However, at present, our proofs that a policy correctly enforces some high-level security property are entirely manual. Whereas our proposals for program rewriting and type inference sought to ease the process of constructing application programs, in this section we focus on simplifying the task of the policy designer and analyst. In Chapter 7, we observed that our user-dened policies are a form of type-system extension. In this context, Millstein et als work on semantic type qualiers is relevant [28]these are custom type qualiers that a programmer can use to indicate data invariants to be enforced by the type system. Much like our policies, the specication of these qualier extensions have to be proved to correctly describe the high-level invariants that they are intended to establish. Marino et al. have proposed using have proposed using proof assistants to partially automate these proofs of correctness [82]. Adapting this proposal to our setting is an interesting direction of future work. Whereas Marino et al. attempt to prove relatively simple syntactic properties, proving semantic properties (like noninterference) of our policy encodings would require a substantially larger effort. However, as discussed in Chapter 2, we have noticed that although non-trivial, our proofs of correctness are simplied considerably by the type-soundness 246 results of the underlying calculi, whether FABLE or FLAIR. Based on this experience, we conjecture that given a mechanized formalization of the metatheory of, say, FLAIR, a policy analyst could rely on several key lemmas in soundness proof to discharge a proof of the desired security property. At the time of this writing, we have formalized a signicant subset of the soundness proof of the AIR calculus in the Coq proof assistant [17]. But, we have yet to apply this formalization to proofs of security properties. 8.3 Enhancements to Support Large-scale Policies In the future, we would like to extend our evaluation of SELINKS by attempting to enforce large policies that address diverse security concerns. In this section, we consider how SELINKS might be scaled up to bring this goal within reach. 8.3.1 Interfacing with Trust Management Frameworks Trust management frameworks were introduced by Blaze et al. as a means of spec- ifying the authorization requirements of large distributed systems [19]. One basic design goal of trust management is to separate the specication of the policy from the means of its enforcement. Another goal is to formalize policy languages to the extent that a precise semantics can be given to a policy specicationboth, to enable the construction of interpreters that can answer policy queries, as well as to make policies amenable to formal analysis so as to check compliance with high-level security goals [77]. Our approach to policy enforcement has hinged on the premise that reasoning end 247 to end about the security of a system depends crucially on a precise specication of the enforcement mechanism that ties a high-level policy to a program. In FABLE, the glue that ties a high-level policy to the program is the enforcement policy. We believe that the notion of an enforcement policy is particularly applicable in the context of trust management, in that it bridges the gap between specication and mechanism intentionally kept separate by trust management systems. By using an enforcement policy, our work admits proofs of end-to-end operational properties of programs. This complements prior work on policy analysis in trust management, which aims to guarantee that policy specications themselves are consistent with high-level system objectives. Given a trust management policy, we could write enforcement policy glue code to ensure that application programs always include the appropriate calls to the policy interpreter. If successful, not only could we prove end-to-end properties of program executions, we could also check that these low-level security properties are consistent with the high-level objectives deduced from an analysis of the policy alone. Of course, this effort would begin with a study of existing policies formalized in trust management systems. For example, health-care policies formalized in Cassandra may be one starting point [14]. However, scaling SELINKS to be able to interface with trust management languages poses a number of interesting problems. For example, nding a reliable and transparent way to tie resources manipulated by the program to the policy elements that govern the usage of those resources. Throughout this dissertation, we have achieved this by associating a security label with each sensitive resource. However, interfacing with a trust management policy would be greatly simplied (and, indeed, enhanced) if the research proposed in Section 8.2.2 is successful. That is, given a specication of resources in a 248 language like SPKI/SDSI, we could automatically reconstruct the type-level labelings of objects in the program and propagate these throughout the program. Interpreters for trust management languages are often stateful. For example, Becker and Nanz [15] describe the semantics of trust management policy language using a variant of Datalog extended with state modication effects. We conjecture that the techniques we have developed with FLAIR will be useful in this context. Extending SELINKS with FLAIR, and enhancing FLAIR with some of the ideas of the previous section to make it more usable at the source level, will also be an interesting line of research. There are will also be some interesting engineering issues. Trust management was designed with distributed systems in mind. SELINKS also targets distributed systems, but is currently limited to the three-tier topology. Extending SELINKS to handle more general topologies of distribution would take a substantial effort. This would include a more sophisticated model of trust in the various nodes/tiers of a system. Our current model trusts the server and the database implicitly, but places no trust whatsoever in the client. An extension to more general topologies would requiring rening this model so that ner degrees of trust can be placed in each node. 8.3.2 Administrative Models for Policy Updates Most existing security-typed languages assume that a programs security policy does not change once the program begins its execution. This is an unrealistic assumption for realistic long-running programs. For operating systems, network servers, and database systems, the privileges of principals are likely to change. New principals may enter the 249 system, while existing principals may leave or change duties. On the other hand, it would be unwise to simply allow the policy to change at arbitrary program points. For example, if the program is unaware of a revocation in the security lattice it could allow a principal to view data illegally. More subtly, a combination of policy changes could violate separation of duty, inadvertently allowing ows permitted by neither the old nor the new policy. In prior work, we proposed a security-typed language RX that permits security policies to change during program execution [125]. We equipped RX with an administrative model based on the RT role-based trust management framework [76]. In effect, elements of the policy dene roles with designated owners who are responsible for administering the roles contents. Thus, only when the program is acting in a way trusted by that owner may the role be changed. We dened a type system to enforce this administrative model and, additionally, to ensure that updates do not cause undesirable information ows. We propose adapting the ideas of RX for use in SELINKS. Since RX policies clearly rely on mutable state, once again, the ideas we developed for the enforcement of stateful policies in Chapter 3 are relevant. By including support of AIR-style tracking of policy state in SELINKS, it should become possible to enforce RXs administrative models for policy updates. However, the exibility of AIR will allow us to easily explore other administrative models as well; e.g., a recent variant of RX proposes a more exible administrative model [8]. Additionally, we would be able to investigate questions regarding the suitability of administrative models for policies other than information ow. For example, what is a suitable administrative model for information release policies like AIR? How would those models be combined with the administration of information ow 250 policies? This effort would also mesh well with a broader initiative that aims to integrate SELINKS with trust management. 8.3.3 Administrative Models for Policy Composition In Chapter 2, we pointed out that our security theorems apply primarily in situations where only a single policy is in effect within a program. However, in practice, multiple policies may be used in conjunction and we would like to reason that interactions between the policies do not result in violations of the intended security properties. In its simplest form, policies can be composed in a way such that different security policies govern different parts of the same application. For example, we may wish to track information ows on some data and just enforce access control on other dataand ensure that the two kinds of data never mix. The policy composition criteria that we dened in Chapter 2 (and Appendix A) apply to exactly this case. We show that by adhering to this simple form of compositionality, one can reason about the security of the entire system by considering each policy in isolation. However, more interesting compositions of policies are also natural. For instance, a policy might state that data governed by Alices access control policy is subject to a lattice-based information ow policy once it is released to Bob. While the enforcement of Alices access control policy and the lattice-based policy may have been proved correct in isolation, it is not immediately clear that the composed policy does not violate the invariants of its components, much less that it meets some desirable composed semantics. In the future, we could devise models to control how policies are allowed to be 251 composed. One might attempt to wrap enforcement policy code within a module, where the module interfaces, dened by the administrator of a policy, would specify all the invariants that must be preserved for a policy to be composed with other policies. A particularly simple example would be to limit a policy to be composed only with other policies that were administered by a trusted principal. However, more complex forms of composition may be possible as well. For example, policies could be combined using boolean formulas. We could also borrow ideas like inheritance and overloading from object-oriented languages (our design of AIR already hints at some of these directions), or have support for ML-style functors, to have better support for managing large policies. 8.4 Defending Against Emerging Threats to Web Security In this dissertation, we have mainly considered threats due to insider attacks. How- ever, several recent trends, particularly with respect to the world wide web, have made it possible for outsiders to subvert application-level security controls by mounting abstraction violating attacks. The line of research proposed in this section aims to address these web-based threats, either by extending SELINKS or by using SELINKS in conjunction with other kinds of defenses. A classic example of a web-based threat is a cross-site scripting (XSS) attack. This can occur when a web page contains script content from a third party, such as an advertiser or other users. This script executes in the web browser with the same level of privilege as scripts that originated from the server and can steal steal private information from the clients web browser and possibly co-opt the clients web browser into attacking other 252 web sites. XSS attacks were identied as the most common security vulnerability in 2007 [86]. Without specic defenses, an application like SEWIKI is also susceptible to XSS attacks. Figure 8.1 illustrates a possible attack. The upper frame depicts a wiki document stored at the server, that contains a top-secret (at the left) and a secret component (at the right). When this document is requested by a client with clearance to view only the secret component, SEWIKI prunes the top-secret component of the document tree and serves only the secret component to the client. However, the malicious client inserts a script into the secret component. This script, designed to run in the web browser of a user with top-secret clearance, fetches the top-secret component of the document and redirects the web browser to the attackers web site, evilsite.com, passing in the top-secret data as part of the query string. The lower frame shows the altered document stored at the server along with the attack script. At some point, a top-secret user requests the document and SEWIKI serves the entire document to the user (without pruning the top-secret part). The attack script then runs in the top-secret users web browser and forwards the top-secret data to evilsite.com. In designing SELINKS, we were careful to incorporate the trust assumptions of each tier in our model. However, our model only goes so farattacks like XSS, or more recently, cross-site request forgery [12], fall outside the scope of our abstractions. Clearly, a comprehensive approach to web security demands that we pay attention to these very real threats. In prior work, we have proposed addressing XSS attacks by relying on cooperation between the server and client [68]. In our approach (called BEEP), users (like the 253 A document with a top-secret component and a secret component server client <script> var ts = doc.getElement(topsecret); doc.location=http://evilsite.com/?data=+ts </script> A malicious user with only secret clearance server client saves changes evilsite.com The victim is a user with top-secret clearance Figure 8.1: A cross-site scripting attack on SEWIKI 254 top-secret user in our example) can run specially modied browsers that are trusted to enforce a policy that is included in each web page by the server. In this case, the trusted web browser can enforce policies that prevent potentially malicious scripts in the secret component of the document from running, or from reloading the browser etc. Thus, like SELINKS, BEEP relies on coordination across the tiers of an application (in this case, the client and server) to reliably enforce a security policy. (In fact, our implementation of SELINKS includes BEEP policies in every web page it serves. Thus, SEWIKI users running BEEP-enabled browsers are protected from the attack described here.) We conjecture that as Web 2.0 applications with very rich client side features continue to gain prominence, enriching a browsers JavaScript runtime environment with the ability to enforce complex policies will become increasingly important. Java applets, which were once the main vehicle of interactive content on web pages, have been supplanted by AJAX-enabled JavaScript. Following the example of Java, where expressive security policies ranging from sand-boxing to stack inspection were found to be necessary, one might expect that it will be useful to enforce non-trivial JavaScript security policies within the browser. As with BEEP and SELINKS, we conjecture that approaching these problems from an application-wide cross-tier perspective is likely to pay dividends. For example, servers can specify policies to be enforced at the client; or, clients can request content that match their security needs. Providing each application component with the ability to attest to its security claims and verify claims of other untrusted components will be challenging. However, recent adaptations of information ow policies to a cryptographic setting may help provide some of the answers [52, 131]. 255 While BEEP protects clients from malicious code that might be served with a web page, a server also needs to be protected from a malicious client. For instance, web applications have complex control-ow properties that govern a clients workow through the application. Violations of this workow can cause the web application to enter an inconsistent state. One approach to solving this problem is to use a FABLE-style security automaton policy to ensure that each client request is consistent with the current state in the workow of an application. We have experimented with this enforcement technique in SEWINESTORE, our e-commerce application. However, other techniques may also be applicable. For instance, ideas from system-call monitoring [66], originally developed to ensure that an operating systems integrity is not compromised when an application is attacked, can also be applied to web applications. A server side monitor could intercept all client requests and ensure that they conform to some specication of the clients expected behavior. Additionally, one might be able to adapt ideas from kernel-based control-ow integrity monitors to suit web applications [99]. By analyzing the source of a web application, one could automatically extract a model of the clients behavior which could then be enforced by a server-side monitor, on a per-session basis. Such an approach could be particularly effective for a multi-tier language like SELINKS, where the entire applications source could be analyzed at once to extract a precise model of its expected control-ow behavior. 256 8.5 Concluding Remarks This chapter has acknowledged a number of limitations to the work described in this dissertation. In seeking to address these limitations, we have also described a number of directions in which our work might be advanced. 257 9. Conclusions This dissertation has made a number of contributions in support of the contention that the enforcement of expressive user-dened security policies can be wide-ranging, reliable, and practical. Our evidence includes the following main elements: A succession of programming-language calculiFABLE, AIR, and FLAIRwhich we have shown to be expressive enough to verify the enforcement of access control, provenance, information ow, downgrading, and automata-based policies, for both functional and effectful programs. The statement and proof of several useful end-to-end properties for programs that enforce each kind of security policy, demonstrating that our approach retains a key benet of traditional security-typed languages, while exceeding prior approaches by being applicable to a broader class of security policies. An implementation of security typing in SELINKS, and the subsequent use of SELINKS in the construction of two realistic web applications, corroborating our claim of practicality. In conducting the work described in this dissertation we have gained a number of insights. Our work arose from the observation (entirely obvious in hindsight) that to reason about the correct enforcement of a security policy in software, one has to make 258 the enforcement mechanism precise. A lot of the prior work on security policy design (e.g., the work on trust management described in Section 8.3.1) starts from the premise that policy specication and mechanism need to be separated. While this is useful for evaluating the high-level security objectives by reasoning about the policy in isolation, it does not allow reasoning about programs that enforce a policy, because the details of enforcement matter. For example, our proof of non-observability relies on a precise denition of how membership checks on access control lists are performed. Variations on the forms of checks can have signicant consequences on the kinds of properties that can be proved. For example, in Chapter 2 we showed alternative encodings of access control in which an authorization check was performed using capabilities instead of interposing a check at each request. We noted that the capability approach can more easily be used support idioms like the delegation of access rights, but it may not be very robust against time-of-check/time-of-use bugs. By developing the concept of an enforcement policy, we were able to show that all the relevant details of the enforcement mechanism can be factored into a small set of veriable functions. We also nd our basic approach (as exemplied by FABLE) attractive because it brings the benets of security typing to common programming tasks. For example, the most common form of access control policy is extremely simplea successful authorization check releases the protected data without any further constraints. However, despite its simplicity, access control is frequently implemented incorrectly; e.g., Security Focus regularly reports vulnerabilities where access control checks are bypassed due to a software error [114]. To prevent these errors, we developed a simple encoding of access control in FABLE. Happily, we found that programming with this encoding in SELINKS was also 259 easy. Nevertheless, the little assistance that the programmer provides in the form of label annotations was enough to prove a useful end-to-end propertyi.e., non-observability. Giving the programmer the freedom to chose the format of security labels is also extremely useful. We have argued that the specic choice of label model can have a profound impact on the kinds of security properties that can be enforcede.g., role-based label models can be better for controlling policies that change at runtime [125]. But, on a still more practical level, this exibility allows SELINKS to easily interface with existing systems that already use specic formats of security policies. Furthermore, allowing labels to be formed from arbitrary data (like our use of time-stamps in the provenance labels of SEWIKI), lets the programmer use types to help manage tasks that would not traditionally be within the purview of a security type system. Dependent types have recently become a fairly popular approach to program verication. But, rather than unleash the full power of dependent types we have taken a lightweight, pragmatic approach that we hope hits a sweet spot. We use dependent types to express a security labeling and for giving precise types to evidence, but not to do full program verication (as in Epigram [2], Coq [17], Cayenne [6], etc.). Our experience with SELINKS indicates that this kind of dependent typing is relatively easy to use. Additionally, rather than focusing on static enforcement of policies (which is usually impossible, because most interesting policies are dynamic) we permit runtime checks to be used to discharge typing obligations. By using a kind of intensional type analysis we can ensure that all the necessary runtime checks are present. This means that programmer can quickly develop secure applications by inserting runtime checks wherever necessary to convince the type system of complete mediation. But, as an application matures, one 260 could write down more expressive types to get stronger static guarantees and remove some runtime checks, or at least move checks outside of certain interface boundaries. Despite being so lightweight, we found that the combination of dependent types with a little bit of veriable enforcement policy code can be extremely expressive. Even in its simplest form (FABLE), we were able to show encodings of many interesting purely functional (or ow-insensitive) policies. But, in order to encode ow-sensitive analyses, or to account for side effects, we needed more. By adding afne types (another relatively standard, off-the-shelf construct) to the mix were able to overcome this restriction. It appears as though our particular combination of a small amount of trusted code, dependent types, and afne types may be of fairly broad interest. Although we set out to design a framework for enforcing security policies, as we discussed in Section 7.2, it appears as though a language resembling FLAIR could be the basis of a more general-purpose type system that supports ow-sensitive user-dened extensions. In conclusion, by developing FLAIR, this dissertation has contributed a new, more exible approach to language-based security. To our knowledge, no prior framework has been able to enforce such a wide range of policies with an equally high level of assurance. 261 A. Proofs of Theorems Related to FABLE A.1 Soundness of FABLE Denition 5 (Well-formed environment). is well-formed if and only if (i.) All names bound in are distinct (ii.) = 1 , x : t, 2 FV (t) dom(1 ) (iii.) e1 e2 c e1 : lab c e2 : lab Lemma 6 (Canonical forms). For well-formed, all of the following are true. 1. 2. 3. 4. c vc : (.t) vc = .e vc = x:t.e vapp = ( [{l}v ] ) vpol = {l}v c vc : (x:t1 ) t2 vapp : t{l} vpol : t{l} app pol Proof. Straightforward from induction on the structure of the typing derivation. Observe that in app-context, terms such as ( [.e] and ( x:t.e] are not values. ) [ ) Theorem 7 (Progress). Given (A1) c e : t. Then either e .e c e or v.e = v. Proof. By induction on the structure of (A1). Case (T-INT): n is a value. Case (T-VAR): Inapplicable, since by assumption, dom() = 0 and e is a closed term. / Case (T-FIX): e takes a step via (E-FIX). Case (T-ABS), (T-TAB): e is a value. Case (T-TAP): We either have e = e [t] or vc [t]. In the rst case, we use the induction c hypothesis and apply (E-CTX) to show that e [t] e [t]. In the second case, by the second premise of (T-TAP) we have (A1.2) c vc : .t. By canonical forms Proposition 6 on (A1.2), we get that vc must be of the form /\.e or ( [/\.e] (since the types of both t u ) and ( t u] are of the form t {e}) and (E-TAP) is applicable. [ ) Case (T-APP): If e is either e1 e2 or vc e2 , then, by applying the induction hypothesis to the third and fourth premises respectively, we get our result using (E-CTX). If e is vc vc then, if c = pol, by canonical forms on third premise of (T-APP), vc = x:t.e and (E-APP) is applicable. Case (T-LAB): C(, e, ) is an evaluation context of the form E e. So, via the v e c c induction hypothesis e c e and by (E-CTX) the goal is established. 262 Case (T-HIDE), (T-SHOW): Application of the induction hypothesis to the rst premise. Case (T-MATCH): If e is match e with . . . then we just use the induction hypothesis on the fth premise of (T-MATCH) and we have our result via (E-CTX). If, however, e is match vc with xi .pi ei , then we must show that reduction via (E-MATCH) is applicable. To establish this, note that the third premise of (T-MATCH) requires pn = xde f . Thus, it sufces to show that vc xde f : for all label-typed values vc . But, all lab -typed values are of the form C() where each sub-term is also a lab -typed value. So, matching via u (U-VAR) must succeed. Case (T-UNLAB): In this case, c = pol. If e = {}e then by using the induction hypothesis on the second premise we have our result via (E-CTX). If e = {}vpol , by the premise we have that pol vpol : t{e}. Thus, from Lemma 6, vpol must be of the form {e }vpol and reduction can proceed using (E-UNLAB). Case (T-RELAB): In this case, c = pol. If e = {e }e then by using the induction hypothesis on the second premise we have our result via (E-CTX). If e = {e }vpol , then, by denition, e is a value. Case (T-POL): If we have e = ( ) we can use (E-BRAC) with the induction hypothesis [e] in the premise. If c = pol we can reduce by (E-NEST). Otherwise, if e = ( pol ] , then [v ) vpol must be one of the following: Sub-case v = n: In which case, a reduction via (E-BINT) is possible. Sub-case v = C(): In which case, a reduction via (E-BLAB) is possible. u Sub-case v = x:t.e: In which case, ( ) is reducible using (E-BABS) to a value. [v] Sub-case v = .e: In which case, a reduction via (E-BTAB) is possible. Sub-case v = {e}u: In which case, ( ) is an application value. [v] Case (T-CONV): Straightforward from the induction hypothesis applied to the rst premise. Proposition 8 (Well-formed sub-derivations). If is well-formed, and (A1) c e : t contains a premise of the form c e : t then is well-formed and c t . Similarly, if (A1) contains a sub-derivation of the form c t then is well-formed. Proposition 9 (Sub-coloring of derivations). If, for well-formed , e : t. Proposition 10 (Weakening). Given c app e : t, then e : t. pol e : t and , well-formed. Then, , c Lemma 11 (Substitution). Given 1 , x:tx , 2 well-formed and (A1) 1 , x:tx , 2 (A2) 1 c c e:t v : tx Then, for = x v, 1 , (2 ) c e : t 263 Proof. Proved by mutual induction along with Lemma 12 on the structure of assumption (A1). We assume a standard denition of capture-avoiding substitution in (e) and (t). Throughout, we are free to assume 1 = 1 , since x dom() Case (T-INT): Trivial. Case (T-VAR): Here we have two sub-cases, depending on whether or not x dom( ). Sub-case (a): (A1) is of the form 1 , x : tx , 2 We have two further sub-cases: c y : t2 ; y = x and thus, (y) = y. Sub-case (a.i): y:t2 2 . In this case, FV (t2 ) dom( ) = 0; thus our conclusion is of / the form 1 , (2 ) c y : (t2 ) Sub-case (a.ii): y:t2 1 . From our initial remark, we know that 1 = 1 ; thus, t2 = t2 . Our conclusion has the required form 1 , (2 ) c y : (t2 ). Sub-case (b): (A1) is of the form 1 , x:tx , 2 c x : tx . But, (x) = v and, so, from (A2), 1 c (x) : tx is trivial. Furthermore, (tx ) = tx , since 1 tx and, by Proposition 8, x dom(1 ). Finally, we have our conclusion from weakening, i.e., Proposition 10. Case (T-FIX): From -renaming, we have f dom( ). Thus, (x f .v) = x f . (v). Now, we can use the induction hypothesis on the second premise to derive 1 , (2 ) c (v) : (t). The rst premise follows from the mutual induction hypothesis of Lemma 12. Case (T-TAB): The rst premise of (A1) gives us 1 , x:tx , 2 , c e : t. Now, since dom( ), we can use the induction hypothesis to get 1 , (2 ), c (e) : (t). The conclusion follows immediately. Case (T-TAP): We use the mutual induction hypothesis of Lemma 12 to establish that 1 , 2 t. Now, we use the induction hypothesis on the second premise, and in the conclusion we have the type ( (t))t . Since dom( ) we can rewrite this type as (( t)t ), as required. Case (T-ABS): Our goal is to show, via (T-ABS), 1 , 2 c y : (ty ). (e) : (t), since by -conversion, dom( ) cannot mention y. From inversion of (A1), we can obtain from the rst premise 1 , x:tx , 2 , y:ty c e:t From the induction hypothesis applied to this judgment we can obtain (T1) 1 , (2 ), y: (ty ) c (e) : (t) Thus, to reach our goal, we use this last judgment (T1) in the second premise of (T-ABS). The rst premise follows from the mutual induction hypothesis. Case (T-APP): The induction hypothesis applied to the rst premise gives 1 , (2 ) and to the second premise gives 1 , (2 ) c c (e1 ) : (x: (t1 )) (t2 ) (e2 ) : (t1 ) 264 Thus, in the conclusion, we get (x (t1 )) (t2 ). Again, as with (T-TAP), via conversion, we can ensure x dom( ) and we can rewrite this type as required to ([x t1 ]t2 ). Case (T-LAB): We use the induction hypothesis on each of the n-premises, obtaining 1 , 2 c (ei ) : lab for the ith premise. For the conclusion, we note that (C()) = e )) and obtain , )) : (lab C()), the desired result. C( ( e e 1 2 c (C( e Case (T-HIDE), (T-SHOW): Straightforward use of induction hypothesis on the premise. Case (T-MATCH): Premise 1 follows from the induction hypothesis. The second premise applied to (t) follows from mutual induction. Premises 3 and 4 are trivial, since v is a closed term and dom( ) does not include any of . For the fth premise for xi each pi we use the induction hypothesis again and similarly for each ei . The sixth and seventh premises are also established by the induction hypothesis. Case (T-UNLAB), (T-RELAB): Induction hypothesis on the rst premise. Case (T-POL): We have 1 , x:tx , 2 c ( ) : t with 1 , x:tx , 2 pol e : t in the premise. Now, [e] if (A2) is 1 pol v : tx , then we can use the induction hypothesis to establish 1 , 2 pol (e) : (t) and conclude with (T-POL). However, if (A4) is 1 app v : t1 , then we must rst use Proposition 9 before proceeding as before. Case (T-CONV): Applying the induction hypothesis to the rst premise, we obtain 1 , 2 c e : t. We proceed by induction on the structure of the second premise of (A1) t t , to show that t t . = = Sub-case (TE-ID): Trivial. Sub-case (TE-SYM), (TE-CTX): Induction hypothesis. Sub-case (TE-REFINE): e e e e . c Sub-case (TE-REDUCE): By construction, we have that . e e. Lemma 12 (Substitution for type well-formedness judgment). Given well-formed 1 , x:tx , 2 . If the following conditions are true: (A1) 1 , x:tx , 2 (A4) 1 c t v : tx Then, for = x v, 1 , 2 Case (K-INT): Trivial. Case (K-TVAR): = (x v); Thus, 1 , 2 . Case (K-LAB): Trivial. Case (K-SLAB): We use the mutual induction hypothesis to show 1 , 2 pol e : lab . If (A2) is of the form 1 app v : tx , then we rst use Proposition 9 before proceeding. 265 (t) Proof. By mutual induction with Lemma 11 on the structure of assumption (A1). Case (K-LABT): Induction hypothesis on the rst premise, and then simila...

Find millions of documents on Course Hero - Study Guides, Lecture Notes, Reference Materials, Practice Exams and more. Course Hero has millions of course specific materials providing students with the best way to expand their education.

Below is a small sample set of documents:

Maryland - TOMOS - 8622
ABSTRACTTitle of dissertation: Language-based Enforcement of User-dened Security Policies As Applied to Multi-tier Web Programs Nikhil Swamy, Doctor of Philosophy, 2008 Directed by: Professor Michael Hicks Department of Computer ScienceOver the l
Maryland - TOMOS - 1903
ABSTRACTTitle of dissertation:Sound, precise and efcient static race detection for multi-threaded programs Polyvios Pratikakis Doctor of Philosophy, 2008Dissertation directed by: Professor Michael HicksProfessor Jeffrey S. Foster Department o
Maryland - TOMOS - 8627
ABSTRACTTitle of dissertation:Sound, precise and efcient static race detection for multi-threaded programs Polyvios Pratikakis Doctor of Philosophy, 2008Dissertation directed by: Professor Michael HicksProfessor Jeffrey S. Foster Department o
Maryland - TOMOS - 1903
ABSTRACTTitle of Dissertation:ANALYSIS OF COMPLEX SURVEY DATA USING ROBUST MODEL-BASED AND MODEL-ASSISTED METHODS Yan Li, Ph.D., 2006Directed By:Professor, Partha Lahiri Joint Program in Survey MethodologyOver the past few decades, major ad
Maryland - TOMOS - 4080
ABSTRACTTitle of Dissertation:ANALYSIS OF COMPLEX SURVEY DATA USING ROBUST MODEL-BASED AND MODEL-ASSISTED METHODS Yan Li, Ph.D., 2006Directed By:Professor, Partha Lahiri Joint Program in Survey MethodologyOver the past few decades, major ad
Maryland - TOMOS - 1903
ABSTRACTTitle of Document: REGULATION OF CALCIUM SIGNALING AND CELLULAR LOCALIZATION OF NFAT IN CD8+ ANERGIC T CELLS.Mathangi Srinivasan, Doctor of Philosophy, 2008 Directed By: Kenneth Frauwirth, Assistant Professor, Department of Cell Biology a
Maryland - TOMOS - 8477
ABSTRACTTitle of Document: REGULATION OF CALCIUM SIGNALING AND CELLULAR LOCALIZATION OF NFAT IN CD8+ ANERGIC T CELLS.Mathangi Srinivasan, Doctor of Philosophy, 2008 Directed By: Kenneth Frauwirth, Assistant Professor, Department of Cell Biology a
Maryland - TOMOS - 1903
ABSTRACTTitle of Dissertation:LOCKED UP: EXPLORING THE COMPLEX NATURE OF CONFLICTING VALUES SYSTEMS AND THEIR EFFECTS ON WORK ATTITUDES Katherine A. DeCelles, Doctor of Philosophy, 2007Directed By:Professor Paul E. Tesluk, Management and Orga
Maryland - TOMOS - 7320
ABSTRACTTitle of Dissertation:LOCKED UP: EXPLORING THE COMPLEX NATURE OF CONFLICTING VALUES SYSTEMS AND THEIR EFFECTS ON WORK ATTITUDES Katherine A. DeCelles, Doctor of Philosophy, 2007Directed By:Professor Paul E. Tesluk, Management and Orga
Maryland - TOMOS - 1903
ABSTRACTTitle of Document:MOLECULAR EPIDEMIOLOGY AND SURVEILLANCE OF AVIAN INFLUENZA IN WILD AND DOMESTIC BIRDS Annabelle Morano Pascua, M.S., 2006Directed By:Associate Professor, Nathaniel L. Tablante, Department of Veterinary Medicine VA-MD
Maryland - TOMOS - 3612
ABSTRACTTitle of Document:MOLECULAR EPIDEMIOLOGY AND SURVEILLANCE OF AVIAN INFLUENZA IN WILD AND DOMESTIC BIRDS Annabelle Morano Pascua, M.S., 2006Directed By:Associate Professor, Nathaniel L. Tablante, Department of Veterinary Medicine VA-MD
Maryland - TOMOS - 1903
Ecology Letters, (2008) 11: 740755doi: 10.1111/j.1461-0248.2008.01192.xREVIEW AND SYNTHESISA cross-system synthesis of consumer and nutrient resource control on producer biomassAbstract Nutrient availability and herbivory control the biomass o
Maryland - TOMOS - 7985
Ecology Letters, (2008) 11: 740755doi: 10.1111/j.1461-0248.2008.01192.xREVIEW AND SYNTHESISA cross-system synthesis of consumer and nutrient resource control on producer biomassAbstract Nutrient availability and herbivory control the biomass o
Maryland - TOMOS - 1903
Exploiting Structure of Symmetric or Triangular Matrices on a GPUJin Hyuk Jung Dianne P. OLeary January 2008Abstract Matrix computations are expensive, and GPUs have the potential to deliver results at reduced cost by exploiting parallel computati
Maryland - TOMOS - 7984
Exploiting Structure of Symmetric or Triangular Matrices on a GPUJin Hyuk Jung Dianne P. OLeary January 2008Abstract Matrix computations are expensive, and GPUs have the potential to deliver results at reduced cost by exploiting parallel computati
Maryland - TOMOS - 1903
ABSTRACTDissertation Title: IMPACTS OF CULTURAL CAPITAL AND ECONOMIC CAPITAL ON STUDENT COLLEGE CHOICE PROCESS IN CHINA Lan Gao, Doctor of Philosophy, 2008Directed By:Dr. Steve Klees and Dr. Jing Lin, Department of Education Leadership, Higher E
Maryland - TOMOS - 8187
ABSTRACTDissertation Title: IMPACTS OF CULTURAL CAPITAL AND ECONOMIC CAPITAL ON STUDENT COLLEGE CHOICE PROCESS IN CHINA Lan Gao, Doctor of Philosophy, 2008Directed By:Dr. Steve Klees and Dr. Jing Lin, Department of Education Leadership, Higher E
Maryland - TOMOS - 1903
ABSTRACTTitle of Document:PROJECTING SUBJECTS IN SPANISH AND ENGLISH Ivn Ortega Santos, Doctor of Philosophy, 2008Directed By:Prof. Juan Uriagereka, Department of LinguisticsThe focus of this dissertation is syntactic movement and its relat
Maryland - TOMOS - 8182
ABSTRACTTitle of Document:PROJECTING SUBJECTS IN SPANISH AND ENGLISH Ivn Ortega Santos, Doctor of Philosophy, 2008Directed By:Prof. Juan Uriagereka, Department of LinguisticsThe focus of this dissertation is syntactic movement and its relat
Maryland - TOMOS - 1903
ABSTRACTTitle of Document:DEGREES OF ACCESS: FACTORS PREVENTING WIDE-SCOPE COVERAGE OF THE IRAQ WAR BY EMBEDDED REPORTERSFROM SHOCK AND AWE TO MISSION ACCOMPLISHED (MARCH 21 - MAY 1, 2003) Submitted by: Lindsay Reed Walton Candidate, Master of Ar
Maryland - TOMOS - 8322
ABSTRACTTitle of Document:DEGREES OF ACCESS: FACTORS PREVENTING WIDE-SCOPE COVERAGE OF THE IRAQ WAR BY EMBEDDED REPORTERSFROM SHOCK AND AWE TO MISSION ACCOMPLISHED (MARCH 21 - MAY 1, 2003) Submitted by: Lindsay Reed Walton Candidate, Master of Ar
Maryland - TOMOS - 1903
ABSTRACTTitle of Thesis:LIGHTING DESIGN OF THE ASHGIRL INA &amp; JACK KAY THEATRE CLARICE SMITH PERFORMING ARTS CENTER UNIVERSITY OF MARYLAND Rebecca Melissa Wolf, Master of Fine Arts, 2008Thesis Directed by:Asst. Professor, Harold Burgess II, De
Maryland - TOMOS - 8451
ABSTRACTTitle of Thesis:LIGHTING DESIGN OF THE ASHGIRL INA &amp; JACK KAY THEATRE CLARICE SMITH PERFORMING ARTS CENTER UNIVERSITY OF MARYLAND Rebecca Melissa Wolf, Master of Fine Arts, 2008Thesis Directed by:Asst. Professor, Harold Burgess II, De
Maryland - TOMOS - 1903
ABSTRACT Title of Thesis: ATTACHMENT AND DEMAND/WITHDRAW BEHAVIOR IN COUPLE INTERACTIONS: THE MODERATING ROLE OF CONFLICT LEVEL Katelyn C. Opel, MS, 2008 Thesis Directed By: Professor Norman B. Epstein, Department of Family ScienceThis study examin
Maryland - TOMOS - 8321
ABSTRACT Title of Thesis: ATTACHMENT AND DEMAND/WITHDRAW BEHAVIOR IN COUPLE INTERACTIONS: THE MODERATING ROLE OF CONFLICT LEVEL Katelyn C. Opel, MS, 2008 Thesis Directed By: Professor Norman B. Epstein, Department of Family ScienceThis study examin
Maryland - TOMOS - 1903
ABSTRACTTitle of Document:TRADE OPENNESS AND WELL-BEING: DO COMPLEMENTARY CONDITIONS MATTER? Julio A. Guzman, PhD, 2008Directed By:Prof. Carol Graham, Public PolicyIn the last three decades, most of the existing literature using regression
Maryland - TOMOS - 8327
ABSTRACTTitle of Document:TRADE OPENNESS AND WELL-BEING: DO COMPLEMENTARY CONDITIONS MATTER? Julio A. Guzman, PhD, 2008Directed By:Prof. Carol Graham, Public PolicyIn the last three decades, most of the existing literature using regression
Maryland - TOMOS - 1903
ABSTRACTTitle of Document:SCREAMS SOMEHOW ECHOING: TRAUMA AND TESTIMONY IN ANGLOPHONE AFRICAN LITERATURE Michelle Lynn Brown, Ph.D., 2008Directed By:Professor Sangeeta Ray, Department of EnglishPostcolonial literary critics note persistentl
Maryland - TOMOS - 8539
ABSTRACTTitle of Document:SCREAMS SOMEHOW ECHOING: TRAUMA AND TESTIMONY IN ANGLOPHONE AFRICAN LITERATURE Michelle Lynn Brown, Ph.D., 2008Directed By:Professor Sangeeta Ray, Department of EnglishPostcolonial literary critics note persistentl
Maryland - TOMOS - 1903
ABSTRACTTitle of Document:JOINT REPLENISHMENT AND SUPPLY CHAIN ACTIONS IN THE RETAIL GROCERY INDUSTRY: TWO ESSAYS Pamela S. Donovan, Ph.D., 2006Directed By:Dr. Curtis Grimm, Deans Professor of Supply Chain and Strategy, Logistics, Business &amp;
Maryland - TOMOS - 3968
ABSTRACTTitle of Document:JOINT REPLENISHMENT AND SUPPLY CHAIN ACTIONS IN THE RETAIL GROCERY INDUSTRY: TWO ESSAYS Pamela S. Donovan, Ph.D., 2006Directed By:Dr. Curtis Grimm, Deans Professor of Supply Chain and Strategy, Logistics, Business &amp;
Maryland - TOMOS - 1903
ABSTRACTTitle of Dissertation:LAND PRESERVATION, VOLUNTARY PROGRAMS, AND REGULATORY INSTRUMENTSXiangping Liu, Doctor of Philosophy, 2008 Dissertation directed by: Professor Andreas Lange Department of Agricultural and Resource EconomicsIn the
Maryland - TOMOS - 8342
ABSTRACTTitle of Dissertation:LAND PRESERVATION, VOLUNTARY PROGRAMS, AND REGULATORY INSTRUMENTSXiangping Liu, Doctor of Philosophy, 2008 Dissertation directed by: Professor Andreas Lange Department of Agricultural and Resource EconomicsIn the
Maryland - TOMOS - 1903
ABSTRACTTitle of Dissertation:INFORMATION EXCHANGE IN THE MARKETPLACE: TWO ESSAYS ON FIRM STRATEGIES AND STAKEHOLDER PERCEPTIONS Michael Donald Pfarrer, Doctor of Philosophy, 2007Dissertation directed by:Professor Violina P. Rindova Departmen
Maryland - TOMOS - 7304
ABSTRACTTitle of Dissertation:INFORMATION EXCHANGE IN THE MARKETPLACE: TWO ESSAYS ON FIRM STRATEGIES AND STAKEHOLDER PERCEPTIONS Michael Donald Pfarrer, Doctor of Philosophy, 2007Dissertation directed by:Professor Violina P. Rindova Departmen
Maryland - TOMOS - 1903
ABSTRACTTitle of Dissertation:BEYOND CYNICISM: HOW MEDIA LITERACY CAN MAKE STUDENTS MORE ENGAGED CITIZENS Paul Mihailidis, 2008Dissertation Directed by:Susan Moeller, Associate Professor, Philip Merrill College of JournalismBeyond Cynicism:
Maryland - TOMOS - 8301
ABSTRACTTitle of Dissertation:BEYOND CYNICISM: HOW MEDIA LITERACY CAN MAKE STUDENTS MORE ENGAGED CITIZENS Paul Mihailidis, 2008Dissertation Directed by:Susan Moeller, Associate Professor, Philip Merrill College of JournalismBeyond Cynicism:
Maryland - TOMOS - 1903
ABSTRACTTitle:QUANTITATIVE GLOBAL HEAT-TRANSFER MEASUREMENTS USING TEMPERATURESENSITIVE PAINT ON A BLUNT BODY IN HYPERSONIC FLOWSInna Kurits Master of Science, 2008 Directed by: Professor Mark J. Lewis Department of Aerospace EngineeringA qua
Maryland - TOMOS - 8302
ABSTRACTTitle:QUANTITATIVE GLOBAL HEAT-TRANSFER MEASUREMENTS USING TEMPERATURESENSITIVE PAINT ON A BLUNT BODY IN HYPERSONIC FLOWSInna Kurits Master of Science, 2008 Directed by: Professor Mark J. Lewis Department of Aerospace EngineeringA qua
Maryland - TOMOS - 1903
ABSTRACTTitle of Document:THE COMMUNITY CAPACITY BUILDING IMPACT OF THE BALTIMORE EMPOWERMENT ZONE Richard Patrick Clinch, Doctor of Philosophy, 2008Directed By:Robert H. Nelson School of Public PolicyThe federal Empowerment Zone/Enterprise
Maryland - TOMOS - 8303
ABSTRACTTitle of Document:THE COMMUNITY CAPACITY BUILDING IMPACT OF THE BALTIMORE EMPOWERMENT ZONE Richard Patrick Clinch, Doctor of Philosophy, 2008Directed By:Robert H. Nelson School of Public PolicyThe federal Empowerment Zone/Enterprise
Maryland - TOMOS - 1037
Evolving a Set of Techniques for OO InspectionsForrest Shullfshull@fc-md.umd.eduGuilherme H. Travassos1travassos@cs.umd.eduJeffrey Carvercarver@cs.umd.eduVictor R. Basilibasili@cs.umd.eduExperimental Software Engineering Group Departme
Maryland - TOMOS - 1903
Evolving a Set of Techniques for OO InspectionsForrest Shullfshull@fc-md.umd.eduGuilherme H. Travassos1travassos@cs.umd.eduJeffrey Carvercarver@cs.umd.eduVictor R. Basilibasili@cs.umd.eduExperimental Software Engineering Group Departme
Maryland - TOMOS - 1036
Secure AgentsPiero A. Bonatti Sarit Krausy V.S. SubrahmanianzWith the rapid proliferation of software agents, there comes an increased need for agents to ensure that they do not provide data and/or services to unauthorized users. We rst develop an
Maryland - TOMOS - 1903
Secure AgentsPiero A. Bonatti Sarit Krausy V.S. SubrahmanianzWith the rapid proliferation of software agents, there comes an increased need for agents to ensure that they do not provide data and/or services to unauthorized users. We rst develop an
Maryland - TOMOS - 1031
The CBP Parameter a Useful Annotation to Aid SDF Compilers1Shuvra S. Bhattacharyya Department of Electrical and Computer Engineering, and Institute for Advanced Computer Studies University of Maryland, College Park ssb@eng.umd.edu Praveen K. Murthy
Maryland - TOMOS - 1903
The CBP Parameter a Useful Annotation to Aid SDF Compilers1Shuvra S. Bhattacharyya Department of Electrical and Computer Engineering, and Institute for Advanced Computer Studies University of Maryland, College Park ssb@eng.umd.edu Praveen K. Murthy
Maryland - TOMOS - 1030
XMT-M: A Scalable Decentralized ProcessorEfraim Berkovich, Joseph Nuzman, Manoj Franklin, Bruce Jacob, and Uzi Vishkin Department of Electrical and Computer Engineering, and University of Maryland Institute for Advanced Computer Studies (UMIACS) Uni
Maryland - TOMOS - 1903
XMT-M: A Scalable Decentralized ProcessorEfraim Berkovich, Joseph Nuzman, Manoj Franklin, Bruce Jacob, and Uzi Vishkin Department of Electrical and Computer Engineering, and University of Maryland Institute for Advanced Computer Studies (UMIACS) Uni
Maryland - TOMOS - 1903
ABSTRACTTitle of Document:Experimental and numerical characterization of turbulent slot film cooling. Carlos A. Cruz, PhD, 2008Directed By:Associate Professor Andr W. Marshall, and Associate Professor Arnaud Trouv, Department of Fire Protecti
Maryland - TOMOS - 8145
ABSTRACTTitle of Document:Experimental and numerical characterization of turbulent slot film cooling. Carlos A. Cruz, PhD, 2008Directed By:Associate Professor Andr W. Marshall, and Associate Professor Arnaud Trouv, Department of Fire Protecti
Maryland - TOMOS - 1903
ABSTRACTTitle of document:PROCESS MODELING OF A WIRE SAW OPERATION Thomas C. Palathra, Master of Science, 2008Directed by:Professor Raymond Adomaitis Department of Chemical and Biomolecular EngineeringMulticrystalline (MC) silicon solar cel
Maryland - TOMOS - 8496
ABSTRACTTitle of document:PROCESS MODELING OF A WIRE SAW OPERATION Thomas C. Palathra, Master of Science, 2008Directed by:Professor Raymond Adomaitis Department of Chemical and Biomolecular EngineeringMulticrystalline (MC) silicon solar cel
Maryland - TOMOS - 1903
ABSTRACTTitle of Document:BILDU G AND GENDER IN NINETEENTHCENTURY BOURGEOIS GERMANY: A CULTURAL STUDIES ANALYSIS OF TEXTS BY WOMEN WRITERS Cauleen Suzanne Gary, PhD, 2008Directed By:Professor Elke P. Frederiksen, Department of Germanic Studie
Maryland - TOMOS - 8490
ABSTRACTTitle of Document:BILDU G AND GENDER IN NINETEENTHCENTURY BOURGEOIS GERMANY: A CULTURAL STUDIES ANALYSIS OF TEXTS BY WOMEN WRITERS Cauleen Suzanne Gary, PhD, 2008Directed By:Professor Elke P. Frederiksen, Department of Germanic Studie
Maryland - TOMOS - 1903
ABSTRACTTitle of Dissertation:BASIC WRITING, BINARIES, AND BRIDGES: DIFFERENCE AND POWER IN THE PRODUCTION AND RECEPTION OF REPRESENTATIONS OF STUDENTS Maurice C. Champagne, Doctor of Philosophy, 2008Dissertation directed by:Professor Shirley
Maryland - TOMOS - 8493
ABSTRACTTitle of Dissertation:BASIC WRITING, BINARIES, AND BRIDGES: DIFFERENCE AND POWER IN THE PRODUCTION AND RECEPTION OF REPRESENTATIONS OF STUDENTS Maurice C. Champagne, Doctor of Philosophy, 2008Dissertation directed by:Professor Shirley
Maryland - TOMOS - 1903
ABSTRACTTitle of Document:THE EFFECTS OF FINGER MOVEMENT CONDITIONS AND SPEED ON FINGER INTERDEPENDENCY James Jungwoo Lieu, Master of Arts, 2008Directed By:Assistant Professor Dr. Jae Kun Shim, Department of KinesiologyThe study investigate
Maryland - TOMOS - 8499
ABSTRACTTitle of Document:THE EFFECTS OF FINGER MOVEMENT CONDITIONS AND SPEED ON FINGER INTERDEPENDENCY James Jungwoo Lieu, Master of Arts, 2008Directed By:Assistant Professor Dr. Jae Kun Shim, Department of KinesiologyThe study investigate
Maryland - TOMOS - 1903
Path Projection for User-Centered Static Analysis ToolsKhoo Yit Phang Jeffrey S. Foster Michael Hicks Vibha SazawalUniversity of Maryland, College Park {khooyp,jfoster,mwh,vibha}@cs.umd.eduAbstractThe research and industrial communities have mad
Maryland - TOMOS - 8369
Path Projection for User-Centered Static Analysis ToolsKhoo Yit Phang Jeffrey S. Foster Michael Hicks Vibha SazawalUniversity of Maryland, College Park {khooyp,jfoster,mwh,vibha}@cs.umd.eduAbstractThe research and industrial communities have mad