14 Pages

iocache

Course: CPS 110, Fall 2009
School: Duke
Rating:
 
 
 
 
 

Word Count: 2026

Document Preview

Caching I/O and Page Replacement Memory/Storage Hierarchy 101 Very fast 1ns clock Multiple Instructions per cycle "CPU-DRAM gap" memory system architecture (CPS 104) volatile "I/O bottleneck" VM and file caching (CPS 110) P $ SRAM, Fast, Small Expensive (cache, registers) DRAM, Slow, Big,Cheaper (called physical or main) $1000-$2000 per GB or so Memory Magnetic, Rotational,...

Register Now

Unformatted Document Excerpt

Coursehero >> North Carolina >> Duke >> CPS 110

Course Hero has millions of student submitted documents similar to the one
below including study guides, practice problems, reference materials, practice exams, textbook help and tutor support.

Course Hero has millions of student submitted documents similar to the one below including study guides, practice problems, reference materials, practice exams, textbook help and tutor support.
Caching I/O and Page Replacement Memory/Storage Hierarchy 101 Very fast 1ns clock Multiple Instructions per cycle "CPU-DRAM gap" memory system architecture (CPS 104) volatile "I/O bottleneck" VM and file caching (CPS 110) P $ SRAM, Fast, Small Expensive (cache, registers) DRAM, Slow, Big,Cheaper (called physical or main) $1000-$2000 per GB or so Memory Magnetic, Rotational, Really Slow Seeks, Really Big, Really Cheap nonvolatile ($25 - $40 per GB) => Cost Effective Memory System (Price/Performance) 1 I/O Caching 101 HASH(object) free/inactive list head Data items from secondary storage are cached in memory for faster access time. methods: hash chains hash function object = get(tag) Locate object if in the cache, else find a free slot and bring it into the cache. hash bucket array free/inactive list tail release(object) Release cached object so its slot may be reused for some other object. I/O cache: a hash table with an integrated free/inactive list (i.e., an ordered list of eviction candidates). Rationale for I/O Cache Structure Goal: maintain K slots in memory as a cache over a collection of m items on secondary storage (K << m). 1. What happens on the first access to each item? Fetch it into some slot of the cache, use it, and leave it there to speed up access if it is needed again later. 2. How to determine if an item is resident in the cache? Maintain a directory of items in the cache: a hash table. Hash on a unique identifier (tag) for the item (fully associative). 3. How to find a slot for an item fetched into the cache? Choose an unused slot, or select an item to replace according to some policy, and evict it from the cache, freeing its slot. 2 Mechanism for Cache Eviction/Replacement Typical approach: maintain an ordered free/inactive list of slots that are candidates for reuse. Busy items in active use are not on the list. E.g., some in-memory data structure holds a pointer to the item. E.g., an I/O operation is in progress on the item. The best candidates are slots that do not contain valid items. Initially all slots are free, and they may become free again as items are destroyed (e.g., as files are removed). Other slots are listed in order of value of the items they contain. These slots contain items that are valid but inactive: they are held in memory only in the hope that they will be accessed again later. Replacement Policy The effectiveness of a cache is determined largely by the policy for ordering slots/items on the free/inactive list. defines the replacement policy A typical cache replacement policy is Least Recently Used. Assume hot items used recently are likely to be used again. Move the item to the tail of the free list on every release. The item at the front of the list is the coldest inactive item. Other alternatives: FIFO: replace the oldest item. MRU/LIFO: replace the most recently used item. 3 Example: File Block Buffer Cache HASH(vnode, logical block) Buffers with valid data are retained in memory in a buffer cache or file cache. Each item in the cache is a buffer header pointing at a buffer . Blocks from different files may be intermingled in the hash chains. System data structures hold pointers to buffers only when I/O is pending or Most systems use a pool of buffers in imminent. kernel memory as a staging area for - busy bit instead of refcount memory<->disk transfers. - most buffers are "free" Why Are File Caches Effective? 1. Locality of reference: storage accesses come in clumps. spatial locality: If a process accesses data in block B, it is likely to reference other nearby data soon. (e.g., the remainder of block B) example: reading or writing a file one byte at a time temporal locality: Recently accessed data is likely to be used again. 2. Read-ahead: if we can predict what blocks will be needed soon, we can prefetch them into the cache. most files are accessed sequentially 4 Handling Updates in the File Cache 1. Blocks may be modified in memory once they have been brought into the cache. Modified blocks are dirty and must (eventually) be written back. 2. Once a block is modified in memory, the write back to disk may not be immediate (synchronous). Delayed writes absorb many small updates with one disk write. How long should the system hold dirty data in memory? Asynchronous writes allow overlapping of computation and disk update activity (write-behind). Do the write call for block n+1 while transfer of block n is in progress. Thus file caches also can improve performance for writes. The Page Caching Problem Each thread/process/job utters a stream of page references. reference string: e.g., abcabcdabce.. The OS tries to minimize the number of faults incurred. The set of pages (the working set) actively used by each job changes relatively slowly. Try to arrange for the resident set of pages for each active job to closely approximate its working set. Replacement policy is the key. On each page fault, select a victim page to evict from memory; read the new page into the victim's frame. Most systems try to approximate an LRU policy. 5 VM Page Cache Internals HASH(memory object/segment, logical block) 1. Pages in active use are mapped through the page table of one or more processes. 2. On a fault, the global object/offset hash table in kernel finds pages brought into memory by other processes. 3. Several page queues wind through the set of active frames, keeping track of usage. 4. Pages selected for eviction are removed from all page tables first. Managing the VM Page Cache Managing a VM page cache is similar to a file block cache, but with some new twists. 1. Pages are typically referenced by page table (pmap) entries. Must pmap_page_protect to invalidate before reusing the frame. 2. Reads and writes are implicit; the TLB hides them from the OS. How can we tell if a page is dirty? How can we tell if a page is referenced? 3. Cache manager must run policies periodically, sampling page state. Continuously push dirty pages to disk to "launder" them. Continuously check to references judge how "hot" each page is. Balance accuracy with sampling overhead. 6 The Paging Daemon Most OS have one or more system processes responsible for implementing the VM page cache replacement policy. A daemon is an autonomous system process that periodically performs some housekeeping task. The paging daemon prepares for page eviction before the need arises. Wake up when free memory becomes low. Clean dirty pages by pushing to backing store. prewrite or pageout Maintain ordered lists of eviction candidates. Decide how much memory to allocate to file cache, VM, etc. LRU Approximations for Paging Pure LRU and LFU are prohibitively expensive to implement. most references are hidden by the TLB OS typically sees less than 10% of all references can't tweak your ordered page list on every reference Most systems rely on an approximation to LRU for paging. periodically sample the reference bit on each page visit page and set reference bit to zero run the process for a while (the reference window) come back and check the bit again reorder the list of eviction candidates based on sampling 7 FIFO with Second Chance Idea: do simple FIFO replacement, but give pages a "second chance" to prove their value before they are replaced. Every frame is on one of three FIFO lists: active, inactive and free Page fault handler installs new pages on tail of active list. "Old" pages are moved to the tail of the inactive list. Paging daemon moves pages from head of active list to tail of inactive list when demands for free frames is high. Clear the refbit and protect the inactive page to "monitor" it. Pages on the inactive list get a "second chance". If referenced while inactive, reactivate to the tail of active list. Illustrating FIFO-2C active list I. Restock inactive list by pulling pages from the head of the active list: clear the ref bit and place on inactive list (deactivation). II. Inactive list scan from head: 1. Page has been referenced? Return to tail of active list (reactivation). 2. Page has not been referenced? pmap_page_protect and place on tail of free list. 3. Page is dirty? Push to backing store and return it to inactive list tail (clean). Consume frames from the head of the free list (free). If free shrinks below threshhold, kick the paging daemon to start a scan (I, II). inactive list free list Paging daemon typically scans a few times per second, even if not needed to restock free list. 8 FIFO-2C in Action (FreeBSD) What Do the Pretty Colors Mean? This is a plot of selected internal kernel events during a run of a process that randomly reads/writes its virtual memory. x-axis: time in milliseconds (total run is about 3 seconds) y-axis: each event pertains to a physical page frame, whose PFN is given on the y-axis The machine is an Alpha with 8000 8KB pages (64MB total) The process uses 48MB of virtual memory: force the paging daemon to do FIFO-2C bookkeeping, but little actual paging. events: page allocate (yellow-green), page free (red), deactivation (duke blue), reactivation (lime green), page clean (carolina blue). 9 What to Look For Some low physical memory ranges are reserved to the kernel. Process starts and soaks up memory that was initially free. Paging daemon frees pages allocated to other processes, and the system reallocates them to the test process. After an initial flurry of demand-loading activity, things settle down after most of the process memory is resident. Paging daemon begins to run more frequently as memory becomes overcommitted (dark blue deactivation stripes). Test process touches pag...

Find millions of documents on Course Hero - Study Guides, Lecture Notes, Reference Materials, Practice Exams and more. Course Hero has millions of course specific materials providing students with the best way to expand their education.

Below is a small sample set of documents:

Duke - CPS - 271
The Problem Learning Probability DistributionsCPS 271 Ron Parr Observe a sequence of events Predict the probability of future events based upon observations Classical statistical problem Surprisingly subtle issues ariseEvent Spaces We first c
Duke - CPS - 108
Inheritance (language independent)qFirst view: exploit common interfaces in programming Iterator and Comparable in Java, see List/ArrayList/Vector Iterators in STL/C+ share interface by convention/templatesImplementation varies while interfac
Duke - CPS - 210
Tricks (mixed syntax)if (some_condition) / as a hint { LOCK m DO if (some_condition) /the truth {stuff} END Cheap to get info but must check for } correctness; always a slow wayMore TricksGeneral pattern:while (! required_conditions) wait (m, c)
Duke - CPS - 108
Software frameworks/cadaversExperience with OO programming and design shows that design patterns are useful Where do we get the experience? How do we impart experience? What can we use to illustrate patterns in practice? What patterns should we em
Duke - CPS - 210
Power-Aware Ad Hoc RoutingMobicom98 paper Power-Aware Routing in Ad Hoc Networks by Singh, Woo, and Raghavendra What is ad hoc routing? Routing through cooperating wireless nodes that may be mobile (topology changing). Goal: reduce the energy cons
Duke - CPS - 189
Thomas Finley An Empirical Study of Novice Program Comprehension in the Imperative and ObjectOriented Styles by Vennila Ramalingam and Susan Wiedenbeck The title says it all, does it not? The authors sought to determine what differences in comprehens
Duke - CPS - 100
Searching, Maps, HashingqSearching is a very important programming application Consider google.com and other search engines In general we search a collection for a key Vector/List, Tree: O(n) and O(log n) If we compare keys when searching we c
Duke - CPS - 100
Intersection, Union, MultisetsFinding the intersection of two sets requires examining all the elements of one set and determining if these are in the other How can we examine all the elements of a MultiSet? What mechanism exists for accessing ind
Duke - CPS - 100
What is Computer Science?What is it that distinguishes it from the separate subjects with which it is related? What is the linking thread which gathers these disparate branches into a single discipline? My answer to these questions is simple - it is
Duke - CPS - 100
Backtracking, Search, HeuristicsqMany problems require an approach similar to solving a maze Certain mazes can be solved using the &quot;right-hand&quot; rule Other mazes, e.g., with islands, require another approach If you have &quot;markers&quot;, leave them at
Duke - CPS - 100
Review of Data StructuresqWeve studied concrete data structures Vectors Homogeneous aggregates supporting random access Linked lists Collections supporting constant-time insertionTrees Combine efficiency of search/insert from vector/linke
Duke - CPS - 110
COMPSCI 110 Operating Systems Who - Introductions How - Policies and Administrative Details Why - Objectives and Expectations What - Our Topic: Operating SystemsHow COMPSCI 110 will work Its all explained on the webhttp:/www.cs.duke.edu/educa
Duke - CPS - 100
Intersection, Union, MultisetsAnatomy of IntersectionFinding the intersection of two sets requires examining all the elements of one set and determining if these are in the other How can we examine all the elements of a MultiSet? What mechanis
Duke - CPS - 100
Name: HW 1 - Linked Lists Instructions: This HW is due in-class on Oct. 16. Turn in your work stapled and with your name on every page. This assignment is worth 100 points. You should work on your own, but you can use books/notes. You should not use
Duke - CPS - 100
CPS 100E - Program Design and Analysis I Prof. Rodger Section: 2-3 Trees handout2-3 Treesleaf nodes contain data item all leaves on same level nonleaf node has 2 or 3 children two search values largest item in left subtree largest item in middle s
Duke - CPS - 100
CPS 100E - Program Design and Analysis I Prof. Rodger Section: Trees handout Read Chap 6.5-6.6, Chap 17. We will not use the class implementation of trees in the book, but you should read about them. General Tree set of nodes vertices set of edges no
Duke - CPS - 100
CPS 100E - Program Design and Analysis I Prof. Rodger Section: Red Black Trees handoutRed-Black TreeBinary search tree properties every node is red or black every leaf is a NULL node and black the root is black If node is red, then its children ar
Duke - CPS - 100
The PlanGraphical User Interfaces GUIs Components Flat Layouts Hierarchical Layouts Designing a GUI Coding a GUIGUI.1GUI.2Components JLabel text/image display JTextField single line for text input/output JTextArea multiple lines for te
Duke - CPS - 100
Java: Base Types All information has a type or class designation Built-in TypesCalled : primitive types or base types boolean, char, bye, short, int, long, float, double Primarily use: int, double, booleanNeed to declare before using; defined
Duke - CPS - 100
On the Limits of ComputingExistence of Noncomputable FunctionsReasons for Failure1.Approach Matching up Programs and Functions E.g., assume 3 functions, only 2 programs Without details, conclude one function has no programRuns too long
Duke - CPS - 100
Intro to SortingSelection Sorting Example Sorting &quot;Ideal&quot; Computer Science Topic Theory and Practice meet Efficient Sorting Saves MoneyFirst look at some simple (quick and dirty?) algorithms Selection Sort1. 2. 3. Find smalle
Duke - CPS - 100
Creating HeapsArray-based heap Heap is an array-based implementation of a binary tree used for implementing priority queues, supports:insert, findmin, deletemin: complexities?store &quot;node values&quot; in array beginning at index 1 for node with
Duke - CPS - 100
Binary TreesFrom doubly-linked lists to binary treesLinked lists: efficient insertion/deletion, inefficient search ArrayList: search can be efficient, insertion/deletion not Binary trees: efficient insertion, deletion, and search trees used i
Duke - CPS - 100
Big Oh Again AgainRecognizing Common Recurrences Have taken the attitude that mostly you can look things up Now need to be able to do your own derivations Extend our menu of solutions to common recurrence Lets look at previously shown tableBe
Duke - CPS - 100
Searching, Maps,Tries (hashing)From Google to MapsSearching is a fundamentally important operation We want to search quickly, very very quickly Consider searching using Google, ACES, issues? In general we want to search in a collection for
Duke - CPS - 100
Intro to GraphsDefinitions and VocabularyA graph consists of a set of vertices (or nodes) and a set of edges (or arcs) where each edge connects a pair of vertices. If the pair of vertices defining an edge is ordered, then it is a directed graph.
Duke - CPS - 100
Data CompressionCompression is a high-profile application.zip, .mp3, .jpg, .gif, .gz, What property of MP3 was a significant factor in what made Napster work (why did Napster ultimately fail?)Why do we care?Secondary storage capacity double
Duke - CPS - 100
Searching, Maps,Tries (hashing)Searching is a fundamentally important operationWe want to search quickly, very very quickly Consider searching using Google, ACES, issues? In general we want to search in a collection for a keyWe've searched us
Duke - CPS - 100
Memory ModelCost of Disk I/OFor this course: Assume Uniform Access Time All elements in an array accessible with same time cost Reality is somewhat different Registers On (cpu) chip cache memory Off chip cache memory Main memory Virtual memo
Duke - CPS - 100
Inheritance and Interfaces Single inheritance in JavaInheritance models an &quot;is-a&quot; relationshipA class can extend only one class in Java A dog is a mammal, an ArrayList is a List, a square is a shape, Write general programs to understand
Duke - CPS - 100
Inheritance and Interfaces Inheritance models an &quot;is-a&quot; relationshipA dog is a mammal, an ArrayList is a List, a square is a shape, Write general programs to understand the abstraction, advantages?void doShape(Shape s) { System.out.println(s.a
Duke - CPS - 100
Other N log N SortsBinary Tree SortBasic Recipe o Insert into binary search tree (BST) o Do Inorder Traversal Complexity o Create: O(N log N) o Traversal O(N) Not usually used for sorting unless you need BST for other reasonsCompSci 100E28.1
Duke - CPS - 100
Java: Base Types Java: Operators for Base TypesAll information has a type or class designation Built-in TypesCalled : primitive types or base types boolean, char, bye, short, int, long, float, double Primarily use: int, double, booleanFami
Duke - CPS - 100
Java Basics Arrays!Java Basics Arrays!Should be a very familiar idea&quot;!Problem: Deal with exam grades in a course o!Could have variable for each student o!Would need unique name for each variable o!Need lots of custom code o!Instead, assume
Duke - CPS - 100
Dropping Glass BallsTower with N Floors ! Given 2 glass balls ! Want to determine the lowest floor from which a ball can be dropped and will break ! How?!Glass balls revisited (more balls)! !1.! 2.! 3.! 4.! 5.! 6.! 7.! 8.! 9.! 10.!What is th
Duke - CPS - 100
Data CompressionYear Scheme 1967 ASCII 1950 Huffman 1977 Lempel-Ziv (LZ77) 1984 Lempel-Ziv-Welch (LZW) Unix compress 1987 (LZH) used by zip and unzip 1987 Move-to-front 1987 gzip 1995 Burrows-Wheeler 1997 BOA (statistical data compression)CompSci
Duke - CPS - 100
Problem 1!Problem 2!Given n, calculate 2n&quot; &quot;What if you wanted to print all from 20 to 2n? What if you wanted to return the value?Given a real number c and some error tolerance epsilon, estimate t, the square root of cCompSci 100E2.1
Duke - CPS - 100
Is there a Science of Networks?!From subsets to graphs with bits!What kinds of networks are there? From Bacon numbers to random graphs to Internet ! From FOAF to Selfish Routing: apparent similarities between many human and technological system
Duke - CPS - 100
Searching, Maps, TablesqSearching is a fundamentally important operation We want to do these operations quickly Consider searching using google.com, altavista.com, etc., In general we want to search in a collection for a key Weve seen searching
Duke - CPS - 100
Data Structures revisitedLinked lists and arrays and ArrayLists and . Linear structures, operations include insert, delete, traverse, . Advantages and trade-offs include . We want to move toward structures that support very efficient insertion a
Duke - CPS - 100
Conventions in Compsci 100 projectsKWIC: Key word in ContextArise, fair sun, and I. Yet I should shortly, for one would those twenty could but wherefore, villain, didst thou mean, But 'banished' to thou happy. Tybalt would cell there would she he
Duke - CPS - 100
Analysis: Algorithms and Data StructuresHow fast does the code run?We need a vocabulary to discuss performance and to reason about alternative algorithms and implementations Its faster! Its more elegant! Its safer! Its cooler! We need empirical
Duke - CPS - 100
Printed by Owen L. Astrachan Feb 04, 09 9:19import java.util.*;75ArrayListHash.javaPage 1/2Feb 04, 09 9:19ArrayListHash.javaPage 2/2public class ArrayListHash implements IMapper {5private static int SIZE = 1000000; private class Comb
Duke - CPS - 100
Printed by Owen L. Astrachan Mar 18, 09 11:42import java.util.*; public class Postfix {5Postfix.javaPage 1/1private static final String DELIMS = &quot;+-*/ &quot;; private int operate(String op, int lhs, int rhs){ if (op.equals(&quot;+&quot;) { return lhs + rhs;
Duke - CPS - 100
APT IPConverterhttp:/www.cs.duke.edu/csed/algoprobs/ipconverter.htmlAPT IPConverterThis problem statement is the exclusive and proprietary property of TopCoder, Inc. Any unauthorized use or reproduction of this information without the prior writ
Duke - FEB - 100
APT IPConverterhttp:/www.cs.duke.edu/csed/algoprobs/ipconverter.htmlAPT IPConverterThis problem statement is the exclusive and proprietary property of TopCoder, Inc. Any unauthorized use or reproduction of this information without the prior writ
Duke - CPS - 100
Data and InformationOrganizing Data: ideas and issuesHow and why do we organize data? Differences between data and information? What about knowledge?Often there is a time/space tradeoff If we use more space (memory) we can solve a data/ info
Duke - CPS - 100
Search, Trees, Games, BacktrackingSearch, Backtracking,HeuristicsTrees help with search How do you find a needle in a haystack? Set, map: binary search tree, balanced and otherwise Quadtree and more, scenes and objects in games 3-4 Tre
Duke - CPS - 100
Data Structures revisitedWordladder Story Linked lists and arrays and ArrayLists and Linear structures, operations include insert, delete, traverse, Advantages and trade-offs include We want to move toward structures that support very effi
Duke - CPS - 100
Printed by Owen L. Astrachan Jan 11, 09 20:25import java.util.*; /* * Model for KeyWordInContext (KWIC) demo program. As given this isnt fully * functional, but shows how the Model communicates with the View * @author Owen Astrachan * @date 9/2/2008
Duke - CPS - 100
Printed by Owen L. Astrachan Feb 04, 09 9:28import java.util.Iterator;75ISimpleList.javaPage 1/2Feb 04, 09 9:28ISimpleList.javaPage 2/2510/* * An interface similar to java.util.List, but doesn't include all the methods * of that int
Duke - CPS - 100
Printed by Owen L. Astrachan Feb 09, 09 12:41import java.util.*;75ListDoubler.javaPage 1/2Feb 09, 09 12:41ListDoubler.javaPage 2/2public class ListDoubler {5double ltime = ld.timeDouble(llist); double atime = ld.timeDouble(alist); do
Duke - CPS - 100
ADTs and vectors, towards linked listsqLinked listsqqtvector is a class-based implementation of a lower-level data type called an array tvector grows dynamically (doubles in size as needed) when elements inserted with push_back tvector prot
Duke - CPS - 100
Data Structures revisitedStack: What problems does it solve?Linked lists and arrays and ArrayLists and Linear structures, operations include insert, delete, traverse, Advantages and trade-offs include We want to move toward structures that
Duke - CPS - 100
Solving Problems RecursivelyPrint words entered, but backwardsRecursion is an indispensable tool in a programmer's toolkit Allows many complex problems to be solved simply Elegance and understanding in code often leads to better programs: eas
Duke - CPS - 100
What's in Compsci 100?Understanding tradeoffs: reasoning, analyzing, describing Algorithms Data Structures Programming Design Object oriented programming using Java IDE Language Problem-solving From design to codeCPS 1002.1Toward understanding
Duke - CPS - 100
Analysis: Algorithms and Data StructuresWe need a vocabulary to discuss performance and to reason about alternative algorithms and implementations It's faster! It's more elegant! It's safer! It's cooler! We need empirical tests and analytical/mat
Duke - CPS - 100
Balanced Search TreesqRotations and balanced treesqBinary search trees keep keys ordered, with efficient lookup Insert, Delete, Find, all are O(log n) in average case Worst case is bad Compared to hashing? Advantages? Balanced trees are guar
Duke - CPS - 100
2004.LinkedPracticehttp:/www.cs.duke.edu/courses/cps I 00/current/inclass/link2lllTnda* * zinliql-flrlnrle a Node * b) / pre: 0 &lt; size(a) and 0 &lt; size (b) and size (b) &lt; size (a) node of list created by / / post: returns pointer to first z i n
Duke - CPS - 100
Searching, Maps, TablesqLog (google) is a big numberqSearching is a fundamentally important operation We want to do these operations quickly Consider searching using google.com, altavista.com, etc., In general we want to search in a collecti
Duke - CPS - 100
Oops I did it again Tiffany , Amrith Nooo Oh no no no no no Oh no no no no no I think I did it again I made an error I can't comprehend Oh Ola It might stop from compiling But it doesn't mean that I can't compress this, 'Cause to lose all of my huff