12 Pages

mapreducemerge

Course: CS 228, Fall 2009
School: Brandeis
Rating:
 
 
 
 
 

Word Count: 9340

Document Preview

Simplified Map-Reduce-Merge: Relational Data Processing on Large Clusters Hung-chih Yang, Ali Dasdan Yahoo! Sunnyvale, CA, USA {hcyang,dasdan}@yahoo-inc.com Ruey-Lung Hsiao, D. Stott Parker Computer Science Department, UCLA Los Angeles, CA, USA {rlhsiao,stott}@cs.ucla.edu ABSTRACT Map-Reduce is a programming model that enables easy development of scalable parallel applications to process vast amounts of data on...

Register Now

Unformatted Document Excerpt

Coursehero >> Massachusetts >> Brandeis >> CS 228

Course Hero has millions of student submitted documents similar to the one
below including study guides, practice problems, reference materials, practice exams, textbook help and tutor support.

Course Hero has millions of student submitted documents similar to the one below including study guides, practice problems, reference materials, practice exams, textbook help and tutor support.
Simplified Map-Reduce-Merge: Relational Data Processing on Large Clusters Hung-chih Yang, Ali Dasdan Yahoo! Sunnyvale, CA, USA {hcyang,dasdan}@yahoo-inc.com Ruey-Lung Hsiao, D. Stott Parker Computer Science Department, UCLA Los Angeles, CA, USA {rlhsiao,stott}@cs.ucla.edu ABSTRACT Map-Reduce is a programming model that enables easy development of scalable parallel applications to process vast amounts of data on large clusters of commodity machines. Through a simple interface with two functions, map and reduce, this model facilitates parallel implementation of many real-world tasks such as data processing for search engines and machine learning. However, this model does not directly support processing multiple related heterogeneous datasets. While processing relational data is a common need, this limitation causes difficulties and/or inefficiency when Map-Reduce is applied on relational operations like joins. We improve Map-Reduce into a new model called MapReduce-Merge. It adds to Map-Reduce a Merge phase that can efficiently merge data already partitioned and sorted (or hashed) by map and reduce modules. We also demonstrate that this new model can express relational algebra operators as well as implement several join algorithms. Categories and Subject Descriptors D.1.3 [Programming Techniques]: Concurrent Programming--Parallel programming; D.3.3 [Programming Languages]: Language Constructs and Features--Frameworks; H.2.4 [Database Management]: Systems--Parallel databases; Relational databases General Terms Design, Languages, Management, Performance, Reliability Keywords Cluster, Data Processing, Distributed, Join, Map-Reduce, Map-Reduce-Merge, Parallel, Relational, Search Engine 1. INTRODUCTION Search engines process and manage a vast amount of data collected from the entire World Wide Web. To do this task Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. SIGMOD'07, June 1214, 2007, Beijing, China. Copyright 2007 ACM 978-1-59593-686-8/07/0006 ...$5.00. efficiently at reasonable cost, instead of relying on generic DBMS, they are usually built as customized parallel data processing systems and deployed on large clusters of sharednothing commodity nodes. In [3], based on his experience as Inktomi (now part of Yahoo!) co-founder, Eric Brewer advocated that building novel data-intensive systems (e.g., search engines) should "apply the principles of databases, rather than the artifacts." It was because DBMS are usually overly generalized with many features that some can be unnecessary overhead for specific applications like search engine. Hence, search engine companies have developed and operated on "simplified" distributed storage and parallel programming infrastructures. These include Google's File System (GFS) [10], Map-Reduce [6], BigTable [4]; Ask.com's Neptune (using the Data Aggregation Call (DAC) framework) [5]; and Microsoft's Dryad [13]. Yahoo! also has similar infrastructures. These infrastructures adopt only a selected subset of database principles, hence are "simplified," but they are sufficiently generic and effective that they can be easily adapted to data processing in search engines, machine learning, and bioinformatics. Following these useful but proprietary (non-publicly released) infrastructures, Hadoop[1] is an open-source implementation, which is reminiscent of GFS and Map-Reduce, and is released under the umbrella of the Apache Software Foundation. Common to these infrastructures is the refactoring of data processing into two primitives: (a) a map function to process input key/value pairs and generate intermediate key/values, and (b) a reduce function to merge all intermediate pairs associated with the same key and then generate outputs. The DAC framework has similar primitives, called local and reduce. These primitives allow users to develop and run parallel data processing tasks without worrying about the nuisance details of coordinating parallel sub-tasks and managing distributed file storage. This abstraction can greatly increase user productivity [6]. Though sufficiently generic to perform many real world tasks, the Map-Reduce framework is best at handling homogeneous datasets. As indicated in [15], joining multiple heterogeneous datasets does not quite fit into the Map-Reduce framework, although it still can be done with extra MapReduce steps. For example, users can map and reduce one dataset and read data from other datasets on the fly. In short, processing data relationships, which is what RDBMS excel at, is perhaps not Map-Reduce's strong suit. For a search engine, many data processing problems can be easily solved using the Map-Reduce framework, but there are some tasks that are best modeled as joins. For ex- 1029 ample, a search engine usually stores crawled URLs with their contents in a crawler database, inverted indexes in an index database, click or execution logs in a variety of log databases, and URL linkages along with miscellaneous URL properties in a webgraph database. These databases are gigantic and distributed over a large cluster of nodes. Moreover, their creation takes data from multiple sources: index database needs both crawler and webgraph databases; a webgraph database needs both a crawler and a previous version of the webgraph database. To handle these tasks in the Map-Reduce framework, developers might end up writing awkward map/reduce code that processes one database while accessing others on the fly. Alternatively they might treat these databases as homogeneous inputs to a Map-Reduce process but encode heterogeneity with an additional data-source attribute in the data and extra conditions in the code. Processing data relationships is ubiquitous, especially in enterprise information systems. One major focus of the extremely popular relational algebra and RDBMS is to model and manage data relationships efficiently. Besides search engine tasks, another scenario of applying a join-enabled MapReduce framework is to join large databases across application, company, or even industry boundaries. For example, both airliners and hotel chains have huge databases. Joining these databases can permit data miners to extract more comprehensive rules than they could individually. While many traditional (shared- or shared-nothing, cluster-based or mass parallel) RDBMS have been deployed in enterprise OLAP systems, a join-enabled Map-Reduce system can provide a highly parallel yet cost effective alternative. Based on these observations, we believe that one important improvement for the Map-Reduce framework is to include relational algebra in the subset of the database principles it upholds. That is, it should be further extended to support relational algebra primitives without sacrificing its existing generality and simplicity. The chief focus and contribution of this paper is this extension. We extend the MapReduce framework (shown in Fig. 1) to the Map-ReduceMerge framework (shown in Fig. 2). This new framework introduces a naming and configuring scheme that extends Map-Reduce to processing heterogeneous datasets simultaneously. It also adds a new Merge phase that can join reduced outputs. To recap, the contributions of this paper are as follows: Abiding by Map-Reduce's "simplified" design philosophy, we augment the Map-Reduce framework by adding a Merge phase, so that it is more efficient and easier to process data relationships among heterogeneous datasets. Note that, while Map-Reduce tasks are usually stacked to form a linear user-managed workflow, adding a new Merge primitive can introduce a variety of hierarchical workflows for one data processing task. A MapReduce-Merge workflow is comparable to a RDBMS execution plan, but developers can embed programming logic in it and it is designed specifically for parallel data processing. In a parallel setting, relational operators can be modeled using various combinations of the three functionalprogramming-based primitives: map, reduce, and merge. With proper configurations, these three primitives can Figure 1: Data and control flow for Google's MapReduce framework. A driver program initiates a coordinator process. It remotely forks many mappers, then reducers. Each mapper reads file splits from GFS, applies user-defined logic, and creates several output partitions, one for each reducer. A reducer reads remotely from every mapper, sorts, groups the data, applies user-defined logic, and sends outputs to GFS. be used to implement the parallel versions of several join algorithms: sort-merge, hash, and block nestedloop. In [12], Jim Gray et al. emphasized that there must be a "synthesis of database systems and file systems," as "file systems grow to petabyte-scale archives with billions of files." This vision not only applies to scientific data management, the focus of [12], but also applies to any data-intensive system such as a search engine. As stated in [12], Google's Map-Reduce framework not only abstracts parallel programming from data processing tasks, but it also abstracts files as just "containers for data" through its set-oriented model. This "synthesis" vision echoes Brewer's "principle" idea as Map-Reduce/GFS provides both views a great example of database-oriented data processing. Jim Gray et al. also envisioned that simplified data/programming models like Google's Map-Reduce could evolve into more general ones in the coming decade. Our Map-Reduce-Merge proposal is a step towards that goal. 2. MAP-REDUCE Google's Map-Reduce programming model and its underlying Google File System (GFS) focus mainly to support search-engine-related data processing. It has a simple programming interface, and, though seemingly restricted, it is actually quite versatile and generic. It can extend to data processing tasks beyond the search-engine domain. According to [6], it has also been heavily applied within Google for data-intensive applications such as machine learning. 2.1 Features and Principles Contrary to traditional data processing and management systems, Map-Reduce and GFS are based on several unorthodox assumptions and counter-intuitive design principles: Low-Cost Unreliable Commodity Hardware: Instead of using expensive, high-performance, and reliable symmetric multiprocessing (SMP) or massively 1030 model. These tasks can immediately enjoy high parallelism with only a few lines of administration and configuration code. This "simplified" philosophy can also be seen in many GFS designs. Developers can focus on formulating their tasks to the Map-Reduce interface, without worrying about such issues as implementing memory management, file allocation, parallel, multithreaded, or network programming. Highly Parallel yet Abstracted: The most important contribution of Map-Reduce is perhaps its automatic parallelization and execution. Even though it might not be optimized for a specific task, the productivity gain from developing an application with MapReduce is far higher than doing it from scratch on the same requirements. Map-Reduce allows developers to focus mainly on the problem at hand rather than worrying about the administrative details. High Throughput: Deployed on low-cost hardware and modeled in simplified, generic frameworks, MapReduce systems are hardly optimized to perform like a massively parallel processing systems deployed with the same number of nodes. However, these disadvantages (or advantages) allow Map-Reduce jobs to run on thousands of nodes at relatively low cost. A scheduling system places each Map and Reduce task at a nearoptimal node (considering the vicinity to data and load balancing), so that many Map-Reduce tasks can share the same cluster. High Performance by the Large: Even though Map-Reduce systems are generic, and not usually tuned to be high performance for specific tasks, they still can achieve high performance simply by being deployed on a large number of nodes. In [6], the authors mentioned a then world-record Terabyte [11] sorting benchmark by using Map-Reduce on thousands of machines. In short, sheer parallelism can generate high performance, and Map-Reduce programs can take advantage of it. Shared-Disk Storage yet Shared-Nothing Computing: In a Map-Reduce environment, every node has its own local hard drives. Mappers and reducers use these local disks to store intermediate files and these files are read remotely by reducers, i.e., MapReduce is a shared-nothing architecture. However, Map-Reduce jobs read input from and write output to GFS, which is shared by every node. GFS replicates disk chunks and uses pooled disks to support ultra large files. Map-Reduce's shared-nothing architecture makes it much more scalable than one that shares disk or memory. In the mean time, Map and Reduce tasks share an integrated GFS that makes thousands of disks behave like one. Set-Oriented Keys and Values; File Abstracted: With GFS's help, Map-Reduce can process thousands of file chunks in parallel. The volume can be far beyond the size limit set for an individual file by the underlying OS file system. Developers see data as keys and values, no longer raw bits and bytes, nor file descriptors. Functional Programming Primitives: The MapReduce interface is based on two functional-programming primitives [6]. Their signatures are re-produced Figure 2: Data and control flow for the MapReduce-Merge framework. The coordinator manages two sets of mappers and reducers. After these tasks are done, it launches a set of mergers that read outputs from selected reducers and merge them with user-defined logic. parallel processing (MPP) machines equipped with highend network and storage subsystems, most search engines run on large clusters of commodity hardware. This hardware is managed and powered by open-source operating systems and utilities, so that the cost is low. Extremely Scalable RAIN Cluster: Instead of using centralized RAID-based SAN or NAS storage systems, every Map-Reduce node has its own local offthe-shelf hard drives. These nodes are loosely coupled in rackable systems connected with generic LAN switches. Loose coupling and shared-nothing architecture make Map-Reduce/GFS clusters highly scalable. These nodes can be taken out of service with almost no impact to still-running Map-Reduce jobs. These clusters are called Redundant Array of Independent (and Inexpensive) Nodes (RAIN) [18]. GFS is essentially a RAIN management system. Fault-Tolerant yet Easy to Administer: Due to its high scalability, Map-Reduce jobs can run on clusters with thousands of nodes or even more. These nodes are not very reliable. At any point in time, a certain percentage of these commodity nodes or hard drives will be out of order. GFS and Map-Reduce are designed not to view this certain rate of failure as an anomaly; instead they use straightforward mechanisms to replicate data and launch backup tasks so as to keep still-running processes going. To handle crashed nodes, system administrators simply take crashed hardware off-line. New nodes can be plugged in at any time without much administrative hassle. There is no complicated backup, restore and recovery configurations and/or procedures like the ones that can be seen in many DBMS. Simplified and Restricted yet Powerful: MapReduce is a restricted programming model, it only provides straightforward map and reduce interfaces. However, most search-engine (and generic) data processing tasks can be effectively implemented in this 1031 here: map: (k1 , v1 ) [(k2 , v2 )] reduce: (k2 , [v2 ]) [v3 ] The map function applies user-defined logic on every input key/value pair and transforms it into a list of intermediate key/value pairs. The reduce function applies user-defined logic to all intermediate values associated with the same intermediate key and produces a list of output values. This simplified interface enables developers to model their specific data processing into two-phase parallel tasks. These signatures were informally defined for readability, they were not meant to be rigorous enough to pass a strongly-typed functional type checking mechanism. However, [14] pointed out that the reduce function output [v3 ] can be in different type from its input [v2 ]. Distributed Partitioning/Sorting Framework: Map-Reduce system also includes phases that work on the intermediate data, and users usually do not need to deal with them directly. These phases include a partitioner function that partitions mapper outputs to reducer inputs, a sort-by-key function that sorts reducer inputs based on keys, and a group-by-key function that groups sorted key/value pairs with the same key into a single key/value pair of the same key and all the values. In its pure form, the system is essentially a 2-phase parallel sorter similar to the one in NOW [2]. Designed for Search Engine Operations yet Applicable to Generic Data Processing Tasks: MapReduce is a generic framework, not limited to search engine operations. It can be applied to any data processing task that fits the simple map-reduce interface. represent dataset lineages, k means keys, and v stands for value entities. map: (k1 , v1 ) [(k2 , v2 )] reduce: (k2 , [v2 ]) (k2 , [v3 ]) merge: ((k2 , [v3 ]) , (k3 , [v4 ]) ) [(k4 , v5 )] In this new model, the map function transforms an input key/value pair (k1 , v1 ) into a list of intermediate key/value pairs [(k2 , v2 )]. The reduce function aggregates the list of values [v2 ] associated with k2 and produces a list of values [v3 ], which is also associated with k2 . Note that inputs and outputs of both functions belong to the same lineage, say . Another pair of map and reduce functions produce the intermediate output (k3 , [v4 ]) from another lineage, say . Based on keys k2 and k3 , the merge function combines the two reduced outputs from different lineages into a list of key/value outputs [(k4 , v5 )]. This final output becomes a new lineage, say . If = , then this merge function does a self-merge, similar to self-join in relational algebra. Notice that the map and reduce signatures in the new model are almost the same as those in the original MapReduce. The only differences are the lineages of the datasets and the production of a key/value list from reduce instead of just values. These changes are introduced because the merge function needs input datasets organized (partitioned, then either sorted or hashed) by keys and these keys have to be passed into the function to be merged. In Google's MapReduce, the reduced output is final, so users pack whatever needed in [v3 ], while passing k2 for next stage is not required. To build a merge function that reads data from both lineages in an organized manner, the design of these signatures emphasizes having the key k2 passed from map to reduce, then to merge functions. This is to make sure that data is partitioned, then sorted (or hashed) on the same keys before they can be merged properly. This condition, however, is too strong. Keys still can be transformed between phases and they do not even need to be of the same type (as implied by the same type descriptor k2 used in every phase) as long as records pointed by transformed keys are still organized in the same way as the one by the mapped keys represented by k2 . For example, 4-digit integers can be transformed into 4byte numerical strings padded with 0s. The order of integers and the one for transformed strings are the same, so they are compatible and replaceable between phases if compatible range partitioners are used in map functions. However, since users already can transform keys in the map function (from k1 to k2 ), there is hardly a need to transform them again in reduce and merge functions. Thus, to keep these signatures simple, we chose to have the same k2 passed between phases. As mentioned in [6], the map and reduce functions originate from functional programming. The merge function can be related to two-dimensional list comprehension, which is also popular in functional programming. 2.2 Homogenization Despite all these advantages and design principles, MapReduce focuses mainly on processing homogeneous datasets. Through a process we called homogenization, Map-Reduce can be used to do equi-joins on multiple heterogeneous datasets. This homogenization process applies one map/reduce task on each dataset that it inserts a data-source tag into every value. It also extracts a key attribute common for all heterogeneous datasets. Transformed datasets now have two common attributes: key and data-source -- they are homogenized. A final map/reduce task can then apply to all the homogenized datasets combined. Data entries from different datasets with the same key value will be grouped in the same reduce partition. User-defined logic can extract data-sources from values to identify their origins, then the entries from different sources can be merged. This procedure takes lots of extra disk space, incurs excessive map-reduce communications, and is limited only to queries that can be rendered as equi-joins. In the next section, we will discuss a general approach of extending Map-Reduce to efficiently process multiple heterogeneous datasets. 3.1 Example 3. MAP-REDUCE-MERGE The Map-Reduce-Merge model enables processing multiple heterogeneous datasets. The signatures of the MapReduce-Merge primitives are listed below, where , , In this section, we start with a simple example that will be continued to next sections. It shows how Map, Reduce, and Merge modules work together. There are two datasets in this example: Employee and Department. Employee's "key" attribute is emp id and the others are packed into an emp info "value." Department's "key" is dept id and the 1032 Algorithm 2 Map function for the Department dataset. 1: map(const Key& key, /* dept id */ 2: const Value& value /* dept info */) { 3: dept id = key; 4: bonus adjustment = value.bonus adjustment; 5: Emit((dept id), (bonus adjustment)); 6: } Algorithm 3 Reduce function for the Employee dataset. 1: reduce(const Key& key, /* (dept id, emp id) */ 2: const ValueIterator& value 3: /* an iterator for a bonuses collection */) { 4: bonus sum = /* sum up bonuses for each emp id */ 5: Emit(key, (bonus sum)); 6: } Figure 3: Example to join Employee and Department tables and compute employee bonuses (see 3.1). Algorithm 1 Map function for the Employee dataset. 1: map(const Key& key, /* emp id */ 2: const Value& value /* emp info */) { 3: emp id = key; 4: dept id = value.dept id; 5: /* compute bonus using emp info */ 6: output key = (dept id, emp id); 7: output value = (bonus); 8: Emit(output key, output value); 9: } others are packed into a dept info "value." One example query is to join these two datasets and compute employee bonuses. Before these two datasets are joined in a merger, they are first processed by a pair of mappers and reducers. A complete data flow is shown in Fig. 3. On the left hand side, a mapper reads Employee entries and computes a bonus for each entry. A reducer then sums up these bonuses for every employee and sorts them by dept id, then emp id. On the right hand side, a mapper reads Department entries and computes bonus adjustments. A reducer then sorts these department entries. At the end, a merger matches the output records from the two reducers on dept id using the sortmerge algorithm, applies a department-based bonus adjustment on employee bonuses. Pseudocode for these mappers and reducers are shown in Alg. 1, 2, 3, and 4. After these two pairs of Map-Reduce tasks are finished, a merger task takes their intermediate outputs, and joins them on dept id. We will describe the details of major merge components in following sections. 3.2 Implementation We have implemented a Map-Reduce-Merge framework, in which Map and Reduce components are inherited from Google Map-Reduce except minor signature changes. The new Merge module includes several new components: merge function, processor function, partition selector, and configurable iterator. We will use the employee-bonus example to explain the data and control flow of this framework and how these components collaborate. The merge function (merger ) is like map or reduce, in which developers can implement user-defined data processing logic. While a call to a map function (mapper ) processes a key/value pair, and a call to a reduce function (reducer ) processes a key-grouped value collection, a merger processes two pairs of key/values, that each comes from a distinguishable source. At the Merge phase, users might want to apply different data-processing logic on data based on their sources. An example is the build and probe phases of a hash join, where build programming logic is applied on one table then probe the other. To accommodate this pattern, a processor is a user-defined function that processes data from one source only. Users can define two processors in Merge. After map and reduce tasks are about done, a Map-ReduceMerge coordinator launches mergers on a cluster of nodes (see Fig. 2). When a merger starts up, it is assigned with a merger number. Using this number, a user-definable module called partition selector can determine from which reducers this merger retrieves its input data. Mappers and reducers are also assigned with a number. For mappers, this number represents the input file split. For reducers, this number represents an input bucket, in which mappers partition and store their output data to. For Map-Reduce users, these numbers are simply system implementation detail, but in Map-Reduce-Merge, users utilize these numbers to associate input/output between mergers and reducers in partition selectors. Like mappers and reducers, a merger can be considered as having logical iterators that read data from inputs. Each mapper and reducer have one logical iterator and it moves from the begin to the end of a data stream, which is an input file split for a mapper, or a merge-sorted stream for a reducer. A merger reads data from two sources, so it can be viewed as having two logical iterators. These iterators usually move forward as their mapper/reducer counterparts, but their relative movement against each others can be instrumented to implement a user-defined merge algorithm. Our Map-Reduce-Merge framework provides a user-configurable module (iterator-manager ) that it is called for the information that controls the movement of these configurable iterators. Later, we will describe several iteration patterns from relational join algorithms. A Merge phase driver, as shown in Alg. 5, is needed to coordinate these 1033 Figure 4: A 2-way Map-Reduce-Merge data flow. Data is processed by a mapper, partitioner, and combiner in the Map phase. Then, it is read remotely and processed by a sorter and reducer in the Reduce phase. In the Merge phase, selected reducer outputs are processed by a matcher and merger guided by a pair of configurable iterators. Algorithm 4 Reduce function for the Department dataset. 1: reduce(const Key& key, /* (dept id) */ 2: const ValueIterator& value 3: /* an iterator on a bonus adjustments collection */) { 4: /* aggregate bonus adjustments and 5: compute a final bonus adjustment */ 6: Emit(key, (bonus adjustment)); 7: } 3.2.2 Processors Merge components and have them collaborate with each others. A processor is the place where users can define logic of processing data from an individual source. Processors can be defined if the hash join algorithm is implemented in Merge, where the first processor builds a hash table on the first source, and the second probes it while iterating through the second data source. In this case, the merger function is empty. Since we will apply the sort-merge algorithm on the bonus-computation join example, these processors stay empty. 3.2.3 3.2.1 Partition Selector In a merger, a user-defined partition selector function determines which data partitions produced by up-stream reducers should be retrieved then merged. This function is given the current merger's number and two collections of reducer numbers, one for each data source. Users define logic in the selector to remove unrelated reducers from the collections. Only the data from the reducers left in the collections will be read and merged in the merger. For the employee-bonus example, a simplified scenario stipulates that both sources have the same collection of reducer numbers and the same range partitioner function is applied to the dept id key only in both mappers, so that both reducer outputs are completely sorted and partitioned into equal number of buckets. Notice that the employee mapper produces keys in pairs of (dept id, emp id), thus its reducer sorts data on this composite key, but partitioning is done on dept id only. Based on these assumptions, a partition selector function can be defined to map reducers and mergers in an one-to-one relationship as in Alg. 6. Merger In the merge function, users can implement data processing logic on data merged from two sources where this data satisfies a merge condition. Alg. 7 shows the last step of computing employee bonuses by adjusting an employee's raw bonus with a department-based adjustment. 3.2.4 Configurable Iterators As indicated, by manipulating relative iteration of a merger's two logical iterators, users can implement different merge algorithms. For algorithms like nested-loop joins, iterators are configured to move as looping variables in a nested loop. For algorithms like sort-merge joins, iterators take turns when iterating over two sorted collections of records. For hashjoin-like algorithms, these two iterators scan over their data in separate passes. The first scans its data and builds a hash table, then the second scans its data and probes the already built hash table. Allowing users to control iterator movement increases the risk of running into a never-ending loop. This risk always ex- 1034 Algorithm 5 Merge phase driver. 1: PartitionSelector partitionSelector; // user-defined logic 2: LeftProcessor leftProcessor; // user-defined logic 3: RightProcessor rightProcessor; // user-defined logic 4: Merger merger; // user-defined logic 5: IteratorManager iteratorManager; // user-defined logic 6: int mergerNumber; // assigned by system 7: vector<int> leftReducerNumbers; // assigned by system 8: vector<int> rightReducerNumbers; // assigned by system 9: // select and filter left and right reducer outputs for this merger 10: partitionSelector.select(mergerNumber, 11: leftReducerNumbers, 12: rightReducerNumbers); 13: ConfigurableIterator left = /*initiated to point to entries 14: in reduce outputs by leftReducerNumbers*/ 15: ConfigurableIterator right =/*initiated to point to entries 16: in reduce outputs by rightReducerNumbers*/ 17: while(true) { 18: pair<bool,bool> hasMoreTuples = 19: make pair(hasNext(left), hasNext(right)); 20: if (!hasMoreTuples.first && !hasMoreTuples.second) {break;} 21: if (hasMoreTuples.first) { 22: leftvalue); leftProcessor.process(leftkey, } 23: if (hasMoreTuples.second) { 24: rightProcessor.process(rightkey, rightvalue); } 25: if (hasMoreTuples.first && hasMoreTuples.second) { 26: merger.merge(leftkey, leftvalue, 27: rightkey, rightvalue); } 28: pair<bool,bool> iteratorNextMove = 29: iteratorManager.move(leftkey, rightkey, hasMoreTuples); 30: if (!iteratorNextMove.first && !iteratorNextMove.second) { 31: break; } 32: if (iteratorNextMove.first) { left++; } 33: if (iteratorNextMove.second) { right++; } 34: } Algorithm 6 One-to-one partition selector. 1: bool select(int mergerNumber, 2: vector<int>& leftReducerNumbers, 3: vector<int>& rigthReducerNumbers) { 4: if (find(leftReducerNumbers.begin(), 5: leftReducerNumbers.end(), 6: mergerNumber) == leftReducerNumbers.end()) { 7: return false; } 8: if (find(rightReducerNumbers.begin(), 9: rightReducerNumbers.end(), 10: mergerNumber) == rightReducerNumbers.end()) { 11: return false; } 12: leftReducerNumbers.clear(); 13: leftReducerNumbers. push back(mergerNumber); 14: rightReducerNumbers.clear(); 15: rightReducerNumbers. push back(mergerNumber); 16: return true; 17: } Algorithm 7 Merge function for the employee-department join. 1: merge(const LeftKey& leftKey, 2: /* (dept id, emp id) */ 3: const LeftValue& leftValue, /* sum of bonuses */ 4: const RightKey& rightKey, /* dept id */ 5: const RightValue& rightValue /* bonus-adjustment */){ 6: if (leftKey.dept id == rightKey) { 7: bonus = leftValue * rightValue; 8: Emit(leftKey.emp id, bonus); } 9: } Algorithm 8 Iteration logic for sort-merge joins. 1: move(const LeftKey& leftKey, 2: const RightKey& rightKey, 3: const pair<bool, bool>& hasMoreTuples) { 4: if (hasMoreTuples.first && hasMoreTuples.second) { 5: if (leftKey < rightKey) { 6: return make pair(true, false); } 7: return make pair(false, true); } 8: return hasMoreTuples; 9: } right iterator by default. If one source is exhausted, this information is stored in the input bool pair "hasMoreTuples," move the iterator for the source that still has data. Alg. 9 is an implementation of nested-loop iteration pattern. In a nested loop, keys are ignored in determining how to move iterators. If the left and right sources are exhausted, then the merge process is terminated. It is a logic error if the right source still have data when the left is exhausted. If the left source is not exhausted, then move the right iterator only. When the right source is exhausted, move the left iterator and reset the right iterator to the beginning of its data source. To implement algorithms that follow the hash join's twoscan iteration pattern, a merger first scans one data source from the beginning to the end, then repeats the scan on the other one, e.g., see Alg. 10. Notice that, for the employee-bonus example, implementing configurable iterators is tied to the choosing of partitioners. Using the sort-merge-based configurable iterators requires a range partitioner in both mappers. 4. APPLICATIONS TO RELATIONAL DATA PROCESSING ists in user-defined logic and is a great concern, especially in strictly-regulated DBMS systems. For programming models like the Map-Reduce and Map-Reduce-Merge, this issue is lesser because they are, after all, programming models and data processing frameworks. Still, it is a nuisance if a task never ends, so a framework should provide a mechanism to reduce the chance of it happening. In our implementation, we use a boolean pair returned by a user-defined function to indicate whether to move an iterator to point to the next entity. This function is called after each merge operation; true indicates forward and false indicates stay. If both booleans are false, then the whole merge process is terminated. Suppose reducers produce sorted outputs in an ascendant order, Alg. 8 shows the programming logic of coordinating iterator movement for sort-merge-alike algorithms. If both sources still have inputs, then move the iterator that points to a smaller key. If both keys are equivalent, then move the One fundamental idea of Map-Reduce-Merge is to bring relational operations into parallel data processing at the search-engine scale. On the other hand, map, reduce, and merge can be used as standardized components in implementing parallel OLAP DBMS. Novel data-processing applications such as search engines and Map-Reduce's unorthodox principles and assumptions make it worthwhile to revisit parallel databases [7, 16]. 4.1 Map-Reduce-Merge Implementations of Relational Operators In our implementation, the Map-Reduce-Merge model assumes that a dataset is mapped into a relation R with an attribute set (schema) A. In map, reduce, and merge functions, users choose attributes from A to form two subsets: K and V . K represents the schema of the "key" part of a MapReduce-Merge record and V the "value" part. For each tuple t of R, this implies that t is concatenated by two field sets: k 1035 Algorithm 9 Iteration logic for nested-loop joins. 1: move(const LeftKey& leftKey, 2: const RightKey& rightKey, 3: const pair<bool, bool>& hasMoreTuples) { 4: if (!hasMoreTuples.first && !hasMoreTuples.second) { 5: return make pair(false, false); } 6: if (!hasMoreTuples.first && hasMoreTuples.second) 7: /* throw a logical-error exception */ 8: if (hasMoreTuples.first && !hasMoreTuples.second) { 9: /* reset the right iterator to the beginning */ 10: return make pair(true, false); } 11: return make pair(false, true); 12: } Algorithm 10 Iteration logic for hash joins. 1: move(const LeftKey& leftKey, 2: const RightKey& rightKey, 3: const pair<bool, bool>& hasMoreTuples) { 4: if (!hasMoreTuples.first && !hasMoreTuples.second){ 5: return make pair(false, false); } 6: if (hasMoreTuples.first) { 7: return make pair(true, false); } 8: return make pair(false, true); 9: } filtering conditions involving more than one relations, however, this filtering can only be accomplished after join (or Cartesian product) operations are properly configured and executed. Joins: 4.2 describes in detail how joins can be implemented using mergers with the help from mappers and reducers. Set Union: Assume the union operation (as well as other set operations described below) is performed over two relations. In Map-Reduce-Merge, each relation will be processed by Map-Reduce, and the sorted and grouped outputs of the reducers will be given to a merger. In each reducer, duplicated tuples from the same source can be skipped easily. The mappers for the two sources should share the same range partitioner, so that a merger can receive records within the same key range from the two reducers. The merger can then iterate on each input simultaneously and produce only one tuple if two input tuples from different sources are duplicates. Non-duplicated tuples are produced by this merger as well. Set Intersection: First, partitioned and sorted MapReduce outputs are sent to mergers as described in the last item. A merger can then iterate on each input simultaneously and produce tuples that are shared by the two reducer outputs. Set Difference: First, partitioned and sorted MapReduce outputs are sent to mergers as described in the last item. A merger can then iterate on each input simultaneously and produce tuples that are the difference of the two reducer outputs. Cartesian Product: In a Map-Reduce-Merge task, the two reducer sets will produce two sets of reduced partitions. A merger is configured to receive one partition from the first reducer (F ) and the complete set of partitions from the second one (S). This merger can then form a nested loop to merge records in the sole F partition with the ones in every S partition. Rename: It is trivial to emulate Rename in MapReduce-Merge, since map, reduce, and merge functions can select, rearrange, compare, and process attributes based on their indexes in the "key" and "value" subsets. Map-Reduce-Merge is certainly more expressive than the relational algebra, since map, reduce, and merge can all contain user-defined programming logic. and v, where K is the schema of k and V is the schema of v. It so happens that Map-Reduce-Merge calls k as "key" and v as "value". This naming is arbitrary in the sense that their attribute sets are decided solely by the user. This "key" is used in Map-Reduce-Merge functions for partitioning, sorting, grouping, matching, and merging tuples. By no means it has the same uniqueness meaning in relational languages. Below we describe how Map-Reduce-Merge can be used to implement primitive and some derived relational operators, so that Map-Reduce-Merge is relationally complete, while being load-balanced, scalable, and parallel. Projection: For each tuple t = (k, v) of the input relation, users can define a mapper to transform it into a projected output tuple t = (k , v ), where k and v are typed by schema K and V , respectively. K and V are subsets of A. Namely, using mappers only can implement relational algebra's projection operator. Aggregation: At the Reduce phase, Map-Reduce (as well as Map-Reduce-Merge) performs the sort-by-key and group-by-key functions to ensure that the input to a reducer is a set of tuples t = (k, [v]) in which [v] is the collection of all the values associated with the key k. A reducer can call aggregate functions on this grouped value list. Namely, reducers can easily implement the "group by" clause and "aggregate" operators in SQL. Generalized Selection: Mappers, reducers, and mergers can all act as filters and implement the selection operator. If a selection condition is on attributes of one data source, then it can be implemented in mappers. If a selection condition is on aggregates or a group of values from one data source, then it can be implemented in reducers. If a selection condition involves attributes or aggregates from more than one sources, then it can be implemented in mergers. Straightforward filtering conditions that involve only one relation in a SQL query's "where" and "having" clauses can be implemented using mappers and reducers, respectively. Mergers can implement complicated 4.2 Map-Reduce-Merge Implementations of Relational Join Algorithms Join is perhaps the most important relational operator. In this section, we will describe how Map-Reduce-Merge can implement three most common join algorithms. 4.2.1 Sort-Merge Join From [6], Map-Reduce is shown to be an effective parallel sorter. The key of sorting is to partition input records based on their actual values instead of, by Map-Reduce default, hashed values. That is, instead of using a hash partitioner, 1036 users can configure the framework to use a range partitioner in mappers. Using this Map-Reduce-based sorter, the MapReduce-Merge framework can be implemented as a parallel, sort-merge join operator. The programming logic for each phase is: Map: Use a range partitioner in mappers, so that records are partitioned into ordered buckets, each is over a mutually exclusive key range and is designated to one reducer. Reduce: For each Map-Reduce lineage, a reducer reads the designated buckets from all the mappers. Data in these buckets are then merged into a sorted set. This sorting procedure can be done completely at the reducer side, if necessary, through an external sort. Or, mappers can sort data in each buckets before sending them to reducers. Reducers can then just do the merge part of the merge sort using a priority queue. Merge: A merger reads from two sets of reducer outputs that cover the same key range. Since these reducer outputs are sorted already, this merger simply does the merge part of the sort-merge join. join. Instead of doing an in-memory hash, a nested loop is implemented. The partitioning and grouping done by mappers and reducers concentrate the join sets, so this parallel nested-loop join can enjoy a high selectivity in each merger. Map: Same as the one for the hash join. Reduce: Same as the one for the hash join. Merge: Same as the one for the hash join, but a nested-loop join is implemented, instead of a hash join. 5. OPTIMIZATIONS Map-Reduce provides several optimization mechanisms, including locality and backup tasks [6]. In this section, we describe some strategies that can reduce resources (e.g, the number of network connections and disk bandwidth) used in the Merge phase. 5.1 Optimal Reduce-Merge Connections 4.2.2 Hash Join One important issue in distributed computing and parallel databases is to keep workload and storage balanced among nodes. One strategy is to disseminate records to nodes based on their hash values. This strategy is very popular in search engines as well as in parallel databases. It is the the default partitioning mechanism in Map-Reduce [6] and the only partitioning strategy in Teradata [16], a parallel RDBMS. Another approach is to run a preprocessing Map-Reduce task to scan the whole dataset and build a data density [6]. This density can be used by partitioners in later Map-Reduce tasks to ensure balanced workload among nodes. Here we show how to implement hash join [8] using the Map-Reduce-Merge framework: Map: Use a common hash partitioner in both mappers, so that records are partitioned into hashed buckets, each is designated to one reducer. Reduce: For each Map-Reduce lineage, a reducer reads from every mapper for one designated partition. Using the same hash function from the partitioner, records from these partitions can be grouped and aggregated using a hash table. This hash-based grouping is an alternative to the default sorting-based approach. It does not need a sorter, but requires maintaining a hashtable either in memory or disk. Merge: A merger reads from two sets of reducer outputs that share the same hashing buckets. One is used as a build set and the other probe. After the partitioning and grouping are done by mappers and reducers, the build set can be quite small, so these sets can be hash-joined in memory. Notice that, the number of reduce/merge sets must be set to an optimally large number in order to support an in-memory hash join, otherwise, an external hash join is required. 4.2.3 Block Nested-Loop Join The Map-Reduce-Merge implementation of the block nestedloop join algorithm is very similar to the one for the hash For a natural join over two datasets, A and B, suppose for A, there are MA number of mappers and RA number of reducers; and for B, MB and RB . Each A mapper produces RA partitions, and each B mapper RB . Conversely, each A reducer reads from every A mappers for the partitions designated for it. Same applies to B reducers from B mappers. To simplify the scenario, let RA = RB = R, then in total there would be at least R (MA + MB ) remote reads (not counting redundant connections incurred by backup jobs) among nodes where mappers and reducers reside. This is a lot of remote reads among nodes, but it is the price to pay to group and aggregate same-key records as these records were originally scattered around in the whole cluster. For mergers, because data is already partitioned and even sorted after Map and Reduce phases, they do not need to connect to every reducer in order to get their data. The selector function in mergers can choose pertinent reduced partitions for merging. For example, in a simplified scenario, if there is also R number of mergers, then these mergers can have an one-to-one association with A reducers and also with B reducers. A user-defined selector can be like the one shown in Alg. 6. This selector receives two collections of reducer numbers for A and B reducers. It then picks the reducers who share the same number with the merger and removes other reducers' numbers from the collections. The merger then uses the selected reducer numbers to set up connections with and requests data from these reducers. In the one-to-one case, the number of connections between reducers and mergers is 2R. If one input dataset is much larger than the other, then it would be inefficient to partition both datasets into the same number of reducers. One can choose different numbers for RA and RB , but the selection logic is more complicated. Selector logic can also be quite complicated in the case of -join. However, selector is a optimization mechanism that can help avoid excessive remote reads. A naive selection can always put only the merger number in one reducer number set and leave the other set intact (see the selection logic in 11) and still get the correct result. This is basically a Cartesian product between two reduced sets. The number of remote reads now becomes R2 + R. Before feeding data from selected reducer partitions to a user-defined merger function, these tuples can be compared and see if they should be merged or not. In short, this 1037 Algorithm 11 Cartesian-product partition selector. 1: select(int mergerNumber, 2: vector<int>& leftReducerNumbers, 3: vector<int>& rightReducerNumbers) { 4: if (find(leftReducerNumbers.begin(), 5: leftReducerNumbers.end(), 6: mergerNumber) == leftReducerNumbers.end()) { 7: return false; } 8: leftReducerNumbers.clear(); 9: leftReducerNumbers.push back(mergerNumber); 10: return true; 11: } comparison can be done in a user-defined matcher that is simply a fine-grained selector. 5.2 Combining Phases To accomplish a data processing task, it usually takes several Map-Reduce-Merge (or Map-Reduce) processes weaved in a workflow, in which the output of a process become the input of a subsequent one. The entire workflow may constitute many disk-read-write passes. For example, Fig. 6 shows a TPC-H Q2 join tree implemented with 13 Map-ReduceMerge passes. These passes can be optimized and combined: ReduceMap, MergeMap: Reducer and merger outputs are usually fed into a down-stream mapper for a subsequent join operation. These outputs can simply be sent directly to a co-located mapper in the same process without storing them in secondary storage first. ReduceMerge: A merger usually takes two sets of reducer partitions. This merger can be combined with one of the reducers and gets its output directly while remotely reads data from the other set of reducers. ReduceMergeMap: An straightforward combination of ReduceMerge and MergeMap becomes ReduceMergeMap. Another way of reducing disk accesses is to replace disk read-writes with network read-writes. This method requires connecting up- and down-stream Map-Reduce-Merge processes while they are running. This approach is arguably more complicated than saving intermediate data in local disks, thus it may not comply with the "simplified" philosophy of the Map-Reduce framework. When a process fails, ...

Find millions of documents on Course Hero - Study Guides, Lecture Notes, Reference Materials, Practice Exams and more. Course Hero has millions of course specific materials providing students with the best way to expand their education.

Below is a small sample set of documents:

Brandeis - CS - 228
MapReduce: Simplied Data Processing on Large ClustersJeffrey Dean and Sanjay Ghemawatjeff@google.com, sanjay@google.comGoogle, Inc.AbstractMapReduce is a programming model and an associated implementation for processing and generating large data sets
Brandeis - CS - 228
WEB SEARCH FOR A PLANET: THE GOOGLE CLUSTER ARCHITECTUREAMENABLE TO EXTENSIVE PARALLELIZATION, GOOGLES WEB SEARCHAPPLICATION LETS DIFFERENT QUERIES RUN ON DIFFERENT PROCESSORS AND, BY PARTITIONING THE OVERALL INDEX, ALSO LETS A SINGLE QUERY USE MULTIPLE
American River - SCC - 330
CAREERS AND RECRUITMENTMapping opportunitiesScientists who can combine geographic information systems with satellite data are in demand in a variety of disciplines. Virginia Gewin gets her bearings.Forest fires ravaging southern California, footand-mo
UNC - PHYS - 26
51. (a) We use = I, where is the net torque acting on the shell, I is the rotational inertia of the shell, and is its angular acceleration. Therefore, I= 960 N m 2 = 2 = 155 kg m . 6.20 rad/s(b) The rotational inertia of the shell is given by I = (2/3)M
Penn State - ASI - 23
J Mol Evol (2008) 66:98106 DOI 10.1007/s00239-007-9040-xThe Evolutionary and Epidemiological Dynamics of the ParamyxoviridaeLaura W. Pomeroy Ottar N. Bjrnstad Edward C. HolmesReceived: 27 March 2007 / Accepted: 10 September 2007 / Published online: 24
Harvard - INDIV - 101
Sex Differences in Intrinsic Aptitude for Mathematics and Science?A Critical ReviewElizabeth S. Spelke Harvard UniversityThis article considers 3 claims that cognitive sex differences account for the differential representation of men and women in high
WVU - BIOM - 693
A Multiresolution Spline With Application to Image MosaicsPETER J. BURT and EDWARD H. ADELSON RCA David Sarnoff Research CenterWe define a multiresolution spline technique for combining two or more images into a larger image mosaic. In this procedure, t
University of Florida - EEL - 6502
9'cn G' z T m o m +I-3 2tltr ,
University of Florida - EEL - 6502
I11 (I ROBUSTNESS OF THE LMSI)IY ~ l . o I ~ I l ~ , ~ l IIl,f,l 0.502; ?oo.? i JO.?~ i rIIL jI~,c(,I , t ~ c l . ~ ~ f l . c . d r ~ cWhen someone talks about robustness, the question that I :comes to mind is in what sense? We will show here th
Montana - EE - 367
Digilent, Inc. 125 SE High Street Pullman, WA 99163 (509) 334 6306 (Voice and Fax) www.digilentinc.comPRELIMINARYDigilab DIO1 Reference ManualRevision: May 7, 2002 Overview The Digilab Digital I/O board 1 (DIO1) is one of several expansion boards desig
Montana - EE - 367
San Jose State - CMPE - 138
begin%MyBus%Routes(street_name, x, y, Const_0, city_name, rno) :- street_name = 1,Const_0 = -1,y = 70,-x &lt;= 0,x &lt;= 170,city_name = 1,rno = 55.Routes(street_name, x, y, Const_0, city_name, rno) :- street_name = 2,Const_0 = -1,x = 170,-y &lt;= -70
Loyola Chicago - EMACS - 20
## This file is not a part of GNU Emacs. It is from xc/programs/rgb/rgb.txt # of the X11R6 X Consortium distribution, and is included here to support the# mapping of color names to RGB values on Windows NT and Windows 95.## The following copyright no
Loyola Chicago - EMACS - 20
From lars Thu Feb 23 23:20:38 1995From: larsi@ifi.uio.no (ding)Date: Fri Feb 24 13:40:45 1995Subject: So you want to use the new GnusMessage-ID: &lt;lars-doc1@eyesore.no&gt;Actually, since you are reading this, chances are you are alreadyusing the new Gnu
Loyola Chicago - EMACS - 20
#-*-Mode: Fundamental-*-# X keymap file for rlk with some emacsified bindings# This file contains the default keyboard mapping. The first column contains a X keyboard code; the other# 16 columns contain the mapping of the keycode to a character string
Lake County - ENGLISH - 482
final project on UIUC virtual writing communitiesIn your early projects for this class, there was an emphasis on where technologically embodied writing happens on campus. Chalkboards, sidewalks, and notebooks all exist in physical space and can be plotte
Midwestern State University - ME - 316
Links to Related SitesDisclaimer: I have not extensively reviewed most of these sites. University of Waterloo Systems Design Engineering http:/sydewww.uwaterloo.ca/ NASA Tech Briefs http:/www.nasatech.com/ The Basics of Design Engineering Appliance Magaz
NYU - V - 22
_ _ Agile ProcessesThe weather-cock on the church spire, though made of iron, would soon be broken by the storm-wind if it did not understand the noble art of turning to every wind. - Heinrich Heine Many of us have lived through the nightmare of a projec
Berkeley - ME - 102
OPA 434 2OPA342 OPA2342 OPA4342OPA 342 OPA 2342OPA 342OPA 4342www.ti.comLow-Cost, Low-Power, Rail-to-Rail OPERATIONAL AMPLIFIERS MicroAmplifier TM SeriesFEATURESLOW QUIESCENT CURRENT: 150A typ RAIL-TO-RAIL INPUT RAIL-TO-RAIL OUTPUT (within 1mV)
Oklahoma State - DOCUMENT - 2111
Oklahoma Cooperative Extension ServiceANSI-8204Predators: Thieves in the NightJoe BerryExtension Poultry SpecialistOklahoma Cooperative Extension Fact Sheets are also available on our website at: http:/osufacts.okstate.eduPoultry producers should be
North Dakota State University - PUBWEB - 724
MONOMIAL IDEALS: HOMEWORK 1Exercise 1. (Fact I.3.7) Let R be a commutative ring with identity. Prove the following. (a) Let I = (f1 , . . . , fn )R and let J = (g1 , . . . , gm )R. Then I + J is generated by the set cfw_f1 , . . . , fn , g1 , . . . , gm
Excelsior - PRESIDENTS - 5211075544
FOURTH ANNUAL PRESIDENTS' FORUM OF EXCELSIOR COLLEGEThe President's Forum of Excelsior College met on March 23, 2007, at the Cosmos Club in Washington, D.C. The subject of the discussions was &quot;Crossing Borders: Higher Education Leadership in the Age of A
illinoisstate.edu - PSY - 138
Name _ Lab 15 Worksheet 1) In your worksheet, record the mean of the first sample. How does this mean compare to the mean of the population? How much sampling error is there (sampling error is the difference between the population mean and the sample mean
illinoisstate.edu - PSY - 138
Name _ Lab 17 Worksheet Try some on your own. Each of the following situations calls for a significance test for a population mean . State the null hypothesis H0 and the alternative hypothesis Ha in each case. (1) The diameter of a spindel in a small moto
Iowa State - EE - 435
EE435 Experiment 2: Amplifier Characterization Spring 2009Objective: The objective of this experiment is to develop measurementmethods for characterizing key properties of operational amplifiers1 IntroductionAmplifiers are one of the major components
Dupage - CIT - 2251
Lab 10.2.4 Mitigate Layer 2 AttacksObjectiveIn this lab, the students will complete the following tasks: Mitigate against CAM table overflow attack with appropriate Cisco IOS commands. Mitigate against MAC spoofing attacks with appropriate Cisco IOS com
Dupage - CIT - 2251
Lab 9.4.10 Configure and Test Advanced Protocol Handling on the Cisco PIX Security ApplianceObjectiveIn this lab exercise, the students will complete the following tasks: Display the Inspection protocol configurations Change the Inspection protocol conf
Dupage - CIT - 2251
Lab 3.4.6b Configure the PIX Security Appliance using CLIObjectiveIn this lab exercise, the students will complete the following tasks: Execute general maintenance commands. Configure the PIX Security Appliance inside and outside interfaces. Test and ve
Dupage - CIT - 2251
Lab 3.2.3 Configure Basic Security using Security Device Manager (SDM)ObjectiveIn this lab, the students will complete the following tasks: Copy the SDM files to router Flash memory. Configure the router to support SDM. Configure a basic firewall. Reset
Dupage - CIT - 2252
Lab 2.1.6 Configure a Router with the IOS Intrusion Prevention SystemObjectiveIn this lab, the students will complete the following tasks: Initialize the Intrusion Protection System (IPS) on the router. Disable signatures. Merge signature definition fil
Dallas - COSC - 1300
NewGeneration Monitors Author: Date: Purpose:To report and analyze annual sales figures from three NewGeneration monitorsNewGeneration Monitors Sales Data 1/1/2006 - 12/31/2006 Monthly Sales Data Month VX100 VX300 FlatScreen January 1410 1860 February 1
Princeton - CL - 795
COPYRIGHT NOTICE: Joshua M. Epstein: Generative Social Science is published by Princeton University Press and copyrighted, 2006, by Princeton University Press. All rights reserved. No part of this book may be reproduced in any form by any electronic or me
Washington - ESS - 201
Contribution of Working Group II to the Fourth Assessment Report of the Intergovernmental Panel on Climate ChangeSummary for PolicymakersThis summary, approved in detail at the Eighth Session of IPCC Working Group II (Brussels, Belgium, 2-5 April 2007),
University of Texas - CS - 395
Transactional Memory: Architectural Support for Lock-Free Data StructuresMaurice Herlihy Digital Equipment Corporation Cambridge Research Laboratory Cambridge MA 02139 herlihy@crl.dec.com J. Eliot B. Moss Dept. of Computer Science University of Massachus
Iowa State - EE - 435
EE 435Lecture 29 Data Converter CharacterizationSpectral PerformancePerformance Characterization of Data Converters Static characteristics Resolution Least Significant Bit (LSB) Offset and Gain Errors Absolute Accuracy Relative Accuracy Integral Nonl
MIT - MPC - 555
Embedded Target for Motorola MPC555 For Use with Real-Time Workshop Modeling Simulation ImplementationUsers GuideVersion 1How to Contact The MathWorks:www.mathworks.com comp.soft-sys.matlab support@mathworks.com suggest@mathworks.com bugs@mathworks.c
Cornell - CS - 6740
CS 6740/INFO 6300 Advanced Language Technologies Last class General Intro to HLTSemantic analysisAssigning meanings to linguistic utterances Compositional semantics: we can derive the meaning of the whole sentence from the meanings of the parts. Max a
North Texas - BUSI - 6280
On The Use Of Structural Equation Models Experimental DesigBagozzi, Richard P.; Yi, Youjae JMR, Journal of Marketing Research; Aug 1989; 26, 3; ABI/INFORM Global pg. 271Reproduced with permission of the copyright owner. Further reproduction prohibited w
Iowa State - NR - 73966
Sorbic Acid*By Dr. Murli Dharmadhikari A. W. Van Hoffman was the first to isolate sorbic acid from the berries of the mountain ash tree in the year 1859. The antimicrobial (preservative) properties of sorbic acid were recognized in the 1940's. In the lat
University of Florida - FOS - 4321
The performance of the method Limit of detection and quantification Sensitivity Specificity Recovery rate of the spiked standard Availability of instruments and reagents Cost and speed Compliance to government regulation (official methods)Official met
Appalachian State - MATH - 5970
Math 5970 - Number Theory ConceptsInstructor: Contact Information: Tracie McLemore Salinas 233 Walker Hall 828-262-2673 salinastm@appstate.edu http:/www.appstate.edu/~salinastmCourse Description: The Graduate Bulletin describes this course as &quot;a study o
Iowa State - EE - 435
EE 435Lecture 17 A Design Flow For Two-Stage Op AmpsReview from last lectureBasic Two-Stage Op Amp gmd (gm0 - sCc ) A FB (s) 2 s CCCL + sCC (gmo - gmd ) + gmd gmoIt can be shown thatgmogmd CL Q= CC gmo - gmdgmo gmd CL CC = 2 Q (gmo - gmd )2wheregm
Kennesaw - JAN - 21
EDUC 7705 FALLI.Course Number: EDUC 7705 Course Title: Assessment and Evaluation College: Bagwell College of Education Semester: Room: Instructors: Class Meetings: ) Payne, D. A. (2002). Applied educational assessment (2nd ed. ). Publishing Company. ISB
Carnegie Mellon - LING - 101
THE CHILD'S LEARNING OF ENGLISH MORPHOLOGYJEAN BERKOIn this study1 we set out to discover what is learned by children exposed to English morphology. To test for knowledge of morphological rules, we use nonsense materials. We know that if the subject can
University of Toronto - CS - 309
Some practise midterm questions:1) Given 2 IP addresses and a subnet mask, determine whether these are on the same subnet.2) Write the number guessing game (computer thinks of a numberfrom 1-10, user guesses it) on the following platforms:a) Clien
CSU Northridge - EAN - 7513
CSU Northridge - EAN - 7513
University of Toronto - CS - 2410
%!PS-Adobe-2.0 %Creator: dvips(k) 5.95a Copyright 2005 Radical Eye Software %Title: syl.dvi %Pages: 1 %PageOrder: Ascend %BoundingBox: 0 0 612 792 %DocumentFonts: CMBX12 CMR12 CMTI12 %DocumentPaperSizes: Letter %EndComments %DVIPSWebPage: (www.radicaleye.
University of Toronto - CS - 2410
%!PS-Adobe-2.0 %Creator: dvips(k) 5.90a Copyright 2002 Radical Eye Software %Title: f04.dvi %CreationDate: Fri Dec 10 00:28:28 2004 %Pages: 2 %PageOrder: Ascend %BoundingBox: 0 0 596 842 %DocumentFonts: CMBX12 CMR10 CMBX10 CMMI10 CMTI10 CMSY10 CMR8 CMMI8
Wisconsin - GEO - 302
Whole Earth StructurePart I - Seismic Waves as Probes of the InteriorEarth's Internal Structure The chemical differentiation into a core, mantle, and crust produces very strong differences in density and seismic wave velocity Our primary and highest-re
Wisconsin - INSTR - 0304
2003-04 DISTRIBUTION LIST FOR UNBOUND RED BOOKSEntire Madison Campus - Unit A UDDS A-02-08 A-02-08 A-02-95 A-03 A-04 A-10 A-49 A-49 A-49 A-53 A-85 Department Name VC for Administration (Darrell Bazzell) OBPA (Tim Norris) University Research Park (Patty S
UCLA - CS - 218
E % D D C ! # !&amp; A 9 9 5 3 6)B@8r(7rf5642rFf 1 ( $ # ! 0)&amp;'% &quot; dr(rk50pv55pYzvkpd5rpukVVzuuk5updGup5d pvEp5yUFry!EE(5(d5zv dr%zkzddEyEEVp55d5dp5drvzvEEECkE55 zG5EEE5E5E0dkfrf5vf5rE5dfv!r0EYvvzd
U. Houston - PH - 1322
1
U. Houston - PH - 1322
1
U. Houston - PH - 1322
1
U. Houston - PH - 1322
1
U. Houston - PH - 1322
1
U. Houston - PH - 1322
1
U. Houston - PH - 1322
1
U. Houston - PH - 1322
1