HW1 Q1 [Dean04a]

HW1 Q1 [Dean04a] - b) what is the key, title, and authors...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: b) what is the key, title, and authors of the paper? [Dean04a] MapReduce: Simplified Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat c) What are the key ideas or claims of the paper (1-2 sentences)? This paper presents a basic programming model and implementation of Map Reduce which is utilized for creating and processing large datasets for parallel programming. Easy way to provide fault tolerance for large clusters comprising thousands of machine is mentioned in the paper. d) What is the novel about the paper (1 sentence, note that this may be different than your answer to (a))? Novelty of the paper is to ensure fault tolerance in the large clusters for distributed computing and parallel processing and implement that very easily and effectively with using only two basic functions Map Reduce. e) What methodology (or methodologies) does the paper use (1-2 sentences)? This paper develops map function for processing a key/value pair and reduces function to merge all intermediate values to the same intermediate key. It provides basic model for execution and method to handle fault tolerance easily. it also describes scalability with large clusters. f) what is one way the paper could best be improved (1-3 sentences)? It is mentioned in the paper that MapReduce can solve many real world tasks with same approach .So paper could be improved by involving those tasks on which distributed and parallel programming is being performed and compare the performance with them .Thus it would be possible to derive the Mapreduce from method used for other parallel computing tasks and that can be next future work leading to make Map Reduce more generalize. g) Do you consider this paper important or not, and why (1-2 sentences)? As far as large cluster of machine comprising thousands of machines in that case MapReduce could scalable and it also provides fault tolerance. Variety of problems like web search service, for sorting, for data mining, for machine learning can be solved easily with simple data processing .These are the main reasons to consider this paper important....
View Full Document

Ask a homework question - tutors are online