Assignment5_Report.pdf

# Assignment5_Report.pdf - CS6200 Section 2 Assignment 5...

This preview shows pages 1–6. Sign up to view the full content.

CS6200 Section 2 Assignment 5 Amogh Huilgol Design Discussion and Pseudo-Code For each step of the algorithm, briefly explain the main idea for how you partition the problem. Show compact pseudo-code for it. It is up to you to decide how to map the computation steps to MapReduce jobs. For example, you can merge multiple steps into a single job, or you can use multiple jobs to implement a single step. Make sure you discuss and show pseudo-code for both versions of step 3. (20 points) Step 1 : Create matrix 𝐌 from either the original input or the adjacency- list representation from a previous assignment. (But also see the discussion about sparse matrix representation below!) We create the adjacency lists as we have done before in previous assignments. i.e. PageName: Adjacency List. The reduce call performs following functions a) Compute dead links b) Offset w.r.t to each worker c) Getting local pageName : Number map. The partitioning ensures that each worker gets 1/nth of dataset The pseudo code for Step 1 can be visualized as follows Map (key k, line l){ nodeId = getKey(l) // Parser Given as part of homework adjList = getAdjacencyList(l) emit(nodeId, adjList) }

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
CS6200 Section 2 Assignment 5 Amogh Huilgol Class Reduce{ Method Setup(){ Counter = 0; HashMap<String, Number> nameToNumMap; } method reduce(node_name n, List[adjLists] ){ StringToNumMap.put(n, Counter++) for each adjList in adjLists aggrAdjLinks += adjList emit(n, aggrAdjLinks) } method Cleanup(){ emit(MapId, StringToNumMap emit(MapId, Counter) } } //end of reduce class Now, using the offset and local map, we create a unique global map between pageName and Number. This is easily done using a map only job. As before , the job is partitioned such that each worker gets to compute global map for 1/nth of inputs. However every map sees complete offset data using distributed cache. The pseudo code for the above process is as follows:
CS6200 Section 2 Assignment 5 Amogh Huilgol Class Mapper { HashMap globalMap ; HashMap offset Method Setup(){ HashMap offset = getOffset(pathToHdfs); } Method map(id, localMap){ For each element e in localMap: globalMap.put(e.name,e.number+offset(id)) } Method cleanup(){ Emit(globalMap); Emit(inverted(globalMap)); } } After this step, we have all the files needed to construct sparse matrix . We do this in two ways Method 1: Column Partition: To form a column we just need adjacency list and hence it is a map only job. As before, we partition the data so that each worker gets 1/nth of data. The pseudo code is as follows

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
CS6200 Section 2 Assignment 5 Amogh Huilgol Class Map{ Method setup(){ Map<String, Number> nameToNumberMap = load Map from hdfs } method map(String pageName, List[Adj1, Adj2 .. ]){ for( each adj in list){ List rowContribution += [(nameToNumberMap (adj)] } emit(nameToNumberMap (pageName), rowContribution); if(list.size == 0){ emit(nameToNumberMap.get(pageName)) } }
CS6200 Section 2 Assignment 5 Amogh Huilgol Method 2 : Row Partition Class Map{ setup(){ Map NameToNumber = load from hdfs } method map(String pageName, List[Adj1, Adj2 .. ]){ for( each adj in list){ emit(NameToNumber.get(adj), (NameToNumber(pageName), 1/list.size)) } if(list.size == 0){ // write to dangling vector on HDFS

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

### What students are saying

• As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

Kiran Temple University Fox School of Business ‘17, Course Hero Intern

• I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

Dana University of Pennsylvania ‘17, Course Hero Intern

• The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

Jill Tulane University ‘16, Course Hero Intern