Assignment5_Report.pdf

Assignment5_Report.pdf - CS6200 Section 2 Assignment 5...

Info icon This preview shows pages 1–6. Sign up to view the full content.

View Full Document Right Arrow Icon
CS6200 Section 2 Assignment 5 Amogh Huilgol Design Discussion and Pseudo-Code For each step of the algorithm, briefly explain the main idea for how you partition the problem. Show compact pseudo-code for it. It is up to you to decide how to map the computation steps to MapReduce jobs. For example, you can merge multiple steps into a single job, or you can use multiple jobs to implement a single step. Make sure you discuss and show pseudo-code for both versions of step 3. (20 points) Step 1 : Create matrix 𝐌 from either the original input or the adjacency- list representation from a previous assignment. (But also see the discussion about sparse matrix representation below!) We create the adjacency lists as we have done before in previous assignments. i.e. PageName: Adjacency List. The reduce call performs following functions a) Compute dead links b) Offset w.r.t to each worker c) Getting local pageName : Number map. The partitioning ensures that each worker gets 1/nth of dataset The pseudo code for Step 1 can be visualized as follows Map (key k, line l){ nodeId = getKey(l) // Parser Given as part of homework adjList = getAdjacencyList(l) emit(nodeId, adjList) }
Image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
CS6200 Section 2 Assignment 5 Amogh Huilgol Class Reduce{ Method Setup(){ Counter = 0; HashMap<String, Number> nameToNumMap; } method reduce(node_name n, List[adjLists] ){ StringToNumMap.put(n, Counter++) for each adjList in adjLists aggrAdjLinks += adjList emit(n, aggrAdjLinks) } method Cleanup(){ emit(MapId, StringToNumMap emit(MapId, Counter) } } //end of reduce class Now, using the offset and local map, we create a unique global map between pageName and Number. This is easily done using a map only job. As before , the job is partitioned such that each worker gets to compute global map for 1/nth of inputs. However every map sees complete offset data using distributed cache. The pseudo code for the above process is as follows:
Image of page 2
CS6200 Section 2 Assignment 5 Amogh Huilgol Class Mapper { HashMap globalMap ; HashMap offset Method Setup(){ HashMap offset = getOffset(pathToHdfs); } Method map(id, localMap){ For each element e in localMap: globalMap.put(e.name,e.number+offset(id)) } Method cleanup(){ Emit(globalMap); Emit(inverted(globalMap)); } } After this step, we have all the files needed to construct sparse matrix . We do this in two ways Method 1: Column Partition: To form a column we just need adjacency list and hence it is a map only job. As before, we partition the data so that each worker gets 1/nth of data. The pseudo code is as follows
Image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
CS6200 Section 2 Assignment 5 Amogh Huilgol Class Map{ Method setup(){ Map<String, Number> nameToNumberMap = load Map from hdfs } method map(String pageName, List[Adj1, Adj2 .. ]){ for( each adj in list){ List rowContribution += [(nameToNumberMap (adj)] } emit(nameToNumberMap (pageName), rowContribution); if(list.size == 0){ emit(nameToNumberMap.get(pageName)) } }
Image of page 4
CS6200 Section 2 Assignment 5 Amogh Huilgol Method 2 : Row Partition Class Map{ setup(){ Map NameToNumber = load from hdfs } method map(String pageName, List[Adj1, Adj2 .. ]){ for( each adj in list){ emit(NameToNumber.get(adj), (NameToNumber(pageName), 1/list.size)) } if(list.size == 0){ // write to dangling vector on HDFS
Image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 6
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern