hw5-2010 - CS 140 Assignment 5: NFA Based String Matching...

Info iconThis preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon
CS 140 Assignment 5: NFA Based String Matching Assigned February 10, 2010 Due by 11:59 pm Wednesday, February 24 The purpose of this assignment is for you to gain experience in a common real-world scenario: You are given an existing sequential program, and you will parallelize it using Cilk++. Your job is to convert the sequential program into a parallel one, without introducing any data races, and get a reasonable speedup. For grading purposes, both correctness and performance will count. For this assignment, there is one part of the program you will parallelize (because it will be executed many times on large inputs), and another part that you will leave sequential (because it only runs once). You will measure speedup only on the part you parallelize. 1. The problem domain The underlying problem is to locate a “target” character string that fits a particular description, within a particular set of text “data”. Versions of this problem show up in many different applications, ranging from the “find” command in a word processor to the reconstruction of a biological genome from DNA sequencing data. Finding a simple string in a word processor is an easy computation; but when the strings and the data get very long, parallel computing must come into play. For this homework, you will parallelize a sequential code for a very basic and important string matching problem. There are two inputs: first, a “regular expression” that describes the set of target strings you’re looking for; second, a string (or maybe a list of strings) to be checked against the regular expression to see if they match the target set. The program first converts the regular expression into a so-called “nondeterministic finite state automaton” or NFA, which is a description of an abstract machine that recognizes strings that match the regular expression. You don’t need to parallelize this conversion, because it only happens once on a small data set. The second step is the core of the algorithm, and the most important part: Here the program takes a particular input string and uses the NFA to decide whether or not the string is in the target set. You will parallelize this core part of the algorithm, which checks one single input string against the NFA.
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 2
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 12/27/2011 for the course CMPSC 140 taught by Professor Gilbert during the Fall '11 term at UCSB.

Page1 / 5

hw5-2010 - CS 140 Assignment 5: NFA Based String Matching...

This preview shows document pages 1 - 2. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online