jurafsky&martin_3rdEd_17 (1).pdf

In cases where there is a well defined sequence asso

Info icon This preview shows pages 12–14. Sign up to view the full content.

to mean “any capital letter”). In cases where there is a well-defined sequence asso- ciated with a set of characters, the brackets can be used with the dash ( - ) to specify any one character in a range . The pattern /[2-5]/ specifies any one of the charac- range ters 2 , 3 , 4 , or 5 . The pattern /[b-g]/ specifies one of the characters b , c , d , e , f , or g . Some other examples are shown in Fig. 2.3 . RE Match Example Patterns Matched /[A-Z]/ an upper case letter “we should call it ‘D renched Blossoms’ ” /[a-z]/ a lower case letter “m y beans were impatient to be hoed!” /[0-9]/ a single digit “Chapter 1 : Down the Rabbit Hole” Figure 2.3 The use of the brackets [] plus the dash - to specify a range. The square braces can also be used to specify what a single character cannot be, by use of the caret ˆ . If the caret ˆ is the first symbol after the open square brace [ , the resulting pattern is negated. For example, the pattern /[ˆa]/ matches any single character (including special characters) except a . This is only true when the caret is the first symbol after the open square brace. If it occurs anywhere else, it usually stands for a caret; Fig. 2.4 shows some examples. RE Match (single characters) Example Patterns Matched /[ˆA-Z]/ not an upper case letter “Oy fn pripetchik” /[ˆSs]/ neither ‘S’ nor ‘s’ “I have no exquisite reason for’t” /[ˆ\.]/ not a period “o ur resident Djinn” /[eˆ]/ either ‘e’ or ‘ ˆ “look up ˆ now” /aˆb/ the pattern ‘ aˆb “look up aˆ b now” Figure 2.4 Uses of the caret ˆ for negation or just to mean ˆ . We discuss below the need to escape the period by a backslash. How can we talk about optional elements, like an optional s in woodchuck and woodchucks ? We can’t use the square brackets, because while they allow us to say “s or S”, they don’t allow us to say “s or nothing”. For this we use the question mark /?/ , which means “the preceding character or nothing”, as shown in Fig. 2.5 . We can think of the question mark as meaning “zero or one instances of the previous character”. That is, it’s a way of specifying how many of something that
Image of page 12

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

2.1 R EGULAR E XPRESSIONS 13 RE Match Example Patterns Matched /woodchucks?/ woodchuck or woodchucks “woodchuck /colou?r/ color or colour “colour Figure 2.5 The question mark ? marks optionality of the previous expression. we want, something that is very important in regular expressions. For example, consider the language of certain sheep, which consists of strings that look like the following: baa! baaa! baaaa! baaaaa! . . . This language consists of strings with a b , followed by at least two a ’s, followed by an exclamation point. The set of operators that allows us to say things like “some number of a s” are based on the asterisk or * , commonly called the Kleene * (gen- Kleene * erally pronounced “cleany star”). The Kleene star means “zero or more occurrences of the immediately previous character or regular expression”. So /a*/ means “any string of zero or more a s”. This will match a or aaaaaa , but it will also match Off Minor since the string Off Minor has zero a ’s. So the regular expression for matching
Image of page 13
Image of page 14
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern