lecture18

lecture18 - Lecture 18: November 24, 2010 Regular...

Info iconThis preview shows pages 1–16. Sign up to view the full content.

View Full Document Right Arrow Icon
Caltech CS 1: Fall 2010 Lecture 18 : November 24, 2010 Regular Expressions th.*g ^th.*g$ th[a-z]*g th?nksg.*g th[^ ]*g th[an]{2}.*g
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Caltech CS 1: Fall 2010 Odds and ends (no specific topic)
Background image of page 2
Caltech CS 1: Fall 2010 Regular expressions (advanced string processing)
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Caltech CS 1: Fall 2010 From Jamie Zawinski (programmer for the Netscape Navigator web browser, which later became Firefox) "Some people, when confronted with a problem, think “I know, I'll use regular expressions.” Now they have two problems."
Background image of page 4
Caltech CS 1: Fall 2010 Regular expressions are a very powerful way of processing strings classifying strings searching for patterns in strings replacing parts of strings that match patterns
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Caltech CS 1: Fall 2010 Regular expressions are very heavily used in practical applications, for instance bioinformatics web programming "data crunching" and anywhere else strings are used
Background image of page 6
Caltech CS 1: Fall 2010 I can't cover everything about regular expressions in one lecture Entire books have been written about them! I will focus on the most general/useful aspects of regular expressions as found in Python see Python docs for more complete information Regular expressions are not Python-specific! almost every language has a regular expression library, some more powerful than others
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Caltech CS 1: Fall 2010 Regular expressions are also known as regexps regexes REs for short
Background image of page 8
Caltech CS 1: Fall 2010 There is a very rich theory behind regular expressions Regular expressions are related to deterministic finite automata (DFA) nondetermistic finite automata (NFA) Caltech's CS 21 course covers these topics Don't need to understand the theory to use regular expressions but it's really interesting, so take CS 21 anyway!
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Caltech CS 1: Fall 2010 Regular expressions in Python require that you import the re module We will go over the main functions in this module later
Background image of page 10
Caltech CS 1: Fall 2010 Regular expressions are basically a kind of "little language" or "embedded language" for describing strings that exists inside Python This notion of "languages inside languages" may seem odd, but there are many examples of it in common use today
Background image of page 11

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Caltech CS 1: Fall 2010 Other little languages: HTML language for describing web pages XML language(s) for describing arbitrary structured data JSON language for describing key-value pairs All of these can be worked with using Python modules
Background image of page 12
Caltech CS 1: Fall 2010 We have a string representing a DNA sequence s = 'AGGTTCGGAATGAGATCCTAAG. ..' We want to find if a particular subsequence is found in the string s for instance, the sequence 'ATTGCC' How do we solve this?
Background image of page 13

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Caltech CS 1: Fall 2010 Several ways of doing this in Python You can write an explicit loop: found = False for i, e in enumerate(s): if s[i:i+6] == 'ATTGCC': Found = True
Background image of page 14
Caltech CS 1: Fall 2010 Several ways of doing this in Python You can use the in operator: found = 'ATTGCC' in s
Background image of page 15

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 16
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 81

lecture18 - Lecture 18: November 24, 2010 Regular...

This preview shows document pages 1 - 16. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online