Funcons that use regular expressions regexprpattern

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: in an HTML document •  create variables from informa+on found in text •  clean and transform text into a uniform format, resolving inconsistencies in format between files •  mine text by trea+ng documents directly as data •  “scrape” the web for data •  A regular expression (aka regex or regexp) is a paMern that describes a set of strings. •  This set may be finite or infinite, depending on the par+cular regexp. We say the regexp “matches” each element of that set. •  For example, the regexp grey|gray ! matches both grey and gray, whereas ^A.* matches any string star+ng with capital A. •  The idea is similar to wildcards in UNIX, but with many more possibili+es. Syntax: • Literal characters are matched only by the character itself. • A character...
View Full Document

This document was uploaded on 02/16/2014 for the course STATISTICS 3026 at Columbia.

Ask a homework question - tutors are online