Lecture 3 Languages and Regular Expressions Basic...

Lecture 3: Languages and Regular Expressions Basic concepts Alphabet — a finite set of symbols, Σ. word (or string ) — a finite sequence of symbols from an alphabet. Alphabet Words { a, b, . . . , z } man, abc, . . . { 0, 1 } 000, 010101, . . . { #, \$, a, b, c } #cb\$, \$\$\$, . . . • | w | length of a word w , i.e. the number of symbols in w . e — the empty word containing no symbols, i.e. the word of zero length. To avoid confusion, e should not be in any alphabet. 1

Languages A language is a set of words defined over an alphabet Σ. Examples: 1. Set of all English words — a language over { a, b, . . . , z } . 2. { 01, 0101, 010101, . . . } — a language over { 0 , 1 } . 3. { e } — a language over any alphabet. • ∅ — the empty language , i.e. the language contain- ing has no words. Σ — the set of all words over the alphabet Σ. It is called the universal language. Any language L is a subset of Σ . Note: = { e } . 2
Operations on words Concatenation merges two given words to form a new word: e.g. abc 123 = abc 123 Properties: ew = we = w ( uv ) w = u ( vw ) Reversal reverses the order of all the symbols in a given w . w = a 1 · · · a n w R = a n · · · a 1 Inductive definition: (1) e R = e (2) ( au ) R = u R a , where a Σ , u Σ Power concatenates n copis of w to form a new word w n = n ww · · · w Inductive definition: (1) w 0 = e (2) w i

