Regular Languages Refresher Roman Manevich BenGurion University of the Negev Regular languages refresher 2 Regular languages refresher Formal languages Alphabet finite set of letters Word sequence of letter ID: 778244
Download The PPT/PDF document "Fall 2017-2018 Compiler Principles" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Fall 2017-2018 Compiler PrinciplesRegular Languages Refresher
Roman
Manevich
Ben-Gurion University of the Negev
Slide2Regular languages refresher2
Slide3Regular languages refresherFormal languagesAlphabet = finite set of lettersWord = sequence of letterLanguage = set of wordsRegular languages defined equivalently byRegular expressionsFinite-state automata
3
Slide4Regular expressionsEmpty string: ЄLetter: aConcatenation: R1 R
2
Union:
R
1
| R
2
Kleene-star:
R*Shorthand: R+ stands for R R*scope: (R)Example: (0* 1*) | (1* 0*)What is this language?
4
Slide5Finite automata5
Slide6Finite automata: known resultsTypes of finite automata:Deterministic (DFA)Non-deterministic (NFA)Non-deterministic + epsilon transitionsTheorem: translation of regular expressions to NFA+epsilon (linear time)Theorem: translation of NFA+epsilon
to DFA
Worst-case exponential time
Theorem [
Myhill-Nerode
]:
For every DFA there is an equivalent unique minimal DFA
6
Slide7Finite automata
start
a
b
b
c
accepting
state
start
state
transition
An automaton
M
=
Q
,
, ,
q
0
,
F
is defined by states and transitions
7
Slide8Automaton running example
start
a
b
b
c
Words are read left-to-right
c
b
a
8
Slide9Automaton running example
start
a
b
b
c
Words are read left-to-right
c
b
a
9
Slide10Automaton running example
start
a
b
b
c
Words are read left-to-right
c
b
a
10
Slide11Automaton running example
start
a
b
b
c
Words are read left-to-right
word
accepted
c
b
a
11
Slide12Word outside of language 1
start
a
b
b
c
c
b
b
12
Slide13Word outside of language 1Missing transition means non-acceptance
start
a
b
b
c
c
b
b
13
Slide14Word outside of language 2
start
a
b
b
c
b
b
a
14
Slide15Word outside of language 2
start
a
b
b
c
b
b
a
15
Slide16Word outside of language 2
start
a
b
b
c
b
b
a
16
Final state is not an accepting state
Slide17Exercise - QuestionWhat is the language defined by the automaton below
start
a
b
b
c
17
?
Slide18Exercise - AnswerWhat is the language defined by the automaton belowa b* cGenerally: all paths leading to accepting states
start
a
b
b
c
18
?
Slide19Nondeterministic Finite automata19
Slide20Non-deterministic automataAllow multiple transitions from given state labeled by same letter
start
a
a
b
c
b
c
20
Slide21NFA run example
c
b
a
start
a
a
b
c
b
c
21
Slide22NFA run exampleMaintain set of states
c
b
a
start
a
a
b
c
b
c
22
Slide23NFA run example
c
b
a
start
a
a
b
c
b
c
23
Slide24NFA run exampleAccept word if any of the states in the set is accepting
c
b
a
start
a
a
b
c
b
c
24
Slide25NFA+Є automataЄ transitions can “fire” without reading the input
start
a
b
c
Є
25
Slide26NFA+Є run example
start
a
b
c
c
b
a
Є
26
Slide27NFA+Є run exampleNow Є transition can non-deterministically take place
start
a
b
c
c
b
a
Є
27
Slide28NFA+Є run example
start
a
b
c
c
b
a
Є
28
Slide29NFA+Є run example
start
a
b
c
c
b
a
Є
29
Slide30NFA+Є run example
start
a
b
c
c
b
a
Є
30
Slide31NFA+Є run example
start
a
b
c
c
b
a
Є
Word accepted
31
Slide32From Regular expressionsto NFA32
Slide33From reg. exp. to automataTheorem: there is an algorithm to build an NFA+Є automaton for any regular expressionProof: by induction on the structure of the regular expressionFor each sub-expression R we build an automaton with exactly one start state and one accepting stateStart state has no incoming transitions
Accepting state has no outgoing transitions
33
Slide34From reg. exp. to NFA+Є automataTheorem: there is an algorithm to build an NFA+Є automaton for any regular expressionProof: by induction on the structure of the regular expression
34
start
Slide35Inductive constructions35R =
start
R =
a
start
a
start
R
1
R
2
R
1
| R
2
start
R
1
R
2
R
1
R
2
start
R
R
*
Slide36Running time of NFA+ЄConstruction requires O(k) states for a reg-exp of length kRunning an NFA+Є with k states on string of length n takes
O
(
n·k
2
) time
Can we reduce the
k
2 factor?36
Each state in a configuration of
O
(
k
) states may have
O
(
k
) outgoing edges, so processing an input letter may take
O
(
n2) time
Slide37From NFA+Є to DFAConstruction requires O(k) states for a reg-exp of length kRunning an NFA+Є with k
states on string of length
n
takes
O
(
n·k
2
) timeCan we reduce the k2 factor?Theorem: for any NFA+Є automaton there exists an equivalent deterministic automatonProof: determinization via subset constructionNumber of states in the worst-case O
(2
k
)
Running time
O
(
n
)
37
Slide38NFA determinization38
Slide39Subset constructionFor an NFA+Є with states M={s1,…,sk}Construct a DFA with one state per setof states of the corresponding NFAM’={ [], [s1], [s
1
,s
2
], [s
2
,s
3
], [s1,s2,s3], …}Simulate transitions between individual states for every letter
39
a
s
1
s
2
a
[s
1
,s
4
]
[s
2
,s
7
]
NFA+
Є
DFA
a
s
4
s
7
Slide40Handling epsilon transitionsExtend macro states by states reachablevia Є transitions 40
Є
s
1
s
4
[s
1
,s
2
]
[s
1
,s
2
,s
4
]
NFA+
Є
DFA