UNC Chapel Hill Data Structures and Analysis COMP 410 Design Problem Type ahead Like on google search phone typing you type a few chars and the program fills in a list of possible choices for you based on the prefix you have typed ID: 759472
Download Presentation The PPT/PDF document "David Stotts Computer Science Departmen..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
David StottsComputer Science DepartmentUNC Chapel Hill
Data Structures
and Analysis
(COMP 410)
Slide2Design Problem
Slide3Type aheadLike on google search, phone typing… you type a few chars and the program fills in a list of possible choices for you… based on the prefix you have typedKeep typing more chars, the choices narrow and changeDesign a data structure that will let you do thisDescribe the time complexity of using it… searching it as typing is done, generating alternatives, etc.
Real Problem
Slide4Discuss an approach with your neighborIn 5-10 mins we will discuss ideas as a class
Take some time
Slide5Let’s not use node to store a whole wordUse child link to represent a char typedPath is then the word
Basic idea
e
t
a
tea
a
to
o
a
s
n
an
as
w
e
n
<root>
new
a
r
tar
Slide6Basic idea…
e
t
a
tea
a
to
o
a
s
ant
t
n
an
as
w
e
n
<root>
t
new
a
r
tar
This tree encodes (stores) these words:
tar, tan, tea, to, ton, toe, a, an, ant, as, net, nest, new, no
n
tan
ton
n
toe
e
o
no
s
net
nest
t
Slide7TriePronounced “try” or “tree”, both waysOr “trie tree” tree-tree, try-treeComes from “ re TRIE val ”Used for prefix-based retrieval of strings formed over an alphabet
This has a name
Slide8How many children at each node?As many as there are chars you can typeLet’s say 26 for this examplenode { string val = null; node[26] child = new [null,null,…,null]; boolean isWord = false;}
Representation
Slide9node { string
val = null; node[26] child = new [null,null,…,null]; boolean isWord = false;}
Representation
val:
child:
isWord: false
. . .
0 1 2 3 4 5 6 7 . . . 22 23 24 25
Slide10Representation
val
:
child:
isWord
: false
. . .
0 1 2 3 4 5 6 7 . . . 22 23 24 25
val
: “a”
child:
isWord
:
true
. . .
0 1 2 3 4 5 6 7 . . . 22 23 24 25
val
: “b”
child:
isWord
:
false
. . .
0 1 2 3 4 5 6 7 . . . 22 23 24 25
val
: “be”
child:
isWord
:
true
. . .
0 1 2 3 4 5 6 7 . . . 22 23 24 25
Slide11val
:
child:
isWord
: false
. . .
0 1 2 3 4 5 6 7 . . . 22 23 24 25
val
: “a”
child:
isWord
:
true
. . .
0 1 2 3 4 5 6 7 . . . 22 23 24 25
val
: “b”
child:
isWord
:
false
. . .
0 1 2 3 4 5 6 7 . . . 22 23 24 25
val
: “be”
child:
isWord
:
true
. . .
0 1 2 3 4 5 6 7 . . . 22 23 24 25
a
b
e
be
<root>
a
Representation
Slide12Big Oh time complexity is always expressed in terms of some problem sizeHere the problem size is not the number of words encoded in the tree, like we say for BSTRather we choose M, the length of a word being inserted or searched for
Analysis
Slide13The worst case time needed to find a word of length M is… O(M)This is true if the tree contains 10 words or 10 million wordsLength of the longest path in the tree is length of the longest word stored in the tree
Analysis
Slide14If a word of length M can be made from N different characters (like 26 in the alphabet) then the number of possible nodes in the data structure isM^N A trie to store words 20 character long in an alphabet of 52 chars (upper and lower) is20^52
Analysis
Slide15Note that if we store 26 character words and limit us to lower case we get26^26 possible nodes…This is slightly worse than 26 !26 * 26 * 26 * … * 26Is worse than26 * 25 * 24 * … * 2 * 1
Analysis
Slide16How bad is N! ?Lets compare let N = 202^N is 2^20 is about a millionN! is 20! is 2.432902e+182,432,902,000,000,000,0002,432,902,000,000 * a million2.4 trillion millions
Analysis
Slide17A trie made to hold 20 character words… Made from 20 lower case charactersWorst case find operation is O(20) or O(N)Worst case space… O(N!)So -- its very fast to use-- Impossible (very impractical) to build in time and space
So what?
Slide18Beyond this is just templates
END