Professor Xiannong Meng Spring 2018 Lecture and activity contents are based on what Prof Chris Ré used in his CS 145 in the fall 2016 term with permission 2 Finding functional dependencies ID: 783433
Download The PPT/PDF document "Lectures 13: Design Theory II" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Lectures 13:Design Theory II
Professor Xiannong Meng
Spring 2018
Lecture and activity contents
are based
on what Prof Chris
Ré
used in his CS 145 in the fall 2016 term
with permission
Slide22. Finding functional dependencies
2
Slide3What you will learn about in this section
“Good” vs. “Bad” FDs: Intuition
Finding FDs
Closures
ACTIVITY: Compute the closures
3
Slide44
“Good” vs. “Bad” FDs
We can start to develop a notion of
good
vs.
bad
FDs:
EmpID
Name
Phone
Position
E0045Smith1234ClerkE3542Mike9876SalesrepE1111Smith9876SalesrepE9999Mary1234Lawyer
Intuitively:EmpID -> Name, Phone, Position is “good FD”Minimal redundancy, less possibility of anomalies
Slide55
“Good” vs. “Bad” FDs
We can start to develop a notion of
good
vs.
bad
FDs:
EmpID
Name
Phone
Position
E0045Smith1234ClerkE3542Mike9876SalesrepE1111Smith9876SalesrepE9999Mary1234Lawyer
Intuitively:EmpID -> Name, Phone, Position is “good FD”But Position -> Phone
is a “bad FD
”
Redundancy! Possibility of data anomalies
Slide6Student
Course
Room
Mary
CS145
B01
Joe
CS145
B01
Sam
CS145
B01......Given a set of FDs (from user) our goal is to:Find all FDs, and Eliminate the “Bad Ones".Returning to our original example… can you see how the “bad FD” {Course} -> {Room} could lead to an:Update AnomalyInsert AnomalyDelete Anomaly…“Good” vs. “Bad” FDs
Slide7FDs for Relational Schema Design
High-level idea:
why do we care about FDs?
Start with some relational
schema
Find out its
functional dependencies (FDs)Use these to design a better schemaOne which minimizes possibility of anomalies
This part can
be tricky!
Slide8Finding Functional Dependencies
There can be a very
large number
of FDs…
How to find them all efficiently?
We can’t necessarily show that any FD will hold
on all instances…How to do this?We will start with this problem:Given a set of FDs, F, what other FDs must hold?
Slide9Equivalent to asking: Given a set of FDs, F = {f1,…
f
n
}, does an FD g hold?
Inference problem
: How do we decide?
Finding Functional Dependencies
Slide10Finding Functional Dependencies
1.
{Name}
{Color}
2. {Category} {Department}3. {Color, Category}
{Price}
Name
Color
CategoryDepPriceGizmoGreenGadgetToys49WidgetBlackGadgetToys59GizmoGreenWhatsitGarden99Which / how many other FDs do?!? Provided FDs:ProductsGiven the provided FDs, we can see that {Name, Category} {Price} must also hold on any instance… Example:
Slide11Equivalent to asking: Given a set of FDs, F = {f1,…
f
n
}, does an FD g hold?
Inference problem
: How do we decide?
Answer: Three simple rules called Armstrong’s Rules.Split/Combine,Reduction, andTransitivity… ideas by picture
Finding Functional Dependencies
Slide121. Split/Combine
A
1
…
A
m
B
1
…
B
n
A
1
, …, A
m
B
1
,…,
B
n
Slide131. Split/Combine
A
1
…
A
m
B
1
…
B
n
A
1
, …, A
m
B
1
,…,
B
n
… is equivalent to the following
n
FDs…
A
1
,…,A
m
B
i
for
i
=1,…,n
Slide141. Split/Combine
A
1
…
A
m
B
1
…
B
n
A
1
, …, A
m
B
1
,…,
B
n
… is equivalent to …
And vice-versa,
A
1
,…,A
m
B
i
for
i
=1,…,n
Slide152. Reduction/Trivial
A
1
…
A
m
A
1
,…,A
m
Aj for any j=1,…,m
Slide163. Transitive Closure
A
1
…
A
m
B
1
…
B
nC1…Ck
A
1
, …, A
m
B
1
,…,
B
n
and
B
1
,…,
B
n
C
1
,…,
C
k
Slide173. Transitive Closure
A
1
…
A
m
B
1
…
B
nC1…Ck
A
1
, …, A
m
B
1
,…,
B
n
and
B
1
,…,
B
n
C
1
,…,
C
k
implies
A
1
,…,A
m
C
1
,…,
C
k
Slide18Finding Functional Dependencies
1.
{Name}
{Color}
2. {Category} {Department}3. {Color, Category}
{Price}
Name
Color
CategoryDepPriceGizmoGreenGadgetToys49WidgetBlackGadgetToys59GizmoGreenWhatsitGarden99Which / how many other FDs hold?Provided FDs:ProductsExample:
Slide19Finding Functional Dependencies
1. {Name}
{Color}
2
. {Category} {Dept.}3. {Color,
Category}
{Price}
Which / how many other FDs hold?
Provided FDs:Inferred FDs:Example:Inferred FDRule used4. {Name, Category} -> {Name}?5. {Name, Category} -> {Color}?6. {Name, Category} -> {Category}?7. {Name, Category -> {Color, Category}?8. {Name, Category} -> {Price}?Rules:1. Split/Combine2. Reduction/Trivial3. Transitive Closure
Slide20Finding Functional Dependencies
1. {Name}
{Color}
2
. {Category} {Dept.}3. {Color,
Category}
{Price}
Can we find
an algorithmic way to do this?Provided FDs:Inferred FDs:Example:Inferred FDRule used4. {Name, Category} -> {Name}Trivial5. {Name, Category} -> {Color}Transitive (4 -> 1)6. {Name, Category} -> {Category}Trivial7. {Name, Category -> {Color, Category}Split/combine (5 + 6)8. {Name, Category} -> {Price}Transitive (7 -> 3)