/
Homework 1: Solutions Homework 1: Solutions

Homework 1: Solutions - PowerPoint Presentation

phoebe-click
phoebe-click . @phoebe-click
Follow
383 views
Uploaded On 2017-06-26

Homework 1: Solutions - PPT Presentation

CS4445B12 Provided by Kenneth J Loomis Entropy of the original set genre criticsreviews rating IMAX likes comedy thumbsup R FALSE no comedy thumbsup R TRUE no comedy neutral ID: 563437

comedy genre action rating genre comedy rating action false drama imax true downpg reviews thumbs attribute determine critics child

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Homework 1: Solutions" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Homework 1: Solutions

CS4445/B12

Provided by: Kenneth J. LoomisSlide2

Entropy of the original set

genre

critics-reviews

ratingIMAXlikescomedythumbs-upRFALSEnocomedythumbs-upRTRUEnocomedyneutralRFALSEnoactionthumbs-downPG-13TRUEnoactionneutralRTRUEnocomedythumbs-downPG-13FALSEyescomedyneutralPG-13TRUEyesdramathumbs-upRFALSEyesdramathumbs-downPG-13TRUEyesdramaneutralRTRUEyesdramathumbs-upPG-13FALSEyesactionneutralRFALSEyesactionthumbs-downPG-13FALSEyesactionneutralPG-13FALSEyes

Entropy (target attribute)

 Slide3

Determine the root node attribute

.6935

 

genre=comedy=drama=actiongenrecritics-reviewsratingIMAXlikesactionthumbs-downPG-13TRUEnoactionneutralRTRUEnoactionneutralRFALSEyesactionthumbs-downPG-13FALSEyesactionneutralPG-13FALSEyescomedythumbs-upRFALSEnocomedythumbs-upRTRUEnocomedyneutralRFALSEnocomedythumbs-downPG-13FALSEyescomedyneutralPG-13TRUEyesdramathumbs-upR

FALSEyesdramathumbs-downPG-13TRUE

yes

drama

neutral

R

TRUE

yes

drama

thumbs-up

PG-13

FALSE

yesSlide4

Determine the root node attribute

genre

critics-reviews

ratingIMAXlikesactionneutralRTRUEnocomedyneutralRFALSEnoactionneutralRFALSEyesactionneutralPG-13FALSEyescomedyneutralPG-13TRUEyesdramaneutralRTRUEyesactionthumbs-downPG-13TRUEnoactionthumbs-downPG-13FALSEyescomedythumbs-downPG-13FALSEyesdramathumbs-downPG-13TRUEyescomedythumbs-upRFALSEnocomedythumbs-upRTRUEnodramathumbs-upRFALSEyesdramathumbs-upPG-13FALSEyes

.9111

 

c

ritics-reviews

=thumbs-up

=neutral

=thumbs-downSlide5

Determine the root node attribute

genre

critics-reviews

ratingIMAXlikesactionthumbs-downPG-13TRUEnoactionneutralPG-13FALSEyescomedyneutralPG-13TRUEyesactionthumbs-downPG-13FALSEyescomedythumbs-downPG-13FALSEyesdramathumbs-downPG-13TRUEyesdramathumbs-upPG-13FALSEyesactionneutralRTRUEnocomedyneutralRFALSEnocomedythumbs-upRFALSEnocomedythumbs-upRTRUEnoactionneutralRFALSEyesdramaneutralRTRUEyesdramathumbs-upRFALSEyes

.7885

 

rating

=PG-13

=RSlide6

Determine the root node attribute

genre

critics-reviews

ratingIMAXlikescomedyneutralRFALSEnocomedythumbs-upRFALSEnoactionneutralPG-13FALSEyesactionthumbs-downPG-13FALSEyescomedythumbs-downPG-13FALSEyesdramathumbs-upPG-13FALSEyesactionneutralRFALSEyesdramathumbs-upRFALSEyesactionthumbs-downPG-13TRUEnoactionneutralRTRUEnocomedythumbs-upRTRUEnocomedyneutralPG-13TRUEyesdramathumbs-downPG-13TRUEyesdramaneutralRTRUEyes

.8922

 

IMAX

=FALSE

=TRUESlide7

Determine the root node attribute

.6935

.9111

.7885.8922 genre=comedy=drama=actionWe can see that genre provides us with the lowest entropy, thus it becomes the root node of our ID3 tree.Slide8

Determine the left child attribute

genre

=comedy

=drama=actionOptions: critics-reviews rating IMAX?We now move on to the left child node of our tree. What attribute do we choose for this node?Slide9

Determine the left child attribute

genre

=comedy

=drama=actioncritics-reviews=thumbs-up=neutral=thumbs-downgenrecritics-reviewsratingIMAXlikescomedyneutralRFALSEnocomedyneutralPG-13TRUEyescomedythumbs-downPG-13FALSEyescomedythumbs-upRFALSEnocomedythumbs-upRTRUEno.4000 Slide10

Determine the left child attribute

genre

=comedy

=drama=actionrating=R=PG-13 genrecritics-reviewsratingIMAXlikescomedyneutralPG-13TRUEyescomedythumbs-downPG-13FALSEyescomedyneutralRFALSEnocomedythumbs-upRFALSEnocomedythumbs-upRTRUEnoSlide11

Determine the left child attribute

genre

=comedy

=drama=actionIMAX=R=PG-13 genrecritics-reviewsratingIMAXlikescomedyneutralRFALSEnocomedythumbs-upRFALSEnocomedythumbs-downPG-13FALSEyescomedythumbs-upRTRUEnocomedyneutralPG-13TRUEyesSlide12

Determine the left child attribute

genre

=comedy

=drama=actionrating=R=PG-13.4000 We can see that rating provides us with the lowest entropy, thus it becomes the left child node of our ID3 tree.Slide13

Determine the left child attribute

genre

=comedy

=drama=actionrating=R=PG-13This also makes this split homogeneous so we can add our leaf nodes here.[yes][no]genrecritics-reviewsratingIMAXlikescomedyneutralPG-13TRUEyescomedythumbs-downPG-13FALSEyescomedyneutralRFALSEnocomedythumbs-upRFALSEnocomedythumbs-upRTRUEnoSlide14

Determine the center child attribute

genre

=

comedy=drama=actionrating=R=PG-13We can see that genre = drama provides us with a homogeneous sub-set, so we can provide a leaf node here.[yes]genrecritics-reviewsratingIMAXlikesdramathumbs-upRFALSEyesdramathumbs-downPG-13TRUEyesdramaneutralRTRUEyesdramathumbs-upPG-13FALSEyes[yes][no]Slide15

Determine the right child attribute

genre

=

comedy=drama=actionrating=R=PG-13We now move on to the right child node of our tree. What attribute do we choose for this node?Options: critics-reviews rating IMAX?[yes][no][yes]Slide16

Determine the right child attribute

genre

=

comedy=drama=actionrating=R=PG-13Critics-reviews=thumbs-up=neutral=thumbs-down genrecritics-reviewsratingIMAXlikesactionneutralRTRUEnoactionneutralRFALSEyesactionneutralPG-13FALSEyesactionthumbs-downPG-13TRUEnoactionthumbs-downPG-13FALSEyes[yes][no][yes]Slide17

Determine the right child attribute

genre

=

comedy=drama=actionrating=R=PG-13rating=R=PG-13 genrecritics-reviewsratingIMAXlikesactionthumbs-downPG-13TRUEnoactionneutralPG-13FALSEyesactionthumbs-downPG-13FALSEyesactionneutralRTRUEnoactionneutralRFALSEyes[yes][no][yes]Slide18

Determine the right child attribute

genre

=

comedy=drama=actionrating=R=PG-13IMAX=TRUE=FALSE genrecritics-reviewsratingIMAXlikesactionneutralPG-13FALSEyesactionthumbs-downPG-13FALSEyesactionneutralRFALSEyesactionthumbs-downPG-13TRUEnoactionneutralRTRUEno[yes][no][yes]Slide19

Determine the right child attribute

genre

=

comedy=drama=actionrating=R=PG-13IMAX=TRUE=FALSEEntropy (critics-reviews) = .9510 = .9510Entropy (IMAX) = 0.0 We can see that IMAX provides us with the lowest entropy, thus it becomes the right child node of our ID3 tree.[yes][no][yes]Slide20

Determine the right child attribute

genre

=

comedy=drama=actionrating=R=PG-13IMAX=TRUE=FALSEThis also makes this split homogeneous so we can add our leaf nodes here.genrecritics-reviewsratingIMAXlikesactionneutralPG-13FALSEyesactionthumbs-downPG-13FALSEyesactionneutralRFALSEyesactionthumbs-downPG-13TRUEnoactionneutralRTRUEno[yes][no][yes][yes][no]Slide21

ID3 Decision tree is complete

genre

=

comedy=drama=actionrating=R=PG-13IMAX=TRUE=FALSESince we have only leaf nodes remaining we are finished building our tree.[yes][no][yes][yes][no]Slide22

Handling

missing

values during prediction

genre=comedy=drama=actionrating=R=PG-13IMAX=TRUE=FALSEHow can we handle missing values using this decision tree?Given an instance:Genre = actionCritics-reviews = ?Rating = RIMAX = ? How do we classify it?[yes][no][yes][yes][no]Slide23

Handling missing

values during prediction: a

solution

Consider adding frequency counts to each leaf node:shown here in curly braces.genre=comedy=drama=actionrating=R=PG-13IMAX=TRUE=FALSE[yes] {2}[no] {3}[yes] {4}[yes] {3}[no] {2}Slide24

Handling missing

values during prediction: a

solution

Genre = actionCritics-reviews = ?Rating = RIMAX = ?Traverse the tree.genre=comedy=drama=actionrating=R=PG-13IMAX=TRUE=FALSE[yes] {2}[no] {3}[yes] {4}[yes] {3}[no] {2}Slide25

Handling missing

values during

prediction:

a solutionGenre = actionCritics-reviews = ?Rating = RIMAX = ?Traverse the decision tree normally when the attribute value is known.genre=comedy=drama=actionrating=R=PG-13IMAX=TRUE=FALSE[yes] {2}[no] {3}[yes] {4}[yes] {3}[no] {2}Slide26

Handling missing

values during

prediction:

a solutionGenre = actionCritics-reviews = ?Rating = RIMAX = ?Traverse every possible path when a missing value is encountered.genre=comedy=drama=actionrating=R=PG-13IMAX=TRUE=FALSE[yes] {2}[no] {3}[yes] {4}[yes] {3}[no] {2}Slide27

Handling missing

values during

prediction:

a solutionGenre = actionCritics-reviews = ?Rating = RIMAX = ?Traverse every possible path when a missing value is encountered.Sum the frequency counts of all like leaf nodes that are reached: genre=comedy=drama=actionrating=R=PG-13IMAX=TRUE=FALSE[yes] {2}[no] {3}[yes] {4}[yes] {3}[no] {2}Slide28

Handling missing

values during

prediction:

a solutionGenre = actionCritics-reviews = ?Rating = RIMAX = ?like = yesFollow every possible path when a missing value is encountered.Determine the frequency count by summing like classification frequencies:Classify based on the highest frequency count. genre=comedy=drama=actionrating=R=PG-13IMAX=TRUE=FALSE[yes] {2}[no] {3}[yes] {4}[yes] {3}[no] {2}Slide29

Handling missing

values during

prediction:

2nd exampleGenre = ?Critics-reviews = ?Rating = RIMAX = TRUElike = noConsider this 2nd example: genre=comedy=drama=actionrating=R=PG-13IMAX=TRUE=FALSE[yes] {2}[no] {3}[yes] {4}[yes] {3}[no] {2}Slide30

Handling missing

values during

prediction:

3rd exampleGenre = ?Critics-reviews = ?Rating = ?IMAX = ?likes = yesgenre=comedy=drama=actionrating=R=PG-13IMAX=TRUE=FALSE[yes] {2}[no] {3}[yes] {4}[yes] {3}[no] {2}Consider if all attribute values are unknown: