1 2 Our Data Chest Pain Blocked Arteries Patient Weight Heart Disease Yes Yes 205 Yes No Yes 180 Yes Yes No 210 Yes Yes Yes 167 Yes No Yes 156 No No Yes 125 No Yes No ID: 1042916
Download Presentation The PPT/PDF document "Lab #10: Demonstration of AdaBoost" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
1. Lab #10: Demonstration of AdaBoost1
2. 2Our DataChest PainBlocked ArteriesPatient WeightHeart DiseaseYesYes205YesNoYes180YesYesNo210YesYesYes167YesNoYes156NoNoYes125NoYesNo168NoYesYes172NoTraining Data
3. 3BaggingShuffle (i.e., bootstrap the data)Train a new decision tree TiChest PainBlocked ArteriesPatient WeightHeart DiseaseYesYes205YesNoYes180YesYesNo210YesYesYes167YesNoYes156NoNoYes125NoYesNo168NoYesYes172NoTraining Data
4. 4BaggingShuffle (i.e., bootstrap the data)Train a new decision tree TiDo N timesChest PainBlocked ArteriesPatient WeightHeart DiseaseYesYes205YesNoYes180YesYesNo210YesYesYes167YesNoYes156NoNoYes125NoYesNo168NoYesYes172NoTraining DataWe have {T1, T2, T3, …, TN}
5. 5BaggingChest PainBlocked ArteriesPatient WeightHeart DiseaseYesYes205YesNoYes180YesYesNo210YesYesYes167YesNoYes156NoNoYes125NoYesNo168NoYesYes172NoShuffle (i.e., bootstrap the data)Train a new decision tree TiDo N timesTraining DataWe have {T1, T2, T3, …, TN}Chest PainBlocked ArteriesPatient WeightHeart DiseaseNoYes158?Testing Data
6. 6BaggingChest PainBlocked ArteriesPatient WeightHeart DiseaseYesYes205YesNoYes180YesYesNo210YesYesYes167YesNoYes156NoNoYes125NoYesNo168NoYesYes172NoShuffle (i.e., bootstrap the data)Train a new decision tree TiDo N timesChest PainBlocked ArteriesPatient WeightHeart DiseaseNoYes158?Training DataTesting DataWe have {T1, T2, T3, …, TN}Take a majority vote from {T1, T2, T3, …, TN}
7. 7BoostingShuffle (i.e., bootstrap the data)Select a random subset of Pi featuresTrain a new decision tree TiChest PainBlocked ArteriesPatient WeightHeart DiseaseYesYes205YesNoYes180YesYesNo210YesYesYes167YesNoYes156NoNoYes125NoYesNo168NoYesYes172NoTraining Data
8. 8BoostingShuffle (i.e., bootstrap the data)Select a random subset of Pi featuresTrain a new decision tree TiChest PainBlocked ArteriesPatient WeightHeart DiseaseYesYes205YesNoYes180YesYesNo210YesYesYes167YesNoYes156NoNoYes125NoYesNo168NoYesYes172NoTraining DataDo N timesWe have {T1, T2, T3, …, TN}
9. 9BoostingShuffle (i.e., bootstrap the data)Select a random subset of Pi featuresTrain a new decision tree TiChest PainBlocked ArteriesPatient WeightHeart DiseaseYesYes205YesNoYes180YesYesNo210YesYesYes167YesNoYes156NoNoYes125NoYesNo168NoYesYes172NoTraining DataDo N timesWe have {T1, T2, T3, …, TN}Chest PainBlocked ArteriesPatient WeightHeart DiseaseNoYes158?Testing DataTake a majority vote from {T1, T2, T3, …, TN}
10. 10Ideas“Fool me once, shame on … shame on you. Fool me – you can’t get fooled again” –George W. BushChest PainBlocked ArteriesPatient WeightHeart DiseaseYesYes205YesNoYes180YesYesNo210YesYesYes167YesNoYes156NoNoYes125NoYesNo168NoYesYes172NoTraining DataWe have {T1, T2, T3, …, TN}“Fool me once, shame on you; fool me twice, shame on me” –Proverb
11. 11Ideas“Fool me once, shame on … shame on you. Fool me – you can’t get fooled again” –George W. BushChest PainBlocked ArteriesPatient WeightHeart DiseaseYesYes205YesNoYes180YesYesNo210YesYesYes167YesNoYes156NoNoYes125NoYesNo168NoYesYes172NoTraining DataWe have {T1, T2, T3, …, TN}“Fool me once, shame on you; fool me twice, shame on me” –ProverbLet’s learn from our mistakes!
12. 12Gradient BoostingChest PainBlocked ArteriesPatient WeightHeart DiseaseYesYes205YesNoYes180YesYesNo210YesYesYes167YesNoYes156NoNoYes125NoYesNo168NoYesYes172NoTraining DataWe have {T1, T2, T3, …, TN}
13. 13Gradient BoostingChest PainBlocked ArteriesPatient WeightHeart DiseaseYesYes205YesNoYes180YesYesNo210YesYesYes167YesNoYes156NoNoYes125NoYesNo168NoYesYes172NoTraining DataWe have {T1, T2, T3, …, TN}Each Th is:a ”weak”/simple decision treebuilt after the previous treetries to learn the shortcomings (the errors/residuals) from the previous tree’s predictions
14. Gradient Boosting: illustration 14
15. Gradient Boosting: illustration 15
16. Gradient Boosting: illustration 16
17. Gradient Boosting: illustration 17
18. Gradient Boosting: illustration 18
19. Gradient Boosting: illustration 19
20. Gradient Boosting: illustration 20
21. Gradient Boosting: illustration 21
22. 22Gradient BoostingChest PainBlocked ArteriesPatient WeightHeart DiseaseYesYes205YesNoYes180YesYesNo210YesYesYes167YesNoYes156NoNoYes125NoYesNo168NoYesYes172NoTraining DataWe have {T1, T2, T3, …, TN}We can determine each h by using gradient descent.
23. 23IdeaChest PainBlocked ArteriesPatient WeightHeart DiseaseYesYes205YesNoYes180YesYesNo210YesYesYes167YesNoYes156NoNoYes125NoYesNo168NoYesYes172NoTraining DataIf we have categorical data (not a regression task), we can use AdaBoost
24. 24IdeaChest PainBlocked ArteriesPatient WeightHeart DiseaseYesYes205YesNoYes180YesYesNo210YesYesYes167YesNoYes156NoNoYes125NoYesNo168NoYesYes172NoTraining DataIf we have categorical data (not a regression task), we can use AdaBoostTrain a single weak (stump) Decision Tree TiCalculate the total error of your predictionsUse this error ( ) to determine how much stock to place in that TreeUpdate the weights of each observationUpdate our running model Ti
25. AdaBoost With a minor adjustment to the exponential loss function, we have the algorithm for gradient descent: 1. Choose an initial distribution over the training data, . 2. At the ith step, fit a simple classifier T(i) on weighted training data 3. Update the weights: where Z is the normalizing constant for the collection of updated weights 4. Update where is the learning rate. 25
26. AdaBoost: start with equal weights 26
27. AdaBoost: fit a simple decision tree27
28. AdaBoost: update the weights28
29. AdaBoost: fit another simple decision tree on re-weighted data29
30. AdaBoost: add the new model to the ensemble30
31. AdaBoost: update the weights31
32. AdaBoost: fit a third, simple decision tree on re-weighted data32
33. AdaBoost: add the new model to the ensemble, repeat…33
34. Choosing the Learning Rate Unlike in the case of gradient boosting for regression, we can analytically solve for the optimal learning rate for AdaBoost, by optimizing: Doing so, we get that 34