Let B 1 B 2 B N be mutually exclusive events whose union equals the sample space S We refer to these sets as a partition of S An event A can be represented as Since B 1 B 2 ID: 999586
Download Presentation The PPT/PDF document "Theorem of total probability" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
1. Theorem of total probabilityLet B1, B2, …, BN be mutually exclusive events whose union equals the sample space S. We refer to these sets as a partition of S.An event A can be represented as:Since B1, B2, …, BN are mutually exclusive, then P(A) = P(A B1) + P(A B2) + … + P(A BN)And therefore P(A) = P(A|B1)*P(B1) + P(A|B2)*P(B2) + … + P(A|BN)*P(BN) = i P(A | Bi) * P(Bi)Exhaustive conditionalizationMarginalization
2. Bayes theoremP(A B) = P(B) * P(A | B) = P(A) * P(B | A)APBPABP)()()|(==>Posterior probabilityPrior of A (Normalizing constant)BAP)|(Prior of BConditional probability(likelihood)This is known as Bayes Theorem or Bayes Rule, and is (one of) the most useful relations in probability and statisticsBayes Theorem is definitely the fundamental relation in Statistical Pattern Recognition
3. Bayes theorem (cont’d)Given B1, B2, …, BN, a partition of the sample space S. Suppose that event A occurs; what is the probability of event Bj?P(Bj | A) = P(A | Bj) * P(Bj) / P(A) = P(A | Bj) * P(Bj) / jP(A | Bj)*P(Bj)Bj: different models / hypothesesIn the observation of A, should you choose a model that maximizes P(Bj | A) or P(A | Bj)? Depending on how much you know about Bj !Posterior probabilityLikelihoodPrior of BjNormalizing constant(theorem of total probabilities)
4. Another exampleWe’ve talked about the boxes of casinos: 99% fair, 1% loaded (50% at six)We said if we randomly pick a die and roll, we have 17% of chance to get a sixIf we get 3 six in a row, what’s the chance that the die is loaded?How about 5 six in a row?
5. P(loaded | 666) = P(666 | loaded) * P(loaded) / P(666) = 0.53 * 0.01 / (0.53 * 0.01 + (1/6)3 * 0.99) = 0.21P(loaded | 66666) = P(66666 | loaded) * P(loaded) / P(66666) = 0.55 * 0.01 / (0.55 * 0.01 + (1/6)5 * 0.99) = 0.71