/
T-test A common statistical test: The Z test for different means T-test A common statistical test: The Z test for different means

T-test A common statistical test: The Z test for different means - PowerPoint Presentation

davies
davies . @davies
Follow
69 views
Uploaded On 2023-11-04

T-test A common statistical test: The Z test for different means - PPT Presentation

A sample N 25 computer science students has mean IQ m135 Are they smarter than average Population mean is 100 with standard deviation 15 The null hypothesis H0 is that the CS students ID: 1028421

sample standard deviation distribution standard sample distribution deviation sampling hypothesis population test statistic null normal students probability reject cars

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "T-test A common statistical test: The Z ..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

1. T-test

2. A common statistical test: The Z test for different meansA sample N = 25 computer science students has mean IQ m=135. Are they “smarter than average”?Population mean is 100 with standard deviation 15The null hypothesis, H0, is that the CS students are “average”, i.e., the mean IQ of the population of CS students is 100.What is the probability p of drawing the sample if H0 were true? If p small, then H0 probably false.Find the sampling distribution of the mean of a sample of size 25, from population with mean 100

3. Remember:Central Limit TheoremThe sampling distribution of the mean of samples of size N approaches a normal (Gaussian) distribution as N approaches infinity.If the samples are drawn from a population with mean and standard deviation , then the mean of the sampling distribution is and its standard deviation is as N increases.These statements hold irrespective of the shape of the original distribution.

4. The sampling distribution for the CS student exampleIf sample of N = 25 students were drawn from a population with mean 100 and standard deviation 15 (the null hypothesis) then the sampling distribution of the mean would asymptotically be normal with mean 100 and standard deviation 100135The mean of the CS students falls nearly 12 standard deviations away from the mean of the sampling distributionOnly ~1% of a normal distribution falls more than two standard deviations away from the meanThe probability that the students are “average” is roughly zeroIQ:

5. The Z test100135Mean of sampling distributionSamplestatisticstd=3011.67Mean of sampling distributionTeststatisticstd=1.0

6. Reject the null hypothesis?Commonly we reject the H0 when the probability of obtaining a sample statistic (e.g., mean = 135) given the null hypothesis is low, say < .05. A test statistic value, e.g. Z = 11.67, recodes the sample statistic (mean = 135) to make it easy to find the probability of sample statistic given H0.

7. We find the probabilities by looking them up in tables, or statistics packages provide them.For example, Pr(Z ≥ 1.67) = .05; Pr(Z ≥ 1.96) = .01.Pr(Z ≥ 11) is approximately zero, reject H0. Reject the null hypothesis?

8. The t testSame logic as the Z test, but appropriate when population standard deviation is unknown, samples are small, etc.Sampling distribution is t, not normal, but approaches normal as samples size increasesTest statistic has very similar form but probabilities of the test statistic are obtained by consulting tables of the t distribution, not the normal

9. The t test100135Mean of sampling distributionSamplestatisticstd=12.102.89Mean of sampling distributionTeststatisticstd=1.0Suppose N = 5 students have mean IQ = 135, std = 27 Estimate the standard deviation of sampling distribution using the sample standard deviation

10. Summary of hypothesis testingH0 negates what you want to demonstrate; find probability p of sample statistic under H0 by comparing test statistic to sampling distribution; if probability is low, reject H0 with residual uncertainty proportional to p. Example: Want to demonstrate that CS graduate students are smarter than average. H0 is that they are average. t = 2.89, p ≤ .022

11. p ValuesCommonly we reject the H0 when the probability of obtaining a sample statistic given the null hypothesis is low, say < .05 The null hypothesis is rejected but might be trueWe find the probabilities by looking them up in tables, or statistics packages provide themThe probability of obtaining a particular sample given the null hypothesis is called the p valueBy convention, one usually dose not reject the null hypothesis unless p < 0.05 (statistically significant)

12. Z Test standard deviation (population)t Test sample standard deviationwhen population standard deviation is unknown, samples are small population mean , sample mean

13. ExampleFive cars parked, mean price of the cars is 20.270 € and the standard deviation of the sample is 5.811€The mean costs of cars in town is 12.000 € (population)H0 hypothesis: parked cars are as expensive as the cars in townFor N-1 (degrees of freedom) t=3.18 has a value less than 0.025, reject H0!

14. Confidence IntervalsJust looking at a figure representing the mean values, we can not see if the differences are significant

15. Paired Sample t TestGiven a set of paired observations (from two normal populations)AB=A-Bx1y1x1-x2x2y2x2-y2x3y3x3-y3x4y4x4-y4x5y5x5-y5

16. Calculate the mean and the standard deviation s of the the differences H0: =0 (no difference)H0: =k (difference is a constant)

17. Confidence Intervals ( known)Standard error from the standard deviation 95 Percent confidence interval for normal distribution is about the mean

18. Confidence interval when ( unknown)Standard error from the sample standard deviation95 Percent confidence interval for t distribution (t0.025 from a table) isPrevious Example:

19. ExamplePerform Cross Validation of all your algorithms with Fold Count 4, 8, Maximum Cases should be 1000. Which algorithm is the best, which varies less? Which is the better choice?For simplification we indicate error intervals mean value + - standard deviation