/
Hypothesis   Testing Martina Litschmannová Hypothesis   Testing Martina Litschmannová

Hypothesis Testing Martina Litschmannová - PowerPoint Presentation

patricia
patricia . @patricia
Follow
0 views
Uploaded On 2024-03-13

Hypothesis Testing Martina Litschmannová - PPT Presentation

m artinalitschmannova vsbcz EA 538 Terms Introduce in Prior Chapter Population all possible values Sample a portion of the population Statistical inference ID: 1047195

test hypothesis sample null hypothesis test null sample level significance reject population statistic hypotheses data statistical normal alternative variable

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Hypothesis Testing Martina Litschmanno..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

1. Hypothesis TestingMartina Litschmannovámartina.litschmannova@vsb.czEA 538

2. Terms Introduce in Prior ChapterPopulation … all possible valuesSample … a portion of the population Statistical inference … generalizing from a sample to a population with calculated degree of certainty Two forms of statistical inference Hypothesis testingEstimationParameter … a characteristic of population, e.g., population mean µStatistic … calculated from data in the sample, e.g., sample mean  

3. Distinctions Between Parameters and Statistics (Exercise 8 and 9 review)ParametersStatistics SourcePopulationSampleNotationGreek (e.g. μ)Roman (e.g. )VariabilityNoYesCalculatedNoYesParametersStatistics SourcePopulationSampleNotationGreek (e.g. μ)VariabilityNoYesCalculatedNoYes

4. What is Hypothesis Testing?A statistical hypothesis is an assumption about a population. This assumption may or may not be true.Hypothesis testing refers to the formal procedures used by statisticians to failed reject or reject statistical hypotheses.The best way to determine whether a statistical hypothesis is true would be to examine the entire population. Since that is often impractical, researchers typically examine a random sample from the population. If sample data are not consistent with the statistical hypothesis, the hypothesis is rejected.

5. Statistical HypothesesThere are two types of statistical hypotheses.Null hypothesis. The null hypothesis, denoted by , is usually the hypothesis that sample observations result purely from chance. is the statement being tested in a test of hypothesis. Alternative hypothesis. The alternative hypothesis, denoted by or , is the hypothesis that sample observations are influenced by some non-random cause. is what is believe to be true if is false. 

6. In this course, we will always assume that the null hypothesis for a population parameter always specifies a single value for that parameter. So, an equal sign always appears: If the primary concern is deciding whether a population parameter is different than a specified value , the alternative hypothesis should be:This form of alternative hypothesis is called a two-tailed test. Statistical Hypotheses

7. If the primary concern is whether a population parameter is less than a specified value , the alternative hypothesis should be:A hypothesis test whose alternative hypothesis has this form is called a left-tailed test. Statistical Hypotheses

8. If the primary concern is whether a population parameter is greater than a specified value , the alternative hypothesis should be:A hypothesis test whose alternative hypothesis has this form is called a right-tailed test.A hypothesis test is called a one-tailed test if it is either right- or left-tailed, i.e.,if it is not a two-tailed test. Statistical Hypotheses

9. Can We Accept the Null Hypothesis?Some researchers say that a hypothesis test can have one of two outcomes: you accept the null hypothesis or you reject the null hypothesis. Many statisticians, however, take issue with the notion of "accepting the null hypothesis." Instead, they say: you reject the null hypothesis or you don‘t reject the null hypothesis.Why the distinction between "acceptance" and „don‘t reject?" Acceptance implies that the null hypothesis is true. Don‘t reject implies that the data are not sufficiently persuasive for us to prefer the alternative hypothesis over the null hypothesis.

10. Hypothesis TestingHypothesis testing is a formal process to determine whether to reject a null hypothesis, based on sample data. This process consists of four steps. State the hypotheses. This involves stating the null and alternative hypotheses. The hypotheses are stated in such a way that they are mutually exclusive. That is, if one is true, the other must be false. Formulate an analysis plan. The analysis plan describes how to use sample data to evaluate the null hypothesis. The evaluation often focuses around a single test statistic. Analyze sample data. Find the value of the test statistic (mean score, proportion, t-score, z-score, etc.) described in the analysis plan. Interpret results. Apply the decision rule described in the analysis plan. If the value of the test statistic is unlikely, based on the null hypothesis, reject the null hypothesis.

11. Decision ErrorsYour Statistical DecisionTrue state of null hypothesis True(example: the drug doesn’t work) False(example: the drug works)Reject (ex: you conclude that the drug works)Type I error (α)CorrectDo not reject (ex: you conclude that there is insufficient evidence that the drug works)CorrectType II Error (β)Your Statistical DecisionTrue state of null hypothesisType I error (α)CorrectCorrectType II Error (β)

12. Decision ErrorsTwo types of errors can result from a hypothesis test.Type I error. A Type I error occurs when the researcher rejects a null hypothesis when it is true. The probability of committing a Type I error is called the significance level. This probability is also called alpha, and is often denoted by α.Type II error. A Type II error occurs when the researcher fails to reject a null hypothesis that is false. The probability of committing a Type II error is called beta, and is often denoted by β. The probability of not committing a Type II error is called the power of the test.

13. Decision RulesThe analysis plan includes decision rules for rejecting the null hypothesis. In practice, statisticians describe these decision rules in two ways - with reference to a p-value or with reference to a region of acceptance.p-value. The strength of evidence in support of a null hypothesis is measured by the p-value. Suppose the test statistic is equal to S. The p-value is the probability of observing a test statistic as extreme as S, assuming the null hypothesis is true. If the p-value is less than the significance level, we reject the null hypothesis.

14. Decision RulesRegion of acceptance. The region of acceptance is a range of values. If the test statistic falls within the region of acceptance, the null hypothesis is not rejected. The region of acceptance is defined so that the chance of making a Type I error is equal to the significance level.The set of values outside the region of acceptance is called the region of rejection. If the test statistic falls within the region of rejection, the null hypothesis is rejected. In such cases, we say that the hypothesis has been rejected at the α level of significance.

15. How to Test Hypotheses?State the hypotheses. Every hypothesis test requires the analyst to state a null hypothesis and an alternative hypothesis. The hypotheses are stated in such a way that they are mutually exclusive. That is, if one is true, the other must be false; and vice versa. Formulate an analysis plan. The analysis plan describes how to use sample data to failed reject or reject the null hypothesis. It should specify the following elements. Significance level. Often, researchers choose significance level equal to 0.01, 0.05, or 0.10; but any value between 0 and 1 can be used.

16. How to Test Hypotheses?Test method. Typically, the test method involves a test statistic and a sampling distribution. Computed from sample data, the test statistic might be a mean score, proportion, difference between means, difference between proportions, z-score, t-score, chi-square, etc. Given a test statistic and its sampling distribution, a researcher can assess probabilities associated with the test statistic. If the test statistic probability is less than the significance level, the null hypothesis is rejected.

17. How to Test Hypotheses?Analyze sample data. Using sample data, perform computations called for in the analysis plan.Test statistic. p-value. The p-value is the probability of observing a sample statistic as extreme as the test statistic, assuming the null hypotheis is true.

18. How to Test Hypotheses?Interpret the results. If the sample findings are unlikely, given the null hypothesis, the researcher rejects the null hypothesis. Typically, this involves comparing the p-value to the significance level, and rejecting the null hypothesis when the P-value is less than the significance level.P-value is low, null hypothesis must go!!

19. How to calculate p-value? … test statistic if is true … distribution function of RV with null distribution in  A

20. How to interpret results?… significance level (typically 0,05) ResultWe reject H0 with significance level We dont reject H0 with significance level Result

21. Hypothesis problemsNull HypothesisAssumptionsTest StatisticNull Distributionnormal or near-normal population, large samplenormal or near-normal population, small sampleNull HypothesisAssumptionsTest StatisticNull Distributionnormal or near-normal population, large samplenormal or near-normal population, small sample

22. Hypothesis problemsNull HypothesisAssumptionsTest StatisticNull DistributionIndependent samples,normal or near-normal populationsNot independent samples (paired data),normal or near-normal populationsNull HypothesisAssumptionsTest StatisticNull DistributionIndependent samples,normal or near-normal populationsNot independent samples (paired data),normal or near-normal populations

23. An inventor has developed a new, energy-efficient lawn mower engine. He claims that the engine will run continuously for 5 hours (300 minutes) on a single gallon of regular gasoline. Suppose a simple random sample of 50 engines is tested. The engines run for an average of 295 minutes, with a standard deviation of 20 minutes. Test the null hypothesis that the mean run time is 300 minutes against the alternative hypothesis that the mean run time is not 300 minutes. Use a 0.05 level of significance. (Assume that run times for the population of engines are normally distributed.)

24. Bon Air Elementary School has 300 students. The principal of the school thinks that the average IQ of students at Bon Air is at least 110. To prove her point, she administers an IQ test to 20 randomly selected students. Among the sampled students, the average IQ is 108 with a standard deviation of 10. Based on these results, should the principal accept or reject her original hypothesis? Assume a significance level of 0.01.

25. Within a school district, students were randomly assigned to one of two Math teachers - Mrs. Smith and Mrs. Jones. After the assignment, Mrs. Smith had 30 students, and Mrs. Jones had 25 students.At the end of the year, each class took the same standardized test. Mrs. Smith's students had an average test score of 78, with a standard deviation of 10; and Mrs. Jones' students had an average test score of 85, with a standard deviation of 15.Test the hypothesis that Mrs. Smith and Mrs. Jones are equally effective teachers. Use a 0.10 level of significance. (Assume that student performance is approximately normal and the variances of both groups are equal.)

26. The Acme Company has developed a new battery. The engineer in charge claims that the new battery will operate continuously for at least 7 minutes longer than the old battery.To test the claim, the company selects a simple random sample of 100 new batteries and 100 old batteries. The old batteries run continuously for 190 minutes with a standard deviation of 20 minutes; the new batteries, 200 minutes with a standard deviation of 40 minutes.Test the engineer's claim that the new batteries run at least 7 minutes longer than the old. Use a 0.05 level of significance. (Assume that there are no outliers in either sample and the variances in both groups are unequal.)

27. Forty-four sixth graders were randomly selected from a school district. Then, they were divided into 22 matched pairs, each pair having equal IQ's. One member of each pair was randomly selected to receive special training. Then, all of the students were given an IQ test. Test results are in dataset IQtest.xls.Do these results provide evidence that the special training helped or hurt student performance? Use an 0.05 level of significance. Assume that the mean differences are approximately normally distributed.

28. The CEO of a large electric utility claims that 80 percent of his 1,000,000 customers are very satisfied with the service they receive. To test this claim, the local newspaper surveyed 100 customers, using simple random sampling. Among the sampled customers, 73 percent say they are very satisified. Based on these findings, can we reject the CEO's hypothesis that 80% of the customers are very satisfied? Use a 0.05 level of significance.

29. Suppose the Acme Drug Company develops a new drug, designed to prevent colds. The company states that the drug is equally effective for men and women. To test this claim, they choose a a simple random sample of 100 women and 200 men from a population of 100,000 volunteers.At the end of the study, 38% of the women caught a cold; and 51% of the men caught a cold. Based on these findings, can we reject the company's claim that the drug is equally effective for men and women? Use a 0.05 level of significance.

30. Chi-Square Test for Independence

31. Chi-Square Test for IndependenceThe test is applied when you have two categorical variables from a single population.It is used to determine whether there is a significant association between the two variables.For example, in an election survey, voters might be classified by gender (male or female) and voting preference (Democrat, Republican, or Independent). We could use a chi-square test for independence to determine whether gender is related to voting preference.

32. Chi-Square Test for IndependenceVoting PreferencesRow totalRepublicanDemocratIndependentMale20015050400Female25030050600Column total4504501001000

33. When to Use Chi-Square Test for Independence?The sampling method is simple random sampling.Each population is at least 10 times as large as its respective sample.The variables under study are each categorical.If sample data are displayed in a contingency table, the expected frequency count for each cell of the table is at least 5.

34. State the HypothesesSuppose that Variable A has r levels, and Variable B has c levels. The null hypothesis states that knowing the level of Variable A does not help you predict the level of Variable B. That is, the variables are independent.H0: Variable A and Variable B are independent. HA: Variable A and Variable B are not independent.

35. TestAssumptionTest StatisticNull Distributionp-valueChi-Square Test for Independence  ,TestAssumptionTest StatisticNull Distributionp-valueChi-Square Test for Independence H0: Variable A and Variable B are independent. HA: Variable A and Variable B are not independent. … observed frequencies… expected frequencies 

36. A public opinion poll surveyed a simple random sample of 1000 voters. Respondents were classified by gender (male or female) and by voting preference (Republican, Democrat, or Independent). Results are shown in the contingency table in dataset public_opinion.xls.Is there a gender gap? Do the men's voting preferences differ significantly from the women's preferences? Use a 0.05 level of significance.You can use Statgraphics or http://www.quantpsy.org/chisq/chisq.htm

37. Study materials :http://homel.vsb.cz/~bri10/Teaching/Bris%20Prob%20&%20Stat.pdf (p. 111 - p.129)http://stattrek.com/tutorials/statistics-tutorial.aspx?Tutorial=Stat (Hypothesis testing)https://onlinecourses.science.psu.edu/stat500/node/56 (Chi-Square Test of Independence)