/
Statistics 200 Lecture # Statistics 200 Lecture #

Statistics 200 Lecture # - PowerPoint Presentation

tatyana-admore
tatyana-admore . @tatyana-admore
Follow
358 views
Uploaded On 2018-02-23

Statistics 200 Lecture # - PPT Presentation

28 Thursday December 1 2016 Textbook 161 Generalize the twosample ttest to more than two samples Explain how testing equality of means can be rephrased as a test of variance Analysis of variance ID: 634593

means anova table categorical anova means categorical table analysis variation 286 valuecollyear 283 variancesource variance group 65total 25087 186

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Statistics 200 Lecture #" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Statistics 200

Lecture #28 Thursday, December 1, 2016Textbook: 16.1

• Generalize the two-sample t-test to more than two samples.• Explain how testing equality of means can be rephrased as a test of variance (“Analysis of variance”).• Formulate null and alternative hypotheses for ANOVA.• State assumptions necessary for ANOVA.• Read an ANOVA table.

Objectives:Slide2

Example of a stemplot (easy to make by hand!)

The decimal point is 1 digit(s) to the right of the | 0 | 1 | 00 2 | 3 | 0 4 | 77 5 | 2 6 | 1348 7 | 0677 8 | 014558899 9 | 4445666888899 10 | 0002234445556788899 11 | 0000011222223445599These are the lab grades (out of 120 possible) for one of the 4 lab sections.Slide3

Motivating Example: Do students change their study habits as they go through college?

In this example, the response is _________ and the explanatory is _________ (use quantitative or categorical)quantitativecategoricalSlide4

We have seen this in the past. For example, compare male GPA with female GPA.

Quantitative response, categorical predictorIn such cases, we have used difference of two means.D.O.T.M. works because the categorical variable only has two levels.But what about the current example?Slide5

In the current example, our categorical predictor has four levels: Freshman, Sophomore, Junior, Senior.Quantitative response, categorical predictor

There’s no such thing as “difference of four means”!(We could try doing all of the pairwise comparisons, but this would be super-tedious AND there is a statistical problem with this approach.)Fr vs. SoFr vs. JuFr vs. SeSo vs. JuSo vs. SeJu vs. SeTry googling “multiple comparisons problem”Slide6

Quantitative response, categorical predictor

Variable CollYear Count Mean SE Mean StDevStudHrWk Freshman 47 14.00 1.48 10.17 Sophomore 178 13.531 0.704 9.397 Junior

51 14.12 1.32 9.42 Senior 11 9.73 1.54 5.12We have all the data we might need to compare the four means. We even know what the null hypothesis should be:H0: µ1 = µ2 = µ3 = µ4Slide7

Quantitative response, categorical predictor

Variable CollYear Count Mean SE Mean StDevStudHrWk Freshman 47 14.00 1.48 10.17 Sophomore 178 13.531 0.704 9.397 Junior

51 14.12 1.32 9.42 Senior 11 9.73 1.54 5.12It may sound strange, but we’re going to test equality of the means by a procedure called analysis of variance, or ANOVA.H0: µ1 = µ2 = µ3 = µ4Slide8

How does ANOVA work to compare means?

Answer: If the means are very different from one another, the variance of the sample means will be large.See the four red X’s in the plot? Those are the sample means.Does their variability seem large? What will you compare it to?Slide9

How does ANOVA work to compare means?

ANOVA works by comparing the variation between the group means to the variation within the groups.Focus on the horizontal variation (the vertical is meaningless here).Slide10

Another graphical summary of the data:ANOVA works by comparing the variation

between the group means to the variation within the groups.Looking at the four CIs, one for each mean, does it look like we’ll reject H0?Slide11

How well do you understand this plot?Which of the four groups has the largest sample size?

FreshmenJuniorsSeniorsSophomoresA key piece of information is here.

Each MOE equals: multiplier times s/sqrt(n). And the same s is used for each.Slide12

Hypotheses for ANOVA:H

0: Population means are the same for each group.Ha: Not all population group means are the same.Remember: There are groups within the population, as defined by their values of the categorical variable.H0: µ1 = µ2 = … = µkHa: Not all µ’s are the same.In our particular situation…H

0: Each class (F, So, J, Se) has the same mean study time.Ha: The mean study times are not the same for each class.Slide13

There are some assumptions made in ANOVA:

An excerpt from page 638:Each population group has a normal distribution.Each population group has the same standard deviation.In STAT 200, you will not be asked to check assumptions. However, you must know these two!Slide14

We may use Minitab to perform ANOVA:

“Factor” is another name for a categorical variable, used often in an ANOVA context.Slide15

ANOVA output is summarized in a single table:Analysis

of VarianceSource DF Adj SS Adj MS F-Value P-ValueCollYear 3 186.7 62.23 0.70 0.552Error 283 25087.6 88.65Total 286 25274.2

The top row gives the “between” variation.The MS, or mean square, is the estimated variance. Note: It equals SS / DF.Slide16

ANOVA output is summarized in a single table:Analysis

of VarianceSource DF Adj SS Adj MS F-Value P-ValueCollYear 3 186.7 62.23 0.70 0.552Error 283 25087.6 88.65Total 286 25274.2

The second row gives the “within” variation.The MS, or mean square, is the estimated variance. Note: It equals SS / DF.Slide17

ANOVA output is summarized in a single table:Analysis

of VarianceSource DF Adj SS Adj MS F-Value P-ValueCollYear 3 186.7 62.23 0.70 0.552Error 283 25087.6 88.65Total 286 25274.2

The second row gives the “within” variation.The F statistic is merely the ratio of the MS between to the MS within. It is the test statistic we use for ANOVA!Slide18

ANOVA output is summarized in a single table:Analysis

of VarianceSource DF Adj SS Adj MS F-Value P-ValueCollYear 3 186.7 62.23 0.70 0.552Error 283 25087.6 88.65Total 286 25274.2

The p-value is based on the F statistic and the two DF values for between and within.Slide19

ANOVA output is summarized in a single table:Analysis

of VarianceSource DF Adj SS Adj MS F-Value P-ValueCollYear 3 186.7 62.23 0.70 0.552Error 283 25087.6 88.65Total 286 25274.2

One more ANOVA table fact: The MS error, or MS within, is also called the pooled sample variance. You can take its square root to get the pooled standard deviation. The pooled stdev was used in the interval plot seen earlier.Slide20

Use ANOVA table to write a conclusionAnalysis

of VarianceSource DF Adj SS Adj MS F-Value P-ValueCollYear 3 186.7 62.23 0.70 0.552Error 283 25087.6 88.65Total 286 25274.2

We do not reject the null hypothesis (p-value=0.552), which means that there is no evidence of any difference among the mean study hours per week for freshmen, sophomores, juniors, and seniors.Slide21

What about mean GPA goal? Slide22

What about mean GPA goal? Analysis

of VarianceSource DF Adj SS Adj MS F-Value P-ValueCollYear 3 3.294 1.09800 18.14 0.000Error 283 17.127 0.06052Total 286 20.421

We reject the null hypothesis (p-value<0.0005), which means that there is strong evidence that the mean goal GPA is not the same among the four groups of freshmen, sophomores, juniors, and seniors.You may wonder whether we may then follow up to find out where the differences lie. Yes, but not in this class…Slide23

If you understand today’s lecture…Objectives:

16.1, 16.3, 16.4, 16.8, 16.13

• Generalize the two-sample t-test to more than two samples.• Explain how testing equality of means can be rephrased as a test of variance (“Analysis of variance”).• Formulate null and alternative hypotheses for ANOVA.• State assumptions necessary for ANOVA.• Read an ANOVA table.