/
Introduction to Statistics for the Social Sciences SBS200 - Lecture Introduction to Statistics for the Social Sciences SBS200 - Lecture

Introduction to Statistics for the Social Sciences SBS200 - Lecture - PowerPoint Presentation

min-jolicoeur
min-jolicoeur . @min-jolicoeur
Follow
342 views
Uploaded On 2019-11-01

Introduction to Statistics for the Social Sciences SBS200 - Lecture - PPT Presentation

Introduction to Statistics for the Social Sciences SBS200 Lecture Section 001 Spring 2019 Room 150 Harvill Building 900 950 Mondays Wednesdays amp Fridays httpwwwyoutubecomwatchvoSQJP40PcGI ID: 761802

sample scores null hypothesis scores sample hypothesis null test area reject tails critical number type score sold incentive guilty

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Introduction to Statistics for the Socia..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Introduction to Statistics for the Social SciencesSBS200 - Lecture Section 001, Spring 2019Room 150 Harvill Building9:00 - 9:50 Mondays, Wednesdays & Fridays. http://www.youtube.com/watch?v=oSQJP40PcGI March 18

Even if you have not yet registered your clicker you can still participate The Green Sheets

Before next exam (April 5th) Please read chapters 1 - 11 in OpenStax textbook Please read Chapters 2, 3, and 4 in Plous Chapter 2: Cognitive Dissonance Chapter 3: Memory and Hindsight Bias Chapter 4: Context Dependence Schedule of readings

Lab sessions Everyone will want to be enrolled in one of the lab sessions Labs continue this week

Let’s do ANOVA Using ExcelA girlscout troop leader wondered whether providing an incentive to whomever sold the most girlscout cookies would have an effect on the number cookies sold. She provided a big incentive to one troop (trip to Hawaii), a lesser incentive to a second troop (bicycle), and no incentive to a third group, and then looked to see who sold more cookies. Troop 1 (Nada) 10 8 12 7 13 Troop 2 (bicycle) 12 14 10 11 13 Troop 3 (Hawaii) 14 9 19 13 15 n = 5 x = 10 n = 5 x = 12 n = 5 x = 14

Let’s do oneReplication of study(new data)

Let’s do same problemUsing MS Excel

Let’ s do same problem Using MS Excel

Means for Each group

F critical (is observed F greater than critical F?) P-value (is it less than .05?) No, so it is not significant Do not reject null No, so it is not significant Do not reject null

Make decision whether or not to reject null hypothesis2.7 is not farther out on the curve than 3.89 so, we do not reject the null hypothesis Observed F = 2.73 Critical F (2,12) = 3.89 Also p-value is not smaller than 0.05 so we do not reject the null hypothesis Step 6: Conclusion: There appears to be no effect of type of incentive on number of girl scout cookies sold

Make decision whether or not to reject null hypothesis2.7 is not farther out on the curve than 3.89 so, we do not reject the null hypothesis Observed F = 2.73 Critical F (2,12) = 3.89 Conclusion: There appears to be no effect of type of incentive on number of girl scout cookies sold F (2,12) = 2.73; n.s. The average number of cookies sold for three different incentives were compared. The mean number of cookie boxes sold for the “Hawaii” incentive was 14 , the mean number of cookies boxes sold for the “Bicycle” incentive was 12, and the mean number of cookies sold for the “No” incentive was 10. An ANOVA was conducted and there appears to be no significant difference in the number of cookies sold as a result of the different levels of incentive F(2, 12) = 2.73; n.s.

. Type I or type II error What if we were looking to see if an individual were guilty of a crime? Type I error: Rejecting a true null hypothesis Saying the person is guilty when they are not (false alarm) Sending an innocent person to jail (& guilty person to stays free) Type II error: Not rejecting a false null hypothesis Saying the person in innocent when they are guilty (miss) Allowing a guilty person to stay free What would null hypothesis be? This person is innocent - there is no crime here Two ways to be correct: Say they are guilty when they are guilty Say they are not guilty when they are innocent Two ways to be incorrect: Say they are guilty when they are not Say they are not guilty when they are Which is worse?

Five steps to hypothesis testing Step 1: Identify the research problem (hypothesis) Describe the null and alternative hypotheses Step 2: Decision rule Alpha level? ( α = .05 or .01)? Step 3: Calculations Step 4: Make decision whether or not to reject null hypothesis If observed z (or t) is bigger then critical z (or t) then reject null Step 5: Conclusion - tie findings back in to research problem One or two tailed test? Balance between Type I versus Type II error Critical statistic (e.g. z or t or F or r) value?

Degrees of Freedom Degrees of Freedom ( d.f . ) is a parameter based on the sample size that is used to determine the value of the t statistic. Degrees of freedom tell how many observations are used to calculate s, less the number of intermediate estimates used in the calculation. We lose one degree of freedom for every parameter we estimate

Comparing z score distributions with t-score distributionsSimilarities include: Using bell-shaped distributions to make confidence interval estimations and decisions in hypothesis testing Use table to find areas under the curve (different table, though – areas often differ from z scores) z-scores t-scores Summary of 2 main differences: We are now estimating standard deviation from the sample (We don’t know population standard deviation) We have to deal with degrees of freedom

Comparing z score distributions with t-score distributions 2) The shape of the sampling distribution is very sensitive to small sample sizes (it actually changes shape depending on n) Please notice: as sample sizes get smaller, the tails get thicker. As sample sizes get bigger tails get thinner and look more like the z-distribution Differences include: We use t-distribution when we don’t know standard deviation of population, and have to estimate it from our sample

Comparing z score distributions with t-score distributions 2) The shape of the sampling distribution is very sensitive to small sample sizes (it actually changes shape depending on n) Differences include: We use t-distribution when we don’t know standard deviation of population, and have to estimate it from our sample 3) Because the shape changes, the relationship between the scores and proportions under the curve change ( So, we would have a different table for all the different possible n’s but just the important ones are summarized in our t-table) Please note: Once sample sizes get big enough the t distribution (curve) starts to look exactly like the z distribution (curve) s cores

. Interpreting t-table Technically, we have a different t-distribution for each sample size This t-table summarizes the most useful values for several distributions n = 17 n = 5 This t-table presents useful values for distributions (organized by degrees of freedom) 1.96 2.58 1.64 Remember these useful values for z-scores? We use degrees of freedom ( df ) to approximate sample size Each curve is based on its own degrees of freedom ( df ) - based on sample size, and its own table tying together t-scores with area under the curve

Area between two scores Area between two scores Area beyond two scores (out in tails) Area beyond two scores (out in tails) Area in each tail (out in tails) Area in each tail (out in tails) df

Area between two scores Area between two scores Area beyond two scores (out in tails) Area beyond two scores (out in tails) Area in each tail (out in tails) Area in each tail (out in tails) Notice with large sample size it is same values as z-score . 1.96 2.58 1.64 Remember these useful values for z-scores? df

Hypothesis testing:one sample t-testIs the mean of my observed sample consistent with the known population mean or did it come from some other distribution? We are given the following problem: 800 students took a chemistry exam. Accidentally, 25 students got an additional ten minutes. Did this extra time make a significant difference in the scores? The average number correct by the large class was 74. The scores for the sample of 25 was 76, 72, 78, 80, 73 70, 81, 75, 79, 76 77, 79, 81, 74, 62 95, 81, 69, 84, 76 75, 77, 74, 72, 75 Please note: In this example we are comparing our sample mean with the population mean (One-sample t-test) Let’s jump right in and do a t-test

Hypothesis testingStep 1: Identify the research problem / hypothesisDescribe the null and alternative hypotheses H o : µ = 74 Did the extra time given to this sample of students affect their chemistry test scores = 74 H 1 : One tail or two tail test? µ

Hypothesis testingStep 2: Decision rule= .05 Degrees of freedom (df) = ( n - 1) = (25 - 1) = 24 two tail test n = 25 This was for z scores We use a different table for t-tests

α= .05 (df) = 24 Critical t (24) = 2.064 two tail test

Hypothesis testingStep 3: Calculations µ = 74 = 76.44 76 72 78 80 73 70 81 75 79 76 77 79 81 74 62 95 81 69 84 76 75 77 74 72 75 76 – 76.44 72 – 76.44 78 – 76.44 80 – 76.44 73 – 76.44 70 – 76.44 81 – 76.44 75 – 76.44 79 – 76.44 76 – 76.44 77 – 76.44 79 – 76.44 81 – 76.44 74 – 76.44 62 – 76.44 95 – 76.44 81 – 76.44 69 – 76.44 84 – 76.44 76 – 76.44 75 – 76.44 77 – 76.44 74 – 76.44 72 – 76.44 75 – 76.44 = -0.44 = -4.44 = +1.56 = + 3.56 = -3.44 = -6.44 = +4.56 = -1.44 = +2.56 = -0.44 = +0.56 = +2.56 = +4.56 = -2.44 = -14.44 = +18.56 = +4.56 = -7.44 = +7.56 = -0.44 = -1.44 = +0.56 = -2.44 = -4.44 = -1.44 0.1936 19.7136 2.4336 12.6736 11.8336 41.4736 20.7936 2.0736 6.5536 0.1936 0.3136 6.5536 20.7936 5.9536 208.5136 344.4736 20.7936 55.3536 57.1536 0.1936 2.0736 0.3136 5.9536 19.7136 2.0736 x (x - x) (x - x) 2 868.16 24 = 6.01 N = 25 Σ x = 1911 Σ (x- x) = 0 Σ (x- x) 2 = 868.16 Σ x = N 1911 25 =

. Hypothesis testing Step 3: Calculations s = 6.01 = 76.44 - 74 1.20 = 2.03 µ = 74 = 76.44 N = 25 76.44 - 74 6.01 25 critical t critical t Observed t (24) = 2.03

Hypothesis testingStep 4: Make decision whether or not to reject null hypothesisStep 6: Conclusion: The extra time did not have a significant effect on the scores 2.03 is not farther out on the curve than 2.064, so, we do not reject the null hypothesis Observed t (24) = 2.03 Critical t (24 ) = 2.064

Hypothesis testing: Did the extra time given to these 25 students affect their average test score? notice we are comparing a sample mean with a population mean: single sample t-test The mean score for those students who where given extra time was 76.44 percent correct, while the mean score for the rest of the class was only 74 percent correct. A t-test was completed and there appears to be no significant difference in the test scores for these two groups t(24) = 2.03; n.s . Start summary with two means (based on DV) for two levels of the IV Describe type of test (t-test versus z-test) with brief overview of results Finish with statistical summary t(24) = 2.03; ns Or if it had been different results that *were* significant: t(24) = -5.71; p < 0.05 Type of test with degrees of freedom n.s . = “not significant” p<0.05 = “significant” n.s . = “not significant” p<0.05 = “significant” Value of observed statistic

Preview of homework assignment

Preview of homework assignment

Thank you!See you next time!!