Allan Rossman Cal Poly San Luis Obispo arossmancalpolyedu httpstatwebcalpolyeduarossman Advertisement I will present a longer more interactive version of this in a workshop on Saturday afternoon in Lincoln room ID: 543946
Download Presentation The PPT/PDF document "Using Simulation to Introduce Concepts o..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Using Simulation to Introduce Concepts of Statistical Inference
Allan Rossman
Cal Poly – San Luis Obispo
arossman@calpoly.edu
http://statweb.calpoly.edu/arossman/Slide2
Advertisement
I will present a longer, more interactive version of this in a workshop on Saturday afternoon in Lincoln room
Lunch provided!
Thanks to John Wiley and Sonss
Rossman
Northwest Two-Year College Math Conf
2Slide3
3
Northwest Two-Year College Math Conf
3
Outline
Who are you?
Overview, motivation
Four examples
Advantages/merits
Implementation suggestions
Assessment suggestions
Resources
Q&A
RossmanSlide4
Who are you?
How many years have you been teaching?
< 1 year
1-3 years4-8 years8-15 years
> 15 years
Northwest Two-Year College Math Conf
4
RossmanSlide5
Who are you?
How many years have you been teaching statistics?
Never
1-3 years4-8 years8-15 years
> 15 years
Northwest Two-Year College Math Conf
5
RossmanSlide6
Who are you?
What is your background in statistics?
No formal background
A course or twoSeveral courses but no degreeUndergraduate degree in statistics
Graduate degree in statisticsOther
Northwest Two-Year College Math Conf
6
RossmanSlide7
Who are you?
Have you used simulation in teaching statistics?
Never
A bit, to demonstrate probability ideasSomewhat, to demonstrate sampling distributions
A great deal, as an inference tool as well as for pedagogical demonstrations
Northwest Two-Year College Math Conf
7
RossmanSlide8
8
8
Motivation
“Ptolemy’s cosmology was needlessly complicated, because he put the earth at the center of his system, instead of putting the sun at the center. Our curriculum is needlessly complicated because we put the normal distribution, as an approximate sampling distribution for the mean, at the center of our curriculum, instead of putting the
core logic of inference
at the center.”
– George Cobb (
TISE
, 2007)
Northwest Two-Year College Math ConfRossmanSlide9
9
9
Example 1: Helper/hinderer?
Sixteen pre-verbal infants were shown two videos of a toy trying to climb a hill
One where a “helper” toy pushes the original toy up
One where a “hinderer” toy pushes the toy back down
Infants were then presented with the two toys from the videos
Researchers noted which toy then infant chose to play with
http://www.yale.edu/infantlab/socialevaluation/Helper-Hinderer.html
Northwest Two-Year College Math ConfRossmanSlide10
10
10
Example 1: Helper/hinderer?
Data: 14 of the 16 infants chose the “helper” toy
Two possible explanations
Infants choose randomly, no genuine preference, researchers just got lucky
Infants have a genuine preference for the helper toy
Core question of inference:
Is such an extreme result unlikely to occur by chance (random choice) alone …
… if there were no genuine preference (null model)?Northwest Two-Year College Math ConfRossmanSlide11
11
11
Analysis options
Could use the normal approximation to the binomial, but sample size is too
small for CLT
Could use a binomial probability calculation
We prefer a simulation approach
To illustrate “how often would we get a result like this just by random chance?”
Starting with tactile simulation
Northwest Two-Year College Math ConfRossmanSlide12
12
12
Strategy
Students flip a fair coin 16 times
Count number of heads, representing choices of helper and hinderer toys
Under the
null model
of no genuine preference
Repeat several times, combine results
See how surprising it is to get 14 or more heads even with “such a small sample size”Approximate (empirical) p-valueTurn to applet for large number of repetitions: www.rossmanchance.com/ISIapplets.html (One Proportion)Northwest Two-Year College Math Conf
RossmanSlide13
13
Results
Pretty unlikely to obtain 14 or more heads in 16 tosses of a fair coin, so …
Pretty strong evidence that pre-verbal infants do have a genuine preference for helper toy and were not just choosing at random
Northwest Two-Year College Math Conf
RossmanSlide14
Follow-up activity
Facial prototyping
Who is on the left – Bob or Tim?
Northwest Two-Year College Math Conf
14
RossmanSlide15
Follow-up activity
Facial prototyping
Does our sample result provide convincing evidence that people have a genuine tendency to assign the name Tim to the face on the left?
How can we use simulation to investigate this question?What conclusion would you draw?
Explain reasoning process behind conclusion
Northwest Two-Year College Math Conf
15
RossmanSlide16
16
16
Example 2: Dolphin therapy?
Subjects who suffer from mild to moderate depression were flown to Honduras, randomly assigned to a treatment
Is dolphin therapy more effective than control?
Core question of inference:
Is such an extreme difference unlikely to occur by chance (random assignment) alone (if there were no treatment effect)?
Northwest Two-Year College Math Conf
RossmanSlide17
17
17
Some approaches
Could calculate test statistic, p-value from approximate sampling distribution (
z
, chi-square)
But it’s approximate
But conditions might not hold
But how does this relate to what “significance” means?
Could conduct Fisher’s Exact TestBut there’s a lot of mathematical start-up requiredBut that’s still not closely tied to what “significance” meansEven though this is a randomization testNorthwest Two-Year College Math ConfRossmanSlide18
18
18
Alternative approach
Simulate random assignment process many times, see how often such an extreme result occurs
30 index cards representing 30 subjects
Assume no treatment effect (null model)
13 improver cards, 17 non-improver cards
Re-randomize 30 subjects to two groups of 15 and 15
Determine number of improvers in dolphin group
Or, equivalently, difference in improvement proportionsRepeat large number of times (turn to computer)Ask whether observed result is in tail of distributionNorthwest Two-Year College Math Conf
?
?RossmanSlide19
19
19
Analysis
www.rossmanchance.com/ISIapplets
(Two Proportions)
Northwest Two-Year College Math Conf
19
RossmanSlide20
20
20
Conclusion
Experimental result is statistically significant
And what is the logic behind that?
Observed result very unlikely to occur by chance (random assignment) alone (if dolphin therapy was not effective)
Providing evidence that dolphin therapy is more effective
Northwest Two-Year College Math Conf
RossmanSlide21
21
21
Example 3: Lingering sleep deprivation?
Does sleep deprivation have harmful effects on cognitive functioning three days later?
21 subjects; random assignment
Core question of inference:
Is such an extreme difference unlikely to occur by chance (random assignment) alone (if there were no treatment effect)?
Northwest Two-Year College Math Conf
RossmanSlide22
22
22
One approach
Calculate test statistic, p-value from approximate sampling distribution
Northwest Two-Year College Math Conf
RossmanSlide23
23
23
Another approach
Simulate randomization process many times under null model, see how often such an extreme result (difference in group means) occurs
Northwest Two-Year College Math Conf
RossmanSlide24
Example 4: Draft lottery
Rossman
Northwest Two-Year College Math Conf
24Slide25
Closer look
Rossman
Northwest Two-Year College Math Conf
25
r
= -0.226Slide26
Familiar refrain
How often would such an extreme result (
r
< -0.226 or r > 0.226) occur by chance alone from a fair, random lottery?Simulate!
Rossman
Northwest Two-Year College Math Conf
26Slide27
Simulation result
Such an extreme result would virtually never occur from fair, random lottery
Overwhelming evidence that lottery used was not random
Rossman
Northwest Two-Year College Math Conf
27Slide28
28
28
Advantages
You can do this from beginning of course!
Emphasizes entire process of conducting statistical investigations to answer real research questions
From data collection to inference in one day
As opposed to disconnected blocks of data analysis, then data collection, then probability, then statistical inference
Leads to deeper understanding of concepts such as statistical significance, p-value, confidence
Very powerful, easily generalized tool
Flexibility in choice of test statistic (e.g. medians, odds ratio)Generalize to more than two groupsNorthwest Two-Year College Math ConfRossmanSlide29
Implementation suggestions
Begin every example/activity with fundamental questions about the study/data
Observational units?
Variables?Types (cat/quant) and roles (
expl/resp) of variables
Observational study or experiment?Random sampling?Random assignment?
Rossman
Northwest Two-Year College Math Conf29Slide30
Implementation suggestions
Emphasize four pillars of inference
Is there a significant effect/difference?
How large is it?To what population can you generalize?
Can you draw a cause/effect conclusion?Notice that last two questions highlight distinction between random sampling and random assignment
Rossman
Northwest Two-Year College Math Conf
30Slide31
31
31
Implementation
suggestions
What about normal-based
methods: why?
Do not ignore them!
A common shape often arises for empirical randomization/sampling distributions
Duh!
Students will see t-tests in other courses, research literatureProcess of standardization has inherent valueGain intuition through formulasNorthwest Two-Year College Math Conf
31
RossmanSlide32
Implementation suggestionsWhat about normal-based methods: how?
Introduce
after
students have gained experience with randomization-based methods
As
a prediction of how simulation results would turn out
Focus on standard deviation of statistic (standard error
)Northwest Two-Year College Math Conf32
RossmanSlide33
33
33
Implementation suggestions
What about interval estimation?
Two possible simulation-based approaches
Invert test
Test “all” possible values of parameter, see which do not put observed result in tail
Easy enough (but tedious) with one-proportion situation (sliders), but not as obvious how to do this with comparing two proportions
Estimate +/- margin-of-error
Could estimate margin-of-error with simulated randomization distributionRough confidence interval as statistic + 2×(SD of statistic)
Northwest Two-Year College Math Conf
33RossmanSlide34
34
34
Implementation
suggestions
Can we introduce SBI gradually?
Yes!
One class period:
Use helper/hinderer activity to introduce concepts of statistical significance, p-value, could this have happened by random chance alone
Two class periods:
Also use dolphin therapy activity to introduce inference for comparing two groups (chance = random assignment)Three class periods: Also use sleep deprivation activity prior to two-sample t-tests (for quantitative response)Four class periods: Also use draft lottery activity (two quantitative variables)
Northwest Two-Year College Math Conf
34RossmanSlide35
Assessment suggestions
Quick assessment of understanding of class activity
What did the cards represent?
What did shuffling and dealing the cards represent?
What implicit assumption about the two groups did the shuffling of cards represent?
What observational units were represented by the dots on the
dotplot
? Why did we count the number of repetitions with 10 or more “successes” (that is, why 10 and why “or more”)?
35Northwest Two-Year College Math Conf35
RossmanSlide36
36
36
Assessment
suggestions
Conceptual understanding of logic of inference
Interpret
p-value in
context: Probability
of observed data, or more extreme, under randomness hypothesis, if null model is trueSummarize conclusion in context, and explain reasoning processApply to new studies, new scenarios Define null model, design simulation, draw conclusionMore complicated scenarios (e.g., compare 3 groups), new statistics (e.g., relative risk)
Northwest Two-Year College Math Conf
36RossmanSlide37
37
37
Assessment
suggestions
Multiple-choice example (not simulation-based)
Suppose
one study finds that 30
% of women sampled dream in color, compared to 20% of men. S
tudy A sampled 100 people of each sex, whereas Study B sampled 40 people of each sex. Which study would provide stronger evidence that there is a genuine difference between men and women on this issue?Study A Study B The strength of evidence would be the same for these two studies
Northwest Two-Year College Math Conf
37RossmanSlide38
38
38
Assessment
suggestions
Free response example (simulation-based)
In a recent study, researchers presented young children (aged 5 to 8 years) with a choice between two toy characters who were offering stickers. One character was described as mean, and the other was described as nice. The mean character offered two stickers, and the nice character offered one sticker. Researchers wanted to investigate whether infants would tend to select the nice character over the mean character, despite receiving fewer stickers. They found that 16 of the 20 children in the study selected the nice character.
Northwest Two-Year College Math Conf
38
RossmanSlide39
39
39
Assessment
suggestions
Free response example (simulation-based)
Describe (in words) the
null model
/hypothesis in this study
.Suppose that you were to conduct a simulation analysis of this study to investigate whether the observed result provides strong evidence that children genuinely prefer the nice toy with one sticker over the mean toy with two stickers. Indicate what you would enter for the following three inputs:Probability of heads: _____Number of tosses: _____Number of repetitions: _____Northwest Two-Year College Math Conf
39
RossmanSlide40
40
40
Assessment
suggestions
Free response example (simulation-based)
One of the following graphs was produced from a correct simulation analysis. The other two were produced from incorrect simulation analyses. Circle the correct one
.
Which of the following is closest to the p-value for this study?
5.0, .50, .05, .005Northwest Two-Year College Math Conf40RossmanSlide41
41
41
Assessment
suggestions
Free response example (simulation-based)
Write an
interpretation
of this p-value in the context of this study (probability of what, assuming what
?).Summarize your conclusion from this research study and simulation analysis. Northwest Two-Year College Math Conf41RossmanSlide42
Resources
Northwest Two-Year College Math Conf
42
RossmanSlide43
Resources
Northwest Two-Year College Math Conf
43
RossmanSlide44
Resources
Simulation-based inference blog:
www.causeweb.org/sbi/
ISI applets:
www.rossmanchance.com/ISIapplets.htmlStatkey app:
lock5stat.com/statkey
Northwest Two-Year College Math Conf
44RossmanSlide45
Thanks!
Want to learn more?
Workshop (with lunch) on Saturday afternoon
in Lincoln room, thanks to John Wiley and Sons
http://www.math.hope.edu/isi/
arossman@calpoly.edu
Northwest Two-Year College Math Conf
45Rossman