Modified from Dr Tammy Franks presentation NOVA Why do we need statistics Example Chemical may increase growth of animal Will be tested on housefly A colony of 20000 houseflies are divided into 2 groups ID: 309264
Download Presentation The PPT/PDF document "Statistics made simple" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Statistics made simple
Modified from Dr. Tammy Frank’s presentation, NOVASlide2
Why do we need statistics?
Example:Chemical may increase growth of animalWill be tested on houseflyA colony of 20,000 houseflies are divided into 2 groupsGroup 1 gets chemical in foodGroup 2 gets a placebo in same food
What comes next?Slide3
2 weeks later – take random sample of 25 house flies from each group, measure wingspan
What are the results?Slide4
Housefly results
25 houseflies from each groupGroup 1 (with chemical) – 7.5mm wingspanGroup 2 (without) – 7.2mm wingspanWhat does this mean?Are group 1 flies really bigger?
Some might say yes, some might say no
Did you, by chance, happen to pick some larger flies from group 2?
Was there sampling error or bias?Slide5
One way to be sure is to measure all 20,000 flies……not feasible
So what do we do?Slide6
Statistics
You say the flies are bigger, I say notStatistics provide rules to help us find outStatistics will help tell us if these are significant (real) differencesIs there bias? Where bigger ones in group 2 picked by chance?Statistics will tell us what the chances are that the results are due to sampling bias or random chanceSlide7
Significant Difference
Real differenceDue to chemical, not chanceIf test shows probability of getting results by chance or random error is <5%, we accept claim that chemical produced larger flyIf test shows that the probability of getting results by chance or random error is >5%, we reject claim that chemical produced larger flySlide8
5% is arbitrary cut-off point that is generally accepted
However, if the cost of making an incorrect decision is very high, there will be higher cut-off like 1%such as research with cancer drugs, etc.Probability value is the p-valueMeasure of probability that the pattern we see in our data is due to sampling error or random chanceSlide9
Scientific Method
Remember that we cannot “prove” anything. We can only accept or reject a hypothesisA theory is the closest that a biologist can come to “proving” a hypothesisSupported and validated by data and scientific communitySlide10
Null and Alternative Hypotheses
For any experiment/survey/study, there must be a null hypothesis and an alternative hypothesisSet up so that one of them must be true, and one must be falseNull hypothesis (H0): = or ≤ or ≥Example:
The average weight of hermit crab group A is the same as that of hermit crab group B (=)
OR
The average weight of hermit crab group A is the same or greater than that of hermit crab group B (≥)
OR
The average weight of hermit crab group A is the same or less thank that of hermit crab group B (≤)Slide11
If null is true, then alternative must be false
Ho: average weight of hermit crab group A = average weight of group BHA: average weight of hermit crab group A ≠ average weight of group BSlide12
Two-tailed hypotheses
Use if you have no expectationsYou are trying to find out if weights are different but have no reason for them to beHo: average weight of hermit crab group A = average weight of group BH
A
:
average weight of hermit crab group A ≠ average weight of group BSlide13
One-tailed hypothesis
Use if you have an expectation of the outcome, based on previous studies or informationFor example, previous studies have demonstrated that Group A area has more hermit crab food that Group BHo: average weight of hermit crab group A ≤ average weight of group B
H
A
:
average weight of hermit crab group A › average weight of group B
Alternative hypothesis corresponds to what you expectSlide14
Always reject or accept the null hypothesis, never reject the alternativeIf you accept or support the null, then don’t mention the alternative
If you reject the null, then accept or support the alternativeWe never prove a hypothesisWe just gain a measure of how confident we are with our hypothesisSlide15
p-value
The measure of the probability that the pattern we see in our data is due to random chance or sampling error0.05 is the value most commonly usedIf p-value is ›0.05 (high p-value), accept nullWeight is not significantly differentIf p-value is ≤0.05 (low p-value), reject null and accept alternative
Weight is significantly differentSlide16
Important terms:
x = measurement value∑ = sum ofn = sample sizedf = degrees of freedom = n – 1X = mean or average = ∑x/n
√ s
2
= Standard deviation = average distance from mean
s
2
= Variance = mean of sum of squares
∑(x – X)
2
/
df
Tells you how much your values varied from mean
Large variance means there is large spread in data, small variance means data points are closer to meanSlide17
What test do you use to get p?
Depends on what type of data you are collectingMeasurement variable or nominal variables?Slide18
Measurement variables
Something that can be counted or measuredInvolves numbersExamples: length, weight, quantityWhat are examples of tests that can be used?Slide19
t-test
Used to determine if two sets of data have the same meanPaired t-test – when measurements are linkedPatient before and after using drugThe null would state there is no differenceUnpaired t-test – when you have before and after within 2 different groups
Patients with drug (group 1) and patients without drug (group 2)Slide20
What do you do when there are more than 2 sets of data?
ANOVA – analysis of varianceNull would state that the means are equalExample would be if you had 5 groups of patients taking drugs at different dosages per groupSingle factor ANOVAOnly vary one parameter – drug dosageTwo factor ANOVA with or without replication
Vary dosage and time of daySlide21
Nominal variables
Usually involves categoriesA nominal variable is often a word or percentageExamples: color, sex, genotypesWhat are examples of
tests that can be used?Slide22
Goodness of fitness test
Chi-square