/
Chapter 2 Chapter 2

Chapter 2 - PowerPoint Presentation

faustina-dinatale
faustina-dinatale . @faustina-dinatale
Follow
369 views
Uploaded On 2016-07-10

Chapter 2 - PPT Presentation

Collecting Data Sensibly 21 Statistical Studies Observation and Experimentation Whether or not a conclusion is reasonable depends on how the data were collected Sometimes were are interested in answering questions about the characteristics of a single population or in in comparing two or m ID: 398859

sample population experiment response population sample response experiment experimental random variable sampling selected bias treatment group factors variables extraneous

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Chapter 2" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Chapter 2

Collecting Data SensiblySlide2

2.1 Statistical Studies: Observation and Experimentation

Whether or not a conclusion is reasonable depends on how the data were collected.

Sometimes we’re are interested in answering questions about the characteristics of a single population or in in comparing two or more well defined populations.

Sometimes we’re trying to answer questions dealing with the effect of a certain explanatory variable on some response.Slide3

In the former situation, an

observational study

is conducted.

Investigator observes characteristics of a subset of the member of one or more existing populations.

Goal is usually to draw conclusions about the corresponding population or about differences between two or more populations.

In the latter, an

experiment is conducted.Investigator observes how a response variable behaves when one or more explanatory variables, sometimes called factors, are manipulated.Goal is to determine the effect of the manipulated factors on the response variable.Composition of the groups that will be exposed to different experimental conditions is determined by random assignments.Slide4

Important difference between observational study and experiment.

Well-designed experiment can result in data that provide evidence for a cause-and-effect relationships.

Alternatively, observational studies can not because it is possible that the observed effect is due to some variable other than the factor being studied.

Such variables are called

confounding variables

– variables that are related to both group membership and the response variable of interest in a research study.Slide5

Example of confounding variables in a study

A July 1, 2003 article from

San Luis Obispo Tribune

summarized the conclusion so a government advisory panel that investigated the benefits of vitamin use.

Panel looked at a large number of studies and concluded that the results were “inadequate or conflicted”

Major concern, many studies were observational in nature and the panel worried that people might healthier just because they take better care of themselves in general.

Confounding variable was lifestyle.Slide6

Two different types of conclusions have been described:

One type involves generalizing from what we have seen in a sample to some larger population.

The other involves reaching a cause-and-effect conclusion about the effect of an explanatory variable on response.

It is important to think carefully about the objectives of a statistical study before planning how the data will collected.

Both observational studies and experiments must be carefully designed if the resulting data are to be useful.Slide7

Table 2.1 Drawing conclusions from statistical studies

Study Description

Reasonable

to generalize about group characteristic to the population?

Reasonable to draw cause-and-effect

conclusion?

Obs. study w/ sample selected at random from a population of interestYesNoObs.

Study based on convenience or voluntary response sample (poor sampling design)

No

No

Experiment w/

groups formed by random assignment of individuals or objects to experimental conditions

Individuals used in study are volunteers or not randomly selected

from some population of interest.

No

Yes

Individuals or objects used in study are randomly

selected from some population of interest

Yes

Yes

Experiment

with groups not formed by random assignment to experimental conditions (poorly designed experiment)

No

NoSlide8

2.2 Sampling

If one is to generalize about a population from a sample, the sample must be representative of the population.

If sample is chosen haphazard on the basis of convenience alone, it is impossible to interpret the resulting data with confidence.

There is not way to tell just by looking at a sample whether it is representative of the population from which it was drawn.

A

census

– obtaining information from an entire population – is often not feasible, so samples are selected instead.Process may be destructiveLimited resources: not enough time and moneySlide9

Bias in Sampling

Bias

– the tendency for samples to differ from the corresponding population in some systematic way.

Selection bias

– bias resulting from the systematic exclusion if some part of the population.

Example: Taking a sample of opinion in a community by selecting participants from phone numbers in the local phone book would systematically exclude people who choose to have unlisted numbers, people who do not have phones, and people who have moved into the community since the telephone directory was published.Slide10

Measurement bias

or

response bias

– bias resulting from the method of observation tends to produce values that differ from the true value.

Examples:

Taking a sample of weights of a type of apple when the scale consistently gives a weigh that is 0.2 ounces high.

When questions on a survey are worded in a way that tends to influence the response.A Gallup survey sponsored by the American Paper Institute (Wall Street Journal, May 17, 1994) included the following question: “It is estimated that disposable diapers account for less than 2 percent of the trash in today’s landfills. In contrast, beverage containers, third-class mail and yard waste are estimated to account for 21 percent of trash in landfills. Given this, in you opinion, would it be fair to tax or ban disposable diapers?”Slide11

Nonresponse

bias

– bias that results when data are not obtained from all individuals selected for inclusion in the sample.

This form of bias is lowest when response rate is high.

Highest

nonresponse

rates are mail, telephone and internet, but are cheapest to conduct.Best response rates are from personal interviews, but are expensive to conduct.Slide12

Important note on bias

Bias is introduced by the way in which a sample is selected or by the way in which the data are collected from the sample so that increasing the size of the sample does nothing to reduce the bias.Slide13

Random Sampling

Random sampling helps reduce bias from samples.

Most inferential methods introduced in this text are based on the idea of random selection.

Simple Random Sample of size n

- a sample that is selected in a way that ensures that every different possible sample of the desired size has the same chance of being selected.

A common method of selecting a random sample is to first create a list, called a

sampling frame of the individuals in the population. Each item on the list can then be identified by a number, and a table random digits or a random number generator can be used to select the sample.Slide14

Sampling with replacement

– means that after each successive item is selected for the sample, the item is “replaced” back into the population and may therefore be selected again.

Example: Choose a sample of 5 digits by spinning a spinner and choosing the number where the pointer is directed.Slide15

Sampling without replacement

– after an item is selected for the sample it is removed from the population and therefore cannot be selected again.

Example: A hand of “five card stud” poker is dealt from an ordinary deck of playing cards. Typically, once a card is dealt it is not possible for that card to appear again until the deck is reshuffled and dealt again.Slide16
Slide17

A Note on Sample Size

Common misconception

If sample size is relatively small compare to the population size, the sample can’t possibly accurately reflect the population.

The random selection process allows us to be confident that the resulting sample adequately reflects the population, even when the sample consists of only a small fraction of the population (see Figure 2.1 for illustration of this idea).Slide18
Slide19

Other Sampling Methods

In some situations, alternative sampling methods may be less costly, easier to implement, or more accurate.

Stratified Random Sampling

– separate random samples are taken from a set of non-overlapping subpopulations, called

strata

(or stratum, singular).

Example: Estimating malpractice insurance cost among subgroups of doctors.Provides information about subgroups as well as overall pop.Allow to make more accurate inferences about a population than does SRS.Slide20

Cluster sampling

– involves dividing a population if interest into

nonoverlapping

subgroups, called

clusters

, selecting clusters at random, and all individuals in the cluster are included in the sample.

In this case, it is ideal for each cluster to mirror the characteristics of the population.Note: The ideal situation occurs when it is reasonable to assume that each cluster reflects the general population. If that is not the case or when clusters are small, a large number of clusters must be selected to get a sample that reflects the population.Second note: Do not confuse stratified and cluster sampling. Strata must be homogenous (similar). Clusters must be heterogeneous (reflecting variability in the population).Slide21

Systematic sampling

is a procedure that can be employed when it is possible to view the population of interest as consisting of a list or some other sequential arrangement. A value k is specified (a number such as 25, 100, 2500…). The one of the first

k

individuals is selected at random, and then ever

k

th

individual in the sequence is selected to be included in the sample. Called 1 in k systematic sampling.Example: In a large university, a professor wanting to select a sample of students to determine the student’s age, might take the student directory (an alphabetical list) and randomly choose one of the first 100 students) and then take every 100th student from that point on.Works as long as there is no repeating patterns in the population.Slide22

Convenience sampling

is using and easily available or convenient group to form a sample.

Example: A “voluntary response sample” is often taken by television news programs. Viewers are encouraged to go to a website and “vote” yes or no on some issue. The commentator then would announce the results of the survey.

A recipe for disaster!

Results are rarely informative about the true nature of the population; wouldn’t want to generalize about the population.Slide23

2.3 Simple Comparative Experiments

Sometimes the question we are trying to answer deals with the effect of a certain explanatory variable on some response.

“What happens when…?”

“What is the effect of…?”

To address these types of questions, the researcher conducts an experiment.Slide24

Experiment

– a planned intervention undertaken to observe the effects of one or more explanatory variables, called

factors

, on a response variable.

Purpose is to increase understanding of the nature of the relationships between the explanatory and response variables.

Any particular combination of values for the explanatory variables is called an

experimental condition or treatment.The design of an experiment is the overall plan for conducting an experiment.Slide25

A good experiment requires more than just manipulating the explanatory variable. The design must also eliminate rival explanations or the experimental results will not be conclusive.

Example: Testing student the effects of room temperature on student performance on a semester physics

exam.

Four sections, two assigned to 65 deg F and

two, 75 deg F.

If 65 deg F group had a higher average would

these results conclusive?Why or why not?Slide26

Extraneous factor

– is one that is not of interest in the current study but is thought to affect the response variable; also called

lurking variable

.

A well-designed experiment copes with the potential effects of extraneous factors by

Random assignment to experimental conditions

Direct controlBlockingSlide27

Direct control

– when an experimenter holds extraneous factors constant so that their effects are not confounded with those of the experimental conditions.

Revisiting the physics exam example:

Requiring the use of the same physics textbook for all sections.

All sections meet at same time of day.Slide28

Blocking

- Using extraneous factors to create groups (blocks) that are similar. All experimental conditions are then tried in each block.

Extraneous factors addressed through blocking are called blocking factors.

In the physics example if the four sections are taught by two different instructors, we might block by instructor.Slide29

We can control the effects of extraneous factors through direct control or blocking as described above, but factors cannot be controlled or blocked.

Example: Student ability in the physics test example

We can handle extraneous factors through random assignment to experimental groups—a process called

randomization

.

Ensures that experiment does not systematically favor one experimental condition over another and attempts to create experimental groups that are much alike as possible.

Ideal situation: Ability to both randomly select subjects and randomly assign them to experimental conditions.Would allow for conclusions to be made about the larger population.The former is not always possible, but we can still make conclusions about the treatment.Slide30

An investigation to test if an online review of course material before an exam would improve exam performance.

Subjects selected might have different ability, which is reflected in their SAT math and verbal scores.Slide31

If we are going to assign these students to two groups, one receiving the review and one not, we should make sure that assignment does not favor one groups over another.

This figure is suppose to show, by use of color, that the subjects were randomly assigned. The orange and blue dots in the original figure were indeed randomly dispersed for any given row of dots in the figure above.Slide32

As long as the number of subjects in not too small, we can rely on random assignment to produce comparable experimental groups eliminating the problem of extraneous variables.

This is the reason that randomization is part of all well-designed experiments.

The gas additive/mileage example.

Test three different fuel additives on fuel efficiency.

Use same car, 10 trials for each additive.

When an experiment can be viewed as a sequence of trials, randomization involves the random assignment of treatments to trails.Slide33

Replication

– a design strategy to ensure that there are enough observations for each experimental condition to ensure that each group reliably reflects variability of the population.

Example 2.3 Subliminal Messages

Language test, one group with words related to politeness, the other related to rudeness.

After test 63% of the group give words that were related to rudeness interrupted a conversation, the other group on 17% interrupted.Slide34

Many experiments compare a group that receives a particular treatment to a

control group

that receives no treatment.

Allows the experimenter to assess how the response variable behaves when the treatment is not used.

Example 2.4 Chilling Newborns? Then You Need a Control Group.

Infants were randomly assigned to usual care (control group) or whole-body cooling.

Results indicated that cooling reduced the risk of death and disability for infants deprived of oxygen at birth.Slide35

Before proceeding with an experiment you must be able to answer these questions.

What is the research question that data from the experiment will be used to answer?

What is the response variable?

How will the values of the response variable be determined?

What are the factors (explanatory variables) for the experiment?

For each factor, how many different values are there, and what are these values?

What are the treatments for the experiment?What extraneous variables might influence the response?How does the design incorporate random assignment of subject to treatments (or treatments to subjects) or random assignment of treatments to trials?For each extraneous variable listed in Question 7, how does the design protect against its potential influence on the response through blocking, direct control, or randomization?

Will you be able to answer the research question using the data collected in the experiment?Slide36

2.4 More on Experimental Design

Goal of experimental design is to provide a method of data collection that:

Minimizes extraneous sources of variability in the response so that any differences in response for various experimental conditions can be more easily assessed.

Creates experimental groups that are similar with respect to extraneous variables that cannot be controlled either directly or through blocking.

Notes on control groups:

Comparing new treatment to old; old is considered the control.

Not all experiments use a control.Example: how oven temperature effects overall cooking time.Slide37

Dealing with human subjects

Placebo

– is something that is identical (in appearance, taste, feel, etc.) to the treatment received by the treatment group, except that it contains no active ingredients.

Because people sometimes respond to the power of suggestion.

Single-blind

experiment – experiment in which subjects do not know what treatment they have received.

Double-blind experiment – experiment in which neither the subjects not the individuals who measure the response know which treatment was received.Slide38

Experimental units and replication

Experimental unit

– the smallest unit to which a treatment if applied.

Replication

– each treatment if applied to more than one experimental unit.

Necessary for randomization to be an effective way to create similar experimental groups, and

to get a sense variability in the values of the response for individuals that receive the same treatment.Slide39

2.5 More on Observational Studies: Designing Surveys (Optional)

Survey

– a voluntary encounter between two strangers in which an interviewer seeks information from a respondent by engaging in a special type of conversation.

Designing and administering a survey is not as easy as it might seem.

Great care required to obtain good information.

Survey researchers and psychologist agree that there is a sequence of task in a survey:

Comprehension of questionRetrieval from memoryReporting a responseSlide40

Keep in mind when writing a survey:

Questions should be understandable by the individuals in the population being surveyed. Vocabulary at an appropriate level, and sentence structure should be simple.

Questions should, as much as possible recognize that human memory is fickle. Questions that are specific will aid the respondent by providing better memory cues. Limitations of memory should be kept in mind when interpreting responses.

Questions should not make respondents feel embarrassed or threatened. In such cases, respondents may introduced social desirability bias. This can compromise conclusions drawn from survey data.