/
Introduction to Statistical Analysis Introduction to Statistical Analysis

Introduction to Statistical Analysis - PowerPoint Presentation

bitsy
bitsy . @bitsy
Follow
381 views
Uploaded On 2022-06-21

Introduction to Statistical Analysis - PPT Presentation

What is STATISTICS Statistics fulfill one of the basic human needs A process to Manage to clean and format the data in order to get a valid data which is feasible to be analyzed ID: 921380

sample data statistics test data sample test statistics size variable categorical estimation statistical population independent obese analysis amp frequency

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Introduction to Statistical Analysis" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Introduction to Statistical Analysis

Slide2

What is STATISTICS?

Statistics

fulfill one

of the basic human

needs.

A

process

to:

Manage

-

to clean and format the data in order to get a valid data which is feasible to be analyzed

.

Analyze

-

to explore the data in order to answer the objective

.

Interpret

data

-

to convert

the statistical interpretation to the common understanding.

Slide3

Classification of Statistics

Descriptive Statistics

- describe the data by summarizing them

Inferential Statistics

- techniques, by which..

- inferences are drawn for the population

parameters from the sample statistics

OR

- Conclusions were made for the

population using a sample data

Slide4

What is the Descriptive Statistics for?

In any study….

Before answering the research question, we should recognize the characteristics of the sample

- (e.g. age, gender, ethnicity, etc)

Slide5

How to describe a categorical variable?

Statistics

- Frequency

- Relative frequency

- Cumulative relative frequency

Figure/Chart

Bar

Pie

Slide6

Statistics

Central tendency

Mean

Median (50

th

Percentile)

Dispersion

Standard deviation

Inter-quartile range (IQR) (3rd quartile – 1st quartile)Figure/ChartHistogram/Frequency polygonBox plot

How to describe a numerical variable?

Slide7

Inferential statistics

With inferential statistics, we take a sample (a small subset of a larger set of data).

We then use this sample to draw inferences or make generalizations about the population from which the samples were drawn.

Estimation (Confidence interval)

For e.g.: Estimation of mean, estimation of proportion

Hypothesis test

For e.g.:

Comparing means, comparing proportions, association between 2 variables

Slide8

Estimation

In estimation, the sample is used to estimate a population parameter and a confidence interval about the estimate is constructed. 

Estimation (CI) of mean:

For e.g.: =16.14, 95% CI = (15.30, 16.98)

We are 95% sure that the mean duration of exercise of population will lie between 15.30 and 16.98 minutes/day.

Estimation (CI) of proportion:

For e.g.: p =0.37, 95% CI = (0.27, 0.47)

We are 95% sure that the prevalence of the obesity in the population will be between 27% & 47%.

Slide9

Hypothesis

Testable statement that describes s relationships of variables.

Derived from research questions.

Postulating the existence of:

1. A difference between groups.

2. An association among factors.

Null hypothesis (

H

0

): - Hypothesis to be tested, of no difference.

Alternative hypothesis (

H

a

): - Hypothesis that postulates that there is a treatment effect or a difference between groups. The process of inferential statistics is to justify whether we have enough evidence (based on probability) to reject or fail to reject H0.

Slide10

Interpretation

Sample Size

Data Collection

Data Quality

Statistical Analysis

Statistical Procedure

Slide11

Sample Size

Slide12

What Is Sample Size?

Sample size

is

:-

the number of units (persons, animals, patients, specific circumstances, etc.) in a population need to be studied to represent the population.

Slide13

Guide : When to start and stop collecting? How are we going to collect it?

Minimum required sample: Depends on availability of the sample, time constraint, subject constraint and ethical issues

Study design : Influence the quality and accuracy of research

Economic : Waste of resources if not having the capability to produce useful results

Why We Need To Calculate Sample Size?

Slide14

Process of Sample Size Determination

Slide15

Journal sample size for pilot study

Slide16

Too small

Well conducted study may fail to answer its research question.

May fail to detect important effects

May estimate those effects imprecisely

Too large

Costly – the longer the study the higher it cost

Difficulties face – lack of manpower and time

Tiring – recruitment of outcome or subjects maybe tiring for a long time

Sample size should be adequate to achieve a good precision in estimation

What

Happened If Sample Size…..

Slide17

Power and sample size

http://biostat.mc.vanderbilt.edu/wiki/Main/PowerSampleSize

Epi

– Info Software

http://wwwn.cdc.gov/epiinfo/html/downloads.htm

EpiCalc

http://www.brixtonhealth.com/epicalc.html

Sample size for Prevalence Studies.xls – Lin

NaingSample size for Sensitivity and Specificity.xls – Lin NaingPower Analysis and Sample Size (PASS) – most powerful but have to buy the license firsthttp://www.ncss.com/pass.html

Statistical Software: Website address

Slide18

Data Collection

Slide19

E.g.: gender, race

Logical

ordering to

the categories, e.g.: education level, pain severity

E.g.: age, weight, height

Statistics:

Frequency & percentage

Relative frequency & percentage

Cumulative frequency & percentage

Figure/chart:

Bar

Pie

Statistic:

Central tendency & dispersion

Mean & SD(if normality assumed)Median & IQR (if skewed)Figure/chart:

HistogramBoxplot

Slide20

Before key-in

What to prepare?

INSTRUMENT/QUESTIONNAIRE

Purpose and objective

Variables and units

Format

DATA DEFINITION/ DATA DICTIONARY

Explain the

summary of variables

in terms of variable name, description, formatting and labelling (where necessary/applicable).DATA ENTRY

Slide21

DATA DEFINITION/DATA DICTIONARY

Slide22

Data Quality

Slide23

DUPLICATES

More than one

observation having same

patientid

MISSING VALUES

Blank

cell without information.

INCONSISTENCY

3 means nothing

in the definition for

ptgender

EXTREME VALUES

Value

exceeds the upper limit.Definition: patientid - patient identification number. ptgender - 1 is Male and 2 is Female. height - defines from 1.4m till 1.8m.

weight - defines from 50.0kg till 150.0kg.DATA CLEANING

Slide24

Statistical Analysis

Slide25

Number of Groups

Parametric Test

Non-parametric

Test

Dependent

Variable

Dependent

Variable

Numerical Data

Numerical Data

Two

(Independent)

Categorical

(e.g. smokers and non-smokers)Independent t-test

Mann-Whitney test> two (Independent)-Categorical(e.g. malay, chinese and indian)One-way ANOVAKruskal-Wallis testTwo (Dependent)-Categorical(e.g. pre and post intervention)Paired t-test

Wilcoxon Signed Rank test

-Numerical

(e.g. weight in kg)

Pearson’s correlation

Spearman’s correlation

Type of Analysis

Slide26

Number of Groups

Assumptions

Assumptions

Dependent

Variable

Dependent

Variable

Categorical Data

Categorical Data

Two

(Independent)

Categorical

(e.g. smokers and non-smokers)

Chi-square testFisher’s exact test> two (Independent)-Categorical

(e.g. malay, chinese and indian)Chi-square testFisher’s exact testNon-parametric Test

Type of Analysis

Assumptions:

The number of cells with Expected Count (EC) less than 5, must be less than 25% of the total number of cells.

The smallest EC must be at least 2.

Slide27

Example Study

– numerical data

RQ: Is there any difference of time spent for exercise between obese and non-obese group?

Objective: To compare the mean duration of exercise between obese and non-obese group

Assumption:

Dependent variable should be approximately normally distributed for each category of the independent variable.

Slide28

second

Test statistic using SPSS

STEP:

Analyze >> Compare means >> Independent

t

test

Analyze >> Descriptive Statistics >> Explore

Click

Click

Slide29

Make a decision

third

Descriptive statistics of each variable

Levene’s

test

result

If P value > 0.05(not sig.), read the first row(Equal variances assumed).

If P value < 0.05(sig), read the second row(Equal variances not assumed).

T

he

Levene’s

test is not significant.

Slide30

Interpretation

“ An independent-sample

t

-test indicated that duration of exercise were not significantly different between obese (Mean=16.7, SD=4.83) and non-obese (Mean=15.8, SD=3.88),

t

(98)=1.06,

p

=

.291

. Therefore, there is no significant association between duration of exercise and obesity.”

Table 1: Comparing mean duration of exercise between obese(

n

=37) and non-obese(

n

=63) respondents

Slide31

Comparing 2 or more proportions.

RQ: Is there any association between gender and obesity group?

Example Study

– categorical data

Slide32

second

Test statistic using SPSS

Analyze >> Descriptive Statistics >> Crosstabs

Click

Click

Click

Click

1

2

Slide33

Make a decision

third

must be

less than 25

%

must be at least 2

Slide34

fourth

Interpretation

“ A Chi-square test for independence indicated that the prevalence (proportion) of obesity between male and female are not significantly different (

P

=0.753). Therefore, there is no significant association between gender and obesity.”

Table 9: Association between gender and obesity

Slide35

Interpretation

Sample Size

Data Collection

Data Quality

Statistical Analysis

Statistical Procedure

Slide36

Thank You