SingleFactor Models Independent Variable can be qualitative or quantitative If Quantitative we typically assume a linear polynomial or no structural relation If Qualitative we typically have no structural relation ID: 1027711
Download Presentation The PPT/PDF document "Single-Factor Studies KNNL – Chapter 1..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
1. Single-Factor StudiesKNNL – Chapter 16
2. Single-Factor ModelsIndependent Variable can be qualitative or quantitativeIf Quantitative, we typically assume a linear, polynomial, or no “structural” relationIf Qualitative, we typically have no “structural” relationBalanced designs have equal numbers of replicates at each level of the independent variableWhen no structure is assumed, we refer to models as “Analysis of Variance” models, and use indicator variables for treatments in regression model
3. Single-Factor ANOVA ModelModel Assumptions for Model TestingAll probability distributions are normalAll probability distributions have equal varianceResponses are random samples from their probability distributions, and are independentAnalysis ProcedureTest for differences among factor level meansFollow-up (post-hoc) comparisons among pairs or groups of factor level means
4. Cell Means Model
5. Example – Virtual Training of Lifeboat TaskTreatments – r = 4 Methods of Virtual Training Traditional Lecture and Materials (LEC/MAT)Computer Monitor / Keyboard (MON/KEY)Head Mounted Display / Joypad (HMD/JPD)Head Mounted Display / Wearable Sensors (HMD/WEA)Response (Y) – Procedural Knowledge Test Scoren = 16 subjects per treatment (Data generated to match) Source: J. Jung and Y.J. Ahn (2018). “Effects of Interface on Procedural Skill Transfer in Virtual Training: Lifeboat Launching Operation Study,” Computer Animation & Virtual Worlds, 29, e1812. https://doi.org/10.1002/cav.1812
6.
7. Cell Means Model – Regression Form
8. Cell Means Model – Regression Form - Example
9. R Program vt <- read.csv("http://users.stat.ufl.edu/~winner/data/virtual_training.csv")attach(vt); names(vt)#### Matrix Form - Cell Means ModelY <- matrix(procKnow, ncol=1) # 64x1 column vector of responsesone <- matrix(rep(1,16), ncol=1) # 16x1 column vector of 1szero <- matrix(rep(0,16), ncol=1) # 16x1 column vector of 0s## Xcm is 64x4 block diagonal matrixXcm <- rbind(cbind(one,zero,zero,zero),cbind(zero,one,zero,zero), cbind(zero,zero,one,zero),cbind(zero,zero,zero,one))XcmPXcm <- t(Xcm) %*% Xcm # X'XXcmPY <- t(Xcm) %*% Y # X'YXcmPXcmINV <- solve(XcmPXcm) # inv(X'X)betahatcm <- XcmPXcmINV %*% XcmPY # betahatround(cbind(XcmPXcm, XcmPY, XcmPXcmINV, betahatcm), 4)> round(cbind(XcmPXcm, XcmPY, XcmPXcmINV, betahatcm), 4) [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10][1,] 16 0 0 0 78.8888 0.0625 0.0000 0.0000 0.0000 4.9306[2,] 0 16 0 0 123.3334 0.0000 0.0625 0.0000 0.0000 7.7083[3,] 0 0 16 0 107.7777 0.0000 0.0000 0.0625 0.0000 6.7361[4,] 0 0 0 16 109.9999 0.0000 0.0000 0.0000 0.0625 6.8750
10. Model InterpretationsFactor Level MeansObservational Studies – The mi represent the population means among units from the populations of factor levelsExperimental Studies - The mi represent the means of the various factor levels, had they been assigned to a population of experimental unitsFixed and Random FactorsFixed Factors – All levels of interest are observed in studyRandom Factors – Factor levels included in study represent a sample from a population of factor levels
11. Fitting ANOVA Models
12. Analysis of Variance
13. Procedural Knowledge Scores – Sums of Squares
14. ANOVA Table
15. F-Test for H0: m1 = ... = mr
16. Procedural Knowledge Scores – ANOVA
17. R Program – Direct Computations#### "Brute-Force" Calculations n.trt <- as.vector(tapply(procKnow, grp.trt, length))mean.trt <- as.vector(tapply(procKnow, grp.trt, mean))sd.trt <- as.vector(tapply(procKnow, grp.trt, sd))mean.all <- mean(procKnow)r <- length(n.trt)n_T <- sum(n.trt)SSTR <- sum(n.trt * (mean.trt - mean.all)^2); dfTR <- r-1SSE <- sum((n.trt-1) * sd.trt^2); dfE <- n_T-rSSTO <- SSTR + SSE; dfTO <- n_T-1MSTR <- SSTR / dfTRMSE <- SSE / dfEF_star <- MSTR / MSEF.95 <- qf(.95,dfTR,dfE)pF <- 1 - pf(F_star, dfTR, dfE)> round(anova.out, 4) df SS MS F* F(.95) P(>F)Treatment 3 65.6639 21.8880 4.9406 2.7581 0.0039Error 60 265.8149 4.4302 NA NA NATotal 63 331.4788 NA NA NA NA
18. R Commands with aov Functionvt <- read.csv("http://users.stat.ufl.edu/~winner/data/virtual_training.csv")attach(vt); names(vt) grp.trt <- factor(grp.trt, levels=1:4, labels=c("LEC/MAT","MON/KEY","HMD/JOY","HMD/WEA"))vt.mod1 <- aov(procKnow ~ grp.trt)anova(vt.mod1)> anova(vt.mod1) Analysis of Variance TableResponse: procKnow Df Sum Sq Mean Sq F value Pr(>F) grp.trt 3 65.664 21.8880 4.9406 0.003931 **Residuals 60 265.815 4.4302
19. F-Test for H0: m1 = m2 = m3 = m4 - Example
20. General Linear Test of Equal Means
21. R Program### General Linear Test## Complete model - Cell Means with no interceptvt.mod.c <- aov(procKnow ~ grp.trt - 1)anova(vt.mod.c)## Reduced Model - Intercept only modelvt.mod.r <- aov(procKnow ~ 1)anova(vt.mod.r)## General Linear Testanova(vt.mod.r, vt.mod.c)> anova(vt.mod.c)Analysis of Variance Table Response: procKnow Df Sum Sq Mean Sq F value Pr(>F) grp.trt 4 2821.91 705.48 159.24 < 2.2e-16 ***Residuals 60 265.81 4.43> anova(vt.mod.r)Analysis of Variance Table Response: procKnow Df Sum Sq Mean Sq F value Pr(>F)Residuals 63 331.48 5.2616> anova(vt.mod.r, vt.mod.c)Model 1: procKnow ~ 1 Model 2: procKnow ~ grp.trt - 1 Res.Df RSS Df Sum of Sq F Pr(>F) 1 63 331.48 2 60 265.81 3 65.664 4.9406 0.003931
22. Factor Effects Model
23. Regression Approach – Factor Effects Model
24. Factor Effects Model - Example
25. R Program#### Matrix Form - Factor Effects ModelY <- matrix(procKnow, ncol=1) # 64x1 column vector of responsesone <- matrix(rep(1,16), ncol=1) # 16x1 column vector of 1szero <- matrix(rep(0,16), ncol=1) # 16x1 column vector of 0s## Xfe is 64x4 model matrixXfe <- rbind(cbind(one,one,zero,zero),cbind(one,zero,one,zero), cbind(one,zero,zero,one),cbind(one,-one,-one,-one))XfePXfe <- t(Xfe) %*% Xfe # X'XXfePY <- t(Xfe) %*% Y # X'YXfePXfeINV <- solve(XfePXfe) # inv(X'X)betahatfe <- XfePXfeINV %*% XfePY # betahatround(cbind(XfePXfe, XfePY, XfePXfeINV, betahatfe), 4)> round(cbind(XfePXfe, XfePY, XfePXfeINV, betahatfe), 4) [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10][1,] 64 0 0 0 419.9998 0.0156 0.0000 0.0000 0.0000 6.5625[2,] 0 32 16 16 -31.1111 0.0000 0.0469 -0.0156 -0.0156 -1.6319[3,] 0 16 32 16 13.3335 0.0000 -0.0156 0.0469 -0.0156 1.1458[4,] 0 16 16 32 -2.2222 0.0000 -0.0156 -0.0156 0.0469 0.1736
26. Factor Effects Model with Weighted Mean
27. Regression for Cell Means Model
28. Randomization (aka Permutation) TestsTreats the units in the study as a finite population of units, each with a fixed error term eijWhen the randomization procedure assigns the unit to treatment i, we observe Yij = m. + ti + eijWhen there are no treatment effects (all ti = 0), Yij = m. + eijWe can compute a test statistic, such as F* under all (or in practice, many) potential treatment arrangements of the observed units (responses)The p-value is measured as proportion of observed test statistics as or more extreme than the original.Total number of potential permutations = nT!/(n1!...nr!)
29. R Program### Randomization Testn.trt <- as.vector(tapply(procKnow, grp.trt, length))mean.trt <- as.vector(tapply(procKnow, grp.trt, mean))mean.all <- mean(procKnow)r <- length(n.trt)n_T <- sum(n.trt)SSTR.obs <- sum(n.trt * (mean.trt - mean.all)^2)N.perm <- 9999set.seed(1234)SSTR.perm <- rep(0, N.perm)for (i in 1:N.perm) { perm <- sample(1:n_T, n_T, replace=FALSE) ## Random permutation of 1:n_T Y.perm <- procKnow[perm] mean.trt.perm <- as.vector(tapply(Y.perm, grp.trt, mean)) SSTR.perm[i] <- sum(n.trt * (mean.trt.perm - mean.all)^2)}(perm.pvalue <- (sum(SSTR.perm >= SSTR.obs) + 1) / (N.perm + 1))hist(SSTR.perm, breaks="Scott")abline(v = SSTR.obs, col="blue", lwd=2)> (perm.pvalue <- (sum(SSTR.perm >= SSTR.obs) + 1) / (N.perm + 1))[1] 0.005
30.
31. Power Approach to Sample Size Choice - Tables
32. Power Approach to Sample Size Choice – R Code
33. R Program ### Power Computationsmu <- c(5,8,7,7); sigma2 <- 4.5r <- length(mu)n <- rep(10, r)mu_dot <- sum(n*mu) / sum(n)lambda <- sum(n * (mu - mu_dot)^2) / sigma2df_E <- sum(n) - r(f.95 <- qf(.95,r-1,df_E))(power <- 1 - pf(f.95, r-1, df_E, lambda))> (f.95 <- qf(.95,r-1,df_E))[1] 2.866266> (power <- 1 - pf(f.95, r-1, df_E, lambda))[1] 0.7354687
34. f.seq <- seq(0,10,.001)plot(f.seq,df(f.seq,r-1,n_T-r), type="l", col="blue", lwd=2)lines(f.seq,df(f.seq,r-1,n_T-r,lambda), col="red", lwd=2)abline(v=f.95)legend("topright", c("central F", "non-central F"), col=c("blue", "red"), lty=1, lwd=2, box.lty=0)
35. Power Approach to Finding “Best” Treatment
36. Effects of Model Departures Non-normal Data – Generally not problematic in terms of the F-test, if data are not too far from normal, and reasonably large sample sizesUnequal Error Variances – As long as sample sizes are approximately equal, generally not a problem in terms of F-test.Non-independence of error terms – Can cause problems with tests. Should use Repeated Measures ANOVA if same subject receives each treatment
37. Tests for Constant Variance H0:s12=...=sr2
38. Bartlett’s TestGeneral Test that can be used in many settings with groupsH0: s12 = … = sr2 (homogeneous variances)Ha: Population Variances are not all equalMSE ≡ Pooled Variance
39. R Program### Levene's and Bartlett's Tests of Equal Variances## Levene's Test (in car package)install.packages("car")library(car)leveneTest(procKnow, grp.trt, "median")## Bartlett's Test bartlett.test(procKnow ~ grp.trt)> leveneTest(procKnow, grp.trt, "median")Levene's Test for Homogeneity of Variance (center = "median") Df F value Pr(>F)group 3 2.1463 0.1038 60 > bartlett.test(procKnow ~ grp.trt) Bartlett test of homogeneity of variancesdata: procKnow by grp.trtBartlett's K-squared = 6.7626, df = 3, p-value = 0.07986
40. Remedial MeasuresNormally distributed, Unequal variances – Use Weighted Least Squares with weights: wij = 1/si2Welch’s Test Non-normal data (with possibly unequal variances) – Variance Stabilizing and Box-Cox TransformationsVariance proportional to mean: Y’=sqrt(Y)Standard Deviation proportional to mean: Y’=log(Y)Standard Deviation proportional to mean2: Y’=1/YResponse is a (binomial) proportion: Y’=2arcsin(sqrt(Y))Non-parametric tests – F-test based on ranks and Kruskal-Wallis Test
41. Welch’s Test – Unequal Variances
42. Example – Virtual Training of Lifeboat Task
43. R Commands with oneway.test Functiononeway.test(procKnow ~ grp.trt, var.equal=F)> oneway.test(procKnow ~ grp.trt, var.equal=F)One-way analysis of means (not assuming equal variances)data: procKnow and grp.trtF = 6.827, num df = 3.000, denom df = 32.562, p-value = 0.001072
44. Nonparametric Tests – Non-Normal Data
45. Example – Virtual Training of Lifeboat Task
46. R Commands – kruskal.test Functionkruskal.test(procKnow ~ grp.trt)> kruskal.test(procKnow ~ grp.trt) Kruskal-Wallis rank sum testdata: procKnow by grp.trtKruskal-Wallis chi-squared = 12.915, df = 3, p-value = 0.004824