/
Bayesian Knowledge Tracing and Other Predictive Models in E Bayesian Knowledge Tracing and Other Predictive Models in E

Bayesian Knowledge Tracing and Other Predictive Models in E - PowerPoint Presentation

olivia-moreira
olivia-moreira . @olivia-moreira
Follow
427 views
Uploaded On 2016-02-18

Bayesian Knowledge Tracing and Other Predictive Models in E - PPT Presentation

Zachary A Pardos PSLC Summer School 2011 Bayesian Knowledge Tracing amp Other Models PLSC Summer School 2011 Zach Pardos 2 Bayesian Knowledge Tracing amp Other Models PLSC Summer School 2011 ID: 223400

tracing knowledge amp 2011 knowledge tracing 2011 amp pardos models bayesian summer school zach plsc intro student probability bkt

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Bayesian Knowledge Tracing and Other Pre..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Bayesian Knowledge Tracing and Other Predictive Models in Educational Data Mining

Zachary A. Pardos

PSLC Summer School 2011

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach PardosSlide2

2Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach Pardos

Outline of Talk

Introduction to Knowledge Tracing

History

Intuition

Model

Demo

Variations (and other models)

Evaluations (baker work /

kdd

)

Random Forests

Description

Evaluations (

kdd

)

Time left?

Vote on next topicSlide3

Intro to Knowledge Tracing

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach Pardos

History

Introduced in 1995 (Corbett

& Anderson,

UMUAI)

Basked on ACT-R theory of skill knowledge (Anderson 1993)

Computations based on a variation of Bayesian calculations proposed in 1972 (Atkinson)Slide4

Intro to Knowledge Tracing

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach Pardos

Intuition

Based

on the

idea that

practice on a skill leads to mastery of that

skill

Has four parameters used to describe student performance

Relies on a

KC model

Tracks student knowledge over timeSlide5

Given a student’s response sequence 1 to n, predict n+1

0

0

0

1

1

1

?

For some Skill K:

Chronological response sequence for student

Y

[

0 = Incorrect response 1 = Correct response]

1 …. n n+1

Intro to Knowledge Tracing

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach PardosSlide6

0

0

0

1

1

1

1

Track knowledge over time

(model of

learning

)

Intro to Knowledge Tracing

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach PardosSlide7

Knowledge Tracing (KT) can be represented as a simple HMM

Latent

Observed

Node representations

K = Knowledge nodeQ = Question node

Node states

K = Two state (0 or 1)

Q = Two state (0 or 1)

UMAP 2011

7

Intro to Knowledge Tracing

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach PardosSlide8

Four parameters of the KT model:

P(L

0

) = Probability of initial knowledge

P(T) = Probability of learningP(G) = Probability of guess

P(S) = Probability of slip

UMAP 2011

P(L

0

)

P(T)

P(T)

P(G)

P(G)

P(G)

P(S)

Probability of forgetting assumed to be zero (fixed)

8

Intro to Knowledge Tracing

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach PardosSlide9

Formulas for inference and prediction

Derivation (Reye, JAIED 2004):

Formulas use Bayes Theorem to make inferences about latent variable

If

(1)

(2)

(3)

 

Intro to Knowledge Tracing

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach PardosSlide10

0

0

1

1

1

Model Training Step - Values of parameters

P(T), P(G), P(S) & P(L

0

)

used to predict student responses

Ad-hoc

values could be used but will likely not be the best fitting

Goal: find a set of values for the parameters that minimizes prediction error

1

1

1

1

0

1

0

0

0

0

1

0

Student A

Student B

Student C

0

Intro to Knowledge Tracing

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach Pardos

Model Training:Slide11

Model Tracing Step – Skill: Subtraction

0

1 1

Student’s last three responses to Subtraction questions (in the Unit)

Test set questionsLatent (knowledge)Observable(responses)

10

%

45%

75

%

79

%

83

%

71%

74

%

P(K)

P(Q)

Intro to Knowledge Tracing

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach Pardos

Model Prediction:Slide12

Influence of parameter values

P(L0): 0.50 P(T): 0.20 P(G): 0.14 P(S): 0.09Student reached 95% probability of knowledgeAfter 4th opportunity

Estimate of knowledge for student with response sequence: 0 1 1 1 1 1 1 1 1 1

Intro to Knowledge Tracing

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach PardosSlide13

Estimate of knowledge for student with response sequence: 0 1 1 1 1 1 1 1 1 1P(L0): 0.50 P(T): 0.20 P(G): 0.14 P(S): 0.09

P(L0): 0.50 P(T): 0.20 P(G): 0.64 P(S): 0.03

Student reached 95% probability of knowledgeAfter 8th opportunity

Intro to Knowledge Tracing

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach Pardos

Influence of parameter valuesSlide14

Intro to Knowledge Tracing

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach Pardos

( Demo )Slide15

Intro to Knowledge Tracing

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach Pardos

Variations on Knowledge Tracing

(and other models)Slide16

Prior Individualization Approach

Do all students enter a lesson with the same background knowledge?Node representationsK = Knowledge nodeQ = Question

nodeS = Student node

Node states

K = Two state (0 or 1)Q = Two state (0 or 1)S = Multi state (1 to N)P(L0|S)

Observed

Intro to Knowledge Tracing

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach PardosSlide17

Conditional Probability Table of Student node and Individualized Prior node

P(L0|S)S value

P(S=value)11/N2

1/N3

1/N……N1/NCPT of Student node CPT of observed student node is fixed Possible to have S value for every student ID Raises initialization issue (where do these prior values come from?)S value can represent a cluster or type of student instead of ID

Intro to Knowledge Tracing

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach Pardos

Prior Individualization ApproachSlide18

Conditional Probability Table of Student node and Individualized Prior node

P(L0|S)S value

P(L0|S)10.05

20.303

0.95……N0.92CPT of Individualized Prior node Individualized L0 values need to be seeded This CPT can be fixed or the values can be learned Fixing this CPT and seeding it with values based on a student’s first response can be an effective strategy

This

model, that only individualizes L

0

, the Prior Per Student (PPS)

model

P(L

0

|S)

Intro to Knowledge Tracing

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach Pardos

Prior Individualization ApproachSlide19

Conditional Probability Table of Student node and Individualized Prior node

P(L0|S)

S valueP(L0|S)0

0.0510.30

CPT of Individualized Prior node Bootstrapping prior If a student answers incorrectly on the first question, she gets a low priorIf a student answers correctly on the first question, she gets a higher priorP(L0|S)11

Intro to Knowledge Tracing

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach Pardos

Prior Individualization ApproachSlide20

What values to use for the two priors?

P(L0|S)S

valueP(L0|S)00.05

10.30

CPT of Individualized Prior nodeWhat values to use for the two priors?P(L0|S)11

Intro to Knowledge Tracing

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach Pardos

Prior Individualization ApproachSlide21

What values to use for the two priors?

P(L0|S)S

valueP(L0|S)00.10

10.85

CPT of Individualized Prior nodeUse ad-hoc valuesP(L0|S)11

Intro to Knowledge Tracing

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach Pardos

Prior Individualization ApproachSlide22

What values to use for the two priors?

P(L0|S)S

valueP(L0|S)0EM

1EM

CPT of Individualized Prior nodeUse ad-hoc valuesLearn the valuesP(L0|S)11

Intro to Knowledge Tracing

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach Pardos

Prior Individualization ApproachSlide23

What values to use for the two priors?

P(L0|S)S

valueP(L0|S)0Slip

11-Guess

CPT of Individualized Prior nodeUse ad-hoc valuesLearn the valuesLink with the guess/slip CPTP(L0|S)11

Intro to Knowledge Tracing

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach Pardos

Prior Individualization ApproachSlide24

What values to use for the two priors?

P(L0|S)S

valueP(L0|S)0Slip

11-Guess

CPT of Individualized Prior nodeUse ad-hoc valuesLearn the valuesLink with the guess/slip CPTP(L0|S)11

With ASSISTments, PPS (ad-hoc) achieved an R

2

of 0.301 (0.176 with KT

)

(Pardos & Heffernan, UMAP 2010)

Intro to Knowledge Tracing

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach Pardos

Prior Individualization ApproachSlide25

UMAP 201125

Intro to Knowledge Tracing

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach Pardos

Variations on Knowledge Tracing

(and other models)Slide26

P(L

0

) = Probability of initial knowledge

P(T) = Probability of learning

P(G) = Probability of guess

P(S) = Probability of slip

UMAP 2011

P(L

0

)

P(T)

P(T)

(Baker et al., 2010)

26

1. BKT-BF

Learns values for

these parameters

by

performing a

grid search

(0.01 granularity)

and chooses the set of parameters with the best squared error

. . .

P(G)

P(G)

P(G)

P(S)

P(S)

P(S)

Intro to Knowledge Tracing

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach PardosSlide27

P(L

0

) = Probability of initial knowledge

P(T) = Probability of learning

P(G) = Probability of guess

P(S) = Probability of slip

UMAP 2011

P(L

0

)

P(T)

P(T)

(Chang et al., 2006)

27

2. BKT-EM

Learns values for

these parameters

with

Expectation Maximization

(EM). Maximizes the log likelihood fit to the data

. . .

P(G)

P(G)

P(G)

P(S)

P(S)

P(S)

Intro to Knowledge Tracing

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach PardosSlide28

P(L

0

) = Probability of initial knowledge

P(T) = Probability of learning

P(G) = Probability of guess

P(S) = Probability of slip

UMAP 2011

P(L

0

)

P(T)

P(T)

(Baker, Corbett, & Aleven, 2008)

28

3

. BKT-CGS

Guess and slip parameters

are assessed

contextually

using a regression on features generated from student performance in the tutor

. . .

P(G)

P(G)

P(G)

P(S)

P(S)

P(S)

Intro to Knowledge Tracing

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach PardosSlide29

P(L

0

) = Probability of initial knowledge

P(T) = Probability of learning

P(G) = Probability of guess

P(S) = Probability of slip

UMAP 2011

P(L

0

)

P(T)

P(T)

(Baker, Corbett, & Aleven, 2008)

29

4

. BKT-

CSlip

Uses the student’s averaged

contextual

Slip parameter

learned across all incorrect actions.

. . .

P(G)

P(G)

P(G)

P(S)

P(S)

P(S)

Intro to Knowledge Tracing

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach PardosSlide30

P(L

0

) = Probability of initial knowledge

P(T) = Probability of learning

P(G) = Probability of guess

P(S) = Probability of slip

UMAP 2011

P(L

0

)

P(T)

P(T)

(

Nooraiei

et al, 2011)

30

5. BKT-

LessData

Limits

students

response sequence length

to the most recent 15 during EM training.

. . .

P(G)

P(G)

P(G)

P(S)

P(S)

P(S)

Most recent 15 responses used (max)

Intro to Knowledge Tracing

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach PardosSlide31

P(L

0

) = Probability of initial knowledge

P(T) = Probability of learning

P(G) = Probability of guess

P(S) = Probability of slip

UMAP 2011

P(L

0

)

P(T)

P(T)

(Pardos & Heffernan, 2010)

31

6. BKT-PPS

Prior per student (PPS) model which

individualizes

the

prior parameter

. Students are assigned a prior based on their response to the first question.

. . .

P(G)

P(G)

P(G)

P(S)

P(S)

P(S)

P(L

0

|S

)

Observed

Intro to Knowledge Tracing

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach PardosSlide32

UMAP 2011

32

7

. CFARCorrect on First Attempt Rate (CFAR) calculates the student’s

percent correct on the current skill up until the question being predicted.Student responses for Skill X: 0 1 0 1 0 1_Predicted next response would be 0.50

(Yu et al., 2010)

Intro to Knowledge Tracing

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach PardosSlide33

UMAP 2011

33

8. Tabling

Uses the student’s response sequence (max length 3) to predict the next response by looking up the

average next response among student with the same sequence in the training set Training setStudent A: 0 1 1 0Student B: 0 1 1 1Student C: 0 1 1 1Predicted next response would be 0.66Test set student: 0 0 1 _

Max table length set to 3:

Table size was 2

0

+2

1

+22+23=15(Wang et al., 2011)

Intro to Knowledge Tracing

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach PardosSlide34

UMAP 2011

34

9

. PFAPerformance Factors Analysis (PFA).

Logistic regression model which elaborates on the Rasch IRT model. Predicts performance based on the count of student’s prior failures and successes on the current skill.An overall difficulty parameter ᵝ is also fit for each skill or each item In this study we use the variant of PFA that fits ᵝ for each skill. The PFA equation is: 

 

(Pavlik et al., 2009)

Intro to Knowledge Tracing

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach PardosSlide35

StudyCognitive Tutor for Genetics76 CMU undergraduate students 9 Skills (no multi-skill steps)23,706 problem solving attempts

11,582 problem steps in the tutor152 average problem steps completed per student (SD=50)Pre and post-tests were administered with this assignment

Dataset

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach Pardos

Methodology

Evaluation

Intro to Knowledge TracingSlide36

StudyPredictions were made by the 9 models using a 5 fold cross-validation by student

Methodology

model in-tutor prediction

Student 1

Skill AResp 10.100.220Skill AResp 2 ….0.51

0.26

1

Skill A

Resp

N

0.770.401Student 1

Skill B

Resp

1

0.55

0.60

1

Skill B

Resp

N

0.41

0.61

0

BKT-BF

BKT-EM

Actual

Accuracy was calculated with A’ for each student. Those values were then averaged across students to report the model’s A’ (higher is better)

Intro to Knowledge Tracing

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach PardosSlide37

Study

Results

in-tutor model prediction

Model

A’BKT-PPS0.7029BKT-BF0.6969BKT-EM0.6957BKT-LessData0.6839PFA0.6629Tabling

0.6476

BKT-

CSlip

0.6149

CFAR

0.5705BKT-CGS0.4857

A’ results averaged across students

Intro to Knowledge Tracing

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach PardosSlide38

Study

Results

in-tutor model prediction

Model

A’BKT-PPS0.7029BKT-BF0.6969BKT-EM0.6957BKT-LessData0.6839PFA0.6629Tabling

0.6476

BKT-

CSlip

0.6149

CFAR

0.5705BKT-CGS0.4857

A’ results averaged across students

No significant differences within these BKT

Significant differences between these BKT and PFA

Intro to Knowledge Tracing

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach PardosSlide39

Study5 ensemble methods were used, trained with the same 5 fold cross-validation folds

Methodology

ensemble in-tutor prediction

Ensemble methods were trained using the 9 model predictions as the features and the actual response as the label.

Student 1

Skill A

Resp

1

0.10

0.22

0

Skill A

Resp

2

….

0.51

0.26

1

Skill A

Resp

N

0.77

0.40

1

Student 1

Skill B

Resp

1

0.55

0.60

1

Skill B

Resp

N

0.41

0.61

0BKT-BFBKT-EM…Actual

featureslabelIntro to Knowledge Tracing

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach PardosSlide40

StudyEnsemble methods used:Linear regression with no feature selection (predictions bounded between {0,1})Linear regression with feature selection (stepwise regression)

Linear regression with only BKT-PPS & BKT-EMLinear regression with only BKT-PPS, BKT-EM & BKT-CSlipLogistic regression

Methodology

ensemble in-tutor prediction

Intro to Knowledge Tracing

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach PardosSlide41

Study

Results

in-tutor ensemble prediction

Model

A’Ensemble: LinReg with BKT-PPS, BKT-EM & BKT-CSlip0.7028Ensemble: LinReg with BKT-PPS & BKT-EM0.6973Ensemble: LinReg without feature selection0.6945Ensemble: LinReg

with feature selection (stepwise)

0.6954

Ensemble:

Logistic without feature selection

0.6854

A’ results averaged across students

Tabling

No significant difference between ensembles

Intro to Knowledge Tracing

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach PardosSlide42

Study

Results

in-tutor ensemble & model prediction

Model

A’BKT-PPS0.7029Ensemble: LinReg with BKT-PPS, BKT-EM & BKT-CSlip0.7028Ensemble: LinReg with BKT-PPS & BKT-EM0.6973BKT-BF

0.6969

BKT-EM

0.6957

Ensemble

:

LinReg without feature selection0.6945Ensemble:

LinReg

with feature selection (stepwise)

0.6954

Ensemble

:

Logistic without feature selection

0.6854

BKT-

LessData

0.6839

PFA

0.6629

Tabling

0.6476

BKT-

CSlip

0.6149

CFAR

0.5705

BKT-CGS

0.4857

A’ results averaged across students

Intro to Knowledge Tracing

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach PardosSlide43

Study

Results

in-tutor ensemble & model prediction

Model

A’Ensemble: LinReg with BKT-PPS, BKT-EM & BKT-CSlip

0.7451

Ensemble

:

LinReg

without feature selection

0.7428

Ensemble

:

LinReg

with feature selection (stepwise)

0.7423

Ensemble

:

Logistic regression without feature selection

0.7359

Ensemble

:

LinReg

with BKT-PPS

& BKT-EM

0.7348

BKT-EM

0.7348

BKT-BF

0.7330

BKT-PPS

0.7310

PFA

0.7277

BKT-

LessData

0.7220

CFAR0.6723Tabling0.6712Contextual Slip

0.6396

BKT-CGS0.4917A’ results calculated across all actions

Intro to Knowledge Tracing

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach PardosSlide44

In the KDD CupMotivation for trying non KT approach:Bayesian method only uses KC, opportunity count and student as features. Much information is left unutilized. Another machine learning method is required

Strategy:Engineer additional features from the dataset and use Random Forests to train a model

Random Forests

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach Pardos

Random ForestsSlide45

Strategy:Create rich feature datasets that include features created from features not included in the test set

Random Forests

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach PardosSlide46

Created by Leo BreimanThe method trains T number of separate decision tree classifiers (50-800)Each decision tree selects a random 1/P portion of the available features (1/3)The tree is grown until there are at least M observations in the leaf (1-100)When classifying unseen data, each tree votes on the class. The popular vote wins or an average of the votes (for regression)

Random Forests

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach PardosSlide47

Feature ImportanceFeatures extracted from training set:Student progress features (avg. importance: 1.67)Number of data points [today, since the start of unit]

Number of correct responses out of the last [3, 5, 10]Zscore sum for step duration, hint requests, incorrectsSkill specific version of all these features

Percent correct features (avg. importance: 1.60)% correct of unit, section, problem and step and total for each skill and also for each student (10 features)Student Modeling Approach features (avg. importance: 1.32)

The predicted probability of correct for the test rowThe number of data points used in training the parametersThe final EM log likelihood fit of the parameters / data points

Random Forests

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach PardosSlide48

Features of the user were more important in Bridge to Algebra than AlgebraStudent progress features / gaming the system (Baker et al., UMUAI 2008) were important in both datasets

Random Forests

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach PardosSlide49

RankFeature setRMSECoverage

1All features0.276287%2Percent correct+

0.282496%3

All features (fill)0.284797%

RankFeature setRMSECoverage1All features0.271292%2All features (fill)0.279199%3Percent correct+

0.2800

98%

Algebra

Bridge to Algebra

Random Forests

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach PardosSlide50

RankFeature setRMSECoverage

1All features0.276287%2Percent correct+

0.282496%3

All features (fill)0.284797%

RankFeature setRMSECoverage1All features0.271292%2All features (fill)0.279199%3Percent correct+

0.2800

98%

Algebra

Bridge to Algebra

Best Bridge to Algebra RMSE on the

Leaderboard

was 0.2777

Random Forest RMSE of 0.2712 here is exceptional

Random Forests

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach PardosSlide51

RankFeature setRMSECoverage

1All features0.276287%2Percent correct+

0.282496%3

All features (fill)0.284797%

RankFeature setRMSECoverage1All features0.271292%2All features (fill)0.279199%3Percent correct+

0.2800

98%

Algebra

Bridge to Algebra

Skill data for a student was not always available for each test row

Because of this many skill related feature sets only had 92% coverage

Random Forests

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach PardosSlide52

Conclusion from KDDCombining user features with skill features was very powerful in both modeling and classification approachesModel tracing based predictions performed formidably against pure machine learning techniquesRandom Forests also performed very well on this educational data set compared to other approaches such as Neural Networks and SVMs. This method could significantly boost accuracy

in other EDM datasets.

Intro to Knowledge Tracing

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach PardosSlide53

Hardware/SoftwareSoftwareMATLAB used for all analysisBayes Net Toolbox for Bayesian Networks Models

Statistics Toolbox for Random Forests classifierPerl used for pre-processingHardwareTwo rocks clusters used for skill model training178 CPUs in total. Training of KT models took ~48 hours when utilizing all CPUs.Two 32gig RAM systems for Random ForestsRF models took ~16 hours to train with 800 trees

Random Forests

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach PardosSlide54

Choose the next topicKT: 1-35Prediction: 36-67Evaluation: 47-77sig tests: 69-77Regression/sig tests: 80-112

Time left?

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach PardosSlide55

UMAP 201155

Intro to Knowledge Tracing

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach Pardos

Individualize Everything?Slide56

Fully Individualized Model

Intro to Knowledge Tracing

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach Pardos

(Pardos & Heffernan, JMLR 2011)Slide57

Fully Individualized Model

S identifies the student

Intro to Knowledge Tracing

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach Pardos

(Pardos & Heffernan, JMLR 2011)Slide58

Fully Individualized Model

T contains the CPT lookup table of individual student learn rates

Intro to Knowledge Tracing

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach Pardos

(Pardos & Heffernan, JMLR 2011)Slide59

Fully Individualized Model

P(T) is trained for each skill which gives a learn rate for:P(T|T=1) [high learner] and P(T|T=0) [low learner]

Intro to Knowledge Tracing

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach Pardos

(Pardos & Heffernan, JMLR 2011)Slide60

SSI model results

Dataset

New RMSE

Prev RMSEImprovementAlgebra0.2813

0.28350.0022Bridge to Algebra0.28240.28600.0036Average of Improvement is the difference between the 1st and 3rd place. It is also the difference between 3rd and 4th place.The difference between PPS and SSI are significant in each dataset at the P < 0.01 level (t-test of squared errors)

Intro to Knowledge Tracing

Bayesian Knowledge Tracing & Other Models

PLSC Summer School 2011

Zach Pardos

(Pardos & Heffernan, JMLR 2011)