A New Experimental Paradigm Robert V Lindsey Michael C Mozer Institute of Cognitive Science Department of Computer Science University of Colorado Boulder Harold Pashler Department of Psychology ID: 207222
Download Presentation The PPT/PDF document "Discovering Optimal Training Policies:" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Discovering Optimal Training Policies:A New Experimental Paradigm
Robert V. Lindsey, Michael C. Mozer
Institute of Cognitive Science
Department of Computer Science
University of Colorado, Boulder
Harold
Pashler
Department of Psychology
UC San DiegoSlide2
Common Experimental Paradigm In Human Learning ResearchPropose several instructional conditions to compare based on intuition or theory
E.g., spacing of study sessions in fact learning
Equal: 1 – 1 – 1
Increasing: 1 – 2 – 4Run many participants in each conditionPerform statistical analyses to establish reliable differencebetween conditionsSlide3
What Most Researchers Interested In Improving Instruction Really Want To Do
Find the best
training policy
(study schedule)
Abscissa: space of all training policies
Performance function defined
over policy spaceSlide4
ApproachPerform single-participant experiments at selected points in policy space (
o
)
Use function approximationtechniques to estimate shapeof the performance functionGiven current estimate,select promising policiesto evaluate next.
promising = has potential
to be the optimum policy
linear
regression
Gaussian
process
regressionSlide5
Gaussian Process RegressionAssumes only that functions are smoothUses data efficiently
Accommodates noisy data
Produces estimates of both function shape and uncertaintySlide6
Simulated ExperimentSlide7Slide8
Embellishments On Off-The-ShelfGP RegressionActive selection heuristic: upper confidence bound
GP is embedded in generative task model
GP represents skill level (-∞
+∞)Mapped to population mean accuracy on test (0 1)Mapped to individual’s mean accuracy, allowing for
interparticipant
variability
Mapped to # correct responses via binomial sampling
Hierarchical Bayesian approach to parameter selection
Interparticipant
variabilityGP smoothness (covariance function)Slide9
Concept Learning TaskSlide10Slide11Slide12Slide13Slide14Slide15
GLOPNOR = GraspabilityEase of picking up & manipulating object with one handBased on norms from Salmon, McMullen, &
Filliter
(2010)Slide16
Two-Dimensional Policy SpaceFading policy
Repetition/alternation
policySlide17
Two-Dimensional Policy SpaceSlide18
Policy Space
f
ading
policy
repetition/
alternation
policySlide19
ExperimentTraining25 trial sequence generated by chosen policyBalanced positive / negative
Testing
24 test trials, ordered randomly, balanced
No feedback, forced choiceAmazon Mechanical Turk$0.25 / participantSlide20
Results
# correct of 25Slide21
Best Policy
Fade from easy to semi-difficulty
Repetitions initially, alternations later
*Slide22
ResultsSlide23
Final Evaluation
65.7%
60.9%
66.6%
68.6%
N=49
N=53
N=50
N=48Slide24
Novel Experimental ParadigmInstead of running a few conditions each with many participants, …
…run
many
conditions each with a different participant.Although individual participants provide a very noisy estimate of the population mean, optimization techniques allow us to determine the shape of the policy space.Slide25
What Next?Plea for more interesting policy spaces!Other optimization problemsAbstract concepts from examples
E.g., irony, recyclability, retribution
Motivation
ManipulationsRewards/points, trial pace, task difficulty, time pressureMeasureVoluntary time on taskSlide26
Machine LearningTo Boost Human Learning
Robert Lindsey
*
Jeff
Shroyer
*
Hal
Pashler
+
Mike Mozer
*
*
University of Colorado at Boulder
+
University of California, San DiegoSlide27
People Forget What They Have LearnedSlide28
Forgetting Can Be Reduced ByAppropriatedly Timed ReviewSlide29
Challenge Of Exploiting Spaced ReviewThe optimal spacing of study depends oncharacteristics of the individual studentcharacteristics of the specific item (e.g., vocabulary word) being learned
the exact study history (timing and retrieval success)Slide30
Our Approach
Data from a population of students studying a set of items
Collaborative filtering
Prediction of when a specific student should study a particular item
Psychological model of human memorySlide31
Colorado Optimized Language Tutor (COLT)Slide32
Experiment In Fall 2012Second year Spanish at Denver area middle school180 students (6 class periods)New vocabulary introduced each week for 10 weeks
COLT used 3 times a week for 30 min
Sessions 1 & 2: study new vocabulary to criterion; remainder of time spent on review
Session 3: quiz on new vocabulary;remainder of time spent on reviewSlide33
Comparison Of Three Review SchedulersWithin Student
Massed review (current educational practice)
Generic spaced review
Personalized spaced review using machine learning modelsSlide34Slide35Slide36Slide37
Bottom Line17% boost in retention of cumulative course content one month after end of semester…if students spend the same amount of time using our machine-learning-based review software instead of cramming for the current week’s examSlide38
BRAIN Initiative
One goal of combining cognitive modeling and machine learning:
Help people learn and perform more efficiently
learning new conceptschoice and ordering of examplesimproving long-term retentionpersonalized selection of material for reviewassisting visual search (e.g., medical, satellite image analysis)image enhancement
training complex visual tasks (e.g., fingerprint analysis)
highlighting to guide attention
diagnosing and remediating cognitive deficits
via modeling individual differences