STReME Series August 11 2011 Brenda Roman MD Professor of Psychiatry BSOM Paul Koles MD Associate Professor of Pathology and Surgery BSOM Journey through Lunch Power and Purposes of Assessment ID: 215692
Download Presentation The PPT/PDF document "Transforming Multiple Choice Questions t..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Transforming Multiple Choice Questions to Effectively Assess Application of Knowledge
STReME Series, August 11, 2011
Brenda Roman, MD, Professor of Psychiatry, BSOM
Paul Koles, MD, Associate Professor of Pathology and Surgery, BSOMSlide2
Journey through Lunch
Power and Purposes of Assessment
Learning Approaches and Assessment
Assessment Using Multiple-Choice Questions (MCQs)
Evaluation of MCQ Quality
Identification of Flaws in MCQs
Practice: Find the Flaws
Practice: Choose the Highest-Quality MCQSlide3
Q1: Of the criteria listed below, which one do you believe is most important for judging the quality of a multiple choice question (MCQ)?
The MCQ assesses knowledge that is considered important by the writer of the question.
The MCQ is directly related to one or more of the course
’
s learning objectives.
The MCQ asks the student to make a decision that is based on critical interpretation of data.
The MCQ requires the student to appropriately apply knowledge, not just to recall facts.Slide4
Flaws in the previous MCQ
Options
Non-homogeneous options: (a) (b) about
content
; (c) (d) about
format
and
purposeUnnecessarily longOnly (d) has a contrasting clauseStemquestion can’t be answered if the answer options are covered up“judging the quality” which aspect of quality?“do you believe” implies that the best answer is a matter of personal opinion (there is no single best answer)
Q1: Of the criteria listed below, which one do you believe is most important for judging the quality of a multiple choice question?
The MCQ assesses knowledge considered important by the writer of the question.
The MCQ is directly related to one or more of the course
’
s learning objectives.
The MCQ asks the student to make a decision that is based on critical interpretation of data.
The MCQ requires the student to appropriately apply knowledge, not just to recall facts.Slide5
Power of Assessment
“
Assessment drives student learning. Student assessment can be designed to foster the development of elaborated knowledge structure by making relationships and understanding—rather than isolated facts—the objects of assessment.
”
Bordage G: Elaborated Knowledge: A Key to Successful Diagnostic Thinking. Acad Med 69:883-885, 1994Slide6
Purposes of Assessment (using written questions
)
Assumption:
performance on a sample of questions allows inferences about the skills of examinees in a broader domain
Communicate what instructor views as important
Motivate students to learn
Allow objective comparisons among students who often experience variations in curriculum
Compensate for instructional gaps by encouraging students to read broadly and utilize a variety of educational tools
Case SM, Swanson DB; Constructing Written Test Questions for the Basic and Clinical Sciences, 3
rd
edition, NBME 2002Slide7
Assumption Refuted:
Physicians who pass licensure exams may lack some essential skills for practicing medicineSlide8
Learning Behavior
Learning behavior:
“
. . .the set of cognitive and metacognitive processes that learners draw on to acquire knowledge, skills, and understanding
”
(Mitchell R; Acad Med 84:918-926, 2009)
424 residents from 7 IM residencies completed a cognitive behavior survey (140 items, 7 point Likert scale)
Seven learning behavior scales developed from survey data: memorization, conceptualization, reflection, independent learning, critical thinking, meaningful learning experience, attitude toward educational experienceRESULTSMemorization not correlated positively with other 6 scalesMemorization correlated negatively with critical thinkingResidents in top 20% on reflection scale also conceptualized, learned independently, and thought critically more than the bottom 20%Slide9
Competent Physicians
Integrate:
“
to bring together parts into a whole
”
(Webster
’
s)Slide10Slide11
Assessment in Medical Education
Primary purpose:
measure student
’
s competence in course, clerkship, or residency
Secondary purpose:
develop competent physicians
Motivate student to integrate new knowledge with previously mastered knowledge (longitudinal learning)Foster critical thinking skills (clinical decision-making)Impart direction for future learning (subliminal messages embedded in assessments) Slide12
Learning Approaches and Assessment
Students adapt learning approaches to context in which learning occurs
Three basic approaches identified
Surface (memorization)
Deep (comprehension and application)
Strategic (adapted to meet perceived expectation of faculty)
Teaching methods influence students
’ approach to learningSome teaching methods hinder development of deep learning approachEducation of competent physicians requires “substantial changes in teaching, curriculum and, particularly, assessment . . .”
Newble DI, Entwistle NJ: Learning Styles and Approaches: Implications for Medical Education. Medical Education 1986; 20:162-175)Slide13
Can MCQs assess learner
’
s ability to apply knowledge by critical thinking and problem solving?
Authors
Method
Results and Conclusions
Corderre etal, BMC Mdical Education 2004, 4:23
Think-aloud protocols to determine problem-solving strategy used by gastroenterologists and MS4s in answering 8 questions about dysphagia, nausea/vomiting, diarrhea, and elevated liver enzymes
Similar clinical reasoning skills used to answer 5-option and extended matching MCQs
Stem more important than options for testing clinical reasoning
Beullens etal, Medical Education 2005, 39:410-417
20 final year med students & 20 final year IM residents solved extended matching questions (EMQs) aloud.
Residents & upper 50% in both groups used more
“
forward
”
than
“
backward
”
reasoning
Processes of clinical reasoning can be assessed using EMQs.
Cuddy etal, Acad Med 2004, 79:S43-45
27 experts complete survey about clinical relevance of 150 NBME step 2 MCQs
92% questions clinically relevant; 85% of content used in clinical practiceSlide14
*
* Bloom
’
s taxonomy of cognitive learning collapsed into 3 levels: (1) knowledge; (2) comprehension and application; (3) problem solvingSlide15Slide16
MCQs using clinical vignettes in the stem
“
Questions with rich descriptions of clinical context invite the more complex cognitive processes that are characteristic of clinical practice.
”
“
Conversely, context-poor questions can test basic factual knowledge but not its transferability to real clinical problems.
”
Epstein RJ: Assessment in Medical Education, New England Journal of Medicine 2007; 356:387-396.Slide17
“There is nothing new under the sun
”
(Ecclesiastes 1:9)
“
No teaching should be done without a patient for a text.
”
(Osler William: On the Need of A Radical Reform in our Methods of Teaching Medical Students; Medical News 82:49-53, 1904.) NBME announcement 2010-2011: decision to use only clinical or experimental vignette formats on USMLE step 1.Slide18
Format of Clinical Vignette
Outline (not all parts necessary)
Age and gender (
“
42-year-old woman
”
)
Site of care (“comes to the emergency department”)Presenting complaint (“because of headache”)Duration (“has persisted for 2 days”)Past history (may not be relevant)Physical findings (“pulsating artery anterior to ear
”
)
+/- diagnostic studies; +/- treatments
Example
“
What area is supplied with blood by the posterior inferior cerebellar artery?
“
A 62-year-old man develops left-sided limb ataxia, Horner
’
s syndrome, nystagmus, and loss of appreciation of facial pain and temperature sensations. Which of the following arteries is most likely to be occluded?
”Slide19
How good is this MCQ?
Subjective methods to evaluate quality
Opinion of question author
Opinions of other content experts
Opinions of experienced MCQ writers
Opinions of students (pre-test, post-test)
Systematic identification of flaws by question author and trusted consultants (YOU ARE THE CONSULTANTS!)
Gold standard: performance of MCQ in an exam, as demonstrated by difficulty index and discrimination factorSlide20
Year N diff. index top 25% bottom 25% disc.factor answer A B C D E
Difficulty index:
percentage of examinees who answered the question correctly
Discrimination Factor:
how well the item discriminates between students who performed highest on the exam (top 25%) and students who performed lowest on the exam (bottom 25%).
Higher D.F. suggests item is a more reliable measure of competence
Gold Standard: Performance of MCQ on an examinationSlide21
Systematic Identification of Flaws in MCQs
5 common flaws in stems
B) 7 common flaws in answer optionsSlide22
Systematic Identification of Flaws Pre-Exam: MCQ Stems
A1.
Stem does not end with a question (lead-in) that can be answered by covering up answer options.
A 39-year-old female is seen for an annual exam. She had been on oral contraceptive pills as a teenager but discontinued that form of contraception over 15 years ago. Because of her contraceptive practice she has . . .
Prostate cancer is best treated . .
.
Corticosteroid therapy . . .
According to the best scientific evidence available to date, HIV-1 came from . . .Slide23
Systematic Identification of Flaws Pre-Exam: MCQ Stems
A2.
Stem is unnecessarily complicated—too long, lots of irrelevant information.
A 48-year-old woman presents to the physician with lower back pain. She states that she has had the pain for about 2 weeks and that it has become steadily more severe. An x-ray film shows a lytic bone lesion in her lumbar spine. Review of systems reveals the recent onset of mild headaches, nausea, and weakness. Her CBC shows a normocytic anemia, and her erythrocyte sedimentation rate is elevated. Urinalysis shows heavy proteinuria, and a serum protein electrophoresis shows a monoclonal peak of IgG. Which of the following is responsible for this patient
’
s spinal lesioins?
Bence-Jones protein
lymphoplasmacytoid proliferation
osteoblast activating factor
osteoclast activating factor
primary amyloidosisSlide24
Systematic Identification of Flaws Pre-Exam: MCQ Stems
A B-cell-deficient toddler recovers as well as a normal child does to infection with the chickenpox virus. This child's immune system is capable of developing . . .
A3.
Stem contains vague terms that invite a wide range of interpretations.Slide25
Systematic Identification of Flaws Pre-Exam: MCQ Stems
A4.
Stem contains abbreviations that are not clearly understood by all examinees.
A 32yo WF in her 1st trimester of pregnancy experiences GERD 3-4x/week and c/o heartburn. She has not responded to MOM. Which medication will be best to treat this patient? Slide26
Systematic Identification of Flaws Pre-Exam: MCQ Stems
A5.
Stem contains words about quantity that are difficult or impossible to quantify: probably, usually, infrequently, sometimes, in most cases, in few cases, etc.
In most cases, men who develop prostate cancer usually have limited dietary intake of which of the following food groups? Slide27
Perception is unpredictableSlide28
Systematic Identification of Flaws Pre-Exam: MCQ Answer Options
B1. One or more options do not follow grammatically from the stem.
Which of the following behaviors is most frequently observed in adolescents who smoke cigarettes?
intelligence quotient below 80
overeating
body mass index < 25
disrespect for authorityalcohol abuseSlide29
Systematic Identification of Flaws Pre-Exam: MCQ Answer Options
B2. Options are heterogeneous in language or domains.
Which is necessary for the development of Burkitt lymphoma?
creation, by translocation, of a bcr/abl fusion gene in B-lymphocytes
deletion of p53 tumor suppressor gene in B-lymphocytes
infection of B-lymphocytes by Epstein-Barr virus
over-expression of the c-myc oncogene in B-lymphocytestrisomy of chromosome 8Slide30
Systematic Identification of Flaws Pre-Exam: MCQ Answer Options
B3. Option includes absolute terms that make it unlikely to be correct:
“
always
”
,
“
never” In patients with advanced dementia due to Alzheimer disease, the memory defectcan be treated adequately with phosphatidylcholine (lecithin).
could be a sequela of early parkinsonism.
is never seen in patients with neurofibrillary tangles in the cerebral cortex.
is never severe.
possibly involves the cholinergic system.Slide31
Systematic Identification of Flaws Pre-Exam: MCQ Answer Options
B4. Correct option is longer, more specific, or more complete than other options (
“
sore thumb
”
).
Secondary gain is
synonymous with malingering.a frequent problem in obsessive-compulsive disorder. a complication of a variety of illnesses and tends to prolong many of them.never seen in organic brain damage.Slide32Slide33
Systematic Identification of Flaws Pre-Exam: MCQ Answer Options
B5. correct option contains the most elements in common with other options (
“
convergence
”
).
Intramedullary destruction of red blood cells in beta-thalassemia is best explained by which mechanism?beta-4 tetramer oxidation and precipitationexcessive iron accumulation in macrophagesincreased formation of alpha chain aggregatesincreased formation of Hb H (beta 4)
increased formation of Hb F (alpha 2 gamma 2)Slide34
Systematic Identification of Flaws Pre-Exam: MCQ Answer Options
B6. Options are long, complicated, or composed of 2-3 parts, imposing irrelevant difficulty.
The figure below shows the dose-response curves for four different derivatives of a muscarinic receptor agonist. Each derivative acts by binding to the same site on the muscarinic receptor. The Heptyl derivative
has a lower binding affinity for the receptor than does the Hexyl derivative.
has a lower intrinsic activity than does the Hexyl derivative because it has a lower receptor affinity.
is a full agonist when compared with the Octyl derivative.
is more potent than the Hexyl derivative.
may act as a mixed agonist-antagonist if it has a higher receptor affinity than the Hexyl derivative.Slide35
Systematic Identification of Flaws Pre-Exam: MCQ Answer Options
B7. Options contain words about quantity that are difficult or impossible to quantify: probably, usually, infrequently, sometimes, in most cases, in few cases, etc.
Severe obesity in early adolescence
usually responds dramatically to dietary regimens.
often is related to endocrine disorders.
has a 75% chance of resolving spontaneously.
shows a poor prognosis.
usually responds to pharmacotherapy and intensive psychotherapy.Slide36
Systematic Identification of Flaws Pre-Exam: MCQ Answer Options
B8.
“
none of the above
”
or
“
all of the above” is used as an option. Which of the following cities is closest to New York City?
Boston
Chicago
Dallas
Los Angeles
None of the aboveSlide37Slide38
Identify those flaws: Practice MCQ 1
P1) Which of the following applies to pseudogout?
It occurs frequently in women.
It is seldom associated with acute pain in a joint.
It may be associated with a finding of chondrocalcinosis.
It is clearly hereditary in most cases.
It responds well to treatment with allopurinol.
P1) Of 13 flaws listed in your worksheet, how many flaws are present in this MCQ?
1
2
3
4
5 Slide39
Identify those flaws: Practice MCQ 2
P2) A 17-year-old male presents with a two-year history of "severe" acne. He has previously been treated with numerous topical treatments and several different oral antibiotics. Multiple nodules and cysts are present diffusely on the face, shoulders, back, and upper chest. He has multiple depressed scars on the cheeks. He is administered an oral agent which leads to significant improvement in his condition. This agent works by
disruption of bacterial cell membranes.
exfoliation.
increased sebum production.
reduction of androgen levels.
suppression of sebum production.
P2) Of 13 flaws listed in your worksheet, how many flaws are present in this MCQ?
1
2
3
4
5 Slide40
Identify those flaws: Practice MCQ 3
P3) A 25-year-old woman consults her physician because she has decided to use oral contraceptives. After the physician asks about history of thrombophlebitis, pulmonary embolus, and smoking (all negative), he proceeds to physical exam:
Vital signs:
within normal limits
Height
4'0"
Weight
85 lbs. HEENT: large head with prominent, rounded forehead Heart, Lungs, Abdomen: within normal limits Extremities: short arms and legs (compared to trunk length). He writes a prescription for oral contraceptives, but also records her most likely physical diagnosis in the chart. Which molecular abnormality best explains her diagnosis?constitutive activation of fibroblast growth receptor 2constitutive activation of fibroblast growth receptor 3expansion mutation in HOXD13 with altered length of transcription factormutation in COL1A1 with deficient synthesis of type 1 collagenmutation in COL2A1 with deficient synthesis of type 2 collagen
P3) Of 13 flaws listed in your worksheet, how many flaws are present in this MCQ?
1
2
3
4
5 Slide41
High-Quality MCQ: in principle
A high-quality multiple-choice question is one that assesses
content considered to be important
, is
free of flaws
in both stem and options, and effectively
identifies
those who can use their knowledge to skillfully assess data and make decisions.” (modified from Case SM Swanson DB: Constructing Written Test Questions for the Basic and Clinical Sciences, National Board of Medical Examiners, 2002) Slide42
Year N diff. index top 25% bottom 25% disc.factor answer A B C D E
Difficulty index:
percentage of examinees who answered the question correctly
Discrimination Factor:
how well the item discriminates between students who performed highest on the exam (top 25%) and students who performed lowest on the exam (bottom 25%).
Higher DF suggests item is a more reliable measure of competence.
Statistical Definition of High-Quality MCQs:
ones that perform well on an exam, as judged by difficulty index and discrimination factorSlide43
Mastery MCQs
The data below show performance of 3 MCQs used in a final course exam for BSOM year 2 students.
All three assessed the same content domain.
All three were classified as “mastery” questions (answered correctly by ≥ 90% of students) QM1) Based on the performance data shown below, which one is the highest-quality MCQ? Option
n
D.I.
top 25%
bottom 25%
D.F.
A
B
C
D
E
A)
101
90
96
86
0.27
0
10
91
0
0
B)
105
90
96
67
0.41
1
95
3
4
2
C)
105
90
93
81
0.22
2
7
94
1
1Slide44
Intermediate Difficulty MCQs
The data below show performance of 4 MCQs used in a final course exam for BSOM year 2 students.
All four assessed the same content domain.
All four were classified as “intermediate difficulty” questions. (answered correctly by 70.0 – 89.9% of students) QM2) Based on the performance data shown below, which one is the highest-quality MCQ? Option
n
D.I.
top 25%
bottom 25%
D.F.
A
B
C
D
E
A)
105
81
100
59
0.43
0
19
85
1
0
B)
93
81
96
63
0.31
14
4
75
0
0
C)
93
70
96
54
0.40
13
65
10
1
4
D)
93
75
92
67
0.21
1
1
19
2
70Slide45
Challenging MCQs
The data below show performance of 3 MCQs used in a final course exam for BSOM year 2 students.
All 3 assessed the same content domain.
All 3 were classified as “challenging” questions. (answered correctly by <70 % of students) QM3) Based on the performance data shown below, which one is the highest-quality MCQ? Option
n
D.I.
top 25%
bottom 25%
D.F.
A
B
C
D
E
A)
104
64
74
41
0.32
12
67
6
12
7
B)
93
57
84
42
0.26
0
22
17
1
53
C)
101
69
81
55
0.33
2
0
3
70
26Slide46