Bachman and Palmer 1996 Anne Mullen annemullenelululavalca Université laval october 2014 Test Validity The Progressive Matrix of Validity Messick 1989 conceived to control the quality of the evaluation ID: 689895
Download Presentation The PPT/PDF document "evaluating a test Test Usefulness" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
evaluating a testTest Usefulness(Bachman and Palmer, 1996)
Anne Mullen anne.mullen@elul.ulaval.ca Université lavaloctober 2014Slide2
Test Validity
The Progressive Matrix of Validity (Messick, 1989) conceivedto control the quality of the evaluationto guarantee that the results of the evaluation are precise to assure that the interpretations of the results are fairSlide3
Plan
1. Qualities of test usefulness definitions questions 2. Creating a valid test3. Discussion and follow-up questionsSlide4
Six Qualities of Test Usefulness
ReliabilityConstruct ValidityAuthenticityInteractivenessImpactPracticalitySlide5
Six Qualities
ReliabilityConstruct ValidityAuthenticityInteractivenessImpactPracticalitySlide6
Reliability
seeks to ascertain that the results of an evaluation are similarmeasures the coherence of results from one evaluation to anotherverifies the variation between results in different evaluationsa minimal level of reliability is determined by the contextSlide7
Is this evaluation reliable?
does the evaluation allow for comparison between test-takers?does the evaluation allow for comparison with other groups of test-takers in the same session, in different sessions?Slide8
Six Qualities
ReliabilityConstruct ValidityAuthenticityInteractivenessImpactPracticalitySlide9
Construct Validity
a measurement by which the results of an evaluation can be interpreted as an indicator of the ability that the evaluation is measuringis said to exist if the results of the evaluation are valid in a specific context and can be generalized (valid in another similar, but different context)Slide10
Does this evaluation measure the correct construct?
does the evaluation actually evaluate the desired ability?what other abilities are measured?Slide11
Six Qualities
ReliabilityConstruct ValidityAuthenticityInteractivenessImpactPracticalitySlide12
Authenticity
the correspondence between the characteristics of the tasks of the context and those of the evaluation helps in the process of generalization of resultsSlide13
Is the evaluation authentic?
will the test-takers need to do similar activities in their present or future, academic or work lives?Slide14
Six Qualities
ReliabilityConstruct ValidityAuthenticityInteractivenessImpactPracticalitySlide15
Interactiveness
the measure and the type of individual characteristics the test-taker uses when completing the tasks of the evaluationincludes a) the goal b) the specific group being evaluated c) the specific context of the evaluationSlide16
Is the evaluation interactive?
does the evaluation reflect the classroom activities? does the evaluation lead the test-taker to use what has been taught and learned?Slide17
Six Qualities
ReliabilityConstruct ValidityAuthenticityInteractivenessImpactPracticalitySlide18
Impact
the effects of the evaluation on a) society (employers), b) educational systems (administrators, teachers) and c) other stakeholders (parents and test-takers)the consequences of the evaluation must be evaluated for each stakeholderSlide19
What is the impact of the evaluation?
how are the results of the test used?is anyone affected negatively by the evaluation?who benefits from the evaluation?Slide20
Six Qualities
ReliabilityConstruct ValidityAuthenticityInteractivenessImpactPracticalitySlide21
Practicality
the measure and the evaluation of the resources: a) human (test correctors, evaluators of the evaluation)b) material (space and equipment)c) time (test creation, the correction, analysis)Slide22
Is the evaluation practical?
can it be completed in the allotted time?can it be corrected easily and fairly for all test-takers?what resources are needed and are they readily available? Slide23
Determining Test Usefulness
Three principles to follow: find a middle ground between the 6 qualities have the six qualities combined and balanced evaluate for the contextSlide24
Six Qualities
ReliabilityConstruct ValidityAuthenticityInteractivenessImpactPracticalitySlide25
Creation of an evaluation
You need to determine an evaluation for the following list of words: to devour, to dirty, to imbibe, to purchase, to relish, to swallow, to savour, to scorch, to slip, to taste,Slide26
Context
The class is an intermediate 4-skills ESL class with 23 students.While listening to a text which included these ten words, take-takers were asked to answer comprehension questions. The 10 words were listed and defined due to their level of presumed difficulty.The teacher also orally explained the meaning of these words and answered any questions.Slide27
Is the text useful?
does the evaluation allow for comparison between test-takers and groups over time? (Reliability)does the evaluation actually evaluate the desired ability? Do other abilities intervene? (Construct validity)Slide28
Is the text useful?
does the evaluation reflect the test-taker’s present day or future reality? (Authenticity)does the evaluation lead the test-taker’s to use what has been taught and learned? (Interactiveness)Slide29
Is the text useful?
what is the effect of the evaluation? (Impact)is the evaluation easy to administer? (Practicality)Slide30
Thank you
anne.mullen@elul.ulaval.ca