Jay Lubomirski How electronic essay graders evaluate writing samples Comparing the electronic graders to the human graders Gaming the system Topics Educational Testing Services ETS is a nonprofit test administration company ID: 297962
Download Presentation The PPT/PDF document "Electronic Essay Graders" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Electronic Essay Graders
Jay LubomirskiSlide2
How electronic essay graders evaluate writing samples
Comparing the electronic graders to the human graders
Gaming the system
TopicsSlide3
Educational Testing Services (ETS
) is a non-profit test administration company
Responsible for tests like GRE®, SAT
®
Subject tests, TOEFL® Test, etcCriterion® Service – online writing evaluation servicee-rater® Scoring engine – system that scores essays written within the Criterion® Service
ETS
e-readerSlide4
Started in 1998, new versions since
Focuses on writing quality rather than content
Uses natural language processing to look at grammar, usage, mechanics, and development
Goal is to predict the score a human grader would give an essay
e-rater® Slide5
e-rater
® is feed a sample set of essays based on the same prompt (question) and their scores from a human grader
e
-rater
® builds a model of the essay content and how it relates to the scores the human grader gave the essayse-rater® is then fed the evaluation essays to scoreAssumption is that “good essays resemble other good essays”ProcessSlide6
Grammar checker looks for 30 error types
Subject- verb agreement
Homophone errorsMisspellings
Overuse of vocabulary
The lexical complexity scorer computes a word frequency index and compares it against the word frequency in modelGrammar & Lexical Complexity Slide7
Automatically identifies sentences that follow essay-discourse categories
Introductory material
Thesis
Main ideas
Supporting ideasConclusionOrganization is determined by computing length of discourse elements Scored against the modelOrganization and DevelopmentSlide8
In 2012, Mark
Shermis
compared 9 electronic grading systems (8 commercial, 1 open source) against 8 essay promptsEssays sourced from high school writing assessments that were graded by human readers
Results demonstrated that electronic essay scoring was capable of producing scores similar to human readers
Scoring the SystemsSlide9
These systems are looking at language structure, they cannot verify facts presented in the essay
Les Perelman, Director of Writing at MIT, wrote an essay that received the top score from e-rater
®
.
The essay prompt was about the rising costs of college. Perelmen based his essay on the premise that college costs are so high because “Teaching assistants are paid an excessive amount of money.”Problems with electronic grading systemsSlide10
“In conclusion, as Oscar Wilde said, "I can resist everything except temptation." Luxury dorms are not the problem. The problem is greedy teaching assistants. It gives me an organizational scheme that looks like an essay, it limits my focus to one topic and three subtopics so I don’t wander about thinking irrelevant thoughts, and it will be useful for whatever writing I do in any subject. I don’t know why some teachers seem to dislike it so much. They must have a different idea about education than I do.”Slide11
Winerip
, M. (2012) “Facing
a Robo-Grader? Just Keep Obfuscating Mellifluously”
New York Times,
April 22, 2012. Retrieved 4/13/2013 from http://www.nytimes.com/2012/04/23/education/robo-readers-used-to-grade-test-essays.html Ramineni, C (2012) “Evaluation of the e-rater® Scoring Engine for the GRE® Issue and Argument Prompts” Educational Testing Service. Retrieved 4/13/2013 from http://www.ets.org/Media/Research/pdf/RR-12-02.pdf
Kolowich
, S (2012) “Large study shows little difference between human and robot essay graders.” Inside Higher Ed. Retrieved 4/11/2013 from
http://www.insidehighered.com/news/2012/04/13/large-study-shows-little-difference-between-human-and-robot-essay-graders
Shermis
, M (2012) “Contrasting State-of-the-Art Automated Scoring of Essays: Analysis” Retrieved 4/13/2013 from
http://www.scoreright.org/NCME_2012_Paper3_29_12.pdf
Dikli
, S. (2006). “An Overview of Automated Scoring of Essays.” Journal of Technology, Learning, and Assessment, 5(1).
Sources