and processing a sizeable EAP corpus in a relatively resourcepoor context Priya Mathew Hilary Nesi amp Benet Vincent Types of DIY corpus Expert writing collected by students Student writing collected by lecturers ID: 806062
Download The PPT/PDF document "Corpus from scratch : collecting" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Corpus from scratch: collecting and processing a sizeable EAP corpus in a (relatively) resource-poor context
Priya Mathew, Hilary Nesi & Benet Vincent
Slide2Types of DIY corpus:Expert writing collected by students.Student writing collected by lecturers.
Student writing compared with expert writing
(collected by students or lecturers).
Corpus compilation helps students learn more about their own disciplines
Corpus compilation
helps lecturers learn
more about
disciplinary requirements
Fairly quick and easy
Fairly slow and laborious
Can provide good examples for data-driven learning
May contain errors
Slide3The Middle East College DIY corpusCreated for needs analysis: What types of assignments to subject lecturers set? What genres of writing do the students produce?
What do the best students do well, and where are they still having problems?
Created for learning activities:
Using discipline-specific key words and phrases
Noticing similarities and differences between their own and expert usage
Slide4Context: MEC, OmanLargest private college (6000 students)Electronics, Civil Engineering, Mechanical Engineering, Computing and Business
Student population: 90% Omani, 10% InternationalArabic background (8 years of English)1-year foundation before undergraduate course (IELTS 5.5)
Slide5Need for writing support post Foundation Many students not able to meet disciplinary writing requirements (feedback from subject lecturers, students and external examiners, student performance)
Slide6Centre for Academic Writing at MECSupports UG and PG students through:workshopsconsultationsWID (Writing in Disciplines) courses
Slide7Initial questionsHow to design courses if we don’t know:what genres students from different disciplines writethe lexicogrammatical features of the different stages of the textswhat subject lecturers value in their students’ written assignments
Texts need to be categorized into genres
Stages of the texts need to be marked up
Slide8Creating the CorpusCivil Engineering (coursework from 26 modules represented)Obtained student consent (Consent Form on Moodle)
Slide9Creating the CorpusSubject lecturers chose some proficient assignments per moduleConverted texts to xml formatTexts annotated during the conversion process
<Oxygen/>
Slide10The MEC Civ. Eng. CorpusMEC Undergraduate Civil Engineering Programme consists of 8 semesters
Semester
1
2
3
4
5
6
7
Number of assignments
10
10
12
22
41
15
23
Number of words
30200
23700
35000
33600
68100
58000
70000
Slide11Genre AnalysisCategorized texts in corpus into genres based on:analysis of stages in texts (Nesi and Gardner 2012)interviews with subject lecturersassignment briefsmodule information guide
Slide12MEC Civil Engineering Corpus, by genre
GenreNo. of assignmentsNo. of words
Case Study
34
13800
Explanation
27
88600
Exercise
14
18000
Lab Report
62
48700
Manual
2
11200
Site Investigation Report
5
14400
Slide13Exploiting the corpus: some initial analysesData-driven analysis involving e.g. key words key termsn-grams can be used to suggest pedagogical interventions
Slide14KeywordsWordforms that are significantly more frequent in the corpus than in a reference corpusMEC CE Corpus vs. enTenTen13 (parameter: 1)
suggests items / categories that may be worth teaching
Includes some that definitely aren’t!
NB Sketch Engine keywords
Slide15Key termsMEC CE Corpus vs. enTenTen13 (parameter: 1)
Almost all N + N / Adj + N
Measurement-related terms
Keyword procedure applied to MWIs
Slide164-gramsUseful starting point to look at categories such as: reference to measurement / locationreference to visuals
This can reveal common issues
aka 4-word lexical bundles
Slide17Referring to visuals teaching material
Lines
retrieved
using CQL
Slide18Further work to include…Keywords of genres (e.g. case study) compared to rest of corpusComparisons of usage seen in corpus with more expert writing:BAWE Engineering writingJournal writingTextbook writing? in
terms of typical collocates and other phraseological features
Sharing results with teachers and students
Probably retrieves different types of keywords
Slide19Slide20