PPT-10 practical uses of a million-word corpus
Author : luna | Published Date : 2024-02-03
in ELT All easy to find and use on wwwlextutorca just add imagination March 30 Fri 1011h30 Colloquium Using corpus resources and findings in ELT 1 Tom Cobb
Presentation Embed Code
Download Presentation
Download Presentation The PPT/PDF document "10 practical uses of a million-word corp..." is the property of its rightful owner. Permission is granted to download and print the materials on this website for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
10 practical uses of a million-word corpus: Transcript
in ELT All easy to find and use on wwwlextutorca just add imagination March 30 Fri 1011h30 Colloquium Using corpus resources and findings in ELT 1 Tom Cobb Didactique des langues. Finding. William W. Cohen. ACL Workshop 2003. Why phrase-finding?. There are lots of phrases. There’s not supervised data. It’s hard to articulate. What makes a phrase a phrase, . vs. . just an n-gram?. Ilina. . Doykova. Shumen . University, . Shumen . (Bulgaria). ilina.doykova@abv.bg. Statistical analysis. Simple things may characterise different styles. average sentence length. average word length. N-Grams and Corpus Linguistics. Regular expressions for asking questions about the stock market from stock reports. Due midnight, Sept. 29. th. Use . Perl or Java . reg. -ex package. HW focus is on writing the “grammar” or FSA for . “It’s a capital mistake to theorize before one has data”. (. A.Conan. Doyle, Sherlock Holmes - A scandal in Bohemia). CADS. recognition and quantification of . patterns. systematic analysis of serendipitous . Lesson 2. Quick recap. Corpus approaches to discourse :authentic texts. corpus. Corpus From the Latin for ‘body’ (plural corpora), a corpus is a body of language representative of a particular variety of language or genre which is collected and stored in electronic form for analysis using concordance software. . of . linguistic. research. Corpus . linguistics. Frans Gregersen 25th of . January. History. of corpus . linguistics. I. To understand the data revolution. …. we. have to look at data in general. William W. Cohen. Outline of the course. Week 1: review and a fruit fly (algorithm to to study). T. ime complexity, cost of operations, and Naïve Bayes v1. Week 2-4: scaling and parallelizing Naïve Bayes. Source: Natural Language Processing with Python --- Analyzing Text with the Natural Language Toolkit. Status. We have . progressed with . Object. -Oriented Programming in Python. Simple I/O, File I/O. Slides adapted from Dan Jurafsky, Jim Martin and Chris Manning. This week. Finish semantics. Begin machine learning for NLP. Review for midterm. Midterm. October . 27. th, . Where: 1024 . Mudd. (here). Biomedical Publications. Comparing the Accuracies of Nine . Text-Based . Similarity Approaches. Boyack et al. (2011). . PLoS. ONE 6(3): e18029. Motivation . Compare different similarity measurements. Learning for NLP Midterm Review: Midterm next Tuesday Homework back Thanks for doing midterm exam! Some very useful comments came in. Today Statistical NLP Machine Learning for NL Tasks Some form of classification Hilary NesihnesiwarwickacukPaper Outline1491Overview of the project1492 Some initial findings1493 Future uses of the BAWE corpus BAWEAn investigation of genres of assessed writing in British higher e by Chinese and Finnish EAP students. . Nicole Keng. University of Vaasa. Simon Smith. Coventry University. Outline. Background. Motivation for corpus construction. Research questions. UK study. Finland study. Brezina, V. (2018). . Statistics in Corpus Linguistics: A Practical Guide. . Cambridge: Cambridge University Press.. 1. What is statistics? Science, corpus linguistics and statistics. Brezina, V. (2018). .
Download Document
Here is the link to download the presentation.
"10 practical uses of a million-word corpus"The content belongs to its owner. You may download and print it for personal use, without modification, and keep all copyright notices. By downloading, you agree to these terms.
Related Documents