Computational Psycholinguistics
1 Course information
Lecture Times | Tuesdays & Thursdays 11am-12:30pm |
Lecture Location | 46-5193 |
Class website | http://stellar.mit.edu/S/course/9/sp17/9.19/index.html |
Syllabus | http://www.mit.edu/~rplevy/teaching/2017spring/9.19 |
2 Instructor information
Instructor | Roger Levy (rplevy@mit.edu) |
Instructor's office | 46-3033 |
Instructor's office hours | Mondays 4-5:30pm and by appointment; special office hours 3/20 and 3/21: Mon 3/20 5-6pm, Tu 3/21 4-5pm |
TA | Meilin Zhan (meilinz@mit.edu) |
TA's office | 46-3027D |
TA's office hours | Thursdays 10-11am and by appointment |
3 Course Description
Over the last two and a half decades, computational linguistics has been revolutionized as a result of three closely related developments: increases in computing power, the advent of large linguistic datasets, and a paradigm shift toward probabilistic modeling. At the same time, similar theoretical developments in cognitive science have led to a view major aspects of human cognition as instances of rational statistical inference. These developments have set the stage for renewed interest in computational approaches to human language use. Correspondingly, this course covers some of the most exciting developments in computational psycholinguistics over the past decade. The course spans human language comprehension, production, and acquisition, and covers key phenomena from both phonetics and syntax. Students will learn key technical tools including probabilistic models, formal grammars, and decision theory, and how theory, computational modeling, and data can be combined to advance our fundamental understanding of human language use.
4 Course organization
We'll meet twice a week; the course format will be a combination of lecture, discussion, and in-class exercises as class size, structure, and interests permit.
5 Intended Audience
Undergraduates in Brain & Cognitive Sciences, Linguistics, Electrical Engineering & Computer Science, and any of a number of related disciplines. Graduate students are also welcome to take the course (contact the instructor if you are interested in taking this course as part of the BCS PhD program requirements; it's possible). Postdocs and faculty are also welcome to participate!
The course prerequisites are one semester of Python programming (fulfilled by 6.00), plus either one semester of probability/statistics/machine learning (fulfilled by, for example, 6.041B or 9.40) or one semester of introductory linguistics (fulfilled by 24.900). If you think you have the requisite background but have not taken the specific courses just mentioned, please talk to the instructor to work out whether you should take this course or do other prerequisites first.
We will be doing some Python programming in this course, and also using programs that must be run from the Unix/Linux/OS X command line.
6 Readings & Textbooks
Readings will frequently be drawn from the following textbooks:
- Daniel Jurafsky and James H. Martin. Speech and Language Processing. Third edition (draft). Draft chapters can be found here. (I refer to this book as "SLP" in the syllabus.)
This textbook is the single most comprehensive and up-to-date introduction available to the field of computational linguistics.
- Bird, Steven, Ewan Klein, and Edward Loper. 2009. Natural Language Processing with Python. O'Reilly Media. (I refer to this book as "NLTK" in the syllabus.)
This is the book for the Natural Language Toolkit (or NLTK), which we will be using extensively to do programming We will also be doing some of our programming in the Python programming language, and will make quite a bit of use of for Python. You can buy this book, or you can freely access it on the Web at http://www.nltk.org/book.
- Christopher D. Manning and Hinrich Schütze. (1999). Foundations of statistical natural language processing. Cambridge: MIT press. Book chapter PDFs can be obtained through the MIT library website. (I refer to this book as "M&S" in the syllabus.)
This is an older but still very useful book on NLP.
- Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze. 2008. Introduction to Information Retrieval. Cambridge University Press. Book chapter PDFs can be obtained through the MIT library website. (I refer to this book as "MRS" in the syllabus.)
We'll also occasionally draw upon other sources for readings, and sometimes read original papers.
7 Syllabus (subject to modification)
Week | Day | Topic | Readings | Homework |
---|---|---|---|---|
Week 1 | Tues 7 Feb | Course intro; intro to probability theory; speech perception and the perceptual magnet | ||
Thurs 9 Feb | Class cancelled due to snow | |||
Week 2 | Tues 14 Feb | Elementary text classification in NLTK; Naive Bayes | MRS Chapter 13 | HW1 out |
Thurs 16 Feb | Word sequences, language models and \(N\)-grams | SLP 4.1-4.4 | ||
Week 3 | Tues 21 Feb | No class (MIT on Monday schedule today) | ||
Thurs 23 Feb | More advanced issues in \(N\)-gram modeling | SLP 4.5-4.8 | ||
Week 4 | Tues 28 Feb | Prediction in human language understanding; surprisal | Kutas et al. 2011; Piantadosi et al., 2011 | HW1 due; HW2 out |
Thurs 2 Mar | Regular expressions | SLP 2.1-2.1.6 | ||
Week 5 | Tues 7 Mar | Finite-state automata | SLP 2.2-2.4 | |
Thurs 9 Mar | Finite-state automata, continued | |||
Week 6 | Tues 14 Mar | Class cancelled due to snow | ||
Thurs 16 Mar | Finite-state transducers; weighted finite-state machines; noisy-channel models | SLP 3; M&S 3.1 | HW2 due | |
Week 7 | Tues 21 Mar | Policy optimization and modeling human eye movements in reading | Sutton & Barto in progress, 1.1-1.6; Bicknell & Levy, 2010 | |
Thurs 23 Mar | Midterm Exam | |||
Spring Break | Tues 28 Mar | No class this week | ||
Wed 29 Mar | Workshop | |||
Thurs 30 Mar | CUNY Sentence Processing Conference | |||
Fri 31 Mar | CUNY Sentence Processing Conference | |||
Sat 1 Apr | CUNY Sentence Processing Conference | |||
Week 8 | Tues 4 Apr | Bayes Nets and interventions | Kraljic et al., 2008; Levy in progress, Directed Graphical Models appendix; | |
Thurs 6 Apr | Multi-factor models: logistic regression; word order preferences in language | SLP 7; Graphical models intro | HW3 out | |
Week 9 | Tues 11 Apr | Hierarchical models and multi-factor models. The binomial construction. | Morgan & Levy, 2015 | |
Thurs 13 Apr | Is human language finite state? Context-free grammars; Syntactic analysis; searching Treebanks | SLP 11, 16; NLTK 8.1-8.5; Levy & Andrew, 2006 | ||
Week 10 | Tues 18 Apr | No class (student holiday due to Patriots Day) | ||
Thurs 20 Apr | Parsing with context-free grammars; dynamic programming. | SLP 13.1-13.5,13.7; NLTK 8.6 | ||
Week 11 | Tues 25 Apr | Probabilistic context-free grammars, incremental parsing, and human syntactic processing | SLP Chapter 13 (under Readings on Stellar); Levy, 2013 | HW3 due; HW4 out |
Thurs 27 Apr | Weighted intersection of context-free grammars and finite-state machines: noisy-channel syntactic comprehension | Levy, 2011 | ||
Week 12 | Tues 2 May | Statistical word learning in humans; modeling with nonparametric Bayes | Saffran et al., 1996; Goldwater, Griffiths, & Johnson, 2009 | |
Thurs 4 May | The emergence of syntactic productivity in language development | Meylan et al., 2017 | HW4 due; | |
Week 13 | Tues 9 May | Pragmatics in language understanding. Scalar inference. The Rational Speech Acts model. | Goodman & Frank, 2016 (see also Frank & Goodman, 2012) | |
Thurs 11 May | Advanced pragmatics models: lexical uncertainty, scalar adjectives | Lassiter & Goodman, 2015 | ||
Week 14 | Tues 16 May | Word embeddings | Levy, Goldberg, & Dagan, 2015 | |
Thurs 18 May | End-of-semester review | None | Final project due for grad students | |
Wed 24 May, 9am-12pm | Final exam |
8 Requirements & grading
You'll be graded on:
- A number of homework assignments throughout the semester (50% of grade)
- A midterm and a final (50% of grade)
- In the case of grad students, a final project
Active participation in the class is also encouraged and taken into account in borderline grade cases!
9 Mailing list
There will be a mailing list for this course, which you can access at https://mailman.mit.edu:444/mailman/listinfo/9.19-2017-spring. Please make sure you're signed up for it! This list is both for discussion of ideas in the class and for communications about organizational issues.