# Computational Psycholinguistics

## 1 Course information

 Lecture Times Tuesdays & Thursdays 11am-12:30pm Lecture Location 46-5193 Class website http://stellar.mit.edu/S/course/9/sp17/9.19/index.html Syllabus http://www.mit.edu/~rplevy/teaching/2017spring/9.19

## 2 Instructor information

 Instructor Roger Levy (rplevy@mit.edu) Instructor's office 46-3033 Instructor's office hours Mondays 4-5:30pm and by appointment; special office hours 3/20 and 3/21: Mon 3/20 5-6pm, Tu 3/21 4-5pm TA Meilin Zhan (meilinz@mit.edu) TA's office 46-3027D TA's office hours Thursdays 10-11am and by appointment

## 3 Course Description

Over the last two and a half decades, computational linguistics has been revolutionized as a result of three closely related developments: increases in computing power, the advent of large linguistic datasets, and a paradigm shift toward probabilistic modeling. At the same time, similar theoretical developments in cognitive science have led to a view major aspects of human cognition as instances of rational statistical inference. These developments have set the stage for renewed interest in computational approaches to human language use. Correspondingly, this course covers some of the most exciting developments in computational psycholinguistics over the past decade. The course spans human language comprehension, production, and acquisition, and covers key phenomena from both phonetics and syntax. Students will learn key technical tools including probabilistic models, formal grammars, and decision theory, and how theory, computational modeling, and data can be combined to advance our fundamental understanding of human language use.

## 4 Course organization

We'll meet twice a week; the course format will be a combination of lecture, discussion, and in-class exercises as class size, structure, and interests permit.

## 5 Intended Audience

Undergraduates in Brain & Cognitive Sciences, Linguistics, Electrical Engineering & Computer Science, and any of a number of related disciplines. Graduate students are also welcome to take the course (contact the instructor if you are interested in taking this course as part of the BCS PhD program requirements; it's possible). Postdocs and faculty are also welcome to participate!

The course prerequisites are one semester of Python programming (fulfilled by 6.00), plus either one semester of probability/statistics/machine learning (fulfilled by, for example, 6.041B or 9.40) or one semester of introductory linguistics (fulfilled by 24.900). If you think you have the requisite background but have not taken the specific courses just mentioned, please talk to the instructor to work out whether you should take this course or do other prerequisites first.

We will be doing some Python programming in this course, and also using programs that must be run from the Unix/Linux/OS X command line.

Readings will frequently be drawn from the following textbooks:

1. Daniel Jurafsky and James H. Martin. Speech and Language Processing. Third edition (draft). Draft chapters can be found here. (I refer to this book as "SLP" in the syllabus.)

This textbook is the single most comprehensive and up-to-date introduction available to the field of computational linguistics.

2. Bird, Steven, Ewan Klein, and Edward Loper. 2009. Natural Language Processing with Python. O'Reilly Media. (I refer to this book as "NLTK" in the syllabus.)

This is the book for the Natural Language Toolkit (or NLTK), which we will be using extensively to do programming We will also be doing some of our programming in the Python programming language, and will make quite a bit of use of for Python. You can buy this book, or you can freely access it on the Web at http://www.nltk.org/book.

3. Christopher D. Manning and Hinrich Schütze. (1999). Foundations of statistical natural language processing. Cambridge: MIT press. Book chapter PDFs can be obtained through the MIT library website. (I refer to this book as "M&S" in the syllabus.)

This is an older but still very useful book on NLP.

4. Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze. 2008. Introduction to Information Retrieval. Cambridge University Press. Book chapter PDFs can be obtained through the MIT library website. (I refer to this book as "MRS" in the syllabus.)

We'll also occasionally draw upon other sources for readings, and sometimes read original papers.

## 7 Syllabus (subject to modification)

Week 1 Tues 7 Feb Course intro; intro to probability theory; speech perception and the perceptual magnet
Thurs 9 Feb Class cancelled due to snow
Week 2 Tues 14 Feb Elementary text classification in NLTK; Naive Bayes MRS Chapter 13 HW1 out
Thurs 16 Feb Word sequences, language models and $$N$$-grams SLP 4.1-4.4
Week 3 Tues 21 Feb No class (MIT on Monday schedule today)
Thurs 23 Feb More advanced issues in $$N$$-gram modeling SLP 4.5-4.8
Week 4 Tues 28 Feb Prediction in human language understanding; surprisal Kutas et al. 2011; Piantadosi et al., 2011 HW1 due; HW2 out
Thurs 2 Mar Regular expressions SLP 2.1-2.1.6
Week 5 Tues 7 Mar Finite-state automata SLP 2.2-2.4
Thurs 9 Mar Finite-state automata, continued
Week 6 Tues 14 Mar Class cancelled due to snow
Thurs 16 Mar Finite-state transducers; weighted finite-state machines; noisy-channel models SLP 3; M&S 3.1 HW2 due
Week 7 Tues 21 Mar Policy optimization and modeling human eye movements in reading Sutton & Barto in progress, 1.1-1.6; Bicknell & Levy, 2010
Thurs 23 Mar Midterm Exam
Spring Break Tues 28 Mar No class this week
Wed 29 Mar Workshop
Thurs 30 Mar CUNY Sentence Processing Conference
Fri 31 Mar CUNY Sentence Processing Conference
Sat 1 Apr CUNY Sentence Processing Conference
Week 8 Tues 4 Apr Bayes Nets and interventions Kraljic et al., 2008; Levy in progress, Directed Graphical Models appendix;
Thurs 6 Apr Multi-factor models: logistic regression; word order preferences in language SLP 7; Graphical models intro HW3 out
Week 9 Tues 11 Apr Hierarchical models and multi-factor models. The binomial construction. Morgan & Levy, 2015
Thurs 13 Apr Is human language finite state? Context-free grammars; Syntactic analysis; searching Treebanks SLP 11, 16; NLTK 8.1-8.5; Levy & Andrew, 2006
Week 10 Tues 18 Apr No class (student holiday due to Patriots Day)
Thurs 20 Apr Parsing with context-free grammars; dynamic programming. SLP 13.1-13.5,13.7; NLTK 8.6
Week 11 Tues 25 Apr Probabilistic context-free grammars, incremental parsing, and human syntactic processing SLP Chapter 13 (under Readings on Stellar); Levy, 2013 HW3 due; HW4 out
Thurs 27 Apr Weighted intersection of context-free grammars and finite-state machines: noisy-channel syntactic comprehension Levy, 2011
Week 12 Tues 2 May Statistical word learning in humans; modeling with nonparametric Bayes Saffran et al., 1996; Goldwater, Griffiths, & Johnson, 2009
Thurs 4 May The emergence of syntactic productivity in language development Meylan et al., 2017 HW4 due;
Week 13 Tues 9 May Pragmatics in language understanding. Scalar inference. The Rational Speech Acts model. Goodman & Frank, 2016 (see also Frank & Goodman, 2012)
Thurs 11 May Advanced pragmatics models: lexical uncertainty, scalar adjectives Lassiter & Goodman, 2015
Week 14 Tues 16 May Word embeddings Levy, Goldberg, & Dagan, 2015
Thurs 18 May End-of-semester review None Final project due for grad students
Wed 24 May, 9am-12pm Final exam

• A number of homework assignments throughout the semester (50% of grade)
• A midterm and a final (50% of grade)
• In the case of grad students, a final project

Active participation in the class is also encouraged and taken into account in borderline grade cases!

## 9 Mailing list

There will be a mailing list for this course, which you can access at https://mailman.mit.edu:444/mailman/listinfo/9.19-2017-spring. Please make sure you're signed up for it! This list is both for discussion of ideas in the class and for communications about organizational issues.

Created: 2017-05-18 Thu 17:09

Emacs 24.4.51.2 (Org mode 8.2.5h)

Validate