# Computational Psycholinguistics, Spring 2018

## 1 Course information

 Lecture Times Mondays & Wednesdays 9:30-11:00am Lecture Location 46-4199 Class website http://stellar.mit.edu/S/course/9/sp18/9.19/index.html Syllabus http://www.mit.edu/~rplevy/teaching/2018spring/9.19

## 2 Instructor information

 Instructor Roger Levy (rplevy@mit.edu) Instructor's office 46-3033 Instructor's office hours Mondays 11am-12pm, Tuesdays 2-3pm Teaching Assistants Yevgeni Berzak (berzak@mit.edu) Richard Futrell (futrell@mit.edu) TA Offices Yevgeni: 46-3027G Richard: 46-3037 TA Office Hours Yevgeni: W 12:30-2pm Richard: R 2-3pm

## 3 Course Description

Over the last two and a half decades, computational linguistics has been revolutionized as a result of three closely related developments: increases in computing power, the advent of large linguistic datasets, and a paradigm shift toward probabilistic modeling. At the same time, similar theoretical developments in cognitive science have led to a view major aspects of human cognition as instances of rational statistical inference. These developments have set the stage for renewed interest in computational approaches to human language use. Correspondingly, this course covers some of the most exciting developments in computational psycholinguistics over the past decade. The course spans human language comprehension, production, and acquisition, and covers key phenomena spanning phonetics, phonology, morphology, syntax, semantics, and pragmatics. Students will learn technical tools including probabilistic models, formal grammars, neural networks, and decision theory, and how theory, computational modeling, and data can be combined to advance our fundamental understanding of human language acquisition and use.

## 4 Course organization

We'll meet twice a week; the course format will be a combination of lecture, discussion, and in-class exercises as class size, structure, and interests permit.

## 5 Intended Audience

Undergraduate or graduate students in Brain & Cognitive Sciences, Linguistics, Electrical Engineering & Computer Science, and any of a number of related disciplines. The undergraduate section is 9.19, the graduate section is 9.190. Postdocs and faculty are also welcome to participate!

The course prerequisites are:

1. One semester of Python programming (fulfillable by 6.00/6.0001+6.0002, for example), plus
2. Either:
• one semester of probability/statistics/machine learning (fulfilled by, for example, 6.041B or 9.40), or
• one semester of introductory linguistics (fulfilled by 24.900).

If you think you have the requisite background but have not taken the specific courses just mentioned, please talk to the instructor to work out whether you should take this course or do other prerequisites first.

We will be doing some Python programming in this course, and also using programs that must be run from the Unix/Linux/OS X command line.

Readings will frequently be drawn from the following textbooks:

1. Daniel Jurafsky and James H. Martin. Speech and Language Processing. Third edition (draft). Draft chapters can be found here. (I refer to this book as "SLP" in the syllabus.)

This textbook is the single most comprehensive and up-to-date introduction available to the field of computational linguistics.

2. Bird, Steven, Ewan Klein, and Edward Loper. 2009. Natural Language Processing with Python. O'Reilly Media. (I refer to this book as "NLTK" in the syllabus.)

This is the book for the Natural Language Toolkit (or NLTK), which we will be using extensively to do programming We will also be doing some of our programming in the Python programming language, and will make quite a bit of use of for Python. You can buy this book, or you can freely access it on the Web at http://www.nltk.org/book.

3. Christopher D. Manning and Hinrich Schütze. (1999). Foundations of statistical natural language processing. Cambridge: MIT press. Book chapter PDFs can be obtained through the MIT library website. (I refer to this book as "M&S" in the syllabus.)

This is an older but still very useful book on NLP.

4. Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze. 2008. Introduction to Information Retrieval. Cambridge University Press. Book chapter PDFs can be obtained through the MIT library website. (I refer to this book as "MRS" in the syllabus.)

We'll also occasionally draw upon other sources for readings, including original research papers in computational linguistics, psycholinguistics, and other areas of the cognitive science of language.

## 7 Syllabus (subject to modification)

Week 1 Wed 7 Feb Course intro; intro to probability theory;
Week 2 Mon 12 Feb Speech perception and the perceptual magnet MRS Chapter 13   Pset 1 out
Wed 14 Feb Elementary text classification in NLTK; Naive Bayes; Word sequences, language models and $$N$$-grams SLP 4.1-4.4
Week 3 Tue 20 Feb More advanced issues in $$N$$-gram modeling SLP 4.5-4.8
Wed 21 Feb Prediction in human language understanding; surprisal Kutas et al. 2011; Piantadosi et al., 2011 Smith & Levy, 2013
Week 4 Mon 26 Feb Regular expressions SLP 2.1-2.1.6   Pset 1 due; Pset 2 out
Wed 28 Feb Finite-state machines SLP 2.2-2.4, 3
Week 5 Mon 5 Mar Finite-state machines II
Wed 7 Mar Finite-state transducers
Week 6 Mon 12 Mar Weighted finite-state machines; noisy-channel models; Policy optimization and modeling human eye movements in reading M&S 3.1; Sutton & Barto in progress, 1.1-1.6; Bicknell & Levy, 2010   Pset 2 due; Pset 3 out
Wed 14 Mar Bayes Nets and interventions Kraljic et al., 2008; Russell & Norvig, 2010, chapter 14 (on Stellar); Levy in progress, Directed Graphical Models appendix; Bayes Nets lecture notes
Week 7 Mon 19 Mar Multi-factor models: logistic regression; word order preferences in language. Hierarchical models; binomial construction. SLP 7; Graphical models intro ; Morgan & Levy, 2015
Wed 21 Mar Midterm exam
Spring Break 26-30 Mar Spring break, no class
Week 8 Mon 2 Apr Is human language finite state? Context-free grammars; Syntactic analysis; searching Treebanks SLP 11, 16; NLTK 8.1-8.5; Levy & Andrew, 2006   Pset 3 due; Pset 4 out
Wed 4 Apr (46-3189) Parsing with context-free grammars; dynamic programming. SLP 13.1-13.5,13.7; NLTK 8.6
Week 9 Mon 9 Apr Probabilistic context-free grammars, incremental parsing, and human syntactic processing SLP Chapter 13 (under Readings on Stellar); Levy, 2013
Wed 11 Apr Weighted intersection of context-free grammars and finite-state machines: noisy-channel syntactic comprehension Levy, 2011
Week 10 Mon 16 Apr Patriots Day, no class (student holiday due to Patriots Day)     Pset 4 due; Pset 5 out
Wed 18 Apr Word embeddings Levy, Goldberg, & Dagan, 2015
Week 11 Mon 23 Apr Implicit associations in word embeddings Caliskan et al., 2017
Wed 25 Apr Recurrent neural network models for language Collobert et al., 2011
Week 12 Mon 30 Apr What do neural networks learn about language structue? Linzen et al., 2016   Pset 5 due; Pset 6 out
Wed 2 May Statistical word learning in humans; modeling with nonparametric Bayes Saffran et al., 1996; Goldwater, Griffiths, & Johnson, 2009
Week 13 Mon 7 May The emergence of syntactic productivity in language development Meylan et al., 2017
Wed 9 May (46-3189) Bootstrapping syntactic acquisition Abend et al., 2017
Week 14 Mon 14 May Pragmatics in language understanding. Scalar inference. The Rational Speech Acts model. Goodman & Frank, 2016 (see also Frank & Goodman, 2012)   Pset 6 due
Wed 16 May Advanced pragmatics models: lexical uncertainty, scalar adjectives Lassiter & Goodman, 2015
TBD Final exam

 Work Grade percentage (9.19) Grade percentage (9.190) A number of homework assignments throughout the semester 50% 37.5% A midterm exam 20% 15% A final exam 30% 22.5% If you are enrolled in 9.190, a final project -- 25%

Active participation in the class is also encouraged and taken into account in borderline grade cases!

### 8.1 Homework late policy

Homework assignments can be turned in up to 7 days late; 10% will be deducted for each 24 hours of lateness (rounded up).

### 8.2 Medical or personal circumstances impacting psets, exams, or projects

If medical or personal circumstances such as illness impact your work on a pset or project, or your ability to take an exam on the scheduled date with adequate preparation, please work with Student Support Services (S3) to verify these circumstances and be in touch with the instructor. We are happy to work with you in whatever way is most appropriate to your individual circumstances to help ensure that you are able to achieve your best performance in class while maintaining your health, happiness, and well-being.

## 9 Mailing list

There will be a mailing list for this course, which you can access at https://mailman.mit.edu:444/mailman/listinfo/9.19-2018-spring. Please make sure you're signed up for it! This list is both for discussion of ideas in the class and for communications about organizational issues.

Created: 2018-03-15 Thu 11:15

Emacs 25.1.1 (Org mode 8.2.5h)

Validate