6.863J/9.611J Natural Language Processing
 
 
Course home
[  Main  ] [  About ] [ Assignments ]
 

Staff
Prof. Robert C. Berwick
berwick@csail.mit.edu
32-D728, x3-8918
Office hours: W4:30-5:30pm

Course Support
Lisa Gaumond
lisag@mit.edu
32-D724, 617-324-1543
TAs: Yuan Shen yks@csail.mit.edu
32-G442, x3-3043;
Gabriel Zaccak gabi@csail.mit.edu
32-G360, x3-0081;
Office hrs:
Tuesday 3pm-4:30pm
Friday 3pm-4:30pm
32-363

Course Time & Place
Lectures: M, W 3-4:30 PM
Room: 32-144,  map

Level & Prerequisites
Undergrad/Graduate; 6.034 or permission of instructor

Policies
Textbooks & readings
Grading marks guide
Style guide

Course Description

A laboratory-oriented course in the theory and practice of building computer systems for human language processing, with an emphasis on how human knowledge of language can be integrated into natural language processing.

This subject qualifies as an Artificial Intelligence and Applications concentration subject.

Announcements:
• Please fill in the course evaluation, here: https://sixweb.mit.edu/student/evaluate/6.863-s2009 (live on May 11)
• Final projects are officially due Friday, May 15
• Lab 3 is posted here.
• [3/20] Over 100 Final project ideas posted here.
Remember, 1 paragraph on your team's project choice (perhaps a new one), due after Spring Break (email to berwick@csail.mit.edu)
• [3/14] CGW results and test set will be released shortly! Thanks for playing! Congrats to the winning teams!
• [2/27] Laboratory 2 is now released, due March 13, here.
• [2/17] Reading & Response 2 is now released, due February 23, here.
• [2/13] Laboratory 1, part 2 is now released, due February 27, here. (Note change in due date.)
• [2/9] Laboratory 1, part 1 is now released, due February 18, here.
• [2/4] Please fill out the course questionnaire, so that we can schedule your free lunches, here.
• [2/4] For the first Reading & Response assignment and later labs, you can use either Athena or your own computer, installing NLTK ("Natural Language Toolkit"). Directions for how to do this are provided in the first reading and response handout. The general NLTK web page is here.

Weeks 1 & 2: Fun NLP link of the week: Postmodernist paper generator. Try 'writing' a new paper by following this link.

Class days in blue, holidays in green, reg add/drop/final project dates in orange.

February 2009
Sun
Mon
Tue
Wed
Thu
Fri
Sat
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
March 2009
Sun
Mon
Tue
Wed
Thu
Fri
Sat
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
           
April 2009
Sun
Mon
Tue
Wed
Thu
Fri
Sat
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29 30    
             
May 2009
Sun
Mon
Tue
Wed
Thu
Fri
Sat
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27 28 29 30
31            
Course schedule at a glance
Date
Topic
Slides & Reference Readings
Laboratory/Assignments

2/4
Weds

Introduction: walking the walk, talking the talk
Lecture 1 pdf slides; pdf 4-up; Jurafsky & Martin, ch. 1.
Jurafsky & Martin (JM) ch. 4 pp. 1-8; review ch 2 on finite-state automata/regular expressions if necessary.
If you don't know python, NLTK book, ch. 1-3; otherwise, just chs 2–3.
• Background Reading (for RR 1): Jurafsky & Martin on ngrams.
• Background Reading (for RR 1): Abney on statistics and language.
Background Reading (for RR 1): Chomsky, Extract on grammaticality, 1955.
Background chapters on NLP from Russell & Norvig, ch. 22.
Reading & response 1 out
(Ngrams; NLTK Python warmup; install NLTK)

2/9
Mon
Ngrams; smoothing; Word parsing & transducers
RR1 discussion

JM ch. 3; ch 10, pp. 1–7;
• Angluin, Induction of k reversible automata
• Berwick & Pilato, Learning syntax by automata induction
Background Reading: Kartunnen, History of two-level morphology, 1996.

Reading & response 1 due MON
Lab 1, part 1 out;

2/11
Weds
Word parsing I;
Lecture 2 pdf slides; pdf 4-up
• Background Reading (RR 2): Harris, From phoneme to morpheme, 1955.

Lab 1 part 2 out Friday

2/17
Tues
Word parsing complexity; What do children do?

Lecture 3 pdf slides; pdf 4-up
• Notes on finite-state automata and learning: Notes 1
Background Reading (RR 2): Saffran, Statistical learning in 8-month-old infants, 1996.
Background reading: Yang, Universal grammar, statistics, or both?, 2004.

2/18
Weds
Part of speech tagging; Finding words by MDL
Lecture 4 pdf slides; pdf 4-up
• NLTK book, part of speech tagging, ch. 4
Reading & response 2 out
Lab 1 part 1 due Weds
2/23
Mon
RR2 discussion
Lecture 5 pdf slides; pdf 4-up
• JM, ch. 6
• NLTK book, parsing, ch. 8
Reading & response 2 due MON
2/25
Weds
Parsing & syntax I: dynamic programming

Lecture 6 pdf slides; pdf 4-up
• Russell & Norvig, ch 23.
• JM, ch. 11
• NLTK book, chart parsing; probabilistic parsing.

Lab 1 part 2 due FRIDAY; Lab 2 out FRIDAY
3/2
Mon

Parsing & syntax II: dynamic programming

Lecture 7 pdf slides; pdf 4-up
• JM ch. 12 draft (cf parsing) pdf.
3/4
Weds
Earley parsing
Lecture 8 pdf slides; pdf 4-up
• Billot & Lang on 'packed parsed forests' here. (Warning: advanced automata theory required to understand this paper.)




3/9
Mon

Competitive parsing



3/11
Weds
Competitive parsing discussion


3/16
Mon
Earley's algorithm; Modern statistical parsers I;
Lecture 9 pdf slides; pdf 4-up
• JM 2nd edition, chapter 14, pdf.
• Background Reading: deMarcken, Lexical heads, phrase structure, & the induction of grammar, 1995.
• Background Reading: Collins, Head-driven statistical models for natural language processing, 2003.


 
3/18
Weds

Modern statistical parsers II; Evaluating Treebank parsers

Lab 2 due WEDS;
Lab 3 out FRIDAY

3/30
Mon

Project topic selection paragraphs due
4/1
Weds


4/6
Mon
Learning syntax I; basic results
Lecture 11 pdf slides; pdf 4-up
• Background Reading: Levelt, Grammatical inference
• (brief); Pinker, Formal models of language learning.
• Background Reading: Gold, Language identification in the limit, 1967.



4/8
Weds
Learning syntax II
Lecture 12 pdf slides; pdf 4-up
Background Reading: Berwick & Niyogi, A minimalist implementation of verb subcategorization, 2001.

• Background Reading: 5 papers on Wordnet
Lab 3 due WEDS

4/13
Mon

Lexical semantics I: working with wordnet

Lecture 13 pdf slides; pdf 4-up
• NLTK docs, ch. 10
• JM, ch. 16

• JM, ch. 19


4/15
Weds
Semantics I: the lambda calculus view

Lecture 14 pdf slides; pdf 4-up
• Background Reading: Perfors, Tennenbaum, Reger, Poverty of the stimulus? A rational approach, 2006.
• Background Reading: Smith, Learning the impossible, 1993.
• Background Reading: Fodor & Sakas, Statistics vs. UG in language learning, 2006


4/22
Weds
Semantics II: the lambda calculus view
Lecture 15 pdf slides; pdf 4-up


4/27
Mon
Semantics III: quantifiers & discourse; answering questions
 
4/29
Weds
Learning semantics & answering questions

• Lecture 17 pdf slides; pdf 4-up
• Background Reading: Fodor, is it a Bird?, 2003.

5/4
Mon
Language Learning & Language Change
• Lecture 18 pdf slides; pdf 4-up
• Background Reading: Niyogi & Berwick, A language learning model for finite parameter spaces, 1996.
5/6
Weds
Language Learning & Language Change
• Lecture 19 pdf slides; pdf 4-up
Background Reading: Niyogi & Berwick, A dynamical systems model for language change, 1997.
5/11
Mon
Evolution of language
• Lecture 20 pdf slides; pdf 4-up
• Background Reading: Chomsky, Fitch, Hauser, The Faculty of Language
Background Reading: Berwick, Syntax Facit Saltum, 2008.
 
5/13
Weds
Finale
• Lecture 21 pdf slides; pdf 4-up
Final group projects due FRIDAY 5/15
 

 

MIT Home