9.S916: Statistical data analysis for scientific inference in cognitive science, Spring 2021
1 Class information
| Class Times | Mondays 11:00am–1pm (subject to change depending on class participant availability) |
| Lecture & Recitation Location | On Zoom! |
| Class website | https://canvas.mit.edu/courses/7745 |
| Syllabus | http://www.mit.edu/~rplevy/teaching/2021spring/9.S916 |
2 Instructor information
| Instructor | Roger Levy (rplevy@mit.edu) |
| Instructor's office | On Zoom! |
| Instructor's office hours | TBD |
3 Class Description
As the empirical domain of cognitive science research continues to grow, complex structured experimental materials and human responses play an increasingly important role, as does analysis of “out of the lab” datasets of naturally occurring human behavior. These new data sources offer great opportunities for the field, but often require complex statistical analysis techniques not extensively covered in core introductory classes to yield theoretical insights and preserve scientifically valid inferences. This class covers theory and practice of several analysis techniques of growing interest for cognitive science, including hierarchical/multilevel regression, generalized additive models, and causal inference. Examples will be primarily drawn from research in the cognitive science of language, but techniques and insights will be of wider relevance for research in cognitive science and allied disciplines. R will be the working programming language for the class.
4 Class organization
The synchronous sessions of this 6-unit class will include a combination of lecture, interactive discussion, review, and practicum sessions. I also endeavor to offer pre-recorded lecture videos in advance of synchronous class meeting times. Although I have taught statistical data analysis many times, I am incorporating substantial new content in this class, so there will be a fair amount of experimentation! Frequent feedback from class participants will be much appreciated.
5 Intended Audience
Class content will be of interest for students, postdocs, faculty, and researchers in cognitive science, psychology, and linguistics, and potentially for sister disciplines including neuroscience, computer science, political science, sociology, economics, and other social science fields. Previous background (at least a semester of probability & statistics, or equivalent hands-on background in a research context) and experience with R or statistical programming in another language (Stata, SAS, Python) is strongly recommended.
6 Readings & Textbooks
The readings list will be developed over the course of the semester, and will draw on a variety of readings from multiple fields. There is no ready-made textbook customized to the content we'll be covering, so keep in mind that different readings may vary in the terminology, notation, and framing used for the same fundamental concepts. I also endeavor to provide written lecture notes throughout the semester.
7 Syllabus (very much subject to modification!)
| Week | Day | Topic |
|---|---|---|
| Week 1 | Mon Feb 22 | Introductory concepts in probability, statistics, and causation |
| Week 2 | Mon Mar 1 | Frequentist and Bayesian methods; likelihood ratio test; Markov Chain Monte Carlo; Bayes Factors |
| Week 3 | Tue Mar 9 | Generalized linear models; model selection; exploratory vs confirmatory analysis |
| Week 4 | Mon Mar 15 | Model class and contrasts; scientifically interpretable parameter estimation for regression |
| Week 5 | Mon Mar 22 | student holiday, no class |
| Week 6 | Mon Mar 29 | Mixed effects (hierarchical, multi-level) models; connection with repeated-measures ANOVA |
| Week 7 | Mon Apr 5 | Testing for generalization at multiple levels for hierarchically organized data; Keeping It Maximal |
| Week 8 | Mon Apr 12 | Practicum session |
| Week 9 | Mon Apr 19 | Patriots day, no class |
| Week 10 | Mon Apr 26 | Generalized Additive Models |
| Week 11 | Mon May 3 | Causal effects and preserving scientific interpretability: collider bias, backdoor paths, and more |
| Week 12 | Mon May 10 | Moving toward data "in the wild": natural experiments; imbalanced datasets; instrumental variables |
| Week 13 | Mon May 17 | Practicum session |
| Thurs May 20 | Class projects due 10pm EDT per the MIT Calendar |
8 Requirements & grading
You need to:
- do the readings and watch any pre-recorded videos in advance;
- show up to and participate in synchronous class discussions;
- do problem sets that I will assign throughout the semester;
- complete a short class project due on Thursday, May 20 (more info on this will be forthcoming)