Class description

This class is a practical introduction to statistical modeling and experimental design, intended to provide essential skills for doing research. We'll cover basic techniques (e.g., hypothesis testing and regression models) for both traditional experiments and newer paradigms such as evaluating simulations. Students with research projects will be encouraged to share their experiences and project-specific questions.

Students are expected to attend class and participate in discussions. Coursework will consist of two "practicals"—analyzing simple datasets to solidify core concepts—and two "case studies"—critical reading assignments of actual articles. Each assignment should take roughly one hour. Students are welcome to work in groups, but each student must submit an individual write-up in his or her own words. If you do work in a group, please also indicate with whom you worked. To pass, students must get a check/check+ on all assignments.

Finally, as this class is meant to be practical, we welcome any suggestions on topics and teaching style that will help you gain more from this course.

What will you get out of this class?

By the end of the class, you'll be able to:

Schedule

Due to the snowstorm and MIT closing on Tuesday, Jan 27, all classes from then on will be pushed back by one day. Note that Practical 2 will still be due on Tuesday by email!

Week Date Topic Assignments due Notes
1 Tue Jan 20 Introduction to statistics terminology, exploratory data analysis, and important distributions [PDF] (last updated 1/19)
Wed Jan 21 Confidence intervals and hypothesis testing Paragraph on your research interests [PDF] (last updated 1/19)
Thu Jan 22 Linear regression [PDF] (last updated 1/20)
Fri Jan 23 Regression diagnostics, advanced regression topics [PDF] (last updated 1/25)
2 Mon Jan 26 Nonparametric tests, model fitting Practical 1 [PDF] (last updated 1/25)
Tue Jan 27 Wed Jan 28 Categorical data Practical 2
Due by email no later than noon on Tuesday Jan 28
[PDF] (last updated 1/26)
Wed Jan 28 Thu Jan 29 Experimental design [PDF]
Thu Jan 29 Fri Jan 30
(32-124)
Machine learning, predictive analytics Case study writeups

Practicals

Each of the practicals involves carrying out some statistical analysis on small, real-world datasets. You may use any software to complete the assignments; all the data is in comma-separated format which should be readable by most software packages. If you do not already have a favorite, we encourage you to try out R, which is available on any Athena machine. We're also familiar with Python, Matlab, Julia, and Excel/OpenOffice. Outside of those, we'll do our best to help, but can't promise to get you unstuck. Finally, keep in mind that in most cases, each analysis will be a single line of R code; rarely will it be more than five. Please contact us if you find yourself getting bogged down in trying to run the analyses.

In your write-up (feel free to use bullet points/keep it brief), make sure you explain your reasoning for the tests that you ran and the parameter settings that you used. Also explain and interpret the results of any exploratory data analysis and statisical inference. Include relevant plots and output to back up your claims; however, we don't want to just see loads of print-outs! Your job is to provide succinct summaries of your analysis, not just copy-paste the computer output.

Additional pointers for those using R: This short reference card contains a quick-lookup list of a lot of common functions. If you need more extensive data manipulation, this card is also a good reference. We've also listed the key commands/syntax you'll need for the assignments here.

These assignments should be handed in at the start of class on the day they're due.

Case studies

Review two of the articles listed below, or of your own choosing. Each review should be no more than one page. Lists, bullet points, etc. are fine as long as your writing is clear. Reviews should consist of:

  • Summary: What was the objective of the study? Summarize the hypothesis, design methodology, analysis approach, and major findings. (This is to check whether you understood the study.)
  • Experimental Design: Was the experimental design appropriate for the study? Provide your reasoning for both sound and unsound aspects.
  • Statistical Analysis: Was the statistical analysis sound? Provide your reasoning for both sound and unsound aspects.

Case study papers: pick two from this list or select your own. If you choose your own, you should be able to find at least one sound and unsound aspect of the paper's statistical and design methodology.

The case studies should be turned in by Friday, January 30 at 11:59 by emailing your writeups to iap-stats AT mit DOT edu.

Acknowledgments