Pol Meth Conf I 

 The 14 papers presented at the 2005 Conference of the Society for Political Methodology, held July 21st-July 23rd in heavily air-conditioned rooms at Florida State, provided plenty of good fodder for discussion.  I will focus on several I found especially provocative--and on which I could reasonably comment--in blog posts over the next few days. 

 Starting off the conference, Gary King's presentation of his paper  Death by Survey: Estimating Adult Mortality without Selection Bias '' with Emmanuela Gakidou argued that we need to take a new approach to estimating death rates in the many countries that do not have vital registration systems.  The dominant approach at the present assumes that larger families do not have differing mortality rates, but given the uneven pace of development in so many countries, that seems a heroic assumption, and their paper shows it is completely wrong empirically.  King and Gakidou's approach involves two fixes: first, weighting to deal with the over-sampling of families with more surviving children during the observation period (since samples are drawn in proportion to survivors rather than those alive at the start of the period), and second, extrapolation to deal with the fact that families with no surviving children are entirely excluded from the sample.  The first problem is fixed exactly by weighting; the second requires assumptions beyond the range of the data.  Some discussion focused focused on one of the main challenges to this second fix—that it involves extrapolation, extrapolation based on a small number of data points, and extrapolation based on a quadratic term.  The paper deals with the danger of extrapolation through repetition in different data: By showing that the relationship between mortality and the number of siblings is constant in its shape across a wide range of countries, King and Gakidou argue that we can be reasonably confident about the fit of the curve from which we are extrapolating.  The authors are now gathering survey data to replicate this approach in cases where we know the answer--that is, where we also have accurate, non-survey data on mortality rates.  That is especially critical since families without any surviving children might be disproportionately the victims of wars or other violence, for instance, making it challenging to use data about families with surviving children to make inferences about families without any surviving children. 

 Another early paper came from Kevin Clarke, who argued that political science as a discipline has become too worried about omitted variable bias.  Clarke took another look at the familiar theoretical omitted variable bias result and pointed out that contrary to conventional wisdom, including additional variables can, under certain circumstances, exacerbate problems of omitted variable bias.  The circumstances are that something else is wrong: that is, omitting a variable that is causally prior to and correlated with the treatment variable and affects the outcome variable will bias inferences in a predictable direction, and including it will reduce bias -- but only when other modeling assumptions are correct.  If you have five things wrong with your model and you fix four, it is at least possible that you can make things worse. 

 In my view, the sociology of the discipline embedded in Clarke's presentation was right on.  In substantive presentations, it is incredibly common for presenters to be barraged with questions that are of this form: "did you account for [insert favorite variable]?  What about [insert second favorite variable]?  Or maybe [insert random variable that no one before or since has ever heard of]?"  Reviewers, too, seem to find this an easy way to respond to articles.  One way to deal with this problem--sensitivity tests--was highlighted during the discussion.  Our models are almost never perfectly specified, so there will always be omitted variables, and knowing how those variables would need to look to overturn a result is a good (if incomplete) start to deal with this problem.  One example of how these kinds of sensitivity analyses might work, by the way, is David Harding's 2003 "Counterfactual Models of Neighborhood Effects: The Effect of Neighborhood Poverty on Dropping Out and Teenage Pregnancy." (American Journal of Sociology 109(3): 676-719).  Another is Paul Rosenbaum and Don Rubin's 1983 "Assessing Sensitivity to an Unobserved Binary Covariate in an Observational Study with Binary Outcome."  (Journal of the Royal Statistical Society, Series B 45: 212-218). 

 One other case against overly-saturated models, one that did not come up in the discussion but that is probably familiar to many, is the challenge of thinking in terms of conditional effects as the number of variables increases.  For instance, if we think about vote choices as our dependent variable, I understand what it means to talk about the impact of income conditional on race, but it is much harder to know what it means to say the impact of income conditional on ten other, inter-correlated variables.  This problem becomes all the more difficult when we remember that we are conditioning not just on the inclusion of certain variables but also on the functional form specified for them. 

 Because I am a Harvard graduate student, I should also play to type and say something briefly about how matching (which these days is well-represented in hallway conversations at IQSS) relates to omitted variables.  Obviously, it is no panacea, as unobserved confounders can be just a troublesome as in the case of more conventional models.  But there is one way in which matching adds value here.  In cases where we are matching observations of units for which we have information not quantified in our dataset, looking at the list of matched pairs can help identify the omitted variable.  If, say, we are studying countries, and see that our observed variables wind up pairing Ethiopia and Greenland, we can use that pairing to think through what kinds of unobserved variables might be potential confounders. 

 Dan Hopkins, G4, Government (guest author)