Pol Meth Conf II 

 Dan Hopkins, G4, Government (guest author) 

 Continuing with the matching theme on which I ended the post of two days ago, Alexis Diamond and Jas Sekhon presented a paper on genetic matching that claimed to be a significant improvement on past approaches.  One of the challenges of matching is to weight each of the covariates so as to produce the optimal set of matches.  Genetic matching uses a genetic algorithm to search across the set of possible weight matrices to find the weight matrix that minimizes some loss function.  Of course, what exactly that loss function should be is debatable.  In Rawlsian fashion, Diamond and Sekhon argued that it should be to maximize the p-value of the most unbalanced covariate, and Sekhon's software (link  here ) does exactly that.  In some applications, one could certainly imagine other loss functions; seeking the best possible balance on the most unbalanced covariate could jeopardize the overall balance, a libertarian sort of rebuttal.  The discussion of the paper also raised the question of whether using a p-value is the right criterion.  If the algorithm is comparing p-values from samples with different sizes, for instance, it could disproportionately favor a smaller sample. 

 Despite the questions, I buy Diamond and Sekhon's argument.  Genetic matching makes effective use of computing power to search across a high-dimensional space for the most balanced sample that the data can provide.  In cases where there is insufficient overlap on covariates, data analysts will know this quickly rather than devoting weeks to Holy Grail-style quests for optimal matches.  And in cases where there is sufficient overlap on the covariates to make causal inferences, data analysts will be far more certain that they have attained the best possible balance—again, subject to the constraints about the loss function.