A Unified Theory of Statistical Inference? 

 If inference is the process of using data we have in order to learn about data we do not have, it seems obvious that there can never be a proof that anyone has arrived at the "correct" theory of inference.  After all, the data we have might have nothing to do with the data we don't have.  So all the (fairly religious) attempts at unification -- likelihood, Bayes, Bayes with frequentist checks, bootstrapping, etc., etc. -- each contribute a great deal but they are unlikely to constitute The Answer.  The best we can hope for is an agreement, or a convention, or a set of practices that are consistent across fields.  But getting people to agree on normative principles in this area is not obviously different from getting them to agree on the normative principles of political philosophy (or any other normative principles).   


 It just doesn't happen, and even if it does it would have merely the status of a compromise rather than the correct answer, the latter being impossible. 

 Yet, there is a unifying principal that would represent progress in the sense that would advance the field: we will know that something like unification has occurred when we distribute the same data, and the same inferential question, to a range of scholars with different theories of inference, that go by different names, use different conventions, and are implemented with different software, and yet they all produce approximately the same emprical answer.   

 We are not there yet, and there are some killer examples where the different approaches yield very different conclusions, but there does appear to be some movement in this direction.  The basic unifying idea I think is that all theories of inference require some assumptions, but we should never take any theory of inference so seriously that we don't stop to check the veracity of the assumptions.  The key is that conditioning on a model does not work, since of course all models are wrong, and some are really bad.  What I notice is that most of the time, you can get roughly the same answers using (1) likelihood or Bayesian models with careful goodness of fit checks and adjustments to the model if necessary, (2) various types of robust, semi-parametric, etc.  statistical methods, (3) matching for use as preprocessing data that is later analyzed or further adjusted by parametric likelihood or Bayesian methods, (4) Bayesian model averaging, with a large enough class of models to average over, (5) the related "committee methods'', (6) mixture of experts models, and (7) some highly flexible functional forms, like neural network models.  Done properly, these will all  usually  give similar answers. 

 This is related to Xiao-Li Meng's self-efficiency result: the rule that ``more data are better'' only holds under the right model.  Inference can't be completely automated for most quantities, and we typically can't make inferences without some modeling assumptions, but the answer won't be right unless the assumptions are correct, and we can't ever know that the assumptions are right.  That means that any approach has to come to terms with the concept that some of the data might not be right for the given model, or the model might be wrong for the observed data. Each of the approaches above has an extra component to try to get around the problem of incorrect models.  This isn't a unification of statistical procedure, or a single unified theory of inference, but it may be leading to a unificiation of results of many diverse procedures, as we take the intuition from each area and apply it across them all.