Incompatibility:  Are You Worried? 

 I’m a teaching fellow for a course in missing data this semester, and one topic keeps coming up peripherally in the course, even though we haven’t tackled it head-on just yet.  That topic is incompatible conditional distributions.  And here’s my question for blog readers:  how much does it bother you? 
 


 Reduced to its essence, here’s the issue.  Supposed I have a dataset with three variables, A, B, and C.  There are multiple missing data patterns, and suppose (although it’s not essential to the problem) that I want to use multiple imputation to create six or seven complete analysis datasets.  Suppose also that it’s very difficult to conceive of a minimally plausible joint distribution p(A, B, C).  Perhaps A is semi-continuous (e.g., income), B is categorical with 5 possible values, and C has support only over the negative integers.  What (as I understand it) is often done in this case is to assume conditional distributions, for example, p*(A|B, C), p*(B|A, C), and p*(C|A, B).  The idea is that one does a “Gibbs? with these three conditional distributions, as follows.  Find starting values for the missing Bs and Cs.  Draw missing As from p*(A|B, C).  Then draw new Bs from p*(B|A, C) using the newly drawn As and the starting Cs.  Continue as though you were doing a real “Gibbs.?  Stop after a certain number of iterations and call the result one of your multiply imputed datasets. 

 The incompatibility problem is that there may be no joint distribution that has conditional distributions p*(A|B, C), p*(B|A,C), and p*(C|A, B).  Remember, (proper) joint distributions determine conditional distributions, but conditional distributions do not determine joint distributions, and in some cases, one can actually prove mathematically that no joint distribution has a particular set of conditionals.  If you ran your “Gibbs? long enough, eventually your draws would wander off to infinity or become absorbed into a boundary of the parameter space.  In other words, your computer would complain; exactly how it would complain depends on how you programmed it. 

 I confess this incompatibility problem bothers me more than it appears to bother some of my mentors.  If the conditional distributions are incompatible, then I KNOW that the "model" I’m fitting could not have generated the data I see.  It seems like even highly improbable models are better than impossible ones.  On the other hand, I am sympathetic to the idea of doing the best one can, and what else is there to do in (say) large datasets with multiple, complicated missing data patterns and unusual variable types? 

 How much does incompatibility bother you?