EM And Multi-level Models 

 One of the purposes of this blog is to allow us to share quantitative problems we’re currently considering.  Here’s one that arose in my research, and I’d love any comments and suggestions readers might have:  can one apply the EM algorithm to help with missing data in multi-level models? 


 Schematically, the problem I ran into is as follows:   A_ij | B_i  follows some distribution, call it  p1_i , and I had  n_i  observations of  A_ij .   A_ij  was a random vector, and some parts of some observations were missing.   B_i | C   follows some other distribution, call it  p2 .  Suppose I’m a frequentist, and I want to make inferences about  C .  The problem I kept running into was that I couldn’t figure out how to use EM without integrating the  B_i ’s out of the likelihood, a mathematical task that exceeded my skills.  I ended up switching to a Bayesian framework and using a Gibbs sampler, i.e., drawing from the distribution of the missing data given the current value of the parameters, then from the distribution of the parameters given the now-complete data.  But I couldn’t help wondering, are hardnosed frequentists just screwed in this situation, do they have to resort to something like Newton-Raphson, or is there an obvious way to use EM that I just missed?