Bayesian Models of Human Learning and Reasoning: A Recap 

 An MIT tag team of  Prof. Josh Tenenbaum  and his graduate student  Charles Kemp  presented their research to the IQSS Research Workshop on Wednesday, October 19. The overlaying topic of Prof. Tenenbaum's research is machine learning; one major aspect of this is their method of categorizing the structure of the field to be learned. 

 For example, it has made sense for hundreds of years that forms of life could be taxonomically identified according to a tree structure so as to compare the closeness of two species, and it also makes some sense to rank them on an ordered scale by some other characteristic (one example presented was how jaw strength could be used to generalize to total strength.) The presenters then showed how Bayesian inference could be used to determine what organizational structures are best suited to which systems, based on a set of covariates corresponding to certain observable features, which could then be used to make other comparisons that might not be as evident, such as immune system behaviour. 

 What confused me for much of the time was their insistence that they could use the data to decide on a prior distribution, an idea that set some alarms off. I have been under the strongest of directives from professors to keep the prior distribution limited to prior knowledge. My current understanding is that the following method is used: 

 1. Choose a family to examine, such as the tree, ring or clique structure (all of which, notably, can be learned by kindergarteners rather quickly.) 

 2. Conduct an analysis where the prior distribution is an equal likelihood of structure corresponding to all possible formations of this type. 

 3. Repeat this with the other relevant families. Those analyses with the most favorable results would then correspond to the most likely structure. 

 4. Conduct further research on the system with the knowledge that one structure family is superior for this description. 

 While I'm not as comfortable with their use of a data-driven prior distribution as they'd like, it seems that the authors are sensitive enough to this concern to keep actual structures separate, and using the data only to confirm their heuristic interpretations of the structures at hand, which sets me more at ease.  

 Now, the key to this research is that this is a model for human learning - and wouldn't you know it, we're better at it than computers. But I'm still very encouraged at the direction in which this is heading, and am looking forward to later reports from the Tenenbaum group.