Cog Sci Conf 

 The annual meeting of the  Conference of the Cognitive Science Society  took place in late July.  Amid a slew of interesting  debates and symposia, one paper stood out as having particularly  interesting implications from the methodological perspective.  The paper, by Navarro et. al., is called "Modeling individual differences with Dirichlet processes" (pdf found  here ). 

 The basic idea is that many questions in cognitive and social science hinge on identifying which items (subjects, features, datapoints) belong to which groups.  The individual difference literature is replete with famous psychological theories along these lines: the factors contributing to IQ, the different "personality types", the styles of thought on this or that problem.  In cognitive science specifically, the process of classification and categorization - arguably one of the more fundamental of the mind's capabilities - is basically equivalent to figuring out which items belong to which groups.  Many existing approaches can capture different ways to assign subjects to groups, but in almost all of them the number of groups must be prespecified - an obvious (and large) limitation. 

 A Dirichlet process is a "rich-get-richer" process: as new items are seen, they are assigned to groups proportional to the size of the group, with some nonzero probability alpha of forming a new group.  This naturally results in a power-law (Zipfian) distribution of items, which parallels the natural distribution of many things in the world.  It also often seems to form groups that match human intuitions about the "best" way to split things up.  Dirichlet process models, often used in Bayesian statistics, have been around in machine learning and some niches of cognitive science for at least a few years.  However, the Navarro article is one of the first I'm aware of that (i) examines their potential in modeling individual differences, and (ii) attempts to make them more widely known to a general cognitive science audience. 

 It's exciting to see more advanced Bayesian statistical models of this sort poke their way into cognitive science. As I think about how useful these can be, I have some questions. For instance, Navarro et al.'s model gives a more principled mechanism for figuring out how many groups best fit a set of data, but the exact number of groups identified is still dependent on the alpha parameter.  Is this a severe limitation?  Also, the "rich-get-richer" process is intuitive and natural in many cases, but not all groups follow power-law distributions.  How might we use models with other processes (e.g., Gaussian process models) to assign items to an unspecified number of groups in delete "other" ways that don't yield power-law distributions?  I think we've only started to scratch the surface of the uses of this type of model, and I'm eager to see what happens next.