Stories and statistics 

 Lately I've been thinking a lot (and writing a little) about ways to combine the qualitative and quantitative empirical traditions in political science, so I was quite interested to read a  new post  on the philosophy blog at the New York Times written by mathematician John Paulos.  He contrasts the logic of story-telling with the logic of statistics to draw out some interesting implications for how each mode of understanding colors the ways we think about the world. 

 In a sentence that could have come out of a "scope and methods" text, Paulos identifies the fundamental difference between literary and statistical traditions: "The focus of stories is on individual people rather than averages, on motives rather than movements, on point of view rather than the view from nowhere, context rather than raw data."  I think this is an accurate description of how  two empirical cultures  in social science have developed, but I disagree that this divide is inherent. 


  
This may be unorthodox, but I don't see statistics as inherently "quantitative" or focused on the "general" rather than the "particular".  I see statistics as a relatively young field attempting to develop answers to the question "how should I go about formulating my beliefs about the world now that I've observed some part of it."  Eventually, statistics will need to offer advice on how to update our picture of the world after observing  any  type of information -- not just information that comes from randomized experiments, fits neatly in rectangular matrices, or involves enough "N" for some central limit theorem to hold. 

 Narrative research seems ideally suited to work with the types of information that traditional statistics has largely ignored.  Why then should statistics take up the task? Narratives are rich with data but researchers using narrative methods have little advice on how to make inferences from these data.  In the richest of literary narratives this ambiguity enhances the text, allowing the reader to reach many conclusions about the meaning and implications of a work.  In empirical social science, this ambiguity can become a liability.  If statisticians spent more time developing ways of making appropriate inferences from data in these settings -- frankly the most common settings that we face -- it might lessen this ambiguity by offering a clear set of rules for mapping complex narrative data to inference. 

 My hunch is that the people who work with data that lends itself to narrative research already have ideas about the best practices for making valid inferences from these data.  Perhaps we should be more interested in learning to speak statisticians' language so that we can suggest these insights to them and they in turn can suggest refinements for us.  This exchange would help statisticians develop a science of inference and help us develop knowledge of social phenomenon.