Relative Distributions 

 When comparing how different groups fare on a particular measure (for example, the life expectancy of immigrants versus native born individuals or the wages of workers in 1950 versus 2000), we often focus on the difference in the averages of the two distributions. Sometimes we also examine disparity in distributional spreads, inquiring whether one group's outcomes are more variable than the other's. Of course, summarizing distributions with one or two parameters discards a lot of potentially useful information. Enter  Relative Distribution Methods in the Social Sciences , a clever book by Mark Handcock and Martina Morris. In what follows, I explore the basic insight of the book and test out some techniques myself  (with graphs!) . 


  
Handcock and Morris present a neat way to compare the whole distributions of a reference group and a comparison group by asking, "at what quantile on the reference group's distribution would someone from the comparison group fall?" If the distributions are the same, we expect this "relative data" to be uniformly distributed. Deviations from uniformity provide insights into group differences.  

  
A bit of formality helps explain the utility of this framework. If Y 0   and Y are random variables representing the measurement of interest on the reference population and comparison population, respectively,  with CDFs  F 0 (y) and F(y) and PDFs f 0 (y) and f(y),  Handcock and Morris define a new random variable R (for relative data) with R = F 0 (Y). The CDF of R is F(F 0  -1 (r))  [readers may recognize this as a probability-probability plot]  and the PDF is f(F 0  -1 (r))/ f 0 (F 0  -1 (r))  for 0  r  1. Notice that this PDF, the "relative density," is simply a ratio of densities, each evaluated at a given quantile of the reference distribution. Because of the transformation from the original variable scale using the quantile function F 0  -1 (r), this relative density is a valid PDF, integrating to 1. Researchers can thus use the random variable R and its "relative distribution" for inference. 

  
I have been interested in trying out this technique for a while, and after listening to a recent  podcast  on why there are relatively few female scientists and then stumbling onto  some  old  discussion  on Andrew Gelman's blog on a similar topic, I figured comparing male and female test scores on a standardized math exam might be a good test case. Studies have shown that on average male math test scores are higher than female scores and they are also more variable. Can relative distribution methods provide any extra insight? In this example, I use data from  Project Talent . The sample represents the population of US high school students in 1960. The data are out of date; I use them for illustrative purposes only. As expected, overlaid densities show that male high school students' math scores are on average higher as well as more variable than female students' scores.   

     

 The relative distribution graph below combines the information in the two curves into a single line. In our sample, women are more likely than men to be in the lower tail (about 1.5 times as likely to achieve the lowest score), about equally likely to be in the middle, and less likely than men to be amongst the highest scorers. Because the relative data have a valid density function, we can also examine 95% confidence intervals. These intervals show that the differences we observe in our sample are only statistically significant in the upper tail, with female test takers about half as likely as their male counterparts to earn the highest scores. This finding is substantively useful, suggesting that average differences are driven by differences in very high math test-taking abilities.   

     

 We can decompose the overall relative distribution into differences in the locations and shapes of the male and female math score distributions. The below figure suggests that any differences between men's and women's math scores at the low end of the distribution are due to men's higher median score, while gender difference in high scorers are driven by the greater spread in men's scores. (If all distributional differences were due to shape differences the second panel of the below figure would show a horizontal lines at 1 and the third panel would look just like the first.) 

     

 There are several neat extensions of relative distribution methods, including covariate adjustments and nonparametric summary measures of distributional divergence. I haven't seen these techniques applied frequently, but they seem useful to me. Let me know what you think.