Sun | Mon | Tue | Wed | Thu | Fri | Sat |
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | |||
5 | 6 | 7 | 8 | 9 | 10 | 11 |
12 | 13 | 14 | 15 | 16 | 17 | 18 |
19 | 20 | 21 | 22 | 23 | 24 | 25 |
26 | 27 | 28 | 29 | 30 |
« March 2009 | Main | May 2009 »
28 April 2009
Please join us for our final meeting tomorrow when Thomas Yee, Department of Statistics, University of Auckland will present ``Vector generalized linear and additive models". Thomas provided the following abstract for his talk:
The class of vector generalized linear and additive models (VGLMs/VGAMs) is very large and contains many statistical models relevant to quantitative social science, e.g., univariate and multivariate distributions, categorical data analysis, time series, survival analysis, extreme value analysis, mixture models, correlated binary data, and nonlinear regression. I'll first give an overview of the framework and tie it in with practice using my VGAM package for R. Then we will focus on two sub-topics: reduced-rank VGLMs and quantile/expectile regression. The former handles the reduced-rank multinomial logit model (aka stereotype model) and Goodman's row-column association model; applications of the latter are becoming popular in many fields. Time allowing, I'll describe several sub-projects I'm currently working on since arriving at IQSS.The Applied Statistics workshop meets each Wednesday in room K-354, CGIS-Knafel (1737 Cambridge St). We start at 12 noon with a light lunch, with presentations beginning around 1215 and we usually wrap up around 130 pm.
Posted by Justin Grimmer at 3:12 PM
24 April 2009
Today's New York Times contains an article reporting that the United States is "losing" the war on cancer. This, of course, made me think of a comic from PhD Comics yesterday.
More seriously, it also brought to mind a paper that Bo Honore and Adriana Lleras-Muney wrote several years ago exploring the war on cancer and how we measure success. The challenge in deciding if we are winning is that everyone must die of something, thus when the Times reports large declines in cardiovascular mortality, it follows that some other cause of death must be increasing to (partially) compensate for the decrease in cardiovascular mortality. What Honore and Lleras-Muney do is that they consider the challenges in estimating competing risks models when the causes of death are not independent. In simple mortality models they find that there has been no improvement in cancer mortality from the war on cancer, but the assumptions that are needed there is that individuals who die from non-cancer causes of death are censored in their analysis and that these survival times are independent.
Their more sophisticated analysis recognizes that there are many risk factors for cancer mortality that are also risk factors for other causes of death, so the assumption that the mortality risks are independent is clearly violated. They then present two alternatives to generate more plausible estimates of the effect of the war on cancer on cancer mortality. The first method is to simply look at upper and lower bounds on survival (Manski bounds) and the second method entails making some assumptions about how the distributions of survival times for different causes of death are related. The bounding method leads to quite wide bounds and they state "that it is not possible to make any statement about whether survival from cancer increased or decreased during this period [1970-2000]."
By assuming that the marginal survival distributions follow a specific functional form, they are able to tighten the bounds considerably to draw some conclusions on the efficacy of the war on cancer. Assuming independence, they find that there has been a small improvement in cancer mortality over the period 1970 to 2000. Assuming some dependence between cardiovascular and cancer mortality, however, provides evidence that the war on cancer had a very large effect on cancer mortality of between 10 and 20%, depending on race and gender. Thus there is reasonable evidence that the war on cancer has not been a failure, but perhaps not a stunning success either. The lesson for social scientists is that every assumption matters, relaxing independence between cardiovascular and cancer mortality dramatically increased the effect of the war on cancer and may even overturn the conclusion in the New York Times article.
Posted by Martin Andersen at 10:34 AM
13 April 2009
Here's a paper for the "high internal, low external validity" file (via Kevin Lewis):
Interracial Workplace Cooperation: Evidence from the NBAJoseph Price, Lars Lefgren & Henry Tappen
NBER Working Paper, February 2009Abstract:
Using data from the National Basketball Association (NBA), we examine
whether patterns of workplace cooperation occur disproportionately
among workers of the same race. We find that, holding constant the
composition of teammates on the floor, basketball players are no more
likely to complete an assist to a player of the same race than a
player of a different race. Our confidence interval allows us to
reject even small amounts of same-race bias in passing patterns. Our
findings suggest that high levels of interracial cooperation can occur
in a setting where workers are operating in a highly visible setting
with strong incentives to behave efficiently.
Posted by Andy Eggers at 6:51 PM
12 April 2009
Please join us this Wednesday for the applied statistics workshop when Alberto Abadie, Professor of Public Policy, will present ``A General Theory of Matching Estimation", joint work with Guido Imbens. Alberto provided the following abstract for his talk:
Matching methods provide simple and intuitive tools for adjusting the distribution of covariates among samples from different populations. Probably because of their transparency and intuitive appeal, matching methods are widely used in evaluation research to estimate treatment effects when all treatment confounders are observed (Rubin, 1973, 1977; Rosenbaum, 2002). In spite of their popularity, the problem of establishing the large sample distribution of matching estimators remains largely unsolved, with the exception of some special cases (see Abadie and Imbens, 2006). The reason is that matching estimators are non-smooth functionals of the data, which makes their large sample theory particularly challenging. This talk will describe a new general method to establish the large sample distribution of matching estimators. As an example of the applicability of the method, we will describe how to derive the distribution of matching estimators when matching is carried out without replacement, a result previously unavailable in the literature. We will also discuss how to adjust the standard errors for propensity score matching estimators to take into account first step estimation of the propensity score, a result also previously unavailable.
The Applied Statistics Workshop meets each Wednesday at 12 noon in K-354 CGIS-Knafel (1737 Cambridge St). The workshop begins with a light lunch and presentations usually start around 1215 and last until about 130 pm.
Posted by Justin Grimmer at 7:41 PM
9 April 2009
In the spirit of Weihua's post on inference in non-random and random settings, I found this paper by Anup Malani on selection bias in randomized studies. The thirty second point is that patients' prior beliefs over outcomes of a study matter and as the probability of being in the "good" arm (to the extent that there is one) increases, the marginal patient will expect a smaller effect from the study. What this means is that the study population is not representative of the overall population and that real-world outcomes will always be worse than those observed in an experiment since in the real world the probability of receiving the "good" treatment is 1.
Malani's model proposes three potential treatments--none, conventional, and experimental--and each treatment has a heterogeneous effect across the population; for simplicity I will assume that the outcome is binary (alive/dead, employed/unemployed, elected/unelected, etc.). Individuals have beliefs about the probability of a good outcome for each treatment and there is also a distribution of true probabilities of a good outcome for each treatment. The key assumption is that these two distributions are not independent, thus knowing an individual's preferences over the three potential treatments provides information about true outcomes (as long as it doesn't provide too much information). The exciting part of this paper is that when one thinks of an experiment, it is simply a lottery among these alternatives, perhaps a lottery between no treatment and the experimental treatment. Then anyone who believes no treatment is better than conventional treatment should enroll in the study, regardless of the lottery. The interesting case is when should individuals who believe that conventional treatment is better than no treatment take the risk of entering the study and potentially being assigned to no treatment? Intuitively this happens when an individual believes that the expected benefit of entering the study is greater than taking conventional treatment, but this benefit depends on the lottery. Therefore, in a simplified form of Malani's argument, as the probability of assignment to the new treatment increases I will enroll in the study with a weaker belief in the new treatment. Malani then shows that under plausible assumptions this yields selection bias and demonstrates that this bias could explain much of the benefit observed in randomized studies of anti-ulcer drugs.
Thus even randomized studies are not immune to selection bias.
What does one do to address this problem? Malani proposes allowing subject to choose which control they would like to receive--thus when a subject is enrolled, she can choose a lottery over no treatment and new treatment or over conventional treatment and new treatment. This then provides estimates of the treatment effect for individuals who prefer conventional treatment to no treatment, which may be the most relevant for policy purposes.
Posted by Martin Andersen at 2:00 PM
7 April 2009
We are pleased to announce a special presentation that should be of interest. David Firth, Professor of Statistics at the University of Warwick, will present on Quasi Variances this *Thursday* from 12-2 pm in room K-354 in CGIS-Knafel (1737 Cambridge St, the usual meeting place for the applied statistics workshop). Professor Firth provided the following abstract for his presentation:
The notion of quasi variances, as a device for both simplifying and enhancing the presentation of additive categorical-predictor effects in statistical models, was developed in Firth and de Menezes (Biometrika, 2004, 65-80). The approach generalizes the earlier idea of "floating absolute risk" (Easton et al., Statistics in Medicine, 1991), which has become rather controversial in epidemiology. In this talk I will outline and exemplify the method, and discuss its extension to some other contexts such as parameters that may be arbitrarily scaled and/or rotated.Everyone (especially graduate students) is welcome and encouraged to attend.
A bit of background on Professor Firth. He is Professor of Statistics at the University of Warwick. He specializes in statistical theory and methods, and has a particular interest in generalized linear models---especially as applied to the social sciences. He has published extensively in the discipline's major journals of record, such as JRSS and Biometrika, and has written several packages for the R language and environment. He has made several significant contributions to the field, and is well known as the inventor of bias-reduced logistic regression (also known as 'Firthit').
He is at IQSS as a Distinguished Visiting Fellow (April 7--17), and will be spending part of his time here working with Arthur Spirling on models of momentum for contest data.
We hope everyone will be able to attend
Posted by Justin Grimmer at 5:52 PM
The workshop will meet tomorrow, when Sandra Sequeira, a PhD candidate in public policy, will present her work on the efficiency cost of corruption, work that is joint with Simeon Djankov. Sandra provided the following abstract for her talk:
This paper estimates the efficiency cost of corruption. We generate an original dataset on bribe payments at ports in Southern Africa that allows us to take an unusually close look into the black box of corruption, observing how bureaucrats set bribes and measuring their economic costs on firms and on the broader economy. We find that bribes are product-specific, frequent and substantial. Bribes can represent up to a 14\% increase in total shipping costs for a standard 20ft container and a 600\% increase in the monthly salary of a port official. Bribes are paid primarily to evade tariffs, protect cargo on the docks and avoid costly storage. We further identify three systemic effects associated with this type of corruption: a ``diversion effect" where firms go the long way around to avoid the most corrupt port; a ``revenue effect" as bribes reduce overall tariff revenue; and a ``congestion effect" as the re-routing of firms increases congestion and transport costs by causing imbalanced cargo flows in the transport network. The evidence supports the theory that bribe payments at ports represent a significant distortionary tax on trade, as opposed to just a transfer between shippers and port officials that greases slow-moving clearing queues.
The Applied Statistics Workshop meets each Wednesday at 12 noon in K-354 CGIS-Knafel (1737 Cambridge St). The workshop begins with a light lunch and presentations usually start around 1215 and last until about 130 pm.
Posted by Justin Grimmer at 5:46 PM
4 April 2009
Here is some latest progress (at least to me) on causal inference. William R. Shadish, M. H. Clark, and Peter M. Steiner published a paper on JASA (December 1, 2008, 103(484): 1334-1344.) based on "a randomized experiment comparing random and nonrandom assignments". Basically "In the randomized experiment, participants were randomly assigned to mathematics or vocabulary training; in the nonrandomized experiment, participants chose their training." As the authors acknowledged, unsurprisingly, the randomized and nonrandomized experiments provided different estimates of the training effects, very likely through the selection bias caused by math phobia. The key finding is that statistical adjustment including propensity score stratification, weighting, and covariance adjustment can reduce estimation bias by about 58-96%.
Here is a link to the PPT of the paper. The comments on the paper are also very insightful.
Posted by Weihua An at 10:31 PM
2 April 2009
As a follow to on Andy's post, here's a great addition to the slides of any methods class. Here is the original. Via Megan McArdle at the Atlantic Monthly.
Posted by John Graves at 6:23 PM