Quantile Regression (rq)
DESCRIPTION:
Perform a quantile regression on a design matrix, x, of
explanatory variables and a vector, y, of responses.
USAGE:
rq(x, y, tau=-1, alpha=.1, dual=F, int=T, tol=1e-4, ci = T, method="score",
interpolate=T, tcrit=T, hs=T)
REQUIRED ARGUMENTS:
x: vector or matrix of explanatory variables. If a matrix,
each column represents a variable and each row represents
an observation (or case). This should not contain column
of 1s unless the argument intercept is FALSE. The number
of rows of x should equal the number of elements of y,
and there should be fewer columns than rows. If x is
missing, rq() computes the ordinary sample quantile(s) of
y.
y: response vector with as many observations as the number of
rows of x.
OPTIONAL ARGUMENTS:
tau: desired quantile. If tau is missing or outside the range
[0,1] then all the regression quantiles are computed and
the corresponding primal and dual solutions are returned.
alpha: level of significance for the confidence intervals; de-
fault is set at 10%.
dual: return the dual solution if TRUE (default).
int: flag for intercept; if TRUE (default) an intercept term is
included in the regression.
tol: tolerance parameter for rq computations.
ci: flag for confidence interval; if TRUE (default) the confi-
dence intervals are returned.
method: if method="score" (default), ci is computed using re-
gression rank score inversion; if method="sparsity", ci is
computed using sparsity function.
interpolate: if TRUE (default), the smoothed confidence inter-
vals are returned.
tcrit: if tcrit=T (default), a finite sample adjustment of the
critical point is performed using Student's t quantile,
else the standard Gaussian quantile is used.
hs: logical flag to use Hall-Sheather's sparsity estimator
(default); otherwise Bofinger's version is used.
VALUE:
coef: the estimated parameters of the tau-th conditional quan-
tile function.
resid: the estimated residuals of the tau-th conditional quan-
tile function.
dual: the dual solution (if dual=T).
h: the index of observations in the basis.
ci: confidence intervals (if ci=T).
VALUE:
sol: a (p+2) by m matrix whose first row contains the 'break-
points' tau_1,tau_2,...tau_m, of the quantile function,
i.e. the values in [0,1] at which the solution changes,
row two contains the corresponding quantiles evaluated at
the mean design point, i.e. the inner product of xbar and
b(tau_i), and the last p rows of the matrix give b(tau_i).
The solution b(tau_i) prevails from tau_i to tau_i+1.
dsol: the matrix of dual solutions corresponding to the primal
solutions in sol. This is an n by m matrix whose ij-th
entry is 1 if y_i > x_i b(tau_j), is 0 if y_i < x_i
b(tau_j), and is between 0 and 1 otherwise, i.e. if the
residual is zero. See Gutenbrunner and Jureckova(1991)
for a detailed discussion of the statistical interpreta-
tion of dsol.
h: the matrix of observations indices in the basis corre-
sponding to sol or dsol.
EXAMPLES:
rq(stack.x,stack.loss,.5) #the l1 estimate for the stackloss data
rq(stack.x,stack.loss,tau=.5,ci=T,method="score") #same as above with
#regression rank score inversion confidence interval
rq(stack.x,stack.loss,.25) #the 1st quartile,
#note that 8 of the 21 points lie exactly
#on this plane in 4-space
rq(stack.x,stack.loss,-1) #this gives all of the rq solutions
rq(y=rnorm(10),method="sparsity") #ordinary sample quantiles
METHOD:
The algorithm used is a modification of the Barrodale and
Roberts algorithm for l1-regression, l1fit in S, and is
described in detail in Koenker and d"Orey(1987).
REFERENCES:
[1] Koenker, R.W. and Bassett, G.W. (1978). Regression
quantiles, Econometrica, 46, 33-50.
[2] Koenker, R.W. and d'Orey (1987). Computing Regression
Quantiles. Applied Statistics, 36, 383-393.
[3] Gutenbrunner, C. Jureckova, J. (1991). Regression
quantile and regression rank score process in the linear
model and derived statistics, Annals of Statistics, 20,
305-330.
[4] Koenker, R.W. and d'Orey (1994). Remark on Alg. AS
229: Computing Dual Regression Quantiles and Regression
Rank Scores, Applied Statistics, 43, 410-414.
[5] Koenker, R.W. (1994). Confidence Intervals for Regres-
sion Quantiles, in P. Mandl and M. Huskova (eds.), Asymp-
totic Statistics, 349-359, Springer-Verlag, New York.
SEE ALSO:
trq and qrq for further details and references.
Linearized Quantile Estimation (qrq)
DESCRIPTION:
Compute linearized quantiles from rq data structure.
USAGE:
qrq(s, a)
REQUIRED ARGUMENTS:
s: data structure returned by the quantile regression func-
tion rq with t<0 or t>1.
a: the vector of quantiles for which the corresponding lin-
earized quantiles are to be computed.
VALUE:
a vector of the linearized quantiles corresponding to vec-
tor a, as interpolated from the second row of s$sol.
SEE ALSO:
rq and trq for further detail.
EXAMPLES:
z_qrq(rq(x,y),a) #assigns z the linearized quantiles
#corresponding to vector a.
Function to compute analogues of the trimmed mean for the linear
regression model. (trq)
DESCRIPTION:
The function returns a regression trimmed mean and some
associated test statistics. The proportion a1 is trimmed
from the lower tail and a2 from the upper tail. If
a1+a2=1 then a result is returned for the a1 quantile. If
a1+a2<1 two methods of trimming are possible described be-
low as "primal" and "dual". The function "trq.print" may
be used to print results in the style of ls.print.
USAGE:
trq(x, y, a1=0.1, a2, int=T, z, method="primal", tol=1e-4)
REQUIRED ARGUMENTS:
x: vector or matrix of explanatory variables. If a matrix,
each column represents a variable and each row represents
an observation (or case). This should not contain column
of 1s unless the argument intercept is FALSE. The number
of rows of x should equal the number of elemants of y,
and there should be fewer columns than rows. Missing
values are not allowed.
y: reponse vector with as many observations as the number of
rows of x. Missing value are not allowed.
OPTIONAL ARGUMENTS:
a1: the lower trimming proportion; defaults to .1 if missing.
a2: the upper trimming proportion; defaults to a1 if missing.
int: flag for intercept; if TRUE, an intercept term is included
in regrssion model. The default includes an intercept
term.
z: structure returned by the function 'rq' with t <0 or >1.
If missing, the function rq(x,y,int=int) is automatically
called to generate this argument. If several calls to trq
are anticipated for the same data this avoids recomputing
the rq solution for each call.
method: method to be used for the trimming. If the choice is
"primal", as is the default, a trimmed mean of the primal
regression quantiles is computed based on the sol array
in the 'rq' structure. If the method is "dual", a weight-
ed least-squares fit is done using the dual solution in
the 'rq' structure to construct weights. The former
method is discussed in detail in Koenker and Potnoy(1987)
the latter in Ruppert and Carroll(1980) and Gutenbrunner
and Jureckova(1991).
tol: Tolerance parameter for rq computions
VALUE:
coef: estimated coeficient vector
resid: residuals from the fit.
cov: the estimated covariance matrix for the coeficient vector.
v: the scaling factor of the covariance matrix under iid er-
ror assumption: cov=v*(x'x)^(-1).
wt: the weights used in the least squares computation, Re-
turned only when method="dual".
d: the bandwidth used to compute the sparsity function. Re-
turned only when a1+a2=1.
EXAMPLES:
z_rq(x,y) #z gets the full regression quantile structure
trq(x,y,.05,z=z) #5% symmetric primal trimming
trq(x,y,.01,.03,method="dual") #1% lower and 3% upper trimmed least-
#squares fit.
trq.print(trq(x,y)) #prints trq results in the style of ls.print.
METHOD:
details of the methods may be found in Koenker and Port-
noy(1987) for the case of primal trimming and in Guten-
brunner and Jureckova(1991) for dual trimming. On the es-
timation of the covariance matrix for individual quan-
tiles, see Koenker(1987) and the discussion in Hendricks
and Koenker(1991). The estimation of the covariance ma-
trix under non-iid conditions is an open research prob-
lem.
REFERENCE:
Bassett, G., and Koenker, R. (1982), "An Empirical Quan-
tile Function for Linear Models With iid Errors," Journal
of the American Statistical Association, 77, 407-415.
Koenker, R.W. (1987), "A Comparison of Asymptotic Methods
of Testing based on L1 Estimation," in Y. Dodge (ed.)
Statistical Data Analysis Based on the L1 norm and Related
Methods, New York: North-Holland.
Koenker, R. W., and Bassett, G.W (1978), "Regression Quan-
tiles", Econometrica, 46, 33-50.
Koenker, R., and Portnoy, S. (1987), "L-Estimation for
Linear Models", Journal of the American Statistical Asso-
ciation, 82, 851-857.
Ruppert, D. and Carroll, R.J. (1980), "Trimmed Least
Squares Estimation in the Linear Model", Journal of the
American Statistical Association, 75, 828-838.