(KRLS) as described in Hainmueller
and Hazlett (2012). KRLS is a
machine learning method that can flexibly
fit solution surfaces of the form y=f(X)
that arise in regression or classification
problems without relying on linearity or
other assumptions that use the columns of
the predictor matrix X directly as basis
functions (such as additivity). KRLS finds
the best fitting function by minimizing a
Tikhonov regularization problem with a
square loss using Gaussian Kernels as
radial basis functions. More precisly, we
search over a space of possible functions
H and choose the best function f according
to the rule:
argmin_f in H sum_i^n
(y_i - f(x_i))^2 + lambda ||f||_H^2
where (y_i - f(x_i))^2 a
loss function that computes how 'wrong'
the function is at each observation i and
lambda ||f||_H^2 is the regularizer that
measures the complexity of the function
according to the L_2 norm |f||^2 =
integral f(x)^2 dx. The representer
theorem states that under fairly general
conditions, the function that minimizes
the regularized loss within the hypothesis
space established by the choice of a
(positive semidefinite) kernel function k
is of the form: f(x_j)= sum_i^n c_i
k(x_i,x_j) where the kernel function
k(x_i,x_j) measures the distance between
two observations x_i and x_j, K is the
kernel matrix with all pairwise distances
K_ij=k(x_i,x_j), and c is the vector of
choice coefficients for each observation i
such that y=Kc. Accordingly, the krls
function solves the following minimization
problem:
argmin_f in H sum_i^n (y
- Kc)'(y-Kc)+ lambda c'Kc
which is convex in c and
solved by c=(K +lambda I)^-1 y, a linear
solution that provides a fit that will be
potentially highly non-linear in terms of
the predictors.
The function currently
implements KRLS for the following Kernels:
"gaussian": k(x_i,x_j)=exp(-|| x_i - x_j
||^2 / sigma^2) where ||x_i - x_j|| is the
Euclidean distance, "linear":
k(x_i,x_j)=x_i'x_j, "poly1-4" which are
polynomials based on k(x_i,x_j)=(x_i'x_j
+1)^p where p is the order. For the
Gaussian kernel the sigma width parameter
is set to the number of dimensions by
default. Unless otherwise specified by the
user, the lambda parameter is chosen by
minimization of the leave-one-out error.
The function also
computes the variance-covariance matrix
for the choice coefficients c and fitted
values y=Kc based on a variance estimator
developed in Hainmueller
and Hazlett (2011). For the Gaussian
kernel, the function can also compute the
pointwise partial derivatives of the
fitted function wrt to each predictor in
X. Average derivatives are also computed
with variances.
KRLS is currently
available for R an alpha version (a Stata
version is planned). Feedback from users
is highly appreciated. If you download the
software, please send us an email so that we can keep you
informed about updates.
KRLS for R
You can
obtain the ebal package for R from CRAN
by typing:
install.packages("KRLS")
Source: http://cran.r-project.org/web/packages/KRLS/