Globally robust methods for constrained nonlinear programming
A method for constrained nonlinear programming is said to be globally robust if it possesses a `reasonable' convergence theory under `minimal' global hypotheses. Here the term `minimal' is used to specifically exclude hypotheses concerning problem feasibility, the existence of solutions, and most emphatically global constraint regularity hypotheses such as the linear independence condition or the Mangasarian-Fromovitz constraint qualification. In this talk we consider algorithms based on the SQP framework that are globally robust in the above sense.
Numerical Computation of Fitzpatrick Functions
The study of Fitzpatrick functions has produced many useful results
regarding the representation of operators. In the case an operator is
cyclically monotone, its antiderivative can be built from its
Fitzpatrick function of infinite order. Given such an operator, we
present fast algorithms for computing its Fitzpatrick functions of
arbitrary order on a grid, by applying the Piecewise Linear-Quadratic
model and related fast transforms.
We will first improve the existing quartic-time algorithm to
worst-case cubic time, then to quadratic time using a relationship
between Fitzpatrick functions and Rockafellar functions. We will then
show how antiderivatives can be computed in linear time from a special
case of the infinite-order Fitzpatrick function.
This is a joint work with Yves Lucet.
Convexity of the proximal average
We complete the study of the convexity of the proximal average by proving it is convex as a function of each one of its parameters but not as a function of any two of each parameters. An application to the efficient plotting of the family of proximal averages is presented. (Joint work with Jennifer Johnstone and Yves Lucet.)
An Augmented Lagrangian Approach for Sparse Orthogonal Principle Component Analysis
Principal component analysis (PCA) is a useful technique for statistical analysis and dimension reduction with numerous applications throughout science and engineering. However, the principal components can sometimes be difficult to interpret because they are linear combination of all the original variables. To facilitate interpretation, sparse PCA produces modified PCs with sparse loadings, i.e. loadings with very few non-zero elements. In this talk, we consider finding sparse orthogonal loadings of a given data matrix, which has been commonly recognized as a challenging problem in statistics. In our approach, we first formulate this problem as a constrained non-smooth optimization problem. Then we propose an augmented Lagrangian (AL) method for solving a class of constrained non-smooth optimization problems, whose global convergence is established. Additionally, we propose two non-monotone gradient methods for solving the subproblems appearing in the AL method. The global and local convergence of these methods are also established. Finally we present some preliminary computational results.
Pattern Discrete and Mixed Hit-and-Run for Global Optimization and Mixed Integer Programming
We develop new Markov chain Monte Carlo samplers for neighborhood generation in global optimization algorithms based on Hit-and-Run, a successful sampling algorithm. We define Sphere and Box Biwalks that are pattern-based and easily implemented for discrete and mixed continuous/discrete domains. The pattern-based Hit-and-Run Markov chains preserve the convergence properties of Hit-and-Run to a target distribution. We present bounds on the finite time performance for the discrete cases of Sphere Biwalk, O(n4), and Box Biwalk, O(n3). Moreover, we provide an efficient implementation of our Pattern Discrete and Mixed Hit-and-Run on discrete and mixed lattice of a polytope. This leads to a randomized Improving Pattern Hit-and-Run algorithm for Mixed Integer Programming, that is linear in dimension in expected number of iterations, where an iteration is O(n) random number generations.
Joint work with Zelda Zabinsky.
SDP Representation of Convex Sets
A set is called SDP representable if it is expressible by some linear matrix inequality via lifting variables. This talk presents necessary and sufficient conditions for SDP representability of a set S. First, we show a necessary condition: S is convex semialgebraic, and the boundary pieces of S are nonnegatively curved on nonsingular parts. Second, we show a sufficient condition: S is convex semialgebraic, and the boundary pieces of S are nonsingular and positively curved. Some variations of these conditions will also be shown. This is a joint work with Bill Helton.
A proximal point method for matrix least squares problem with nuclear norm regularization
We consider a Newton-CG proximal point method for solving matrix least squares problem with nuclear norm regularization. For the symmetric problem in which the matrix variable is symmetric, the proximal point method is the same as the augmented Lagrangian method applied to the dual problem. For the inner problems in the non-symmetric problem, we show that the soft thresholding operator is strongly semi-smooth everywhere, which is a key property for successfully applying semi-smooth Newton-CG method to solve the inner problems. Numerical experiments on a variety of large scale SDP problems arising from regularized kernel estimation and matrix completion show that the proposed method is very efficient. [This is a joint work with kaifeng Jiang and Kim Chuan Toh at National University of Singapore.]
Fundamentals of Convergence Theories for Convex Relaxation Hierarchies
Lift-and-project operators provide an automatic way for constructing
all facets of the convex hull of 0,1 vectors in a polytope given
by linear or polynomial inequalities. They also yield
tractable approximations provided that the input polytope is
tractable and that we only apply the operators O(1) times.
There are many generalizations of these operators which can
be used to generate arbitrarily tight, convex relaxations
of essentially arbitrary nonconvex sets.
I will quickly review the above-mentioned methods and then
focus on the main techniques used in providing
convergence theories for various convex relaxation hierarchies.
I will discuss the relationships among the various techniques
when the hierarchies are applied to polynomial optimization
problems. Finally, I will show how to use these techniques
to prove sum-of-squares type representation theorems for
polynomials that are nonnegative over some compact set.
Adaptive Parameterized Improving Hit-and-Run for Global Optimization
We build on Improving Hit-and-Run's (IHR) prior success as a Monte Carlo random search algorithm for global optimization by generalizing the algorithm's sampling distribution. Specifically, in place of the uniform step-size distribution in IHR, we employ a family of parameterized step-size distributions to sample candidate points. The IHR step-size distribution is a special instance within this family. This parameterization is motivated by recent results on efficient decentralized search in the so-called "Small World" problems. To improve the performance of the algorithm, we adaptively tune the parameter based on the success rate of obtaining improving points. We present analytical and numerical results on simple spherical programs to illustrate the key ideas of the relationship between the parametrization and algorithm performance. These results are then extended to global optimization problems with Lipschitz continuous objective functions. Our preliminary numerical results demonstrate the potential benefit of this approach.
Joint work with Archis Ghate and Zelda Zabinsky.
Techniques for Benchmarking of Optimization Algorithms: A Case Study Using Line Search Variants of the Nelder-Mead Algorithm
Research in optimization algorithm design is often accompanied by benchmarking a new algorithm in comparison to previously established methods. Such benchmarking is often done in a manner that subconsciously favour s the newly develop algorithm. In this paper we present two techniques that can help researchers to provide fairer benchmarking of algorithms. First, we recommend the use of performance profiles, as developed by Audet and Orban. We review the construction of performance profiles and provide some examples in order to illustrate their use. Second, we advocate generating and providing results using each algorithm's ``optimal parameter select ion''. We demonstrate that if algorithms are compared using arbitrary parameter selections, results do not just compare the algorithms, but also compare the authors' ability to select good parameters.
This is joint work with Warren Hare.
On the maximal monotonicity of the sum of a maximal monotone linear operator and a normal cone operator
The question whether or not the sum of two maximal monotone operators is maximal monotone
under Rockafellar's constraint qualification is the most famous open problem in monotone
operator theory. In his 2008 monograph ``From Hahn-Banach to Monotonicity'', Stephen Simons
asked whether or not the sum theorem holds for the special case of a maximal monotone linear
operator and a normal cone operator of a closed convex set,
assuming that the interior of the
set makes a nonempty intersection with the domain of the linear operator.
In this talk, we provide an affirmative answer to Simons' question. In fact, we show that
the sum theorem is true for a maximal monotone linear relation and a normal cone operator.
The proof relies on Rockafellar's formula for the Fenchel conjugate
of the sum as well as
some
results featuring the Fitzpatrick function. This is a joint work with Heinz Bauschke and Shawn Wang.