MIT 6.7220/15.084 โ Nonlinear Optimization Tue, Apr 23th 2024
Lecture 15
Central path and interior-point methods
Instructor: Prof. Gabriele Farina ( gfarina@mit.edu)โ
โ These notes are class material that has not undergone formal peer review. The TAs and I are grateful for any
reports of typos.
Having laid the foundations of self-concordant functions, we are ready to see one of the most impor-
tant applications of these functions: interior-point methods.
Since we will be working extensively with self-concordant functions, we will make the blanket as-
sumption that ฮฉ is an open, convex, nonempty set.
1 Path-following interior-point methods: chasing the central path
Consider a problem of the form
min
๐ฅ
s.t.
โจ๐, ๐ฅโฉ
๐ฅ โ ฮฉ,
where ๐ โ โ๐ and ฮฉ denotes the closure of the open,
convex, and nonempty set ฮฉ โ โ๐.
Unlike iterative methods that project onto the feasible
set (such as for example the projected gradient descent
and the mirror descent algorithm), interior-point meth-
ods work by constructing a sequence of feasible points
in ฮฉ, whose limit is the solution to the problem. To do
so, interior-point methods consider a sequence of opti-
mization problems with objective
๐พโจ๐, ๐ฅโฉ + ๐(๐ฅ),
where ๐พ โฅ 0 is a parameter and ๐ is a strongly nonde-
generate self-concordant function on ฮฉ.
ฮฉ
Central path
๐(๐พ)
๐พ = 0
๐พ = +โ
Figure: The central path traced by
the sequence of solutions to the reg-
ularized problem arg min{โ๐พ โ (๐ฅ + ๐ฆ) +
๐(๐ฅ) : ๐ฅ โ ฮฉ}, for increasing values of ๐พ โฅ
0. The self-concordant function ๐ is the
polyhedral barrier. The red dot, corre-
sponding to the solution at ๐พ = 0, is called
analytic center.
As we saw in Lecture 14, self-concordant functions shoot to infinity at the boundary of their domain,
and hence the minimizer of the self-concordant function will guarantee that the solution is in the
interior of the feasible set. The parameter ๐พ is increased over time: as ๐พ grows, the original objec-
tive function โจ๐, ๐ฅโฉ becomes the dominant term, and the solution to the regularized problem will
approach more and more the boundary. The path of solutions traced by the regularized problems is
called the central path.
Definition 1.1 (Central path). Let ๐ : ฮฉ โ โ be a lower-bounded strongly nondegenerate self-
concordant function. The central path is the curve ๐ parameterized over ๐พ โฅ 0, traced by the
solutionsยน to the regularized optimization problem
๐(๐พ) โ arg min
๐ฅ
๐พโจ๐, ๐ฅโฉ + ๐(๐ฅ)
s.t. ๐ฅ โ ฮฉ.
ยนRemember that lower-bounded self-concordant functions always have a unique minimizer, as seen in Theorem 2.5
of Lecture 14.
1.1 Barriers and their complexity parameter
As it turns out, the performance of path-following interior-point methods depends crucially on a
parameter of the strongly nondegenerate self-concordant function used, which is called the complex-
ity parameter of the function.
Definition 1.2 (Complexity parameter). The complexity parameter of a strongly nondegenerate
self-concordant function ๐ : ฮฉ โ โ is defined as the supremum of the intrinsic squared norm of
the second-order descent direction (Newton step) at any point in the domain, that is,
๐๐ โ sup
๐ฅโฮฉ
โ๐(๐ฅ)โ2
๐ฅ.
Theorem 1.1 ([NN94], Corollary 2.3.3). The complexity parameter of a strongly nondegenerate
self-concordant function is at least 1.
We reserve the term barrier for only those self-concordant functions for which the complexity para-
meter is finite, as we make formal next.
Definition 1.3 (Barrier function). A strongly nondegenerate self-concordant barrier (for us, sim-
ply barrier) is a strongly nondegenerate self-concordant function ๐ whose complexity parameter
is finite.
For example, in the case of the log barrier for the positive orthant, we can bound the complexity
parameter as follows.
Example 1.1. The logarithmic barrier for the positive orthant โ๐
>0, defined as
๐ : โ๐
>0 โ โ where ๐(๐ฅ) = โ โ
๐
๐=1
log(๐ฅ๐)
has complexity parameter ๐๐ = ๐.
Solution . The Hessian of the logarithmic barrier is
โ2๐(๐ฅ) = diag( 1
๐ฅ2
1
, ..., 1
๐ฅ2
๐
),
and the Newton step is
๐(๐ฅ) = โ[โ2๐ (๐ฅ)]โ1โ๐ (๐ฅ) =
โ
โโ๐ฅ1
โฎ
๐ฅ๐โ
โโ.
Hence, the intrinsic norm of the Newton step satisfies
โ๐(๐ฅ)โ2
๐ฅ = ๐(๐ฅ)โค[โ2๐ (๐ฅ)]๐(๐ฅ) = โ
๐
๐=1
1
๐ฅ2
๐
๐ฅ2
๐ = ๐
as we wanted to show. โก
However, not all self-concordant functions are barriers.
Example 1.2. The function ๐(๐ฅ) = ๐ฅ โ log(๐ฅ) is strongly nondegenerate self-concordant on ฮฉ โ
โ>0, but it is not a barrier.
Solution . We already know that ๐ is self-concordant, since โ log(๐ฅ) is self-concordant (see Lec-
ture 14), and addition of linear functions to self-concordant functions preserve self-concordance.
The Hessian of ๐ is โ2๐(๐ฅ) = 1/๐ฅ2, and the Newton step is correspondingly
๐(๐ฅ) = โ[โ2๐ (๐ฅ)]โ1โ๐ (๐ฅ) = โ๐ฅ2(1 โ
1
๐ฅ).
Hence, the intrinsic norm of the Newton step is
โ๐(๐ฅ)โ2
๐ฅ = 1
๐ฅ2 [๐ฅ2(1 โ
1
๐ฅ )]
2
= ๐ฅ2 โ 2๐ฅ + 1,
which is unbounded as ๐ฅ โ +โ. โก
1.2 Complexity parameter and optimality gap of the central path
The complexity parameter of a barrier function is a crucial quantity that appears in the analysis of
interior-point methods. We now begin with its first application in providing an upper bound on the
optimality gap of the regularized problem.
Theorem 1.2. Let ๐ : ฮฉ โ โ be a barrier function. For any ๐พ > 0, the point ๐(๐พ) on the central
path (see Definition 1.1), satisfies the inequality
โจ๐, ๐(๐พ)โฉ โค ( min
๐ฅโฮฉ
โจ๐, ๐ฅโฉ) + 1
๐พ ๐๐ .
The above result ensures that when ๐พ becomes large enough, then the points on the central path
become arbitrarily close to the optimal value of the original problem. With little extra work, the
same can be said for approximate solutions to ๐(๐พ).
Theorem 1.3. Let ๐ : ฮฉ โ โ be a barrier function. For any ๐พ > 0, and point ๐ฅ โ ฮฉ such that
โ๐ฅ โ ๐(๐พ)โ๐ฅ โค 1
6 ,
โจ๐, ๐ฅโฉ โค ( min
๐ฅโฮฉ
โจ๐, ๐ฅโฉ) + 6
5๐พ โ ๐๐ .
2 The (short-step) barrier method
The idea of the short-step barrier method is to chase the central path closely at every iteration.
This is conceptually the simplest interior point method, with more advanced versions being the long-
step barrier method and the predictor-corrector barrier method, which is what is implemented in
commercial solvers such as CPLEX and Gurobi. We will use the term short-step barrier method and
barrier method interchangeably today.
Assume that we know an initial point ๐ฅ1 โ ฮฉ that is close to the point ๐(๐พ1) on the central path,
for some value of ๐พ1 > 0. The barrier algorithm now increases the parameter ๐พ1 to a value ๐พ2 = ๐ฝ๐พ1
(where ๐ฝ > 1), and applies Newtonโs method to approximate the solution ๐(๐พ2). As long as ๐ฅ1 was
sufficiently close to ๐(๐พ1), we expect that in switching from ๐พ1 to ๐พ2, the point ๐ฅ1 will still be in
the region of quadratic convergence. In this case, Newtonโs method converges so fast, that (as we
will see formally in the next subsection) a single Newton step is sufficient to produce a point ๐ฅ2 โ
๐ฅ1 + ๐๐พ2 (๐ฅ1) that is again very close to the central path at ๐(๐พ2). For the choice of parameter ๐พ2,
the Newton step is in particular
๐ฅ2 โ ๐ฅ1 โ [โ2๐(๐ฅ1)]โ1(๐พ2๐ + โ๐(๐ฅ1)),
since the objective function we apply the second-order descent direction is by definition the problem
min
๐ฅ
s.t.
๐พ2โจ๐, ๐ฅโฉ + ๐(๐ฅ)
๐ฅ โ ฮฉ.
Continuing this process indefinitely, that is,
๐พ๐ก+1 โ ๐ฝ๐พ๐ก, ๐ฅ๐ก+1 โ ๐ฅ๐ก โ [โ2๐ (๐ฅ๐ก)]โ1(๐พ๐ก+1๐ + โ๐ (๐ฅ๐ก))
we have the short-step barrier method.
2.1 Update of the parameter ๐พ
As we did in Lecture 14, we will denote the second-order direction of descentโthat is, the Newton
stepโstarting from a point ๐ฅ using the letter ๐. However, since we are now dealing with a continuum
of objective functions parameterized on ๐พ, we will need to also specify what objective (that is, what
value of ๐พ) we are applying the Newton step to. For this reason, we will introduce the notation
๐๐พ (๐ฅ) โ โ[โ2๐(๐ฅ)]โ1(๐พ๐ + โ๐ (๐ฅ)).
The main technical hurdle in analyzing the short-step barrier method is to quantify the proximity of
the iterates to the central path. As is common with self-concordant functions, we will measure such
proximity using the lengths of the Newton steps: ๐ฅ๐ก is near ๐(๐พ๐ก) in the sense that the intrinsic norm
of the Newton step ๐๐พ๐ก (๐ฅ๐ก) is small (this should feel natural recalling Theorem 3.1 in Lecture 14).
How close to the central path is close enough, so that the barrier method using a single Newton
update per iteration is guaranteed to work? As we move our attention from the objective ๐พ๐กโจ๐, ๐ฅโฉ +
๐(๐ฅ) to the objective ๐พ๐ก+1โจ๐, ๐ฅโฉ + ๐(๐ฅ), we can expect that distance to optimality of ๐ฅ๐ก to ๐(๐พ๐ก+1)
increases by a certain amount compared to the distance from ๐ฅ๐ก to ๐(๐พ๐ก). If this amount is not too
large, then we can hope to use Theorem 3.2 in Lecture 14 to โrecoverโ in a single Newton step the
distance lost, and close the induction. The following theorem operationalizes the idea we just stated,
and provides a concrete quantitative answer to what โclose enoughโ means. In particular, we will
show that โ๐๐พ๐ก (๐ฅ๐ก)โ๐ฅ๐ก
โค 1
9 is enough.
Theorem 2.1. If ๐ฅ๐ก is close to the central path, in the sense that โ๐๐พ๐ก (๐ฅ๐ก)โ๐ฅ๐ก
โค 1
9 , then by setting
๐พ๐ก+1 โ ๐ฝ๐พ๐ก with ๐ฝ โ (1 + 1
8โ๐๐
),
the same proximity is guaranteed at time ๐ก + 1, that is, โ๐๐พ๐ก+1 (๐ฅ๐ก+1)โ๐ฅ๐ก+1
โค 1
9 .
Proof . We need to go from a statement pertaining โ๐๐พ๐ก (๐ฅ๐ก)โ๐ฅ๐ก
to one pertaining โ๐๐พ๐ก+1 (๐ฅ๐ก+1)โ๐ฅ๐ก+1
.
We will do so by combining two facts:
1. First, observe the equality (valid for all ๐พ๐ก+1 and ๐พ๐ก)
๐๐พ๐ก+1 (๐ฅ๐ก) = โ[โ2๐ (๐ฅ๐ก)]โ1(๐พ๐ก+1๐ + โ๐ (๐ฅ๐ก))
= โ ๐พ๐ก+1
๐พ๐ก
[โ2๐ (๐ฅ๐ก)]โ1(๐พ๐ก๐ +
๐พ๐ก
๐พ๐ก+1
โ๐(๐ฅ๐ก))
= โ ๐พ๐ก+1
๐พ๐ก
[โ2๐ (๐ฅ๐ก)]โ1(๐พ๐ก๐ + โ๐ (๐ฅ๐ก)) +
๐พ๐ก+1 โ ๐พ๐ก
๐พ๐ก
[โ2๐ (๐ฅ๐ก)]โ1โ๐ (๐ฅ๐ก)
= ๐พ๐ก+1
๐พ๐ก
๐๐พ๐ก (๐ฅ๐ก) + (
๐พ๐ก+1
๐พ๐ก
โ 1)[โ2๐ (๐ฅ๐ก)]โ1โ๐ (๐ฅ๐ก).
Using the triangle inequality for norm โโ โ๐ฅ๐ก
and plugging in the hypotheses of the statement,
we get
โ๐๐พ๐ก+1 (๐ฅ๐ก)โ๐ฅ๐ก
โค ๐พ๐ก+1
๐พ๐ก
โ๐๐พ๐ก (๐ฅ๐ก)โ๐ฅ๐ก
+ |๐พ๐ก+1
๐พ๐ก
โ 1| โ โ[โ2๐ (๐ฅ๐ก)]โ1โ๐ (๐ฅ๐ก)โ๐ฅ๐ก
โค ๐พ๐ก+1
๐พ๐ก
โ๐๐พ๐ก (๐ฅ๐ก)โ๐ฅ๐ก
+ |๐พ๐ก+1
๐พ๐ก
โ 1| โ โ๐๐
โค 1
9 (1 + 1
8โ๐๐
) + 1
8โ๐๐
โ๐๐
โค 1
9 โ (1 + 1
8) + 1
8 = 1
4 (since ๐๐ โฅ 1).
However, the left-hand side of the inequality is โ๐๐พ๐ก+1 (๐ฅ๐ก)โ๐ฅ๐ก
and not โ๐๐พ๐ก+1 (๐ฅ๐ก+1)โ๐ฅ๐ก+1
. This
is where the second step comes in.
2. To complete the bound, we will convert from โ๐๐พ๐ก+1 (๐ฅ๐ก)โ๐ฅ๐ก
to โ๐๐พ๐ก+1 (๐ฅ๐ก)โ๐ฅ๐ก+1
. To do so, re-
member that ๐ฅ๐ก+1 is obtained from ๐ฅ๐ก by taking a Newton step. Hence, using Theorem 3.2
of Lecture 14, we have
โ๐๐พ๐ก+1 (๐ฅ๐ก+1)โ๐ฅ๐ก+1
โค
โ
โโโ โ๐๐พ๐ก+1 (๐ฅ๐ก)โ๐ฅ๐ก
1 โ โ๐๐พ๐ก+1 (๐ฅ๐ก)โ๐ฅ๐ก โ
โโโ
2
โค (
1
4
1 โ 1
4
)
2
= 1
9 .
This completes the proof. โก
Remark 2.1. Remarkably, a safe increase in ๐พ depends only on the complexity parameter ๐๐ of
the barrier, and not on any property of the function. For example, for a linear program
min
๐ฅ
s.t.
๐โค๐ฅ
๐ด๐ฅ = ๐
๐ฅ โฅ 0 โ โ๐,
using the polyhedral barrier function, the increase in ๐พ is independent of the number of con-
straints of the problem or the sparsity of ๐ด, and we can increase ๐พ๐ก+1 = ๐พ๐ก โ (1 + 1
8โ๐ ).
The result in Theorem 2.1 shows that at every iteration, it is safe to increase ๐พ by a factor of 1 +
1
8โ๐๐
> 1, which leads to an exponential growth in the weight given to the objective function of the
problem.
Hence, combining the previous result with Theorem 1.2 we find the following guarantee.
Theorem 2.2. Consider running the short-step barrier method with a barrier function ๐ with
complexity parameter ๐๐ , starting from a point ๐ฅ1 close to ๐(๐พ1), i.e., โ๐๐พ1 (๐ฅ1)โ๐ฅ1
โค 1/9, for
some ๐พ1 > 0. For any ๐ > 0, after
๐ = โ10โ๐๐ log(
6๐๐
5๐๐พ1
)โ
iterations, the solution computed by the short-step barrier method guarantees an ๐-suboptimal
objective value โจ๐, ๐ฅ๐ โฉ โค ( min๐ฅโฮฉโจ๐, ๐ฅโฉ) + ๐.
Proof . Since at every time the value of ๐พ is increased by the quantity 1 + 1
8โ๐๐
, the number of
iterations required to increase the value from ๐พ1 to any value ๐พ is given by
๐ =
โข
โข
โข
โข
โก log( ๐พ
๐พ1
)
log(1 + 1
8โ๐๐
)โฅ
โฅ
โฅ
โฅ
โค
โค โlog( ๐พ
๐พ1
)5
4 โ 8โ๐๐ โ (since 1
log(1 + ๐ฅ) โค 5
4๐ฅ for all 0 โค ๐ฅ โค 1
2)
= โ10โ๐๐ log(
๐พ
๐พ1
)โ.
On the other hand, we know from Theorem 1.3 that the optimality gap of ๐(๐พ) is given by
6๐๐ /(5๐พ) as long as โ๐ฅ๐ โ ๐(๐พ๐ )โ๐ฅ๐
โค 1
6 . This is indeed the case from Remark 3.2 of Lecture
14. So, to reach an optimality gap of ๐, we need ๐พ = 6๐๐ /(5๐). Substituting this value into the
previous bound yields the statement. โก
2.2 Finding a good initial point
The result in Theorem 2.2 shows that, as long as we know a point ๐ฅ1 that is โcloseโ (in the formal
sense of Theorem 2.1) to the central path, for a parameter ๐พ1 that is not too small, then we can
guarantee an ๐-suboptimal solution in roughly โ๐๐ log(1/๐) iterations.
โ The analytic center. Intuitively, one might guess that a good initial point for the algorithm would
be a point close to ๐ โ ๐(0) (the minimizer of ๐ on ฮฉ), which is often called the analytic center of
ฮฉ. Letโs verify that that is indeed the case. By definition, such a point satisfies โ๐(๐) = 0, and so
we have that
๐๐พ (๐) = โ๐พ[โ2๐(๐)]โ1๐ โน โ๐๐พ (๐)โ๐ = ๐พ โ โ[โ2๐(๐)]โ1๐โ๐ .
Hence, ๐ฅ1 = ๐ is within proximity 1/9 (in the sense of Theorem 2.1) of the central path for the
value of
๐พ1 = 1
9 โ[โ2๐(๐)]โ1๐โ๐
.
The only thing left to check is that ๐พ1 is not excessively small, so that the number of iterations pre-
dicted in Theorem 2.2 is not too large. We now show that indeed we can upper bound โ[โ2๐(๐)]โ1๐โ๐ .
Theorem 2.3. Let ๐ be the minimizer of the barrier ๐ on ฮฉ. Then,
โ[โ2๐ (๐)]โ1๐โ๐ โค โจ๐, ๐โฉ โ min
๐ฅโฮฉ
โจ๐, ๐ฅโฉ.
(So, in particular, โ[โ2๐ (๐)]โ1๐โ๐ โค max๐ฅโฮฉ โจ๐, ๐ฅโฉ โ min๐ฅโฮฉ โจ๐, ๐ฅโฉ.)
Proof . The direction โ[โ2๐ (๐)]โ1๐ is a descent direction for ๐, since
โจ๐, โ[โ2๐ (๐)]โ1๐โฉ = โโ[โ2๐ (๐)]โ1๐โ
2
๐
โค 0.
Hence, as we consider points ๐ฅ(๐) โ ๐ โ ๐ โ [โ2๐ (๐)]โ1๐ for ๐ โฅ 0 such that ๐ฅ(๐) โ ฮฉ, we have
that the value of the objective โจ๐, ๐ฅ(๐)โฉ decreases monotonically, and in particular
โจ๐, ๐ฅ(๐)โฉ = โจ๐, ๐โฉ โ ๐ โ โ[โ2๐ (๐)]โ1๐โ
2
๐
,
which implies that
โ[โ2๐ (๐)]โ1๐โ
2
๐
= โจ๐, ๐โฉ โ โจ๐, ๐ฅ(๐)โฉ
๐ โค โจ๐, ๐โฉ โ min๐ฅโฮฉโจ๐, ๐ฅโฉ
๐ .
To complete the proof, it suffices to show that we can move in the direction of โ[โ2๐ (๐)]โ1๐
for a meaningful amount ๐. For this, we will use the property of self-concordant function that
the Dikin ellipsoid ๐ (๐) โ {๐ฅ โ ฮฉ : โ๐ฅ โ ๐โ๐ < 1} โ ฮฉ. In particular, this implies that any ๐ โฅ
0 such that
1 > โ๐ โ ๐ฅ(๐)โ๐ = ๐โ[โ2๐ (๐)]โ1๐โ
๐
generates a point ๐ฅ(๐) โ ฮฉ. So, we must have
โ[โ2๐ (๐)]โ1๐โ
2
๐
โค inf
โฉ
{
โจ
{
โงโจ๐, ๐โฉ โ min๐ฅโฮฉโจ๐, ๐ฅโฉ
๐ : 0 < ๐ < 1
โ[โ2๐ (๐)]โ1๐โ๐ โญ
}
โฌ
}
โซ
= (โจ๐, ๐โฉ โ min
๐ฅโฮฉ
โจ๐, ๐ฅโฉ)โ[โ2๐ (๐)]โ1๐โ
๐
,
which implies the statement. โก
So, we have shown the following.
Theorem 2.4 (The analytic center ๐ is a good initial point). Let ๐ be a barrier function with
complexity parameter ๐๐ . If the short-step barrier method is initialized at the analytic center ๐,
then the number of iterations required to obtain an ๐-suboptimal solution is bounded by
๐ = โ10โ๐๐ log(
11 ๐๐
๐ (โจ๐, ๐โฉ โ min
๐ฅโฮฉ
โจ๐, ๐ฅโฉ))โ.
โ Path switching and the auxiliary central path. In practice, we might not know where the analytic
center is. In this case, the typical solution is to first approximate the analytic center, and then start
the short step barrier method from there as usual.
To approximate the analytic center, one can use the auxiliary central path. The idea is the following:
start from an arbitrary point ๐ฅโฒ โ ฮฉ. Such a point is on the central path traced by the solutions to
๐โฒ(๐) โ arg min
๐ฅ
โ๐โจโ๐(๐ฅโฒ), ๐ฅโฉ + ๐(๐ฅ)
s.t. ๐ฅ โ ฮฉ.
Indeed, note that ๐ฅโฒ is the solution for ๐ = 1, that is, ๐ฅโฒ = ๐โฒ(1).
We can then run the short-step barrier method chasing ๐โฒ in reverse. At every step, we will de-
crease the value of ๐ by a factor of 1 โ 1
8โ๐๐
. Once the value of ๐ is sufficiently small that
โ[โ๐(๐ฅ)]โ1โ๐ (๐ฅ)โ๐ฅ โค 1/6, we will have reached a point that is close to the analytic center, and we
can start the regular short-step barrier method for ๐(๐พ) from there. This technique is called path
switching, since we follow two central paths (one from ๐ฅโฒ to the analytic center, and one from the
analytic center to the solution), switching around the analytic center which path to follow. [โท Try
to work out the details and convince yourself this works!]
3 Further readings
The short book by Renegar, J. [Ren01] and the monograph by Nesterov, Y. [Nes18] (Chapter 5)
provide a comprehensive introduction to self-concordant functions and their applications in opti-
mization.
I especially recommend the book by Renegar, J. [Ren01] for a concise yet rigorous account.
[NN94] Y. Nesterov and A. Nemirovskii, Interior-Point Polynomial Algorithms in Convex Pro-
gramming. Philadelphia, PA: Society for Industrial, Applied Mathematics, 1994. doi:
10.1137/1.9781611970791.
[Ren01] J. Renegar, A Mathematical View of Interior-point Methods in Convex Optimization.
Philadelphia, PA, USA: SIAM, 2001. doi: 10.1137/1.9780898718812.
[Nes18] Y. Nesterov, Lectures on Convex Optimization. Springer International Publishing, 2018.
[Online]. Available: https://link.springer.com/book/10.1007/978-3-319-91578-4
Lecture 15
Central path and interior-point methods
Instructor: Prof. Gabriele Farina ( gfarina@mit.edu)โ
โ These notes are class material that has not undergone formal peer review. The TAs and I are grateful for any
reports of typos.
Having laid the foundations of self-concordant functions, we are ready to see one of the most impor-
tant applications of these functions: interior-point methods.
Since we will be working extensively with self-concordant functions, we will make the blanket as-
sumption that ฮฉ is an open, convex, nonempty set.
1 Path-following interior-point methods: chasing the central path
Consider a problem of the form
min
๐ฅ
s.t.
โจ๐, ๐ฅโฉ
๐ฅ โ ฮฉ,
where ๐ โ โ๐ and ฮฉ denotes the closure of the open,
convex, and nonempty set ฮฉ โ โ๐.
Unlike iterative methods that project onto the feasible
set (such as for example the projected gradient descent
and the mirror descent algorithm), interior-point meth-
ods work by constructing a sequence of feasible points
in ฮฉ, whose limit is the solution to the problem. To do
so, interior-point methods consider a sequence of opti-
mization problems with objective
๐พโจ๐, ๐ฅโฉ + ๐(๐ฅ),
where ๐พ โฅ 0 is a parameter and ๐ is a strongly nonde-
generate self-concordant function on ฮฉ.
ฮฉ
Central path
๐(๐พ)
๐พ = 0
๐พ = +โ
Figure: The central path traced by
the sequence of solutions to the reg-
ularized problem arg min{โ๐พ โ (๐ฅ + ๐ฆ) +
๐(๐ฅ) : ๐ฅ โ ฮฉ}, for increasing values of ๐พ โฅ
0. The self-concordant function ๐ is the
polyhedral barrier. The red dot, corre-
sponding to the solution at ๐พ = 0, is called
analytic center.
As we saw in Lecture 14, self-concordant functions shoot to infinity at the boundary of their domain,
and hence the minimizer of the self-concordant function will guarantee that the solution is in the
interior of the feasible set. The parameter ๐พ is increased over time: as ๐พ grows, the original objec-
tive function โจ๐, ๐ฅโฉ becomes the dominant term, and the solution to the regularized problem will
approach more and more the boundary. The path of solutions traced by the regularized problems is
called the central path.
Definition 1.1 (Central path). Let ๐ : ฮฉ โ โ be a lower-bounded strongly nondegenerate self-
concordant function. The central path is the curve ๐ parameterized over ๐พ โฅ 0, traced by the
solutionsยน to the regularized optimization problem
๐(๐พ) โ arg min
๐ฅ
๐พโจ๐, ๐ฅโฉ + ๐(๐ฅ)
s.t. ๐ฅ โ ฮฉ.
ยนRemember that lower-bounded self-concordant functions always have a unique minimizer, as seen in Theorem 2.5
of Lecture 14.
1.1 Barriers and their complexity parameter
As it turns out, the performance of path-following interior-point methods depends crucially on a
parameter of the strongly nondegenerate self-concordant function used, which is called the complex-
ity parameter of the function.
Definition 1.2 (Complexity parameter). The complexity parameter of a strongly nondegenerate
self-concordant function ๐ : ฮฉ โ โ is defined as the supremum of the intrinsic squared norm of
the second-order descent direction (Newton step) at any point in the domain, that is,
๐๐ โ sup
๐ฅโฮฉ
โ๐(๐ฅ)โ2
๐ฅ.
Theorem 1.1 ([NN94], Corollary 2.3.3). The complexity parameter of a strongly nondegenerate
self-concordant function is at least 1.
We reserve the term barrier for only those self-concordant functions for which the complexity para-
meter is finite, as we make formal next.
Definition 1.3 (Barrier function). A strongly nondegenerate self-concordant barrier (for us, sim-
ply barrier) is a strongly nondegenerate self-concordant function ๐ whose complexity parameter
is finite.
For example, in the case of the log barrier for the positive orthant, we can bound the complexity
parameter as follows.
Example 1.1. The logarithmic barrier for the positive orthant โ๐
>0, defined as
๐ : โ๐
>0 โ โ where ๐(๐ฅ) = โ โ
๐
๐=1
log(๐ฅ๐)
has complexity parameter ๐๐ = ๐.
Solution . The Hessian of the logarithmic barrier is
โ2๐(๐ฅ) = diag( 1
๐ฅ2
1
, ..., 1
๐ฅ2
๐
),
and the Newton step is
๐(๐ฅ) = โ[โ2๐ (๐ฅ)]โ1โ๐ (๐ฅ) =
โ
โโ๐ฅ1
โฎ
๐ฅ๐โ
โโ.
Hence, the intrinsic norm of the Newton step satisfies
โ๐(๐ฅ)โ2
๐ฅ = ๐(๐ฅ)โค[โ2๐ (๐ฅ)]๐(๐ฅ) = โ
๐
๐=1
1
๐ฅ2
๐
๐ฅ2
๐ = ๐
as we wanted to show. โก
However, not all self-concordant functions are barriers.
Example 1.2. The function ๐(๐ฅ) = ๐ฅ โ log(๐ฅ) is strongly nondegenerate self-concordant on ฮฉ โ
โ>0, but it is not a barrier.
Solution . We already know that ๐ is self-concordant, since โ log(๐ฅ) is self-concordant (see Lec-
ture 14), and addition of linear functions to self-concordant functions preserve self-concordance.
The Hessian of ๐ is โ2๐(๐ฅ) = 1/๐ฅ2, and the Newton step is correspondingly
๐(๐ฅ) = โ[โ2๐ (๐ฅ)]โ1โ๐ (๐ฅ) = โ๐ฅ2(1 โ
1
๐ฅ).
Hence, the intrinsic norm of the Newton step is
โ๐(๐ฅ)โ2
๐ฅ = 1
๐ฅ2 [๐ฅ2(1 โ
1
๐ฅ )]
2
= ๐ฅ2 โ 2๐ฅ + 1,
which is unbounded as ๐ฅ โ +โ. โก
1.2 Complexity parameter and optimality gap of the central path
The complexity parameter of a barrier function is a crucial quantity that appears in the analysis of
interior-point methods. We now begin with its first application in providing an upper bound on the
optimality gap of the regularized problem.
Theorem 1.2. Let ๐ : ฮฉ โ โ be a barrier function. For any ๐พ > 0, the point ๐(๐พ) on the central
path (see Definition 1.1), satisfies the inequality
โจ๐, ๐(๐พ)โฉ โค ( min
๐ฅโฮฉ
โจ๐, ๐ฅโฉ) + 1
๐พ ๐๐ .
The above result ensures that when ๐พ becomes large enough, then the points on the central path
become arbitrarily close to the optimal value of the original problem. With little extra work, the
same can be said for approximate solutions to ๐(๐พ).
Theorem 1.3. Let ๐ : ฮฉ โ โ be a barrier function. For any ๐พ > 0, and point ๐ฅ โ ฮฉ such that
โ๐ฅ โ ๐(๐พ)โ๐ฅ โค 1
6 ,
โจ๐, ๐ฅโฉ โค ( min
๐ฅโฮฉ
โจ๐, ๐ฅโฉ) + 6
5๐พ โ ๐๐ .
2 The (short-step) barrier method
The idea of the short-step barrier method is to chase the central path closely at every iteration.
This is conceptually the simplest interior point method, with more advanced versions being the long-
step barrier method and the predictor-corrector barrier method, which is what is implemented in
commercial solvers such as CPLEX and Gurobi. We will use the term short-step barrier method and
barrier method interchangeably today.
Assume that we know an initial point ๐ฅ1 โ ฮฉ that is close to the point ๐(๐พ1) on the central path,
for some value of ๐พ1 > 0. The barrier algorithm now increases the parameter ๐พ1 to a value ๐พ2 = ๐ฝ๐พ1
(where ๐ฝ > 1), and applies Newtonโs method to approximate the solution ๐(๐พ2). As long as ๐ฅ1 was
sufficiently close to ๐(๐พ1), we expect that in switching from ๐พ1 to ๐พ2, the point ๐ฅ1 will still be in
the region of quadratic convergence. In this case, Newtonโs method converges so fast, that (as we
will see formally in the next subsection) a single Newton step is sufficient to produce a point ๐ฅ2 โ
๐ฅ1 + ๐๐พ2 (๐ฅ1) that is again very close to the central path at ๐(๐พ2). For the choice of parameter ๐พ2,
the Newton step is in particular
๐ฅ2 โ ๐ฅ1 โ [โ2๐(๐ฅ1)]โ1(๐พ2๐ + โ๐(๐ฅ1)),
since the objective function we apply the second-order descent direction is by definition the problem
min
๐ฅ
s.t.
๐พ2โจ๐, ๐ฅโฉ + ๐(๐ฅ)
๐ฅ โ ฮฉ.
Continuing this process indefinitely, that is,
๐พ๐ก+1 โ ๐ฝ๐พ๐ก, ๐ฅ๐ก+1 โ ๐ฅ๐ก โ [โ2๐ (๐ฅ๐ก)]โ1(๐พ๐ก+1๐ + โ๐ (๐ฅ๐ก))
we have the short-step barrier method.
2.1 Update of the parameter ๐พ
As we did in Lecture 14, we will denote the second-order direction of descentโthat is, the Newton
stepโstarting from a point ๐ฅ using the letter ๐. However, since we are now dealing with a continuum
of objective functions parameterized on ๐พ, we will need to also specify what objective (that is, what
value of ๐พ) we are applying the Newton step to. For this reason, we will introduce the notation
๐๐พ (๐ฅ) โ โ[โ2๐(๐ฅ)]โ1(๐พ๐ + โ๐ (๐ฅ)).
The main technical hurdle in analyzing the short-step barrier method is to quantify the proximity of
the iterates to the central path. As is common with self-concordant functions, we will measure such
proximity using the lengths of the Newton steps: ๐ฅ๐ก is near ๐(๐พ๐ก) in the sense that the intrinsic norm
of the Newton step ๐๐พ๐ก (๐ฅ๐ก) is small (this should feel natural recalling Theorem 3.1 in Lecture 14).
How close to the central path is close enough, so that the barrier method using a single Newton
update per iteration is guaranteed to work? As we move our attention from the objective ๐พ๐กโจ๐, ๐ฅโฉ +
๐(๐ฅ) to the objective ๐พ๐ก+1โจ๐, ๐ฅโฉ + ๐(๐ฅ), we can expect that distance to optimality of ๐ฅ๐ก to ๐(๐พ๐ก+1)
increases by a certain amount compared to the distance from ๐ฅ๐ก to ๐(๐พ๐ก). If this amount is not too
large, then we can hope to use Theorem 3.2 in Lecture 14 to โrecoverโ in a single Newton step the
distance lost, and close the induction. The following theorem operationalizes the idea we just stated,
and provides a concrete quantitative answer to what โclose enoughโ means. In particular, we will
show that โ๐๐พ๐ก (๐ฅ๐ก)โ๐ฅ๐ก
โค 1
9 is enough.
Theorem 2.1. If ๐ฅ๐ก is close to the central path, in the sense that โ๐๐พ๐ก (๐ฅ๐ก)โ๐ฅ๐ก
โค 1
9 , then by setting
๐พ๐ก+1 โ ๐ฝ๐พ๐ก with ๐ฝ โ (1 + 1
8โ๐๐
),
the same proximity is guaranteed at time ๐ก + 1, that is, โ๐๐พ๐ก+1 (๐ฅ๐ก+1)โ๐ฅ๐ก+1
โค 1
9 .
Proof . We need to go from a statement pertaining โ๐๐พ๐ก (๐ฅ๐ก)โ๐ฅ๐ก
to one pertaining โ๐๐พ๐ก+1 (๐ฅ๐ก+1)โ๐ฅ๐ก+1
.
We will do so by combining two facts:
1. First, observe the equality (valid for all ๐พ๐ก+1 and ๐พ๐ก)
๐๐พ๐ก+1 (๐ฅ๐ก) = โ[โ2๐ (๐ฅ๐ก)]โ1(๐พ๐ก+1๐ + โ๐ (๐ฅ๐ก))
= โ ๐พ๐ก+1
๐พ๐ก
[โ2๐ (๐ฅ๐ก)]โ1(๐พ๐ก๐ +
๐พ๐ก
๐พ๐ก+1
โ๐(๐ฅ๐ก))
= โ ๐พ๐ก+1
๐พ๐ก
[โ2๐ (๐ฅ๐ก)]โ1(๐พ๐ก๐ + โ๐ (๐ฅ๐ก)) +
๐พ๐ก+1 โ ๐พ๐ก
๐พ๐ก
[โ2๐ (๐ฅ๐ก)]โ1โ๐ (๐ฅ๐ก)
= ๐พ๐ก+1
๐พ๐ก
๐๐พ๐ก (๐ฅ๐ก) + (
๐พ๐ก+1
๐พ๐ก
โ 1)[โ2๐ (๐ฅ๐ก)]โ1โ๐ (๐ฅ๐ก).
Using the triangle inequality for norm โโ โ๐ฅ๐ก
and plugging in the hypotheses of the statement,
we get
โ๐๐พ๐ก+1 (๐ฅ๐ก)โ๐ฅ๐ก
โค ๐พ๐ก+1
๐พ๐ก
โ๐๐พ๐ก (๐ฅ๐ก)โ๐ฅ๐ก
+ |๐พ๐ก+1
๐พ๐ก
โ 1| โ โ[โ2๐ (๐ฅ๐ก)]โ1โ๐ (๐ฅ๐ก)โ๐ฅ๐ก
โค ๐พ๐ก+1
๐พ๐ก
โ๐๐พ๐ก (๐ฅ๐ก)โ๐ฅ๐ก
+ |๐พ๐ก+1
๐พ๐ก
โ 1| โ โ๐๐
โค 1
9 (1 + 1
8โ๐๐
) + 1
8โ๐๐
โ๐๐
โค 1
9 โ (1 + 1
8) + 1
8 = 1
4 (since ๐๐ โฅ 1).
However, the left-hand side of the inequality is โ๐๐พ๐ก+1 (๐ฅ๐ก)โ๐ฅ๐ก
and not โ๐๐พ๐ก+1 (๐ฅ๐ก+1)โ๐ฅ๐ก+1
. This
is where the second step comes in.
2. To complete the bound, we will convert from โ๐๐พ๐ก+1 (๐ฅ๐ก)โ๐ฅ๐ก
to โ๐๐พ๐ก+1 (๐ฅ๐ก)โ๐ฅ๐ก+1
. To do so, re-
member that ๐ฅ๐ก+1 is obtained from ๐ฅ๐ก by taking a Newton step. Hence, using Theorem 3.2
of Lecture 14, we have
โ๐๐พ๐ก+1 (๐ฅ๐ก+1)โ๐ฅ๐ก+1
โค
โ
โโโ โ๐๐พ๐ก+1 (๐ฅ๐ก)โ๐ฅ๐ก
1 โ โ๐๐พ๐ก+1 (๐ฅ๐ก)โ๐ฅ๐ก โ
โโโ
2
โค (
1
4
1 โ 1
4
)
2
= 1
9 .
This completes the proof. โก
Remark 2.1. Remarkably, a safe increase in ๐พ depends only on the complexity parameter ๐๐ of
the barrier, and not on any property of the function. For example, for a linear program
min
๐ฅ
s.t.
๐โค๐ฅ
๐ด๐ฅ = ๐
๐ฅ โฅ 0 โ โ๐,
using the polyhedral barrier function, the increase in ๐พ is independent of the number of con-
straints of the problem or the sparsity of ๐ด, and we can increase ๐พ๐ก+1 = ๐พ๐ก โ (1 + 1
8โ๐ ).
The result in Theorem 2.1 shows that at every iteration, it is safe to increase ๐พ by a factor of 1 +
1
8โ๐๐
> 1, which leads to an exponential growth in the weight given to the objective function of the
problem.
Hence, combining the previous result with Theorem 1.2 we find the following guarantee.
Theorem 2.2. Consider running the short-step barrier method with a barrier function ๐ with
complexity parameter ๐๐ , starting from a point ๐ฅ1 close to ๐(๐พ1), i.e., โ๐๐พ1 (๐ฅ1)โ๐ฅ1
โค 1/9, for
some ๐พ1 > 0. For any ๐ > 0, after
๐ = โ10โ๐๐ log(
6๐๐
5๐๐พ1
)โ
iterations, the solution computed by the short-step barrier method guarantees an ๐-suboptimal
objective value โจ๐, ๐ฅ๐ โฉ โค ( min๐ฅโฮฉโจ๐, ๐ฅโฉ) + ๐.
Proof . Since at every time the value of ๐พ is increased by the quantity 1 + 1
8โ๐๐
, the number of
iterations required to increase the value from ๐พ1 to any value ๐พ is given by
๐ =
โข
โข
โข
โข
โก log( ๐พ
๐พ1
)
log(1 + 1
8โ๐๐
)โฅ
โฅ
โฅ
โฅ
โค
โค โlog( ๐พ
๐พ1
)5
4 โ 8โ๐๐ โ (since 1
log(1 + ๐ฅ) โค 5
4๐ฅ for all 0 โค ๐ฅ โค 1
2)
= โ10โ๐๐ log(
๐พ
๐พ1
)โ.
On the other hand, we know from Theorem 1.3 that the optimality gap of ๐(๐พ) is given by
6๐๐ /(5๐พ) as long as โ๐ฅ๐ โ ๐(๐พ๐ )โ๐ฅ๐
โค 1
6 . This is indeed the case from Remark 3.2 of Lecture
14. So, to reach an optimality gap of ๐, we need ๐พ = 6๐๐ /(5๐). Substituting this value into the
previous bound yields the statement. โก
2.2 Finding a good initial point
The result in Theorem 2.2 shows that, as long as we know a point ๐ฅ1 that is โcloseโ (in the formal
sense of Theorem 2.1) to the central path, for a parameter ๐พ1 that is not too small, then we can
guarantee an ๐-suboptimal solution in roughly โ๐๐ log(1/๐) iterations.
โ The analytic center. Intuitively, one might guess that a good initial point for the algorithm would
be a point close to ๐ โ ๐(0) (the minimizer of ๐ on ฮฉ), which is often called the analytic center of
ฮฉ. Letโs verify that that is indeed the case. By definition, such a point satisfies โ๐(๐) = 0, and so
we have that
๐๐พ (๐) = โ๐พ[โ2๐(๐)]โ1๐ โน โ๐๐พ (๐)โ๐ = ๐พ โ โ[โ2๐(๐)]โ1๐โ๐ .
Hence, ๐ฅ1 = ๐ is within proximity 1/9 (in the sense of Theorem 2.1) of the central path for the
value of
๐พ1 = 1
9 โ[โ2๐(๐)]โ1๐โ๐
.
The only thing left to check is that ๐พ1 is not excessively small, so that the number of iterations pre-
dicted in Theorem 2.2 is not too large. We now show that indeed we can upper bound โ[โ2๐(๐)]โ1๐โ๐ .
Theorem 2.3. Let ๐ be the minimizer of the barrier ๐ on ฮฉ. Then,
โ[โ2๐ (๐)]โ1๐โ๐ โค โจ๐, ๐โฉ โ min
๐ฅโฮฉ
โจ๐, ๐ฅโฉ.
(So, in particular, โ[โ2๐ (๐)]โ1๐โ๐ โค max๐ฅโฮฉ โจ๐, ๐ฅโฉ โ min๐ฅโฮฉ โจ๐, ๐ฅโฉ.)
Proof . The direction โ[โ2๐ (๐)]โ1๐ is a descent direction for ๐, since
โจ๐, โ[โ2๐ (๐)]โ1๐โฉ = โโ[โ2๐ (๐)]โ1๐โ
2
๐
โค 0.
Hence, as we consider points ๐ฅ(๐) โ ๐ โ ๐ โ [โ2๐ (๐)]โ1๐ for ๐ โฅ 0 such that ๐ฅ(๐) โ ฮฉ, we have
that the value of the objective โจ๐, ๐ฅ(๐)โฉ decreases monotonically, and in particular
โจ๐, ๐ฅ(๐)โฉ = โจ๐, ๐โฉ โ ๐ โ โ[โ2๐ (๐)]โ1๐โ
2
๐
,
which implies that
โ[โ2๐ (๐)]โ1๐โ
2
๐
= โจ๐, ๐โฉ โ โจ๐, ๐ฅ(๐)โฉ
๐ โค โจ๐, ๐โฉ โ min๐ฅโฮฉโจ๐, ๐ฅโฉ
๐ .
To complete the proof, it suffices to show that we can move in the direction of โ[โ2๐ (๐)]โ1๐
for a meaningful amount ๐. For this, we will use the property of self-concordant function that
the Dikin ellipsoid ๐ (๐) โ {๐ฅ โ ฮฉ : โ๐ฅ โ ๐โ๐ < 1} โ ฮฉ. In particular, this implies that any ๐ โฅ
0 such that
1 > โ๐ โ ๐ฅ(๐)โ๐ = ๐โ[โ2๐ (๐)]โ1๐โ
๐
generates a point ๐ฅ(๐) โ ฮฉ. So, we must have
โ[โ2๐ (๐)]โ1๐โ
2
๐
โค inf
โฉ
{
โจ
{
โงโจ๐, ๐โฉ โ min๐ฅโฮฉโจ๐, ๐ฅโฉ
๐ : 0 < ๐ < 1
โ[โ2๐ (๐)]โ1๐โ๐ โญ
}
โฌ
}
โซ
= (โจ๐, ๐โฉ โ min
๐ฅโฮฉ
โจ๐, ๐ฅโฉ)โ[โ2๐ (๐)]โ1๐โ
๐
,
which implies the statement. โก
So, we have shown the following.
Theorem 2.4 (The analytic center ๐ is a good initial point). Let ๐ be a barrier function with
complexity parameter ๐๐ . If the short-step barrier method is initialized at the analytic center ๐,
then the number of iterations required to obtain an ๐-suboptimal solution is bounded by
๐ = โ10โ๐๐ log(
11 ๐๐
๐ (โจ๐, ๐โฉ โ min
๐ฅโฮฉ
โจ๐, ๐ฅโฉ))โ.
โ Path switching and the auxiliary central path. In practice, we might not know where the analytic
center is. In this case, the typical solution is to first approximate the analytic center, and then start
the short step barrier method from there as usual.
To approximate the analytic center, one can use the auxiliary central path. The idea is the following:
start from an arbitrary point ๐ฅโฒ โ ฮฉ. Such a point is on the central path traced by the solutions to
๐โฒ(๐) โ arg min
๐ฅ
โ๐โจโ๐(๐ฅโฒ), ๐ฅโฉ + ๐(๐ฅ)
s.t. ๐ฅ โ ฮฉ.
Indeed, note that ๐ฅโฒ is the solution for ๐ = 1, that is, ๐ฅโฒ = ๐โฒ(1).
We can then run the short-step barrier method chasing ๐โฒ in reverse. At every step, we will de-
crease the value of ๐ by a factor of 1 โ 1
8โ๐๐
. Once the value of ๐ is sufficiently small that
โ[โ๐(๐ฅ)]โ1โ๐ (๐ฅ)โ๐ฅ โค 1/6, we will have reached a point that is close to the analytic center, and we
can start the regular short-step barrier method for ๐(๐พ) from there. This technique is called path
switching, since we follow two central paths (one from ๐ฅโฒ to the analytic center, and one from the
analytic center to the solution), switching around the analytic center which path to follow. [โท Try
to work out the details and convince yourself this works!]
3 Further readings
The short book by Renegar, J. [Ren01] and the monograph by Nesterov, Y. [Nes18] (Chapter 5)
provide a comprehensive introduction to self-concordant functions and their applications in opti-
mization.
I especially recommend the book by Renegar, J. [Ren01] for a concise yet rigorous account.
[NN94] Y. Nesterov and A. Nemirovskii, Interior-Point Polynomial Algorithms in Convex Pro-
gramming. Philadelphia, PA: Society for Industrial, Applied Mathematics, 1994. doi:
10.1137/1.9781611970791.
[Ren01] J. Renegar, A Mathematical View of Interior-point Methods in Convex Optimization.
Philadelphia, PA, USA: SIAM, 2001. doi: 10.1137/1.9780898718812.
[Nes18] Y. Nesterov, Lectures on Convex Optimization. Springer International Publishing, 2018.
[Online]. Available: https://link.springer.com/book/10.1007/978-3-319-91578-4