MIT 6.7220/15.084 โ Nonlinear Optimization (Spring โ25) May 6-8th 2025
Lecture 20
Central path and interior-point methods
Instructor: Prof. Gabriele Farina ( gfarina@mit.edu)โ
Having laid the foundations of self-concordant functions, we are ready to see one of the
most important applications of these functions: interior-point methods.
Since we will be working extensively with self-concordant functions, we will make the blanket
assumption that ฮฉ is an open, convex, nonempty set.
L20.1 Path-following interior-point methods: chasing the cen-
tral path
Consider a problem of the form
min
๐ฅ
s.t.
โจ๐, ๐ฅโฉ
๐ฅ โ ฮฉ,
where ๐ โ โ๐ and ฮฉ denotes the closure of the
open, convex, and nonempty set ฮฉ โ โ๐.
Unlike iterative methods that project onto the
feasible set (such as for example the projected gra-
dient descent and the mirror descent algorithm),
interior-point methods work by constructing a
sequence of feasible points in ฮฉ, whose limit is
the solution to the problem. To do so, interior-
point methods consider a sequence of optimization
problems with objective
๐พโจ๐, ๐ฅโฉ + ๐(๐ฅ),
where ๐พ โฅ 0 is a parameter and ๐ is a strongly
nondegenerate self-concordant function on ฮฉ.
ฮฉ
Central path
๐(๐พ)
๐พ = 0
๐พ = +โ
Figure: The central path traced by
the sequence of solutions to the regu-
larized problem arg min{โ๐พ โ (๐ฅ + ๐ฆ) +
๐(๐ฅ) : ๐ฅ โ ฮฉ}, for increasing values of ๐พ โฅ
0. The self-concordant function ๐ is the
polyhedral barrier. The red dot, corre-
sponding to the solution at ๐พ = 0, is
called analytic center.
As we saw in Lecture 19, self-concordant functions shoot to infinity at the boundary of their
domain, and hence the minimizer of the self-concordant function will guarantee that the
solution is in the interior of the feasible set. The parameter ๐พ is increased over time: as ๐พ
grows, the original objective function โจ๐, ๐ฅโฉ becomes the dominant term, and the solution to
the regularized problem will approach more and more the boundary. The path of solutions
traced by the regularized problems is called the central path.
Definition L20.1 (Central path). Let ๐ : ฮฉ โ โ be a lower-bounded strongly nondegen-
erate self-concordant function. The central path is the curve ๐ parameterized over ๐พ โฅ
0, traced by the solutionsยน to the regularized optimization problem
๐(๐พ) โ arg min
๐ฅ
๐พโจ๐, ๐ฅโฉ + ๐(๐ฅ)
s.t. ๐ฅ โ ฮฉ.
L20.1.1 Barriers and their complexity parameter
As it turns out, the performance of path-following interior-point methods depends crucially
on a parameter of the strongly nondegenerate self-concordant function used, which is called
the complexity parameter of the function.
Definition L20.2 (Complexity parameter). The complexity parameter of a strongly
nondegenerate self-concordant function ๐ : ฮฉ โ โ is defined as the supremum of the
intrinsic squared norm of the second-order descent direction (Newton step) at any point
in the domain, that is,
๐๐ โ sup
๐ฅโฮฉ
โ๐(๐ฅ)โ2
๐ฅ.
Theorem L20.1 ([NN94], Corollary 2.3.3). The complexity parameter of a strongly
nondegenerate self-concordant function is at least 1.
We reserve the term barrier for only those self-concordant functions for which the complexity
parameter is finite, as we make formal next.
Definition L20.3 (Barrier function). A strongly nondegenerate self-concordant barrier
(for us, simply barrier) is a strongly nondegenerate self-concordant function ๐ whose
complexity parameter is finite.
For example, in the case of the log barrier for the positive orthant, we can bound the
complexity parameter as follows.
Example L20.1. The logarithmic barrier for the positive orthant โ๐
>0, defined as
๐ : โ๐
>0 โ โ where ๐(๐ฅ) = โ โ
๐
๐=1
log(๐ฅ๐)
has complexity parameter ๐๐ = ๐.
Solution. The Hessian of the logarithmic barrier is
โ2๐(๐ฅ) = diag( 1
๐ฅ2
1
, ..., 1
๐ฅ2
๐
),
and the Newton step is
๐(๐ฅ) = โ[โ2๐(๐ฅ)]โ1โ๐(๐ฅ) =
(
CCD๐ฅ1
โฎ
๐ฅ๐)
GGH.
Hence, the intrinsic norm of the Newton step satisfies
โ๐(๐ฅ)โ2
๐ฅ = ๐(๐ฅ)โค[โ2๐(๐ฅ)]๐(๐ฅ) = โ
๐
๐=1
1
๐ฅ2
๐
๐ฅ2
๐ = ๐
as we wanted to show. โก
However, not all self-concordant functions are barriers.
Example L20.2. The function ๐(๐ฅ) = ๐ฅ โ log(๐ฅ) is strongly nondegenerate self-concor-
dant on ฮฉ โ โ>0, but it is not a barrier.
Solution. We already know that ๐ is self-concordant, since โ log(๐ฅ) is self-concordant
(see Lecture 19), and addition of linear functions to self-concordant functions preserve
self-concordance.
The Hessian of ๐ is โ2๐(๐ฅ) = 1/๐ฅ2, and the Newton step is correspondingly
๐(๐ฅ) = โ[โ2๐(๐ฅ)]โ1โ๐(๐ฅ) = โ๐ฅ2(1 โ 1
๐ฅ).
Hence, the intrinsic norm of the Newton step is
โ๐(๐ฅ)โ2
๐ฅ = 1
๐ฅ2 [๐ฅ2(1 โ 1
๐ฅ )]
2
= ๐ฅ2 โ 2๐ฅ + 1,
which is unbounded as ๐ฅ โ +โ. โก
L20.1.2 Complexity parameter and optimality gap of the central path
The complexity parameter of a barrier function is a crucial quantity that appears in the
analysis of interior-point methods. We now begin with its first application in providing an
upper bound on the optimality gap of the regularized problem.
Theorem L20.2. Let ๐ : ฮฉ โ โ be a barrier function. For any ๐พ > 0, the point ๐(๐พ) on
the central path (see Definition L20.1), satisfies the inequality
โจ๐, ๐(๐พ)โฉ โค ( min
๐ฅโฒโฮฉ
โจ๐, ๐ฅโฒโฉ) + 1
๐พ ๐๐ .
The above result ensures that when ๐พ becomes large enough, then the points on the central
path become arbitrarily close to the optimal value of the original problem. With little extra
work, the same can be said for approximate solutions to ๐(๐พ).
Theorem L20.3. Let ๐ : ฮฉ โ โ be a barrier function. For any ๐พ > 0, and point ๐ฅ โ ฮฉ
such that โ๐ฅ โ ๐(๐พ)โ๐ฅ โค 1
6 ,
โจ๐, ๐ฅโฉ โค ( min
๐ฅโฒโฮฉ
โจ๐, ๐ฅโฒโฉ) + 6
5๐พ โ ๐๐ .
L20.2 The (short-step) barrier method
The idea of the short-step barrier method is to chase the central path closely at every
iteration. This is conceptually the simplest interior point method, with more advanced
versions being the long-step barrier method and the predictor-corrector barrier method,
which is what is implemented in commercial solvers such as CPLEX and Gurobi. We will
use the term short-step barrier method and barrier method interchangeably today.
Assume that we know an initial point ๐ฅ1 โ ฮฉ that is close to the point ๐(๐พ1) on the central
path, for some value of ๐พ1 > 0. The barrier algorithm now increases the parameter ๐พ1 to a
value ๐พ2 = ๐ฝ๐พ1 (where ๐ฝ > 1), and applies Newtonโs method to approximate the solution
๐(๐พ2). As long as ๐ฅ1 was sufficiently close to ๐(๐พ1), we expect that in switching from ๐พ1 to
๐พ2, the point ๐ฅ1 will still be in the region of quadratic convergence. In this case, Newtonโs
method converges so fast, that (as we will see formally in the next subsection) a single
Newton step is sufficient to produce a point ๐ฅ2 โ ๐ฅ1 + ๐๐พ2 (๐ฅ1) that is again very close to
the central path at ๐(๐พ2). For the choice of parameter ๐พ2, the Newton step is in particular
๐ฅ2 โ ๐ฅ1 โ [โ2๐(๐ฅ1)]โ1(๐พ2๐ + โ๐(๐ฅ1)),
since the objective function we apply the second-order descent direction is by definition the
problem
min
๐ฅ
s.t.
๐พ2โจ๐, ๐ฅโฉ + ๐(๐ฅ)
๐ฅ โ ฮฉ.
Continuing this process indefinitely, that is,
๐พ๐ก+1 โ ๐ฝ๐พ๐ก, ๐ฅ๐ก+1 โ ๐ฅ๐ก โ [โ2๐(๐ฅ๐ก)]โ1(๐พ๐ก+1๐ + โ๐(๐ฅ๐ก))
we have the short-step barrier method.
L20.2.1 Update of the parameter ๐พ
As we did in Lecture 19, we will denote the second-order direction of descentโthat is, the
Newton stepโstarting from a point ๐ฅ using the letter ๐. However, since we are now dealing
with a continuum of objective functions parameterized on ๐พ, we will need to also specify
what objective (that is, what value of ๐พ) we are applying the Newton step to. For this
reason, we will introduce the notation
๐๐พ (๐ฅ) โ โ[โ2๐(๐ฅ)]โ1(๐พ๐ + โ๐(๐ฅ)).
The main technical hurdle in analyzing the short-step barrier method is to quantify the
proximity of the iterates to the central path. As is common with self-concordant functions,
we will measure such proximity using the lengths of the Newton steps: ๐ฅ๐ก is near ๐(๐พ๐ก)
in the sense that the intrinsic norm of the Newton step ๐๐พ๐ก (๐ฅ๐ก) is small (this should feel
natural recalling Theorem L19.6 in Lecture 19).
How close to the central path is close enough, so that the barrier method using a single
Newton update per iteration is guaranteed to work?
As we move our attention from the objective ๐พ๐กโจ๐, ๐ฅโฉ + ๐(๐ฅ) to the objective ๐พ๐ก+1โจ๐, ๐ฅโฉ +
๐(๐ฅ), we can expect that distance to optimality of ๐ฅ๐ก to ๐(๐พ๐ก+1) increases by a certain
amount compared to the distance from ๐ฅ๐ก to ๐(๐พ๐ก). If this amount is not too large, then
we can hope to use Theorem L19.8 in Lecture 19 to โrecoverโ in a single Newton step the
distance lost, and close the induction. The following theorem operationalizes the idea we
just stated, and provides a concrete quantitative answer to what โclose enoughโ means. In
particular, we will show that โ๐๐พ๐ก (๐ฅ๐ก)โ๐ฅ๐ก โค 1
9 is enough.
Theorem L20.4. If ๐ฅ๐ก is close to the central path, in the sense that โ๐๐พ๐ก (๐ฅ๐ก)โ๐ฅ๐ก โค 1
9 , then
by setting
๐พ๐ก+1 โ ๐ฝ๐พ๐ก with ๐ฝ โ (1 + 1
8โ๐๐
),
the same proximity is guaranteed at time ๐ก + 1, that is, โ๐๐พ๐ก+1 (๐ฅ๐ก+1)โ๐ฅ๐ก+1 โค 1
9 .
Proof. We need to go from a statement pertaining โ๐๐พ๐ก (๐ฅ๐ก)โ๐ฅ๐ก to one pertaining
โ๐๐พ๐ก+1 (๐ฅ๐ก+1)โ๐ฅ๐ก+1 . We will do so in two steps.
โ First part. Observe the equality (valid for all ๐พ๐ก+1 and ๐พ๐ก)
๐๐พ๐ก+1 (๐ฅ๐ก) = โ[โ2๐(๐ฅ๐ก)]โ1(๐พ๐ก+1๐ + โ๐(๐ฅ๐ก))
= โ ๐พ๐ก+1
๐พ๐ก
[โ2๐(๐ฅ๐ก)]โ1(๐พ๐ก๐ + ๐พ๐ก
๐พ๐ก+1
โ๐(๐ฅ๐ก))
= โ ๐พ๐ก+1
๐พ๐ก
[โ2๐(๐ฅ๐ก)]โ1(๐พ๐ก๐ + โ๐(๐ฅ๐ก)) + ๐พ๐ก+1 โ ๐พ๐ก
๐พ๐ก
[โ2๐(๐ฅ๐ก)]โ1โ๐(๐ฅ๐ก)
= ๐พ๐ก+1
๐พ๐ก
๐๐พ๐ก (๐ฅ๐ก) + (๐พ๐ก+1
๐พ๐ก
โ 1)[โ2๐(๐ฅ๐ก)]โ1โ๐(๐ฅ๐ก).
Using the triangle inequality for norm โโ โ๐ฅ๐ก and plugging in the hypotheses of the
statement, we get
โ๐๐พ๐ก+1 (๐ฅ๐ก)โ๐ฅ๐ก โค ๐พ๐ก+1
๐พ๐ก
โ๐๐พ๐ก (๐ฅ๐ก)โ๐ฅ๐ก + | ๐พ๐ก+1
๐พ๐ก
โ 1| โ โ[โ2๐(๐ฅ๐ก)]โ1โ๐(๐ฅ๐ก)โ๐ฅ๐ก
โค ๐พ๐ก+1
๐พ๐ก
โ๐๐พ๐ก (๐ฅ๐ก)โ๐ฅ๐ก + | ๐พ๐ก+1
๐พ๐ก
โ 1| โ โ๐๐
โค 1
9(1 + 1
8โ๐๐
) + 1
8โ๐๐
โ๐๐
โค 1
9 โ (1 + 1
8 ) + 1
8 = 1
4 (since ๐๐ โฅ 1).
However, the left-hand side of the inequality is โ๐๐พ๐ก+1 (๐ฅ๐ก)โ๐ฅ๐ก and not โ๐๐พ๐ก+1 (๐ฅ๐ก+1)โ๐ฅ๐ก+1 .
This is where the second step comes in.
โ Second part. To complete the bound, we will convert from โ๐๐พ๐ก+1 (๐ฅ๐ก)โ๐ฅ๐ก to
โ๐๐พ๐ก+1 (๐ฅ๐ก+1)โ๐ฅ๐ก+1 . To do so, remember that ๐ฅ๐ก+1 is obtained from ๐ฅ๐ก by taking a Newton
step. Hence, using Theorem L19.8 of Lecture 19, we have
โ๐๐พ๐ก+1 (๐ฅ๐ก+1)โ๐ฅ๐ก+1 โค ( โ๐๐พ๐ก+1 (๐ฅ๐ก)โ๐ฅ๐ก
1 โ โ๐๐พ๐ก+1 (๐ฅ๐ก)โ๐ฅ๐ก
)
2
โค (
1
4
1 โ 1
4
)
2
= 1
9.
This completes the proof. โก
Remark L20.1. Remarkably, a safe increase in ๐พ depends only on the complexity
parameter ๐๐ of the barrier, and not on any property of the function. For example, for
a linear program
min
๐ฅ
s.t.
โจ๐, ๐ฅโฉ
๐ด๐ฅ = ๐
๐ฅ โ โ๐
โฅ0,
using the polyhedral barrier function, the increase in ๐พ is independent of the number
of constraints of the problem or the sparsity of ๐ด, and we can increase ๐พ๐ก+1 = ๐พ๐ก โ
(1 + 1
8โ๐ ).
The result in Theorem L20.4 shows that at every iteration, it is safe to increase ๐พ by a
factor of 1 + 1
8โ๐๐
> 1, which leads to an exponential growth in the weight given to the
objective function of the problem.
Hence, combining the previous result with Theorem L20.2 we find the following guarantee.
Theorem L20.5. Consider running the short-step barrier method with a barrier func-
tion ๐ with complexity parameter ๐๐ , starting from a point ๐ฅ1 close to ๐(๐พ1), i.e.,
โ๐๐พ1 (๐ฅ1)โ๐ฅ1 โค 1/9, for some ๐พ1 > 0. For any ๐ > 0, after
๐ = โ10โ๐๐ log( 6๐๐
5๐๐พ1
)โ
iterations, the solution computed by the short-step barrier method guarantees an ๐-
suboptimal objective value โจ๐, ๐ฅ๐ โฉ โค ( min๐ฅโฮฉโจ๐, ๐ฅโฉ) + ๐.
Proof. Since at every time the value of ๐พ is increased by the quantity 1 + 1
8โ๐๐
, the
number of iterations required to increase the value from ๐พ1 to any value ๐พ is given by
๐ =
โ
โ
โ
} log( ๐พ
๐พ1
)
log(1 + 1
8โ๐๐
) โ
โ
โ
ย
โค โlog( ๐พ
๐พ1
) 5
4 โ 8โ๐๐ โ (since 1
log(1 + ๐ฅ) โค 5
4๐ฅ for all 0 โค ๐ฅ โค 1
2)
= โ10โ๐๐ log( ๐พ
๐พ1
)โ.
On the other hand, we know from Theorem L20.3 that the optimality gap of ๐(๐พ) is given
by 6๐๐ /(5๐พ) as long as โ๐ฅ๐ โ ๐(๐พ๐ )โ๐ฅ๐
โค 1
6 . This is indeed the case from Remark L19.3
of Lecture 19. So, to reach an optimality gap of ๐, we need ๐พ = 6๐๐ /(5๐). Substituting
this value into the previous bound yields the statement. โก
L20.2.2 Finding a good initial point
The result in Theorem L20.5 shows that, as long as we know a point ๐ฅ1 that is โcloseโ (in
the formal sense of Theorem L20.4) to the central path, for a parameter ๐พ1 that is not too
small, then we can guarantee an ๐-suboptimal solution in roughly โ๐๐ log(1/๐) iterations.
โ The analytic center. Intuitively, one might guess that a good initial point for the algorithm
would be a point close to ๐ โ ๐(0) (the minimizer of ๐ on ฮฉ), which is often called the
analytic center of ฮฉ. Letโs verify that that is indeed the case. By definition, such a point
satisfies โ๐(๐) = 0, and so we have that
๐๐พ (๐) = โ๐พ[โ2๐(๐)]โ1๐ โน โ๐๐พ (๐)โ๐ = ๐พ โ โ[โ2๐(๐)]โ1๐โ๐ .
Hence, ๐ฅ1 = ๐ is within proximity 1/9 (in the sense of Theorem L20.4) of the central path
for the value of
๐พ1 = 1
9 โ[โ2๐(๐)]โ1๐โ๐
.
The only thing left to check is that ๐พ1 is not excessively small, so that the number of
iterations predicted in Theorem L20.5 is not too large. We now show that indeed we can
upper bound โ[โ2๐(๐)]โ1๐โ๐ .
Theorem L20.6. Let ๐ be the minimizer of the barrier ๐ on ฮฉ. Then,
โ[โ2๐(๐)]โ1๐โ๐ โค โจ๐, ๐โฉ โ min
๐ฅโฮฉ
โจ๐, ๐ฅโฉ.
(So, in particular, โ[โ2๐(๐)]โ1๐โ๐ โค max๐ฅโฮฉ โจ๐, ๐ฅโฉ โ min๐ฅโฮฉ โจ๐, ๐ฅโฉ.)
Proof. The direction โ[โ2๐(๐)]โ1๐ is a descent direction for ๐, since
โจ๐, โ[โ2๐(๐)]โ1๐โฉ = โโ[โ2๐(๐)]โ1๐โ2
๐
โค 0.
Hence, as we consider points ๐ฅ(๐) โ ๐ โ ๐ โ [โ2๐(๐)]โ1๐ for ๐ โฅ 0 such that ๐ฅ(๐) โ ฮฉ, we
have that the value of the objective โจ๐, ๐ฅ(๐)โฉ decreases monotonically, and in particular
โจ๐, ๐ฅ(๐)โฉ = โจ๐, ๐โฉ โ ๐ โ โ[โ2๐(๐)]โ1๐โ2
๐
,
which implies that
โ[โ2๐(๐)]โ1๐โ2
๐
= โจ๐, ๐โฉ โ โจ๐, ๐ฅ(๐)โฉ
๐ โค โจ๐, ๐โฉ โ min๐ฅโฮฉโจ๐, ๐ฅโฉ
๐ .
To complete the proof, it suffices to show that we can move in the direction of
โ[โ2๐(๐)]โ1๐ for a meaningful amount ๐. For this, we will use the property of self-
concordant function that the Dikin ellipsoid ๐ (๐) โ {๐ฅ โ ฮฉ : โ๐ฅ โ ๐โ๐ < 1} โ ฮฉ. In
particular, this implies that any ๐ โฅ 0 such that
1 > โ๐ โ ๐ฅ(๐)โ๐ = ๐โ[โ2๐(๐)]โ1๐โ๐
generates a point ๐ฅ(๐) โ ฮฉ. So, we must have
โ[โ2๐(๐)]โ1๐โ2
๐
โค inf
{
ย
ย
ย
ย โจ๐, ๐โฉ โ min๐ฅโฮฉโจ๐, ๐ฅโฉ
๐ : 0 < ๐ < 1
โ[โ2๐(๐)]โ1๐โ๐ }
ย
ย
ย
ย
= (โจ๐, ๐โฉ โ min
๐ฅโฮฉ
โจ๐, ๐ฅโฉ)โ[โ2๐(๐)]โ1๐โ๐
,
which implies the statement. โก
So, we have shown the following.
Theorem L20.7 (The analytic center ๐ is a good initial point). Let ๐ be a barrier function
with complexity parameter ๐๐ . If the short-step barrier method is initialized at the
analytic center ๐, then the number of iterations required to obtain an ๐-suboptimal
solution is bounded by
๐ = โ10โ๐๐ log( 11 ๐๐
๐ (โจ๐, ๐โฉ โ min
๐ฅโฮฉ
โจ๐, ๐ฅโฉ))โ.
โ Path switching and the auxiliary central path. In practice, we might not know where
the analytic center is. In this case, the typical solution is to first approximate the analytic
center, and then start the short step barrier method from there as usual.
To approximate the analytic center, one can use the auxiliary central path. The idea is the
following: start from an arbitrary point ๐ฅโฒ โ ฮฉ. Such a point is on the central path traced
by the solutions to
๐โฒ(๐) โ arg min
๐ฅ
โ๐โจโ๐(๐ฅโฒ), ๐ฅโฉ + ๐(๐ฅ)
s.t. ๐ฅ โ ฮฉ.
Indeed, note that ๐ฅโฒ is the solution for ๐ = 1, that is, ๐ฅโฒ = ๐โฒ(1).
We can then run the short-step barrier method chasing ๐โฒ in reverse. At every step, we
will decrease the value of ๐ by a factor of 1 โ 1
8โ๐๐
. Once the value of ๐ is sufficiently small
that โ[โ2๐(๐ฅ)]โ1โ๐(๐ฅ)โ๐ฅ
โค 1/6, we will have reached a point that is close to the analytic
center, and we can start the regular short-step barrier method for ๐(๐พ) from there. This
technique is called path switching, since we follow two central paths (one from ๐ฅโฒ to the
analytic center, and one from the analytic center to the solution), switching around the
analytic center which path to follow. [โท Try to work out the details and convince yourself
this works!]
L20.3 Further readings
The short book by Renegar, J. [Ren01] and the monograph by Nesterov, Y. [Nes18] (Chapter
5) provide a comprehensive introduction to self-concordant functions and their applications
in optimization.
I especially recommend the book by Renegar, J. [Ren01] for a concise yet rigorous account.
[Ren01] Renegar, J. (2001). A Mathematical View of Interior-point Methods in Convex
Optimization. SIAM. https://doi.org/10.1137/1.9780898718812
[Nes18] Nesterov, Y. (2018). Lectures on Convex Optimization. Springer International
Publishing. https://link.springer.com/book/10.1007/978-3-319-91578-4
Changelog
โข Fixed a few typos (thanks Jonathan Huang!)
โข Fixed a few typos (thanks Khizer!)
โ These notes are class material that has not undergone formal peer review. The TAs and I are grateful
for any reports of typos.
ยนRemember that lower-bounded self-concordant functions always have a unique minimizer, as seen in
Theorem L19.7 of Lecture 19.
Lecture 20
Central path and interior-point methods
Instructor: Prof. Gabriele Farina ( gfarina@mit.edu)โ
Having laid the foundations of self-concordant functions, we are ready to see one of the
most important applications of these functions: interior-point methods.
Since we will be working extensively with self-concordant functions, we will make the blanket
assumption that ฮฉ is an open, convex, nonempty set.
L20.1 Path-following interior-point methods: chasing the cen-
tral path
Consider a problem of the form
min
๐ฅ
s.t.
โจ๐, ๐ฅโฉ
๐ฅ โ ฮฉ,
where ๐ โ โ๐ and ฮฉ denotes the closure of the
open, convex, and nonempty set ฮฉ โ โ๐.
Unlike iterative methods that project onto the
feasible set (such as for example the projected gra-
dient descent and the mirror descent algorithm),
interior-point methods work by constructing a
sequence of feasible points in ฮฉ, whose limit is
the solution to the problem. To do so, interior-
point methods consider a sequence of optimization
problems with objective
๐พโจ๐, ๐ฅโฉ + ๐(๐ฅ),
where ๐พ โฅ 0 is a parameter and ๐ is a strongly
nondegenerate self-concordant function on ฮฉ.
ฮฉ
Central path
๐(๐พ)
๐พ = 0
๐พ = +โ
Figure: The central path traced by
the sequence of solutions to the regu-
larized problem arg min{โ๐พ โ (๐ฅ + ๐ฆ) +
๐(๐ฅ) : ๐ฅ โ ฮฉ}, for increasing values of ๐พ โฅ
0. The self-concordant function ๐ is the
polyhedral barrier. The red dot, corre-
sponding to the solution at ๐พ = 0, is
called analytic center.
As we saw in Lecture 19, self-concordant functions shoot to infinity at the boundary of their
domain, and hence the minimizer of the self-concordant function will guarantee that the
solution is in the interior of the feasible set. The parameter ๐พ is increased over time: as ๐พ
grows, the original objective function โจ๐, ๐ฅโฉ becomes the dominant term, and the solution to
the regularized problem will approach more and more the boundary. The path of solutions
traced by the regularized problems is called the central path.
Definition L20.1 (Central path). Let ๐ : ฮฉ โ โ be a lower-bounded strongly nondegen-
erate self-concordant function. The central path is the curve ๐ parameterized over ๐พ โฅ
0, traced by the solutionsยน to the regularized optimization problem
๐(๐พ) โ arg min
๐ฅ
๐พโจ๐, ๐ฅโฉ + ๐(๐ฅ)
s.t. ๐ฅ โ ฮฉ.
L20.1.1 Barriers and their complexity parameter
As it turns out, the performance of path-following interior-point methods depends crucially
on a parameter of the strongly nondegenerate self-concordant function used, which is called
the complexity parameter of the function.
Definition L20.2 (Complexity parameter). The complexity parameter of a strongly
nondegenerate self-concordant function ๐ : ฮฉ โ โ is defined as the supremum of the
intrinsic squared norm of the second-order descent direction (Newton step) at any point
in the domain, that is,
๐๐ โ sup
๐ฅโฮฉ
โ๐(๐ฅ)โ2
๐ฅ.
Theorem L20.1 ([NN94], Corollary 2.3.3). The complexity parameter of a strongly
nondegenerate self-concordant function is at least 1.
We reserve the term barrier for only those self-concordant functions for which the complexity
parameter is finite, as we make formal next.
Definition L20.3 (Barrier function). A strongly nondegenerate self-concordant barrier
(for us, simply barrier) is a strongly nondegenerate self-concordant function ๐ whose
complexity parameter is finite.
For example, in the case of the log barrier for the positive orthant, we can bound the
complexity parameter as follows.
Example L20.1. The logarithmic barrier for the positive orthant โ๐
>0, defined as
๐ : โ๐
>0 โ โ where ๐(๐ฅ) = โ โ
๐
๐=1
log(๐ฅ๐)
has complexity parameter ๐๐ = ๐.
Solution. The Hessian of the logarithmic barrier is
โ2๐(๐ฅ) = diag( 1
๐ฅ2
1
, ..., 1
๐ฅ2
๐
),
and the Newton step is
๐(๐ฅ) = โ[โ2๐(๐ฅ)]โ1โ๐(๐ฅ) =
(
CCD๐ฅ1
โฎ
๐ฅ๐)
GGH.
Hence, the intrinsic norm of the Newton step satisfies
โ๐(๐ฅ)โ2
๐ฅ = ๐(๐ฅ)โค[โ2๐(๐ฅ)]๐(๐ฅ) = โ
๐
๐=1
1
๐ฅ2
๐
๐ฅ2
๐ = ๐
as we wanted to show. โก
However, not all self-concordant functions are barriers.
Example L20.2. The function ๐(๐ฅ) = ๐ฅ โ log(๐ฅ) is strongly nondegenerate self-concor-
dant on ฮฉ โ โ>0, but it is not a barrier.
Solution. We already know that ๐ is self-concordant, since โ log(๐ฅ) is self-concordant
(see Lecture 19), and addition of linear functions to self-concordant functions preserve
self-concordance.
The Hessian of ๐ is โ2๐(๐ฅ) = 1/๐ฅ2, and the Newton step is correspondingly
๐(๐ฅ) = โ[โ2๐(๐ฅ)]โ1โ๐(๐ฅ) = โ๐ฅ2(1 โ 1
๐ฅ).
Hence, the intrinsic norm of the Newton step is
โ๐(๐ฅ)โ2
๐ฅ = 1
๐ฅ2 [๐ฅ2(1 โ 1
๐ฅ )]
2
= ๐ฅ2 โ 2๐ฅ + 1,
which is unbounded as ๐ฅ โ +โ. โก
L20.1.2 Complexity parameter and optimality gap of the central path
The complexity parameter of a barrier function is a crucial quantity that appears in the
analysis of interior-point methods. We now begin with its first application in providing an
upper bound on the optimality gap of the regularized problem.
Theorem L20.2. Let ๐ : ฮฉ โ โ be a barrier function. For any ๐พ > 0, the point ๐(๐พ) on
the central path (see Definition L20.1), satisfies the inequality
โจ๐, ๐(๐พ)โฉ โค ( min
๐ฅโฒโฮฉ
โจ๐, ๐ฅโฒโฉ) + 1
๐พ ๐๐ .
The above result ensures that when ๐พ becomes large enough, then the points on the central
path become arbitrarily close to the optimal value of the original problem. With little extra
work, the same can be said for approximate solutions to ๐(๐พ).
Theorem L20.3. Let ๐ : ฮฉ โ โ be a barrier function. For any ๐พ > 0, and point ๐ฅ โ ฮฉ
such that โ๐ฅ โ ๐(๐พ)โ๐ฅ โค 1
6 ,
โจ๐, ๐ฅโฉ โค ( min
๐ฅโฒโฮฉ
โจ๐, ๐ฅโฒโฉ) + 6
5๐พ โ ๐๐ .
L20.2 The (short-step) barrier method
The idea of the short-step barrier method is to chase the central path closely at every
iteration. This is conceptually the simplest interior point method, with more advanced
versions being the long-step barrier method and the predictor-corrector barrier method,
which is what is implemented in commercial solvers such as CPLEX and Gurobi. We will
use the term short-step barrier method and barrier method interchangeably today.
Assume that we know an initial point ๐ฅ1 โ ฮฉ that is close to the point ๐(๐พ1) on the central
path, for some value of ๐พ1 > 0. The barrier algorithm now increases the parameter ๐พ1 to a
value ๐พ2 = ๐ฝ๐พ1 (where ๐ฝ > 1), and applies Newtonโs method to approximate the solution
๐(๐พ2). As long as ๐ฅ1 was sufficiently close to ๐(๐พ1), we expect that in switching from ๐พ1 to
๐พ2, the point ๐ฅ1 will still be in the region of quadratic convergence. In this case, Newtonโs
method converges so fast, that (as we will see formally in the next subsection) a single
Newton step is sufficient to produce a point ๐ฅ2 โ ๐ฅ1 + ๐๐พ2 (๐ฅ1) that is again very close to
the central path at ๐(๐พ2). For the choice of parameter ๐พ2, the Newton step is in particular
๐ฅ2 โ ๐ฅ1 โ [โ2๐(๐ฅ1)]โ1(๐พ2๐ + โ๐(๐ฅ1)),
since the objective function we apply the second-order descent direction is by definition the
problem
min
๐ฅ
s.t.
๐พ2โจ๐, ๐ฅโฉ + ๐(๐ฅ)
๐ฅ โ ฮฉ.
Continuing this process indefinitely, that is,
๐พ๐ก+1 โ ๐ฝ๐พ๐ก, ๐ฅ๐ก+1 โ ๐ฅ๐ก โ [โ2๐(๐ฅ๐ก)]โ1(๐พ๐ก+1๐ + โ๐(๐ฅ๐ก))
we have the short-step barrier method.
L20.2.1 Update of the parameter ๐พ
As we did in Lecture 19, we will denote the second-order direction of descentโthat is, the
Newton stepโstarting from a point ๐ฅ using the letter ๐. However, since we are now dealing
with a continuum of objective functions parameterized on ๐พ, we will need to also specify
what objective (that is, what value of ๐พ) we are applying the Newton step to. For this
reason, we will introduce the notation
๐๐พ (๐ฅ) โ โ[โ2๐(๐ฅ)]โ1(๐พ๐ + โ๐(๐ฅ)).
The main technical hurdle in analyzing the short-step barrier method is to quantify the
proximity of the iterates to the central path. As is common with self-concordant functions,
we will measure such proximity using the lengths of the Newton steps: ๐ฅ๐ก is near ๐(๐พ๐ก)
in the sense that the intrinsic norm of the Newton step ๐๐พ๐ก (๐ฅ๐ก) is small (this should feel
natural recalling Theorem L19.6 in Lecture 19).
How close to the central path is close enough, so that the barrier method using a single
Newton update per iteration is guaranteed to work?
As we move our attention from the objective ๐พ๐กโจ๐, ๐ฅโฉ + ๐(๐ฅ) to the objective ๐พ๐ก+1โจ๐, ๐ฅโฉ +
๐(๐ฅ), we can expect that distance to optimality of ๐ฅ๐ก to ๐(๐พ๐ก+1) increases by a certain
amount compared to the distance from ๐ฅ๐ก to ๐(๐พ๐ก). If this amount is not too large, then
we can hope to use Theorem L19.8 in Lecture 19 to โrecoverโ in a single Newton step the
distance lost, and close the induction. The following theorem operationalizes the idea we
just stated, and provides a concrete quantitative answer to what โclose enoughโ means. In
particular, we will show that โ๐๐พ๐ก (๐ฅ๐ก)โ๐ฅ๐ก โค 1
9 is enough.
Theorem L20.4. If ๐ฅ๐ก is close to the central path, in the sense that โ๐๐พ๐ก (๐ฅ๐ก)โ๐ฅ๐ก โค 1
9 , then
by setting
๐พ๐ก+1 โ ๐ฝ๐พ๐ก with ๐ฝ โ (1 + 1
8โ๐๐
),
the same proximity is guaranteed at time ๐ก + 1, that is, โ๐๐พ๐ก+1 (๐ฅ๐ก+1)โ๐ฅ๐ก+1 โค 1
9 .
Proof. We need to go from a statement pertaining โ๐๐พ๐ก (๐ฅ๐ก)โ๐ฅ๐ก to one pertaining
โ๐๐พ๐ก+1 (๐ฅ๐ก+1)โ๐ฅ๐ก+1 . We will do so in two steps.
โ First part. Observe the equality (valid for all ๐พ๐ก+1 and ๐พ๐ก)
๐๐พ๐ก+1 (๐ฅ๐ก) = โ[โ2๐(๐ฅ๐ก)]โ1(๐พ๐ก+1๐ + โ๐(๐ฅ๐ก))
= โ ๐พ๐ก+1
๐พ๐ก
[โ2๐(๐ฅ๐ก)]โ1(๐พ๐ก๐ + ๐พ๐ก
๐พ๐ก+1
โ๐(๐ฅ๐ก))
= โ ๐พ๐ก+1
๐พ๐ก
[โ2๐(๐ฅ๐ก)]โ1(๐พ๐ก๐ + โ๐(๐ฅ๐ก)) + ๐พ๐ก+1 โ ๐พ๐ก
๐พ๐ก
[โ2๐(๐ฅ๐ก)]โ1โ๐(๐ฅ๐ก)
= ๐พ๐ก+1
๐พ๐ก
๐๐พ๐ก (๐ฅ๐ก) + (๐พ๐ก+1
๐พ๐ก
โ 1)[โ2๐(๐ฅ๐ก)]โ1โ๐(๐ฅ๐ก).
Using the triangle inequality for norm โโ โ๐ฅ๐ก and plugging in the hypotheses of the
statement, we get
โ๐๐พ๐ก+1 (๐ฅ๐ก)โ๐ฅ๐ก โค ๐พ๐ก+1
๐พ๐ก
โ๐๐พ๐ก (๐ฅ๐ก)โ๐ฅ๐ก + | ๐พ๐ก+1
๐พ๐ก
โ 1| โ โ[โ2๐(๐ฅ๐ก)]โ1โ๐(๐ฅ๐ก)โ๐ฅ๐ก
โค ๐พ๐ก+1
๐พ๐ก
โ๐๐พ๐ก (๐ฅ๐ก)โ๐ฅ๐ก + | ๐พ๐ก+1
๐พ๐ก
โ 1| โ โ๐๐
โค 1
9(1 + 1
8โ๐๐
) + 1
8โ๐๐
โ๐๐
โค 1
9 โ (1 + 1
8 ) + 1
8 = 1
4 (since ๐๐ โฅ 1).
However, the left-hand side of the inequality is โ๐๐พ๐ก+1 (๐ฅ๐ก)โ๐ฅ๐ก and not โ๐๐พ๐ก+1 (๐ฅ๐ก+1)โ๐ฅ๐ก+1 .
This is where the second step comes in.
โ Second part. To complete the bound, we will convert from โ๐๐พ๐ก+1 (๐ฅ๐ก)โ๐ฅ๐ก to
โ๐๐พ๐ก+1 (๐ฅ๐ก+1)โ๐ฅ๐ก+1 . To do so, remember that ๐ฅ๐ก+1 is obtained from ๐ฅ๐ก by taking a Newton
step. Hence, using Theorem L19.8 of Lecture 19, we have
โ๐๐พ๐ก+1 (๐ฅ๐ก+1)โ๐ฅ๐ก+1 โค ( โ๐๐พ๐ก+1 (๐ฅ๐ก)โ๐ฅ๐ก
1 โ โ๐๐พ๐ก+1 (๐ฅ๐ก)โ๐ฅ๐ก
)
2
โค (
1
4
1 โ 1
4
)
2
= 1
9.
This completes the proof. โก
Remark L20.1. Remarkably, a safe increase in ๐พ depends only on the complexity
parameter ๐๐ of the barrier, and not on any property of the function. For example, for
a linear program
min
๐ฅ
s.t.
โจ๐, ๐ฅโฉ
๐ด๐ฅ = ๐
๐ฅ โ โ๐
โฅ0,
using the polyhedral barrier function, the increase in ๐พ is independent of the number
of constraints of the problem or the sparsity of ๐ด, and we can increase ๐พ๐ก+1 = ๐พ๐ก โ
(1 + 1
8โ๐ ).
The result in Theorem L20.4 shows that at every iteration, it is safe to increase ๐พ by a
factor of 1 + 1
8โ๐๐
> 1, which leads to an exponential growth in the weight given to the
objective function of the problem.
Hence, combining the previous result with Theorem L20.2 we find the following guarantee.
Theorem L20.5. Consider running the short-step barrier method with a barrier func-
tion ๐ with complexity parameter ๐๐ , starting from a point ๐ฅ1 close to ๐(๐พ1), i.e.,
โ๐๐พ1 (๐ฅ1)โ๐ฅ1 โค 1/9, for some ๐พ1 > 0. For any ๐ > 0, after
๐ = โ10โ๐๐ log( 6๐๐
5๐๐พ1
)โ
iterations, the solution computed by the short-step barrier method guarantees an ๐-
suboptimal objective value โจ๐, ๐ฅ๐ โฉ โค ( min๐ฅโฮฉโจ๐, ๐ฅโฉ) + ๐.
Proof. Since at every time the value of ๐พ is increased by the quantity 1 + 1
8โ๐๐
, the
number of iterations required to increase the value from ๐พ1 to any value ๐พ is given by
๐ =
โ
โ
โ
} log( ๐พ
๐พ1
)
log(1 + 1
8โ๐๐
) โ
โ
โ
ย
โค โlog( ๐พ
๐พ1
) 5
4 โ 8โ๐๐ โ (since 1
log(1 + ๐ฅ) โค 5
4๐ฅ for all 0 โค ๐ฅ โค 1
2)
= โ10โ๐๐ log( ๐พ
๐พ1
)โ.
On the other hand, we know from Theorem L20.3 that the optimality gap of ๐(๐พ) is given
by 6๐๐ /(5๐พ) as long as โ๐ฅ๐ โ ๐(๐พ๐ )โ๐ฅ๐
โค 1
6 . This is indeed the case from Remark L19.3
of Lecture 19. So, to reach an optimality gap of ๐, we need ๐พ = 6๐๐ /(5๐). Substituting
this value into the previous bound yields the statement. โก
L20.2.2 Finding a good initial point
The result in Theorem L20.5 shows that, as long as we know a point ๐ฅ1 that is โcloseโ (in
the formal sense of Theorem L20.4) to the central path, for a parameter ๐พ1 that is not too
small, then we can guarantee an ๐-suboptimal solution in roughly โ๐๐ log(1/๐) iterations.
โ The analytic center. Intuitively, one might guess that a good initial point for the algorithm
would be a point close to ๐ โ ๐(0) (the minimizer of ๐ on ฮฉ), which is often called the
analytic center of ฮฉ. Letโs verify that that is indeed the case. By definition, such a point
satisfies โ๐(๐) = 0, and so we have that
๐๐พ (๐) = โ๐พ[โ2๐(๐)]โ1๐ โน โ๐๐พ (๐)โ๐ = ๐พ โ โ[โ2๐(๐)]โ1๐โ๐ .
Hence, ๐ฅ1 = ๐ is within proximity 1/9 (in the sense of Theorem L20.4) of the central path
for the value of
๐พ1 = 1
9 โ[โ2๐(๐)]โ1๐โ๐
.
The only thing left to check is that ๐พ1 is not excessively small, so that the number of
iterations predicted in Theorem L20.5 is not too large. We now show that indeed we can
upper bound โ[โ2๐(๐)]โ1๐โ๐ .
Theorem L20.6. Let ๐ be the minimizer of the barrier ๐ on ฮฉ. Then,
โ[โ2๐(๐)]โ1๐โ๐ โค โจ๐, ๐โฉ โ min
๐ฅโฮฉ
โจ๐, ๐ฅโฉ.
(So, in particular, โ[โ2๐(๐)]โ1๐โ๐ โค max๐ฅโฮฉ โจ๐, ๐ฅโฉ โ min๐ฅโฮฉ โจ๐, ๐ฅโฉ.)
Proof. The direction โ[โ2๐(๐)]โ1๐ is a descent direction for ๐, since
โจ๐, โ[โ2๐(๐)]โ1๐โฉ = โโ[โ2๐(๐)]โ1๐โ2
๐
โค 0.
Hence, as we consider points ๐ฅ(๐) โ ๐ โ ๐ โ [โ2๐(๐)]โ1๐ for ๐ โฅ 0 such that ๐ฅ(๐) โ ฮฉ, we
have that the value of the objective โจ๐, ๐ฅ(๐)โฉ decreases monotonically, and in particular
โจ๐, ๐ฅ(๐)โฉ = โจ๐, ๐โฉ โ ๐ โ โ[โ2๐(๐)]โ1๐โ2
๐
,
which implies that
โ[โ2๐(๐)]โ1๐โ2
๐
= โจ๐, ๐โฉ โ โจ๐, ๐ฅ(๐)โฉ
๐ โค โจ๐, ๐โฉ โ min๐ฅโฮฉโจ๐, ๐ฅโฉ
๐ .
To complete the proof, it suffices to show that we can move in the direction of
โ[โ2๐(๐)]โ1๐ for a meaningful amount ๐. For this, we will use the property of self-
concordant function that the Dikin ellipsoid ๐ (๐) โ {๐ฅ โ ฮฉ : โ๐ฅ โ ๐โ๐ < 1} โ ฮฉ. In
particular, this implies that any ๐ โฅ 0 such that
1 > โ๐ โ ๐ฅ(๐)โ๐ = ๐โ[โ2๐(๐)]โ1๐โ๐
generates a point ๐ฅ(๐) โ ฮฉ. So, we must have
โ[โ2๐(๐)]โ1๐โ2
๐
โค inf
{
ย
ย
ย
ย โจ๐, ๐โฉ โ min๐ฅโฮฉโจ๐, ๐ฅโฉ
๐ : 0 < ๐ < 1
โ[โ2๐(๐)]โ1๐โ๐ }
ย
ย
ย
ย
= (โจ๐, ๐โฉ โ min
๐ฅโฮฉ
โจ๐, ๐ฅโฉ)โ[โ2๐(๐)]โ1๐โ๐
,
which implies the statement. โก
So, we have shown the following.
Theorem L20.7 (The analytic center ๐ is a good initial point). Let ๐ be a barrier function
with complexity parameter ๐๐ . If the short-step barrier method is initialized at the
analytic center ๐, then the number of iterations required to obtain an ๐-suboptimal
solution is bounded by
๐ = โ10โ๐๐ log( 11 ๐๐
๐ (โจ๐, ๐โฉ โ min
๐ฅโฮฉ
โจ๐, ๐ฅโฉ))โ.
โ Path switching and the auxiliary central path. In practice, we might not know where
the analytic center is. In this case, the typical solution is to first approximate the analytic
center, and then start the short step barrier method from there as usual.
To approximate the analytic center, one can use the auxiliary central path. The idea is the
following: start from an arbitrary point ๐ฅโฒ โ ฮฉ. Such a point is on the central path traced
by the solutions to
๐โฒ(๐) โ arg min
๐ฅ
โ๐โจโ๐(๐ฅโฒ), ๐ฅโฉ + ๐(๐ฅ)
s.t. ๐ฅ โ ฮฉ.
Indeed, note that ๐ฅโฒ is the solution for ๐ = 1, that is, ๐ฅโฒ = ๐โฒ(1).
We can then run the short-step barrier method chasing ๐โฒ in reverse. At every step, we
will decrease the value of ๐ by a factor of 1 โ 1
8โ๐๐
. Once the value of ๐ is sufficiently small
that โ[โ2๐(๐ฅ)]โ1โ๐(๐ฅ)โ๐ฅ
โค 1/6, we will have reached a point that is close to the analytic
center, and we can start the regular short-step barrier method for ๐(๐พ) from there. This
technique is called path switching, since we follow two central paths (one from ๐ฅโฒ to the
analytic center, and one from the analytic center to the solution), switching around the
analytic center which path to follow. [โท Try to work out the details and convince yourself
this works!]
L20.3 Further readings
The short book by Renegar, J. [Ren01] and the monograph by Nesterov, Y. [Nes18] (Chapter
5) provide a comprehensive introduction to self-concordant functions and their applications
in optimization.
I especially recommend the book by Renegar, J. [Ren01] for a concise yet rigorous account.
[Ren01] Renegar, J. (2001). A Mathematical View of Interior-point Methods in Convex
Optimization. SIAM. https://doi.org/10.1137/1.9780898718812
[Nes18] Nesterov, Y. (2018). Lectures on Convex Optimization. Springer International
Publishing. https://link.springer.com/book/10.1007/978-3-319-91578-4
Changelog
โข Fixed a few typos (thanks Jonathan Huang!)
โข Fixed a few typos (thanks Khizer!)
โ These notes are class material that has not undergone formal peer review. The TAs and I are grateful
for any reports of typos.
ยนRemember that lower-bounded self-concordant functions always have a unique minimizer, as seen in
Theorem L19.7 of Lecture 19.