􀀲􀊠􀀲􀀲􀀲􀀲􀄽􀀲􀀲􀀲􀀲􀊖􀊖􀊖􀀲􀀲􀊖􀊖􀊖􀊖􀊖􀊖􀊖􀊖􀊖􀊖􀊖􀊖􀊖􀊖􀊖􀀲􀊖􀊖􀊗􀊖􀊖􀊖􀊖􀊖􀊖􀊖􀊖􀊖􀊖􀊖􀊖􀊖􀊖􀊖􀊖􀊖􀊖􀊖􀊖􀊖􀊖
MIT 6.7220/15.084 — Nonlinear Optimization Tue, Feb 27th 2024
Lecture 5
Lagrange multipliers and KKT conditions
Instructor: Prof. Gabriele Farina ( gfarina@mit.edu)
These notes are class material that has not undergone formal peer review. The TAs and I are grateful for any
reports of typos.
With separation in our toolbox, in this lecture we revisit normal cones, and extend our machinery
to feasible sets defined by functional constraints.
1 A second look at the normal cone of linear constraints
In Lecture 2, we considered normal cones for a few classes of feasible
sets that come up often: hyperplanes, affine subspaces, halfspaces, and
intersection of halfspaces. In that latter case, we drew a picture and
were convinced that the normal cone at a point at the intersection of
halfspaces was given by the conic hull of the directions orthogonal to
those halfspaces (see the picture on the right). 𝑥
𝑎1
𝑎2
𝒩Ω(𝑥)
Ω
We now give the proof of this result. We will do that by invoking the
machinery of separation seen in Lecture 4 to argue that a direction
outside of the normal cone must form an acute angle with at least one
direction that remains in the feasible set Ω.
Theorem 1.1. Let Ω ⊆ 𝑛 be defined as the intersection of 𝑚 linear inequalities
Ω ≔ {𝑥 ∈ ℝ𝑛 : 𝐴𝑥 ≤ 𝑏}, where 𝐴 =

⎛— 𝑎
1

— 𝑎
𝑚 —⎠
⎞ ∈ ℝ𝑚×𝑛, 𝑏 ∈ ℝ𝑚.
Given a point 𝑥 ∈ Ω, define the index set of the “binding” (also known as “active”) constraints
𝐼(𝑥) ≔ {𝑗 ∈ {1, ..., 𝑚} : 𝑎
𝑗 𝑥 = 𝑏𝑗}.
Then, the normal cone at any 𝑥 ∈ Ω is given by
𝒩Ω(𝑥) =
{
{

𝑗∈𝐼(𝑥)
𝜆𝑗𝑎𝑗 : 𝜆𝑗 ≥ 0
}
}
= {𝐴⊤𝜆 : 𝜆⊤(𝑏 − 𝐴𝑥) = 0, 𝜆 ∈ ℝ𝑚
≥0},
where the second equality rewrites the condition 𝑗 ∈ 𝐼(𝑥) via the complementary slackness con-
dition (see Lecture 2).
Remark 1.1. As a reminder, the coefficients 𝜆 in the normal cone above are typically called
“Lagrange multipliers”.
Remark 1.2. The result above immediately implies the correctness of the normal cone to an
affine subspace too. This is because we can rewrite the condition 𝐴𝑥 = 𝑏 as the intersection of
𝐴𝑥 ≤ 𝑏 and −𝐴𝑥 ≤ −𝑏:
Ω ≔ {𝑥 ∈ ℝ𝑛 : 𝐴𝑥 = 𝑏} = {𝑥 ∈ ℝ𝑛 : ( 𝐴
−𝐴)𝑥 ≤ ( 𝑏
−𝑏)}.
Applying the characterization in Theorem 1.1 to any 𝑥 ∈ Ω using the right-hand side formulation
yields that
𝒩Ω(𝑥) =
{
{
𝐴⊤𝜆1 − 𝐴⊤𝜆2 : (
𝜆1
𝜆2
)

( 𝑏 − 𝐴𝑥
−𝑏 + 𝐴𝑥) = 0, (𝜆1
𝜆2
) ∈ ℝ2𝑚
≥0
}
}
.
Since by hypothesis 𝑥 ∈ Ω, then 𝐴𝑥 = 𝑏 and so the complementary slackness condition is vacu-
ous, and we are left with
𝒩Ω(𝑥) = {𝐴⊤(𝜆1 − 𝜆2) : (
𝜆1
𝜆2
) ∈ ℝ2𝑚
≥0 }.
It is clear that for any 𝜆1, 𝜆2 ∈ 𝑚
≥0, their difference is 𝜆1 𝜆2 𝑚. Similarly, for any value of
𝜆 ∈ ℝ𝑚, we can find 𝜆1, 𝜆2 ∈ 𝑚
≥0 such that 𝜆 = 𝜆1 𝜆2. Thus,
𝒩Ω(𝑥) = {𝐴⊤𝜆 : 𝜆 ∈ ℝ𝑚}
as we argued (albeit using a different technique) in Lecture 2.
1.1 Separating a point from a convex cone
Before we can prove Theorem 1.1, we will find it helpful to use the following corollary of separation
for convex cones. A cone is a set with the property that the ray {𝜆 ⋅ 𝑥 : 𝜆 ≥ 0} generated by any
point 𝑥 in the set is fully contained in the set.
Definition 1.1 (Cone). A set 𝑆 is a cone if, for any 𝑥 ∈ 𝑆 and 𝜆 ∈ ≥0, the point 𝜆 ⋅ 𝑥 ∈ 𝑆.
Convex cones are among the simplest convex sets, and they appear all the time in optimization
theory.¹ In particular, in the next theorem we show that separation of a point from a nonempty
closed convex cone can always be achieved using a hyperplane passing through the origin.
¹Hiriart-Urruty, J.-B., & Lemaréchal, C. [HL01], speaking of convex cones, say: “they are important in convex
analysis (the “unilateral” realm of inequalities), just as subspaces are important in linear analysis (the “bilateral”
realm of equalities)”.
Theorem 1.2. Let 𝑆 ⊆ 𝑛 be a nonempty closed convex cone, and 𝑦 ∉ 𝑆 be a point in 𝑛. Then,
there exists a hyperplane passing through the origin that separates 𝑦 from 𝑆; formally, there
exists 𝑢 ∈ 𝑛 such that
⟨𝑢, 𝑦⟩ < 0 and ⟨𝑢, 𝑥⟩ ≥ 0 ∀𝑥 ∈ 𝑆.
Proof . We already know from Lecture 4 that there exist 𝑢 ∈ 𝑛, 𝑣 ∈ such that
⟨𝑢, 𝑦⟩ < 𝑣 and ⟨𝑢, 𝑥⟩ ≥ 𝑣 ∀𝑥 ∈ 𝑆. (1)
Consider any point 𝑎 ∈ 𝑆. By definition of cone, 𝜆 ⋅ 𝑎 ∈ 𝑆 for all 𝜆 ≥ 0. Thus, the separation
condition on the right in (1) implies that 𝑣 ≤ 𝜆 ⋅ ⟨𝑢, 𝑎⟩ for all 𝜆 ≥ 0. In particular, by plugging
𝜆 = 0, we find that 𝑣 ≤ 0, yielding ⟨𝑢, 𝑦⟩ < 0. Furthermore, dividing by 𝜆 we find that
⟨𝑢, 𝑎⟩ ≥ 𝑣
𝜆 ∀𝜆 ≥ 0 ⟨𝑢, 𝑎⟩ ≥ sup
𝜆→∞
𝑣
𝜆 = 0.
Since 𝑎 ∈ 𝑆 was arbitrary, the statement follows.
1.2 The proof of Theorem 1.1
Proof (of Theorem 1.1) . Fix any 𝑥 ∈ Ω and let
𝒞(𝑥) ≔
{
{

𝑗∈𝐼(𝑥)
𝜆𝑗𝑎𝑗 : 𝜆𝑗 ≥ 0
}
}
.
We will show that 𝒩Ω(𝑥) = 𝒞(𝑥) by proving the two directions of inclusion separately.
• We start by showing that any 𝑑 ∈ 𝒞(𝑥) belongs to 𝒩Ω(𝑥), that is,
⟨𝑑, 𝑦 − 𝑥⟩ ≤ 0 for all 𝑦 ∈ Ω.
Let 𝑑 be expressed as 𝑗∈𝐼(𝑥) 𝜆𝑗𝑎𝑗 with 𝜆𝑗 0. Then, for any 𝑦 ∈ Ω,
⟨ ∑
𝑗∈𝐼(𝑥)
𝜆𝑗𝑎𝑗, 𝑦 − 𝑥⟩ =
𝑗∈𝐼(𝑥)
𝜆𝑗⟨𝑎𝑗, 𝑦 − 𝑥⟩
=
𝑗∈𝐼(𝑥)
𝜆𝑗(⟨𝑎𝑗, 𝑦⟩ − 𝑏𝑗) (by definition of 𝐼(𝑥), ⟨𝑎𝑗, 𝑥⟩ = 𝑏𝑗)

𝑗∈𝐼(𝑥)
𝜆𝑗(𝑏𝑗 − 𝑏𝑗) = 0. (since 𝑦 ∈ Ω and 𝜆𝑗 ≥ 0)
This shows that 𝑑 ∈ 𝒩Ω(𝑥) and concludes the proof of this direction of the inclusion.
• We now look at the other direction. Take any 𝑑 ∉ 𝒞(𝑥). Since 𝒞 is a nonempty closed convex
cone [▷ you should verify this claim], by the conic separation result of Theorem 1.2, there
must exist 𝑢 ∈ 𝑛 such that
⟨𝑢, 𝑑⟩ < 0, and ⟨𝑢, 𝑎⟩ ≥ 0 ∀𝑎 ∈ 𝒞(𝑥). (2)
We argue that for 𝛿 > 0 small enough, the point 𝑦 ≔ 𝑥 − 𝛿 ⋅ 𝑢 belongs to Ω. We do so by
showing that it satisfies all the inequalities 𝑎
𝑗 𝑥 ≤ 𝑏𝑗 that define Ω:
if 𝑗 ∈ 𝐼(𝑥), then ⟨𝑎𝑗, 𝑥 − 𝛿 ⋅ 𝑢⟩ = 𝑏𝑗 𝛿 ⋅ ⟨𝑎𝑗, 𝑢⟩ ≤ 𝑏𝑗 since ⟨𝑎𝑗, 𝑢⟩ ≥ 0 by (2).
if 𝑗 ∉ 𝐼(𝑥), then 𝑏𝑗 ⟨𝑎𝑗, 𝑥⟩ > 0. By continuity, small enough perturbations of 𝑥, in
any direction, will not affect the strict inequality.
Thus, the direction 𝛿 ⋅ 𝑢 remains inside of Ω starting from 𝑥. We now argue that it forms a
strictly positive inner product with 𝑑. Indeed, note that from (2)
⟨𝑑, 𝑦 − 𝑥⟩ = ⟨𝑑, −𝛿 ⋅ 𝑢⟩ = −𝛿 ⋅ ⟨𝑑, 𝑢⟩ > 0.
This shows that 𝑑 ∉ 𝒞(𝑥) ⟹ 𝑑 ∉ 𝒩Ω(𝑥), completing the proof.
2 Karush-Kuhn-Tucker (KKT) conditions
The result of Theorem 1.1 gives a complete characterization of the normal cone for sets defined as
intersections of linear constraints. We now turn our attention to more general constraint sets, defined
as the intersection of differentiable² functional constraints
²While in previous classes we could have gotten away with Gâteaux differentiability (i.e., linearity of directional
derivatives), in this lecture we assume Fréchet differentiability, that is, the fact that the functions can be locally
approximated by their linearization at any point.
min
𝑥
s.t.
𝑓(𝑥)
𝑖(𝑥) = 0 𝑖 ∈ {1, ..., 𝑟}
𝑔𝑗(𝑥) ≤ 0 𝑗 ∈ {1, ..., 𝑠}.
(3)
2.1 The general idea
Consider any point 𝑥 on the boundary
of the feasible set Ω depicted on the side,
which is the intersection of three inequal-
ity constraints 𝑔𝑖(𝑥) ≤ 0 for 𝑖 ∈ {1, 2, 3}
(in particular, in the figure the 𝑔𝑖(𝑥) are
quadratic constraints). The main insight is
the following:
The set of directions that form an obtuse
angle with all directions that from 𝑥
remain inside of the set coincides with
the normal cone of the linearization of
the constraints that are binding (that is,
holding with equality) at 𝑥.
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
−2
−1.5
−1
−0.5
0.5
1
1.5
Ω
𝑔1(𝑥) = 0
𝑔2(𝑥) = 0
𝑔3(𝑥) = 0
𝑥1
𝑥2
𝒩Ω(𝑥1)
𝒩Ω(𝑥2)
The above observation suggests that for a nonlinear optimization problem with functional con-
straints, −∇𝑓(𝑥) should belong to the normal cone to the linearization of the binding constraints at
𝑥. This condition goes under the name of Karush-Kuhn-Tucker (KKT) optimality condition
³The KKT conditions used to be called “Kuhn-Tucker conditions” and were first published by Kuhn, H. W., &
Tucker, A. W. [KT51]. It was later discovered that the same conditions had appeared more than 10 years earlier in
the unpublished master’s thesis of Karush, W. [Kar39]. An analysis of the history of the KKT conditions is given by
Kjeldsen, T. H. [Kje00].
Since the binding linearized constraints are of the form
𝑖(𝑥) + ⟨∇ℎ𝑖(𝑥), 𝑥 − 𝑥⟩ = 0 ⟨∇ℎ𝑖(𝑥), 𝑥⟩ = ⟨∇ℎ𝑖(𝑥), 𝑥⟩ − ℎ𝑖(𝑥)
𝑔𝑖(𝑥) + ⟨∇𝑔𝑖(𝑥), 𝑥 − 𝑥⟩ ≤ 0 ⟨∇𝑔𝑖(𝑥), 𝑥⟩ ≤ ⟨∇𝑔𝑖(𝑥), 𝑥⟩ − 𝑔𝑖(𝑥),
from Theorem 1.1 we know that the normal cone of the linearization ̃ Ω of Ω around 𝑥∗ is
𝒩̃Ω(𝑥) =
{
{

𝑟
𝑖=1
𝜆𝑖∇ℎ𝑖(𝑥) +
𝑗∈𝐼(𝑥)
𝜇𝑗∇𝑔𝑗(𝑥) : 𝜆𝑖 ∈ ℝ, 𝜇𝑗 ∈ ℝ≥0
}
}
,
where the set of binding inequality constraints 𝐼(𝑥) is
𝐼(𝑥) ≔ {𝑗 ∈ {1, ..., 𝑠} : 𝑔𝑗(𝑥) = 0}.
(All equality constraints are always binding, and so we can directly sum over all 𝑖 = 1, ..., 𝑟.)
By using the complementary slackness reformulation of 𝐼(𝑥), the first-order optimality conditions
induced by the linearization of the feasible set are typically rewritten as follows.
Definition 2.1 (KKT conditions). Consider a nonlinear optimization problem with differentiable
objective function and functional constraints, in the form given in (3), and let 𝑥 be a point in
the feasible set (“Primal Feasibility”). The KKT conditions at 𝑥 are given by
−∇𝑓(𝑥) = ∑
𝑟
𝑖=1
𝜆𝑖∇ℎ𝑖(𝑥) + ∑
𝑠
𝑗=1
𝜇𝑗∇𝑔𝑗(𝑥) (“Stationarity”)
𝜆𝑖 ∈ ℝ, 𝜇𝑗 ≥ 0 ∀𝑖 = 1, ..., 𝑟, 𝑗 = 1, ..., 𝑠 (“Dual feasibility”)
𝜇𝑗 ⋅ 𝑔𝑗(𝑥) = 0 ∀𝑗 = 1, ..., 𝑠. (“Complementary slackness”)
In the definition above, we have noted in quotes the typical names for each of the different conditions.
However, please do not get distracted by these names:
• What the KKT conditions are really saying is that −∇𝑓(𝑥) must be in the normal cone to the
linearization of the constraint set.
• The complementary slackness condition is just a fancier way of writing “if 𝑗 ∉ 𝐼(𝑥), then 𝜇𝑗 =
0”.
The KKT conditions are often necessary conditions for optimality (for example, in the picture above),
but not always.
2.2 Failure of the KKT conditions
It is important to keep in mind that in some cases KKT conditions might fail to hold at optimality.
This typically happens when the linearization of the constraints collapses. To fix ideas, consider the
following simple example.
Example 2.1 (Failure of KKT). Consider the problem
𝑥
𝑦
0 0.2 0.4 0.6 0.8 1
0.25
0.5
0.75
1
Ω
min
(𝑥,𝑦)
s.t.
−𝑥
𝑦 − (1 − 𝑥)3 ≤ 0
𝑥 ≥ 0
𝑦 ≥ 0.
Let’s denote the objective and functional constraints as
𝑓(𝑥, 𝑦) ≔ −𝑥,
𝑔1(𝑥, 𝑦) ≔ 𝑦 − (1 − 𝑥)3 ≤ 0,
𝑔2(𝑥, 𝑦) ≔ −𝑥 ≤ 0,
𝑔3(𝑥, 𝑦) ≔ −𝑦 ≤ 0.
The feasible set Ω for this problem is shown on the right. At the optimal point (𝑥∗, 𝑦∗) ≔ (1, 0),
the gradients of the objective and the binding constraints (𝑔1 and 𝑔3) are
∇𝑓(𝑥∗, 𝑦∗) = (−1
0 ); ∇𝑔1(𝑥∗, 𝑦∗) = (0
1); ∇𝑔3(𝑥∗, 𝑦∗) = ( 0
−1).
It is then clear that there exist no Lagrange multipliers 𝜆1, 𝜆3 such that
−∇𝑓(𝑥∗, 𝑦∗) = 𝜆1∇𝑔1(𝑥∗, 𝑦∗) + 𝜆3∇𝑔3(𝑥∗, 𝑦∗).
The KKT conditions fail in this case.
The reason why the KKT conditions failed at the optimal point (𝑥, 𝑦) = (1, 0) in Example 2.1 is
due to the fact that the linearization of constraint 𝑔1(𝑥, 𝑦) ≤ 0 around the optimal point (𝑥, 𝑦) =
(1, 0) is 𝑦 ≤ 0. This is parallel to the existing constraint 𝑦 ≥ 0, and fails the capture the fact that
𝑥 ≤ 1 on the feasible set Ω.
2.3 Constraint qualification
Conditions that prevent the degenerate behavior illustrated in Example 2.1 above go under the name
of constraint qualification. Several constraint qualification conditions are known in the literature.
Concave and affine constraints. We already know that when the feasible set Ω is defined via linear
constraints (that is, all 𝑖 and 𝑔𝑗 in (3) are affine functions), then no further constraint qualifications
hold, and the necessity of the KKT conditions is implied directly by Theorem 1.1.
With only very little work, we can show that the same remains true if the 𝑔𝑗 are allowed to be
concave functions (that is, the −𝑔𝑗 are convex functions).
Theorem 2.1 (Concave and linear constraints). Let 𝑥 ∈ Ω ⊆ 𝑛 be a minimizer of (3). If
• the binding inequality constraints {𝑔𝑗}𝑗∈𝐼(𝑥) are concave differentiable functions in a con-
vex neighborhood of 𝑥; and
• the equality constraints {ℎ𝑖}𝑟
𝑖=1 are affine functions on 𝑛,
then the KKT conditions hold at 𝑥.
Linear independence of gradients. In Example 2.1, the linearization of two constraints coincided,
causing problems. When all linearized constraints are linearly independent, the issue is avoided.
Theorem 2.2 (Linear independence of gradients). Let 𝑥 ∈ Ω ⊆ 𝑛 be a minimizer of (3). If all
functions 𝑖 are continuously differentiable and the set of gradients
{∇ℎ𝑖(𝑥) : 𝑖 = 1, ..., 𝑟} ∪ {∇𝑔𝑗(𝑥) : 𝑗 ∈ 𝐼 (𝑥)}
is linearly independent, then the KKT conditions hold at 𝑥.
For those interested, the conditions above is a special case of a much more general condition called
Mangasarian-Fromowitz constraint qualification [MF67].
Slater’s condition. Finally, we consider a popular constraint qualification condition for problems
with convex inequality constraints and affine equality constraints.
Theorem 2.3 (Slater’s condition [Sla59]). Let 𝑥 ∈ Ω ⊆ 𝑛 be a minimizer of (3). If
• the binding inequality constraints {𝑔𝑗}𝑗∈𝐼(𝑥) are convex differentiable functions; and
• the equality constraints {ℎ𝑖}𝑟
𝑖=1 are affine functions; and
• there exists a feasible point 𝑥0 that is strictly feasible for the binding inequality constraints,
that is,
𝑔𝑗(𝑥0) < 0 ∀𝑗 ∈ 𝐼(𝑥)
then the KKT conditions hold at 𝑥.
3 Further readings and bibliography
The following books all contain an excellent and approachable treatment of the KKT condititions:
[Gül10] Güler, O. (2010). Foundations of Optimization. Springer. https://link.springer.com/book/
10.1007/978-0-387-68407-9
[Jah07] Jahn, J. (2007). Introduction to the Theory of Nonlinear Optimization. Springer. https://
link.springer.com/book/10.1007/978-3-540-49379-2
[Ber16] Bertsekas, D. P. (2016). Nonlinear Programming (3rd edition). Athena Scientific.
For a more advanced treatment with connections to metric regularity and nonsmooth analysis, I
recommend the following book by Borwein and Lewis.
[BL06] Borwein, J., & Lewis, A. (2006). Convex Analysis and Nonlinear Optimization. Springer.
https://link.springer.com/book/10.1007/978-0-387-31256-9
Bibliography
[HL01] J.-B. Hiriart-Urruty and C. Lemaréchal, Fundamentals of Convex Analysis. Springer, 2001.
[Online]. Available: https://link.springer.com/book/10.1007/978-3-642-56468-0
[KT51] H. W. Kuhn and A. W. Tucker, “Nonlinear Programming,” Proceedings of the Second
Berkeley Symposium on Mathematical Statistics and Probability, vol. 2. University of Cali-
fornia Press, Ewing, NJ, USA, pp. 481–493, Jan. 1951.
[Kar39] W. Karush, “Minima of functions of several variables with inequalities as side conditions.,”
1939.
[Kje00] T. H. Kjeldsen, “A Contextualized Historical Analysis of the Kuhn–Tucker Theorem in
Nonlinear Programming: The Impact of World War II,” Historia Math., vol. 27, no. 4, pp.
331–361, Nov. 2000, doi: 10.1006/hmat.2000.2289.
[MF67] O. L. Mangasarian and S. Fromovitz, “The Fritz John necessary optimality conditions in
the presence of equality and inequality constraints,” J. Math. Anal. Appl., vol. 17, no. 1,
pp. 37–47, Jan. 1967, doi: 10.1016/0022-247X(67)90163-1.
[Sla59] M. Slater, “Lagrange Multipliers Revisited,” 1959.
[Gül10] O. Güler, Foundations of Optimization. New York, NY, USA: Springer, 2010. [Online].
Available: https://link.springer.com/book/10.1007/978-0-387-68407-9
[Jah07] J. Jahn, Introduction to the Theory of Nonlinear Optimization. Berlin, Germany: Springer,
2007. [Online]. Available: https://link.springer.com/book/10.1007/978-3-540-49379-2
[Ber16] D. P. Bertsekas, Nonlinear Programming, 3rd edition. Nashua, NH, USA: Athena Scientific,
2016.
[BL06] J. Borwein and A. Lewis, Convex Analysis and Nonlinear Optimization. New York, NY,
USA: Springer, 2006. [Online]. Available: https://link.springer.com/book/10.1007/978-
0-387-31256-9

Content

PDF File

Typo or question?

Get in touch!
gfarina AT mit.edu

Metadata

Course: MIT 6.7220 / 15.084
Term: Spring 2024
Date: 2024-02-27