In mathematical optimization problems, Lagrange multipliers are a method for dealing with constraints. Suppose the question as given is to find local extrema of a function of several variables subject to one or more constraints given by setting further functions of the variables to given values. The method introduces a new unknown scalar variable, the Lagrange multiplier, for each constraint; and forms a linear combination involving the multipliers as coefficients. This reduces the constrained problem to an unconstrained problem. It may then be solved, for example by the usual gradient method.

To explain why this has a chance of working, consider a two-dimensional case. Suppose we have a function f(x,y) to maximise, subject to g(x,y) = c, where c is a given constant. We can visualise 'contours' given by f(x,y) = d for various values of d. The constraint is to stay on one contour, given by fixing the value of g. Suppose we are walking along that contour, and taking note of the various contours f = d we cross. If they cross the contour transversally, we can increase the value of f by walking 'uphill': that is following the direction leading to higher values of f. Only if the contour f = d we are on actually touches the contour g = c we are confined to will this not be possible. At a constrained maximum of f, that must be true.

A familiar example can be obtained from weather maps, with their contours for temperature and pressure: the constrained extrema will occur where the superposed maps show touching lines (isopleths).

Geometrically we translate the tangency condition to saying that the gradients of f and g are parallel vectors at the maximum. Introducing an unknown scalar λ, the gradient of f -λg is then 0 for some value of λ. This in geometrical form is the Lagrange multiplier argument: f -λg must be stationary, where the multiplier λ is a new variable, at a local extremum.

Table of contents
1 The method of Lagrange multipliers
2 Example
3 External links

The method of Lagrange multipliers

Let f (x1, x2, … xn) be a function defined on an n-dimensional open set. We suppose f has continuous partial derivatives on that set. A constraint is defined by the hypersurface which satisfies g (x1, x2, … xn) = C within the open set; where C is a constant. Additionally, g0 everywhere on the curve (where is the gradient).

Given that there is a local maximum (or minimum) of f(x) on the set {xRn | g (x) = C}, by setting the derivative of f along the hypersurface to 0, one can show that there exists a real number λ (the Lagrange multiplier) such that

.

Written differently

for each x that is such a local extremum. This is therefore a necessary condition, that holds at all possible solutions of the local extremum problem.

In general one can apply the reasoning to several constraints, by introducing multipliers λ, μ, ... for each.

The constraint g (x) = C will have to be to be used again, in applying the new criterion to find the solutions, since as it stands the particular value of C isn't mentioned.

Example

Suppose we wish to find the discrete probability distribution with maximal information entropy. Then

Of course, the sum of these probabilities equals 1, so our constraint function is

To find the point of maximum entropy (depending on the probabilities), we can use the Lagrange multiplier. We set:

Carrying out the differentiation of these n equations, we get
This shows that all pi are equal (because they depend on λ only). By using the constraint ∑k pk = 1 again, we get the uniform distribution:

For another example, see also Derivation of the partition function.

The method of Lagrange multipliers is generalized by the Kuhn-Tucker Theorem.

External links

For references to Lagrange's original work and for an account of the terminology see the Lagrange Multiplier entry in