Post

Variational Calculus Part 2: The Euler-Lagrange Equation

Deriving the Euler-Lagrange equation, the fundamental differential equation that extremizing functions must satisfy in variational problems, using the first variation and the fundamental lemma.

Variational Calculus Part 2: The Euler-Lagrange Equation

In Part 1, we introduced functionals

\[ J[y] \]

– functions of functions – and established that a necessary condition for a function

\[ y(x) \]

to extremize

\[ J[y] \]

is that its first variation must vanish for all admissible variations

\[ \eta(x) \]

:

\[ \delta J[y; \eta] = \left. \frac{d}{d\epsilon} J[y + \epsilon \eta] \right\vert_{\epsilon=0} = 0 \]

This is the variational calculus equivalent of setting

\[ f'(x)=0 \]

in ordinary calculus. Now, our goal is to transform this abstract condition into a concrete, usable tool. We will focus on a very common type of functional and derive the celebrated Euler-Lagrange equation, a differential equation that the extremizing function

\[ y(x) \]

must satisfy.

1. The Standard Functional Form

Many problems in physics, engineering, and even machine learning involve functionals of the following form:

Definition. Standard Functional

A common type of functional depends on a function

\[ y(x) \]

, its first derivative

\[ y'(x) = dy/dx \]

, and the independent variable

\[ x \]

, integrated over an interval

\[ [a, b] \]

:

\[ J[y] = \int_a^b F(x, y(x), y'(x)) \, dx \]

Here,

\[ F(x, y, y') \]

is a given function of three variables, often called the Lagrangian or the integrand function. We assume

\[ F \]

has continuous partial derivatives with respect to its arguments. We also assume

\[ y(x) \]

is twice continuously differentiable.

Examples from Part 1, like the arc length functional (

\[ F = \sqrt{1+(y')^2} \]

) and the Fermat’s principle functional (

\[ F = \sqrt{1+(y')^2}/v(x) \]

), fit this form.

2. Calculating the First Variation Explicitly

Let’s compute

\[ \delta J[y; \eta] \]

for the standard functional. Recall that

\[ \tilde{y}(x; \epsilon) = y(x) + \epsilon \eta(x) \]

. Then, its derivative is

\[ \tilde{y}'(x; \epsilon) = y'(x) + \epsilon \eta'(x) \]

. Substituting into the functional:

\[ J[y + \epsilon \eta] = \int_a^b F(x, y(x) + \epsilon \eta(x), y'(x) + \epsilon \eta'(x)) \, dx \]

To find the first variation, we differentiate this expression with respect to

\[ \epsilon \]

and then set

\[ \epsilon = 0 \]

. Assuming we can differentiate under the integral sign (Leibniz integral rule, valid here due to our smoothness assumptions):

\[ \delta J[y; \eta] = \left. \frac{d}{d\epsilon} \int_a^b F(x, y + \epsilon \eta, y' + \epsilon \eta') \, dx \right\vert_{\epsilon=0} \]
\[ \delta J[y; \eta] = \int_a^b \left. \frac{d}{d\epsilon} F(x, y + \epsilon \eta, y' + \epsilon \eta') \right\vert_{\epsilon=0} \, dx \]

Now, we apply the chain rule to the integrand

\[ F \]

. Let

\[ Y = y + \epsilon \eta \]

and

\[ Y' = y' + \epsilon \eta' \]

. Then

\[ F = F(x, Y, Y') \]

. So,

\[ \frac{dF}{d\epsilon} = \frac{\partial F}{\partial Y} \frac{\partial Y}{\partial \epsilon} + \frac{\partial F}{\partial Y'} \frac{\partial Y'}{\partial \epsilon} \]

. We have:

\[ \frac{\partial Y}{\partial \epsilon} = \frac{\partial}{\partial \epsilon}(y + \epsilon \eta) = \eta \]
\[ \frac{\partial Y'}{\partial \epsilon} = \frac{\partial}{\partial \epsilon}(y' + \epsilon \eta') = \eta' \]

Therefore,

\[ \frac{d}{d\epsilon} F(x, y + \epsilon \eta, y' + \epsilon \eta') = \frac{\partial F}{\partial (y + \epsilon \eta)} \eta + \frac{\partial F}{\partial (y' + \epsilon \eta')} \eta' \]

Setting

\[ \epsilon = 0 \]

, we get:

\[ \left. \frac{d}{d\epsilon} F(x, y + \epsilon \eta, y' + \epsilon \eta') \right\vert_{\epsilon=0} = \frac{\partial F}{\partial y}(x, y, y') \eta(x) + \frac{\partial F}{\partial y'}(x, y, y') \eta'(x) \]

For brevity, we’ll write

\[ \frac{\partial F}{\partial y} \]

and

\[ \frac{\partial F}{\partial y'} \]

}, understanding they are evaluated at

\[ (x, y(x), y'(x)) \]

.

Plugging this back into the integral for

\[ \delta J[y; \eta] \]

:

\[ \delta J[y; \eta] = \int_a^b \left( \frac{\partial F}{\partial y} \eta(x) + \frac{\partial F}{\partial y'} \eta'(x) \right) \, dx \]

The necessary condition for an extremum is

\[ \delta J[y; \eta] = 0 \]

:

\[ \int_a^b \left( \frac{\partial F}{\partial y} \eta(x) + \frac{\partial F}{\partial y'} \eta'(x) \right) \, dx = 0 \]

This equation must hold for all admissible variation functions

\[ \eta(x) \]

.

3. The Key Maneuver: Integration by Parts

The expression above involves both

\[ \eta(x) \]

and its derivative

\[ \eta'(x) \]

. To make progress, we want to factor out

\[ \eta(x) \]

from the entire integrand. We can achieve this by applying integration by parts to the second term:

\[ \int u \, dv = uv - \int v \, du \]

.

Let

\[ u = \frac{\partial F}{\partial y'} \]

and

\[ dv = \eta'(x) \, dx \]

. Then

\[ du = \frac{d}{dx} \left( \frac{\partial F}{\partial y'} \right) \, dx \]

and

\[ v = \eta(x) \]

.

So, the second term becomes:

\[ \int_a^b \frac{\partial F}{\partial y'} \eta'(x) \, dx = \left[ \frac{\partial F}{\partial y'} \eta(x) \right]_a^b - \int_a^b \eta(x) \frac{d}{dx} \left( \frac{\partial F}{\partial y'} \right) \, dx \]

The boundary term

\[ \left[ \frac{\partial F}{\partial y'} \eta(x) \right]_a^b = \frac{\partial F}{\partial y'}(b) \eta(b) - \frac{\partial F}{\partial y'}(a) \eta(a) \]

. Recall from Part 1 that for problems with fixed endpoints

\[ y(a)=y_a \]

and

\[ y(b)=y_b \]

, the admissible variations

\[ \eta(x) \]

must satisfy

\[ \eta(a) = 0 \]

and

\[ \eta(b) = 0 \]

. Therefore, for such problems, the boundary term vanishes:

\[ \left[ \frac{\partial F}{\partial y'} \eta(x) \right]_a^b = 0 \]

Note on Boundary Conditions: If the endpoints are not fixed (so-called “natural boundary conditions”), then

\[ \eta(a) \]

and

\[ \eta(b) \]

are not necessarily zero, and the boundary terms must be handled differently. This leads to additional conditions on

\[ \frac{\partial F}{\partial y'} \]

at the endpoints. We will focus on fixed endpoints for now.

Substituting the result of the integration by parts (with the vanishing boundary term) back into the equation for

\[ \delta J = 0 \]

:

\[ \int_a^b \frac{\partial F}{\partial y} \eta(x) \, dx - \int_a^b \eta(x) \frac{d}{dx} \left( \frac{\partial F}{\partial y'} \right) \, dx = 0 \]

Combining the integrals:

\[ \int_a^b \left( \frac{\partial F}{\partial y} - \frac{d}{dx} \left( \frac{\partial F}{\partial y'} \right) \right) \eta(x) \, dx = 0 \]

This equation is crucial. It states that the integral of the product of the term in the parenthesis and

\[ \eta(x) \]

is zero for any admissible variation function

\[ \eta(x) \]

. This leads us to a powerful lemma.

4. The Fundamental Lemma of Variational Calculus

The equation we’ve reached is of the form

\[ \int_a^b g(x) \eta(x) \, dx = 0 \]

, where

\[ g(x) = \frac{\partial F}{\partial y} - \frac{d}{dx} \left( \frac{\partial F}{\partial y'} \right) \]

.

Lemma. Fundamental Lemma of Variational Calculus (du Bois-Reymond)

If a function

\[ g(x) \]

is continuous on the interval

\[ [a, b] \]

, and if

\[ \int_a^b g(x) \eta(x) \, dx = 0 \]

for every continuously differentiable function

\[ \eta(x) \]

such that

\[ \eta(a) = 0 \]

and

\[ \eta(b) = 0 \]

, then

\[ g(x) = 0 \]

for all

\[ x \in [a, b] \]

.

Intuition behind the Lemma: Suppose, for the sake of contradiction, that

\[ g(x_0) \neq 0 \]

for some

\[ x_0 \in (a, b) \]

. Let’s say

\[ g(x_0) > 0 \]

. Since

\[ g(x) \]

is continuous, there must be a small subinterval around

\[ x_0 \]

, say

\[ [c, d] \subset (a, b) \]

, where

\[ g(x) > 0 \]

throughout this subinterval.

Now, we can construct a specific variation function

\[ \eta(x) \]

that is positive within

\[ [c, d] \]

and zero outside this subinterval (and still satisfies

\[ \eta(a)=\eta(b)=0 \]

because

\[ [c,d] \]

is strictly inside

\[ (a,b) \]

). Such functions, often called “bump functions,” can be made smooth. For example, one could choose

\[ \eta(x) = (x-c)^2(x-d)^2 \]

for

\[ x \in [c,d] \]

and

\[ \eta(x)=0 \]

otherwise (or a smoother version using exponentials as shown in the reference materials from the prompt).

For such an

\[ \eta(x) \]

:

\[ g(x) \eta(x) > 0 \]

for

\[ x \in (c, d) \]
\[ g(x) \eta(x) = 0 \]

for

\[ x \notin [c, d] \]

Then the integral

\[ \int_a^b g(x) \eta(x) \, dx = \int_c^d g(x) \eta(x) \, dx \]

would be strictly positive. This contradicts our premise that the integral is zero for all admissible

\[ \eta(x) \]

. Therefore, our assumption that

\[ g(x_0) \neq 0 \]

must be false. Thus,

\[ g(x) = 0 \]

for all

\[ x \in [a, b] \]

.

More on Bump Functions

A common example of a smooth bump function that is non-zero only on a finite interval, say

\[ (-1, 1) \]

, is:

\[ B(t) = \begin{cases} \exp\left(-\frac{1}{1-t^2}\right) & \text{if } \vert t \vert < 1 \\ 0 & \text{if } \vert t \vert \ge 1 \end{cases} \]

This function is infinitely differentiable everywhere, including at

\[ t=\pm 1 \]

where all derivatives are zero. By scaling and translating

\[ t \]

, we can create such a bump function

\[ \eta(x) \]

over any desired subinterval

\[ [c, d] \]

within

\[ [a, b] \]

. This rigorous construction underpins the Fundamental Lemma.

5. The Euler-Lagrange Equation

Applying the Fundamental Lemma of Variational Calculus to our equation:

\[ \int_a^b \left( \frac{\partial F}{\partial y} - \frac{d}{dx} \left( \frac{\partial F}{\partial y'} \right) \right) \eta(x) \, dx = 0 \]

The term in the parenthesis plays the role of

\[ g(x) \]

. If this integral is zero for all admissible

\[ \eta(x) \]

, then the term itself must be identically zero:

Theorem. The Euler-Lagrange Equation

A function

\[ y(x) \]

that extremizes the functional

\[ J[y] = \int_a^b F(x, y(x), y'(x)) \, dx \]

with fixed boundary conditions

\[ y(a)=y_a \]

and

\[ y(b)=y_b \]

, must satisfy the following second-order ordinary differential equation:

\[ \frac{\partial F}{\partial y} - \frac{d}{dx} \left( \frac{\partial F}{\partial y'} \right) = 0 \]

This is known as the Euler-Lagrange equation.

Understanding the terms:

\[ \frac{\partial F}{\partial y} \]

: The partial derivative of

\[ F(x, y, y') \]

with respect to its second argument

\[ y \]

, treating

\[ x \]

and

\[ y' \]

as constants.

\[ \frac{\partial F}{\partial y'} \]

: The partial derivative of

\[ F(x, y, y') \]

with respect to its third argument

\[ y' \]

, treating

\[ x \]

and

\[ y \]

as constants.

\[ \frac{d}{dx} \left( \frac{\partial F}{\partial y'} \right) \]

: The total derivative with respect to

\[ x \]

of the expression

\[ \frac{\partial F}{\partial y'} \]

. Since

\[ y \]

and

\[ y' \]

are functions of

\[ x \]

, this derivative will generally involve

\[ y'(x) \]

and

\[ y''(x) \]

via the chain rule:

1
2
<div class="math-block" markdown="0"> \[ \frac{d}{dx} \left( \frac{\partial F}{\partial y'}(x, y(x), y'(x)) \right) = \frac{\partial^2 F}{\partial x \partial y'} + \frac{\partial^2 F}{\partial y \partial y'} y' + \frac{\partial^2 F}{\partial y'^2} y'' \]
</div>

The Euler-Lagrange equation is a differential equation for the unknown function

\[ y(x) \]

. Solving it (subject to the boundary conditions) provides the candidate functions that could extremize the functional.

6. Significance and What’s Next

The derivation of the Euler-Lagrange equation is a monumental step in variational calculus. It converts the problem of optimizing over an infinite-dimensional space of functions into the more familiar problem of solving a differential equation.

  1. Global Criterion to Local Rule: The original problem was to minimize a global quantity (the integral
\[ J[y] \]

). The Euler-Lagrange equation provides a local condition (a differential equation) that must hold at every point

\[ x \]

.

  1. Universality: This single method, encapsulated by the Euler-Lagrange equation, can tackle a vast array of problems that seek to find an optimal function, simply by identifying the correct integrand
\[ F(x, y, y') \]

.

In the next part of this crash course, we will:

  • Apply the Euler-Lagrange equation to solve some classic variational problems, such as finding the shortest path between two points (revisiting our initial example) and the brachistochrone problem.
  • Discuss some special cases and first integrals of the Euler-Lagrange equation (Beltrami identity).

This will demonstrate the power and utility of the machinery we’ve developed.

This post is licensed under CC BY 4.0 by the author.