Variational Calculus Part 2: The Euler-Lagrange Equation
Deriving the Euler-Lagrange equation, the fundamental differential equation that extremizing functions must satisfy in variational problems, using the first variation and the fundamental lemma.
In Part 1, we introduced functionals
– functions of functions – and established that a necessary condition for a function
to extremize
is that its first variation must vanish for all admissible variations
:
This is the variational calculus equivalent of setting
in ordinary calculus. Now, our goal is to transform this abstract condition into a concrete, usable tool. We will focus on a very common type of functional and derive the celebrated Euler-Lagrange equation, a differential equation that the extremizing function
must satisfy.
1. The Standard Functional Form
Many problems in physics, engineering, and even machine learning involve functionals of the following form:
Definition. Standard Functional
A common type of functional depends on a function
\[ y(x) \], its first derivative
\[ y'(x) = dy/dx \], and the independent variable
\[ x \], integrated over an interval
\[ [a, b] \]:
\[ J[y] = \int_a^b F(x, y(x), y'(x)) \, dx \]Here,
\[ F(x, y, y') \]is a given function of three variables, often called the Lagrangian or the integrand function. We assume
\[ F \]has continuous partial derivatives with respect to its arguments. We also assume
\[ y(x) \]is twice continuously differentiable.
Examples from Part 1, like the arc length functional (
) and the Fermat’s principle functional (
), fit this form.
2. Calculating the First Variation Explicitly
Let’s compute
for the standard functional. Recall that
. Then, its derivative is
. Substituting into the functional:
To find the first variation, we differentiate this expression with respect to
and then set
. Assuming we can differentiate under the integral sign (Leibniz integral rule, valid here due to our smoothness assumptions):
Now, we apply the chain rule to the integrand
. Let
and
. Then
. So,
. We have:
Therefore,
Setting
, we get:
For brevity, we’ll write
and
}, understanding they are evaluated at
.
Plugging this back into the integral for
:
The necessary condition for an extremum is
:
This equation must hold for all admissible variation functions
.
3. The Key Maneuver: Integration by Parts
The expression above involves both
and its derivative
. To make progress, we want to factor out
from the entire integrand. We can achieve this by applying integration by parts to the second term:
.
Let
and
. Then
and
.
So, the second term becomes:
The boundary term
. Recall from Part 1 that for problems with fixed endpoints
and
, the admissible variations
must satisfy
and
. Therefore, for such problems, the boundary term vanishes:
Note on Boundary Conditions: If the endpoints are not fixed (so-called “natural boundary conditions”), then
\[ \eta(a) \]and
\[ \eta(b) \]are not necessarily zero, and the boundary terms must be handled differently. This leads to additional conditions on
\[ \frac{\partial F}{\partial y'} \]at the endpoints. We will focus on fixed endpoints for now.
Substituting the result of the integration by parts (with the vanishing boundary term) back into the equation for
:
Combining the integrals:
This equation is crucial. It states that the integral of the product of the term in the parenthesis and
is zero for any admissible variation function
. This leads us to a powerful lemma.
4. The Fundamental Lemma of Variational Calculus
The equation we’ve reached is of the form
, where
.
Lemma. Fundamental Lemma of Variational Calculus (du Bois-Reymond)
If a function
\[ g(x) \]is continuous on the interval
\[ [a, b] \], and if
\[ \int_a^b g(x) \eta(x) \, dx = 0 \]for every continuously differentiable function
\[ \eta(x) \]such that
\[ \eta(a) = 0 \]and
\[ \eta(b) = 0 \], then
\[ g(x) = 0 \]for all
\[ x \in [a, b] \].
Intuition behind the Lemma: Suppose, for the sake of contradiction, that
for some
. Let’s say
. Since
is continuous, there must be a small subinterval around
, say
, where
throughout this subinterval.
Now, we can construct a specific variation function
that is positive within
and zero outside this subinterval (and still satisfies
because
is strictly inside
). Such functions, often called “bump functions,” can be made smooth. For example, one could choose
for
and
otherwise (or a smoother version using exponentials as shown in the reference materials from the prompt).
For such an
:
for
for
Then the integral
would be strictly positive. This contradicts our premise that the integral is zero for all admissible
. Therefore, our assumption that
must be false. Thus,
for all
.
More on Bump Functions
A common example of a smooth bump function that is non-zero only on a finite interval, say
, is:
This function is infinitely differentiable everywhere, including at
where all derivatives are zero. By scaling and translating
, we can create such a bump function
over any desired subinterval
within
. This rigorous construction underpins the Fundamental Lemma.
5. The Euler-Lagrange Equation
Applying the Fundamental Lemma of Variational Calculus to our equation:
The term in the parenthesis plays the role of
. If this integral is zero for all admissible
, then the term itself must be identically zero:
Theorem. The Euler-Lagrange Equation
A function
\[ y(x) \]that extremizes the functional
\[ J[y] = \int_a^b F(x, y(x), y'(x)) \, dx \]with fixed boundary conditions
\[ y(a)=y_a \]and
\[ y(b)=y_b \], must satisfy the following second-order ordinary differential equation:
\[ \frac{\partial F}{\partial y} - \frac{d}{dx} \left( \frac{\partial F}{\partial y'} \right) = 0 \]This is known as the Euler-Lagrange equation.
Understanding the terms:
: The partial derivative of
with respect to its second argument
, treating
and
as constants.
: The partial derivative of
with respect to its third argument
, treating
and
as constants.
: The total derivative with respect to
of the expression
. Since
and
are functions of
, this derivative will generally involve
and
via the chain rule:
1
2
<div class="math-block" markdown="0"> \[ \frac{d}{dx} \left( \frac{\partial F}{\partial y'}(x, y(x), y'(x)) \right) = \frac{\partial^2 F}{\partial x \partial y'} + \frac{\partial^2 F}{\partial y \partial y'} y' + \frac{\partial^2 F}{\partial y'^2} y'' \]
</div>
The Euler-Lagrange equation is a differential equation for the unknown function
. Solving it (subject to the boundary conditions) provides the candidate functions that could extremize the functional.
6. Significance and What’s Next
The derivation of the Euler-Lagrange equation is a monumental step in variational calculus. It converts the problem of optimizing over an infinite-dimensional space of functions into the more familiar problem of solving a differential equation.
- Global Criterion to Local Rule: The original problem was to minimize a global quantity (the integral
). The Euler-Lagrange equation provides a local condition (a differential equation) that must hold at every point
.
- Universality: This single method, encapsulated by the Euler-Lagrange equation, can tackle a vast array of problems that seek to find an optimal function, simply by identifying the correct integrand
.
In the next part of this crash course, we will:
- Apply the Euler-Lagrange equation to solve some classic variational problems, such as finding the shortest path between two points (revisiting our initial example) and the brachistochrone problem.
- Discuss some special cases and first integrals of the Euler-Lagrange equation (Beltrami identity).
This will demonstrate the power and utility of the machinery we’ve developed.