Motivating Hilbert Spaces: Encoding Geometry

Generalizing the dot product to function spaces and demanding completeness leads to Hilbert spaces, essential for geometry and analysis in infinite dimensions.

Posted May 29, 2025

By Ji-Ha Kim

16 min read

Welcome back! Previously, we laid down some basic definitions in functional analysis and distinguished “kets” (vectors) from “bras” (linear functionals). To motivate this abstraction, we explore how functions can be viewed as infinite-dimensional vector spaces equip these function spaces with geometric structure, transferring lots of intuition and results from finite-dimensional linear algebra to the realm of functions.

1. Introduction: Beyond Vector Spaces

Functions forming a vector space is a start, but finite-dimensional vector spaces like \(\mathbb{R}^n\) offer more. The dot product provides crucial geometric tools:

Length (Norm): The magnitude of a vector.
Distance: Separation between two vectors.
Angles & Orthogonality: Orientation and perpendicularity.

To ask “how large is this function?”, “how similar are these functions?”, or “are these functions orthogonal?”, we need to generalize the dot product. This leads to inner products and ultimately to Hilbert spaces, vital for mathematics, physics, and machine learning.

2. The Inner Product: A “Dot Product” for Functions

Intuition from \(\mathbb{R}^n\)

For \(\mathbf{x} = (x_1, \dots, x_n)\) and \(\mathbf{y} = (y_1, \dots, y_n)\) in \(\mathbb{R}^n\), the dot product is:

\[\mathbf{x} \cdot \mathbf{y} = \sum_{i=1}^n x_i y_i\]

Generalizing to Functions: The \(L_2\) Inner Product

How can this extend to real-valued functions \(f(t), g(t)\) on an interval \([a,b]\)? View a function as a “vector” with infinitely many components, indexed by \(t \in [a,b]\). Then, as a common trick in analysis, we apply a limit on the discrete case to recover the continuous case.

Approximate with Step Functions: Divide \([a,b]\) into \(N\) subintervals \(I_k = [t_{k-1}, t_k]\) of width \(\Delta t_k\). Pick sample points \(t_k^\ast \in I_k\). Define step functions:
- \(f_N(t) = f(t_k^\ast )\) for \(t \in I_k\)
- \(g_N(t) = g(t_k^\ast )\) for \(t \in I_k\)
Inner Product for Step Functions: A natural generalization of \(\sum x_i y_i\) weights each product \(f(t_k^\ast )g(t_k^\ast )\) by the subinterval length \(\Delta t_k\):
\[\sum_{k=1}^N f(t_k^\ast )g(t_k^\ast ) \Delta t_k\]
This is a Riemann sum for \(\int_a^b f(t)g(t)dt\).
Taking the Limit: As \(N \to \infty\) and max \(\Delta t_k \to 0\), if \(f,g\) are Riemann integrable, the sum converges:
\[\lim_{N \to \infty} \sum_{k=1}^N f(t_k^\ast )g(t_k^\ast ) \Delta t_k = \int_a^b f(t)g(t)dt\]

This integral, \(\int_a^b f(t)g(t)dt\), is the natural dot product extension for real functions. For complex-valued functions, to ensure \(\langle f \vert f \rangle \ge 0\), we use a complex conjugate:

\[\langle f \vert g \rangle_{L_2} = \int_a^b \overline{f(t)} g(t) dt\]

This ensures \(\langle f \vert f \rangle_{L_2} = \int_a^b \vert f(t) \vert^2 dt \ge 0\). We generally use \(\langle f \vert g \rangle\) for inner products.

Definition 2.1: Inner Product
An inner product on a vector space \(V\) over \(\mathbb{F}\) (\(\mathbb{R}\) or \(\mathbb{C}\)) is a function \(\langle \cdot \vert \cdot \rangle : V \times V \to \mathbb{F}\) satisfying for all kets \(\vert f \rangle, \vert g \rangle, \vert h \rangle \in V\) and scalars \(\alpha, \beta \in \mathbb{F}\):
Conjugate Symmetry: \(\langle f \vert g \rangle = \overline{\langle g \vert f \rangle}\). (For real spaces: \(\langle f \vert g \rangle = \langle g \vert f \rangle\)).
Linearity in the second argument (ket): \(\langle f \vert (\alpha \vert g \rangle + \beta \vert h \rangle) \rangle = \alpha \langle f \vert g \rangle + \beta \langle f \vert h \rangle\). (This implies conjugate-linearity in the first argument (bra): \(\langle (\alpha \vert f \rangle + \beta \vert g \rangle) \vert h \rangle = \bar{\alpha} \langle f \vert h \rangle + \bar{\beta} \langle g \vert h \rangle\)).
Positive-definiteness: \(\langle f \vert f \rangle \ge 0\), and \(\langle f \vert f \rangle = 0 \iff \vert f \rangle = \vert \mathbf{0} \rangle\).
A vector space with an inner product is an inner product space.

Example: The \(L_2\) Inner Product (Formalized)

The \(L_2\) inner product for complex-valued, square-integrable functions \(f, g\) on \([a,b]\) (or a measure space \((\Omega, \Sigma, \mu)\)) is:

\[\langle f \vert g \rangle = \int_a^b \overline{f(x)} g(x) dx\]

(Or \(\int_\Omega \overline{f(x)} g(x) d\mu(x)\) for measure \(\mu\)). The space of functions where \(\int_a^b \vert f(x) \vert^2 dx < \infty\) is \(L_2([a,b])\). (Technically, \(L_2\) consists of equivalence classes of functions differing on sets of measure zero).

3. Geometric Toolkit from the Inner Product

An inner product unlocks several geometric concepts:

Induced Norm (Length/Magnitude)

The inner product defines a norm (length):

\[\Vert f \Vert = \sqrt{\langle f \vert f \rangle}\]

For the \(L_2\) inner product, this is the \(L_2\)-norm:

\[\Vert f \Vert_2 = \left( \int_a^b \vert f(x) \vert^2 dx \right)^{1/2}\]

Often represents “energy” or “RMS value”.

Metric (Distance)

The norm defines a metric (distance):

\[d(f,g) = \Vert f - g \Vert = \sqrt{\langle f-g \vert f-g \rangle}\]

Angles and Orthogonality

The Cauchy-Schwarz Inequality is key:

\[\vert \langle f \vert g \rangle \vert \le \Vert f \Vert \Vert g \Vert\]

For real functions, this allows defining angle \(\theta\) via \(\cos \theta = \frac{\langle f \vert g \rangle}{\Vert f \Vert \Vert g \Vert}\). Crucially, it defines orthogonality: \(\vert f \rangle\) and \(\vert g \rangle\) are orthogonal if:

\[\langle f \vert g \rangle = 0\]

Example: Orthogonal Sines
Consider real functions on \([-\pi, \pi]\) with \(\langle f \vert g \rangle = \int_{-\pi}^{\pi} f(x)g(x)dx\). Let \(\vert f_m \rangle \leftrightarrow f_m(x) = \sin(mx)\) and \(\vert f_n \rangle \leftrightarrow f_n(x) = \sin(nx)\) for positive integers \(m, n\). Then \(\langle f_m \vert f_n \rangle = \int_{-\pi}^{\pi} \sin(mx)\sin(nx)dx\).
If \(m \neq n\), \(\langle f_m \vert f_n \rangle = 0\) (orthogonal).
If \(m = n\), \(\langle f_n \vert f_n \rangle = \Vert f_n \Vert^2 = \int_{-\pi}^{\pi} \sin^2(nx)dx = \pi\). This orthogonality underpins Fourier series.

Projections

The projection of \(\vert g \rangle\) onto the direction of \(\vert f \rangle\) (\(\Vert f \Vert \neq 0\)) is:

\[\text{proj}_{\vert f \rangle} \vert g \rangle = \frac{\langle f \vert g \rangle}{\langle f \vert f \rangle} \vert f \rangle = \frac{\langle f \vert g \rangle}{\Vert f \Vert^2} \vert f \rangle\]

This is the component of \(\vert g \rangle\) along \(\vert f \rangle\). For an orthogonal set \(\{\vert \phi_k \rangle\}\), the best approximation of \(\vert f \rangle\) in their span is:

\[\vert f_{\text{approx}} \rangle = \sum_k \frac{\langle \phi_k \vert f \rangle}{\Vert \phi_k \Vert^2} \vert \phi_k \rangle\]

Projections and Interpolation Formulas: A Duality Perspective

Projecting onto basis functions connects to constructing interpolation formulas. This often involves finding basis functions \(\{\vert \phi_i \rangle\}\) and linear functionals \(\{\langle \Lambda_j \vert\}\) (e.g., point evaluations) forming a biorthogonal system.

Biorthogonality and Interpolation

Given a basis \(\{\vert \phi_i \rangle\}_{i=0}^n\) for a function space \(V\) (e.g., polynomials \(\mathcal{P}_n\)) and \(n+1\) linearly independent linear functionals \(\{\langle \Lambda_j \vert\}_{j=0}^n\) on \(V\). They are biorthogonal if:

\[\langle \Lambda_j \vert \phi_i \rangle = \delta_{ji} \quad (\text{Kronecker delta})\]

If such a system exists, any \(\vert P \rangle \in V\) expands as:

\[\vert P \rangle = \sum_{i=0}^n \langle \Lambda_i \vert P \rangle \vert \phi_i \rangle\]

Application to Interpolation: To find \(\vert P \rangle \in V\) satisfying \(\langle \Lambda_j \vert P \rangle = y_j\) (given values \(y_j\)), if \(\{\vert \phi_i \rangle\}\) is biorthogonal to \(\{\langle \Lambda_j \vert\}\), then:

\[\vert P \rangle = \sum_{j=0}^n y_j \vert \phi_j \rangle\]

The coefficients are the target values \(y_j\).

1. Lagrange Interpolation:

Functionals: \(\langle \Lambda_j \vert P \rangle = P(x_j)\) (point evaluations at distinct \(x_j\)).
Target: \(P(x_j) = y_j\).
Biorthogonal Basis (Lagrange Polynomials \(L_i(x)\)) : \(L_i(x_j) = \delta_{ji}\).
\[L_i(x) = \prod_{k=0, k \neq i}^n \frac{x-x_k}{x_i-x_k}\]
Formula: \(P(x) = \sum_{j=0}^n y_j L_j(x)\).

2. Taylor Series (Polynomial Approximation):

Functionals: \(\langle \Lambda_j \vert P \rangle = P^{(j)}(x_0)\) (derivatives at \(x_0\)).
Target: \(P^{(j)}(x_0) = f^{(j)}(x_0)\) for some function \(f\). So, \(y_j = f^{(j)}(x_0)\).
Biorthogonal Basis: \(\phi_i(x) = \frac{(x-x_0)^i}{i!}\) satisfy \(\phi_i^{(j)}(x_0) = \delta_{ji}\).
Formula (Taylor Polynomial): \(P(x) = \sum_{j=0}^n f^{(j)}(x_0) \frac{(x-x_0)^j}{j!}\).

3. Newton Interpolation:

Functionals: Same as Lagrange, \(\langle \Lambda_j \vert P \rangle = P(x_j)\).
Target: \(P(x_j) = y_j\).
Basis (Newton Polynomials): \(\phi_i(x) = \prod_{k=0}^{i-1} (x - x_k)\), for \(i \ge 1\) and \(\phi_0(x)=1\). This basis is computationally efficient because adding a new point only requires adding one new basis element and coefficient, unlike the Lagrange basis which must be completely recomputed.
Formula: \(P(x) = \sum_{j=0}^n c_j \phi_j(x)\). The coefficients \(c_j\) are called divided differences, denoted \(f[x_0, \dots, x_j]\), and are determined recursively.
\[c_0 = y_0, \quad c_1 = \frac{y_1 - y_0}{x_1 - x_0}, \quad c_2 = \frac{\frac{y_2 - y_1}{x_2 - x_1} - \frac{y_1 - y_0}{x_1 - x_0}}{x_2 - x_0}, \quad \dots\]
This basis is not biorthogonal to the point evaluation functionals, but it provides a different, powerful approach to solving the same interpolation problem.

Note on Convergence: Biorthogonality gives the form of interpolating functions. Whether these \(P_n(x)\) converge to an underlying \(f(x)\) as \(n \to \infty\) is a separate issue in approximation theory (e.g., Runge’s phenomenon, Fourier series convergence).

4. The Crucial Ingredient: Completeness

An inner product provides geometry. For robust analysis (convergent algorithms, solution existence), we also need completeness.

Convergence and Cauchy Sequences

In a normed space, a sequence \((\vert f_n \rangle)\) converges to \(\vert f \rangle\) if \(\lim_{n \to \infty} \Vert \vert f_n \rangle - \vert f \rangle \Vert = 0\). A sequence \((\vert f_n \rangle)\) is Cauchy if its terms get arbitrarily close: \(\lim_{n,m \to \infty} \Vert \vert f_n \rangle - \vert f_m \rangle \Vert = 0\). Convergent sequences are Cauchy. The converse is not always true; the space might have “holes.”

The Problem of “Missing Limits”

Warning: Not All Inner Product Spaces Are Complete
Consider \(C([-1,1])\) (continuous functions on \([-1,1]\)) with the \(L_2\) inner product. The sequence \(f_n(x) = \tanh(nx)\) consists of continuous functions and is Cauchy in the \(L_2\)-norm. However, \(f_n(x)\) converges pointwise to the sign function:
\[f(x) = \lim_{n\to\infty} \tanh(nx) = \begin{cases} -1 & \text{if } x < 0 \\ 0 & \text{if } x = 0 \\ 1 & \text{if } x > 0 \end{cases}\]
This limit \(f(x)\) is in \(L_2([-1,1])\) but is not continuous. Thus, the limit of this Cauchy sequence of continuous functions is not in \(C([-1,1])\). So, \(C([-1,1])\) with the \(L_2\)-norm is not complete.

Why Completeness is Vital

Completeness ensures that if an iterative algorithm generates a Cauchy sequence of approximations, its limit exists within the working space as a valid solution.

Definition 4.1: Hilbert Space
A Hilbert space is an inner product space that is complete with respect to the norm induced by its inner product.

A Hilbert space has “filled in all its holes.” \(L_2([a,b])\) with the standard \(L_2\) inner product is a Hilbert space. It is the completion of \(C([a,b])\) under the \(L_2\)-norm.

5. Hilbert Spaces: Power and Applications

Hilbert spaces combine:

Rich geometric structure (inner product).
Strong analytical properties (completeness).

This combination yields powerful tools. The Best Approximation Theorem states: for any closed subspace \(M\) of a Hilbert space \(H\) and any \(\vert f \rangle \in H\), there exists a unique \(\vert f_M \rangle \in M\) closest to \(\vert f \rangle\). This \(\vert f_M \rangle\) is the orthogonal projection of \(\vert f \rangle\) onto \(M\). This theorem underpins methods like least squares and Fourier series.

Another profound result, fundamental to Hilbert space theory, is the Riesz Representation Theorem. It states that for every continuous linear functional \(\langle \psi \vert\) (a “bra”) on a Hilbert space \(\mathcal{H}\), there exists a unique vector \(\vert \psi_0 \rangle \in \mathcal{H}\) (a “ket”) such that \(\langle \psi \vert \phi \rangle = \langle \psi_0 \vert \phi \rangle_{\mathcal{H}}\) for all \(\vert \phi \rangle \in \mathcal{H}\). Moreover, the norm of the functional equals the norm of the vector: \(\Vert \langle \psi \vert \Vert = \Vert \vert \psi_0 \rangle \Vert_{\mathcal{H}}\). This theorem establishes an isometric isomorphism between a Hilbert space and its continuous dual space, uniquely and norm-preservingly pairing bras and kets.

Fourier Analysis as a Prime Example

Fourier theory thrives in Hilbert spaces. The space \(L_2([-\pi, \pi])\) of square-integrable complex-valued functions on \([-\pi, \pi]\) is a Hilbert space. Crucially, it is also separable, meaning it contains a countable dense subset. This property guarantees the existence of a countable orthonormal basis. Indeed, all separable, infinite-dimensional Hilbert spaces (over \(\mathbb{C}\) or \(\mathbb{R}\)) are isometrically isomorphic to \(\ell_2(\mathbb{N})\), the space of square-summable sequences. This highlights a universal structure among these spaces.

The set \(\left\{ \vert \phi_k \rangle \mid \phi_k(x) = \frac{1}{\sqrt{2\pi}} e^{ikx} \right\}_{k \in \mathbb{Z}}\) forms an orthonormal basis for \(L_2([-\pi, \pi])\).
- Orthogonal: \(\langle \phi_k \vert \phi_j \rangle = \delta_{kj}\) (Kronecker delta).
- Basis: Any \(\vert f \rangle \in L_2([-\pi, \pi])\) has a unique Fourier series representation (converging in \(L_2\)-norm):
  \[f(x) = \sum_{k=-\infty}^{\infty} c_k \frac{e^{ikx}}{\sqrt{2\pi}}, \quad \text{where} \quad c_k = \langle \phi_k \vert f \rangle = \frac{1}{\sqrt{2\pi}}\int_{-\pi}^{\pi} e^{-ikx} f(x) dx\]
Parseval’s Identity (infinite-dimensional Pythagorean theorem):
\[\Vert f \Vert^2 = \sum_{k=-\infty}^{\infty} \vert c_k \vert^2\]
The Fourier Transform is a unitary operator on \(L_2(\mathbb{R})\), preserving inner products and norms.

Deep Dive: The Fourier Transform as a Rotation in Function Space

The Fourier Transform (\(\mathcal{F}\)) on \(L_2(\mathbb{R})\) has remarkable properties, linking it to rotations, complex numbers, and the Gaussian function.

The Unitary FT and the Gaussian

The unitary Fourier transform is often defined as \(\mathcal{F}\{f\}(k) = \frac{1}{\sqrt{2\pi}}\int_{-\infty}^\infty f(x) e^{-ikx} dx\).

Unitarity and Normalization: The factor \(1/\sqrt{2\pi}\) ensures \(\mathcal{F}\) is unitary on \(L_2(\mathbb{R})\) (\(\Vert \mathcal{F}f \Vert_2 = \Vert f \Vert_2\)) and stems from the Gaussian integral \(\int_{-\infty}^\infty e^{-x^2/2} dx = \sqrt{2\pi}\). This same integral makes the standard normal PDF, \(\varphi(x) = \frac{1}{\sqrt{2\pi}}e^{-x^2/2}\), integrate to 1.
Eigenfunctions: The Gaussian function \(g(x) = e^{-x^2/2}\) is the unique (+1)-eigenfunction of \(\mathcal{F}\). The full set of eigenfunctions are the Hermite-Gaussian functions, \(\psi_n(x)\), which form an orthonormal basis for \(L_2(\mathbb{R})\). They satisfy \(\mathcal{F}\{\psi_n\} = (-i)^n \psi_n\).
Involutive Property: Applying the transform twice gives the parity operator: \(\mathcal{F}^2 f(x) = f(-x)\). Consequently, \(\mathcal{F}^4 = \mathrm{Id}\), just like \(i^4 = 1\).

The “Quarter-Turn” Analogy Made Precise

The property \(\mathcal{F}^4 = \mathrm{Id}\) makes \(\mathcal{F}\) analogous to a “quarter-turn” or \(90^\circ\) rotation. This analogy is not just metaphorical; it is exact in the time-frequency phase space.

The Wigner Distribution, \(W_f(x,k)\), represents a function’s energy jointly in the time (\(x\)) and frequency (\(k\)) domains.
Applying the Fourier transform to a function \(f\) corresponds to a rigid \(90^\circ\) rotation of its Wigner distribution: \(W_{\mathcal{F}\{f\}}(x,k) = W_f(-k, x)\). Applying it twice rotates the distribution by \(180^\circ\), corresponding to \(W_f(-x,-k)\), which matches the parity operation \(\mathcal{F}^2 f(x) = f(-x)\).

Generalizing to Arbitrary Rotations

This idea can be generalized. The Fractional Fourier Transform (\(\mathcal{F}_\alpha\)) is a family of operators that rotate a function’s Wigner distribution by an arbitrary angle \(\alpha\).

\(\mathcal{F}_0 = \mathrm{Id}\) (no rotation)
\(\mathcal{F}_{\pi/2} = \mathcal{F}\) (the standard FT)
\(\mathcal{F}_{\pi} = P\) (the parity operator)

This provides a continuous interpolation between a function and its Fourier transform. The underlying mathematical framework for these operators is the metaplectic representation.

Other Key Areas Benefiting from Hilbert Space Theory

Quantum Mechanics: System states are vectors in a complex Hilbert space; observables are self-adjoint operators.
Signal Processing: Signal analysis, filtering, and compression use orthogonal bases (Fourier, wavelets) in Hilbert spaces.
Partial Differential Equations (PDEs): Existence/uniqueness proofs and numerical methods (e.g., FEM) are often set in Sobolev spaces (specific Hilbert spaces). PDEs are recast as variational problems: find a function minimizing an “energy” or satisfying a “weak” form. Theorems like Lax-Milgram use Hilbert space properties (completeness, Riesz Representation) to guarantee solutions.

Relevance to Machine Learning & Optimization

Function Approximation & Learning: Many ML problems seek an optimal function. Hilbert spaces provide the setting, often employing regularization:
\[\text{Minimize: } \text{Loss}(\text{data}, f) + \lambda \Vert f \Vert_H^2\]
Here, \(\Vert f \Vert_H^2\) is the squared norm in a Hilbert space \(H\) (e.g., Sobolev, RKHS), penalizing “complexity” (like roughness) to improve generalization. The choice of \(H\) encodes priors about good solutions.
Reproducing Kernel Hilbert Spaces (RKHS): Central to kernel methods (SVMs, Gaussian Processes). The “kernel trick” operates implicitly in an RKHS, whose norm is used for regularization.
Optimization in Function Spaces: The Riesz Representation Theorem is key. The derivative of a loss functional (a bra) can be converted to a gradient ket in the Hilbert space, enabling gradient-based optimization directly on functions.

6. Conclusion: Geometry and Analysis United

We began by needing more than basic vector space properties for functions. The inner product introduced geometry: length, distance, and orthogonality, analogous to \(\mathbb{R}^n\). This enables projections and decompositions, underpinned by the Best Approximation Theorem.

Infinite dimensions, however, demand completeness for robust analysis. Cauchy sequences must converge to limits within the space. Hilbert spaces are inner product spaces that are complete. This fusion of geometry and completeness makes them exceptionally powerful. They support rigorous application of geometric intuition and analytical techniques to infinite-dimensional problems.

Understanding Hilbert spaces is crucial for Fourier analysis, quantum mechanics, signal processing, solving PDEs, and the theory behind ML algorithms (especially those using regularization in function spaces like RKHS or relying on the Riesz Representation Theorem for optimization). The journey highlights how appropriate mathematical abstractions transform complex problems into more tractable, often elegant, forms.

Next Up: What if a function space has a norm and is complete, but the norm doesn’t stem from an inner product? We lose angles and orthogonality but retain a strong analytical framework. This leads to Banach spaces.

7. Summary Cheat Sheet

Concept	Description	Example/Analogy in \(\mathbb{R}^n\)	Key Implication in Function Spaces
Inner Product	Generalizes dot product; defines geometry.	\(\mathbf{x} \cdot \mathbf{y}\)	\(\langle f \vert g \rangle = \int \overline{f(x)}g(x) dx\) (\(L_2\))
Induced Norm	Length/magnitude from inner product.	\(\Vert \mathbf{x} \Vert = \sqrt{\mathbf{x} \cdot \mathbf{x}}\)	\(\Vert f \Vert = \sqrt{\langle f \vert f \rangle}\)
Orthogonality	“Perpendicular” if inner product is zero.	\(\mathbf{x} \cdot \mathbf{y} = 0\)	\(\langle f \vert g \rangle = 0\)
Cauchy Sequence	Terms get arbitrarily close: \(\Vert f_n - f_m \Vert \to 0\).	“Appears to converge.”	Defines “converging-like” behavior.
Completeness	Every Cauchy sequence converges to a limit within the space.	\(\mathbb{R}^n\) is complete.	No “holes”; limits of Cauchy sequences exist in the space.
Hilbert Space	Complete inner product space.	\(\mathbb{R}^n\) with dot product.	\(L_2([a,b])\) (prototypical); combines geometry & analysis.
Best Approx. Thm.	Unique closest point in a closed subspace via orthogonal projection.	Projecting vector onto a plane.	Justifies Fourier series, least squares.
Riesz Rep. Thm.	Every continuous linear functional is an inner product with a unique vector.	Dual vectors identified with original vectors.	Links dual space \(H^\ast\) to \(H\); enables gradients in function spaces.
Separability	Contains a countable dense subset.	\(\mathbb{Q}^n\) is dense in \(\mathbb{R}^n\).	Allows countable orthonormal bases (e.g., Fourier for \(L_2\)).
Isomorphism to \(\ell_2\)	All separable infinite-dim. Hilbert spaces are isomorphic to \(\ell_2\).	-	Universal structure for such spaces.
\(L_2\) Space	Functions \(f\) with \(\int \vert f(x) \vert^2 dx < \infty\).	-	Standard infinite-dimensional Hilbert space.
Orthonormal Basis	Mutually orthogonal unit-norm functions spanning the space.	Standard basis \((\mathbf{e}_i)\).	Fourier basis (\(\frac{1}{\sqrt{2\pi}}e^{ikx}\)); enables function decomposition.

References

Kowalski, E. (2013). Spectral theory in Hilbert spaces (ETH Z¨urich, FS 09). https://people.math.ethz.ch/~kowalski/spectral-theory.pdf

Mathematical Foundations, Machine Learning

This post is licensed under CC BY 4.0 by the author.