Post

Differential Geometry Crash Course: Cheat Sheet

A quick reference guide summarizing key concepts, definitions, and formulas from the Differential Geometry crash course, covering manifolds, tangent spaces, metrics, geodesics, connections, and curvature.

Differential Geometry Crash Course: Cheat Sheet

Introduction

This cheat sheet provides a condensed summary of the core concepts covered in our three-part Differential Geometry Crash Course. Use it as a quick reference to recall definitions, key formulas, and the significance of these ideas, especially as they relate to understanding advanced topics in machine learning and optimization.

Part 1: Smooth Manifolds and Tangent Spaces – The Landscape

ConceptIntuition/Brief DescriptionKey Formula/RepresentationRelevance/Use
Smooth Manifold \(M\)A space that locally looks like \(\mathbb{R}^n\) but can have global curvature.Locally Euclidean, Hausdorff, second-countable, with smooth transition maps between charts.Represents parameter spaces, state spaces, or constraint surfaces in ML.
Chart \((U, \phi)\)A local coordinate system on the manifold; maps a patch \(U \subset M\) to \(\mathbb{R}^n\).\(\phi: U \to V \subseteq \mathbb{R}^n\) homeomorphism.Allows performing calculus locally using familiar \(\mathbb{R}^n\) tools.
Atlas \(\mathcal{A}\)A collection of charts that cover the entire manifold \(M\).\(\mathcal{A} = \{(U_\alpha, \phi_\alpha)\}\) s.t. \(\bigcup U_\alpha = M\).Provides a complete description of the manifold’s local structure.
Tangent Space \(T_p M\)Vector space of all possible “directions” or “velocities” at a point \(p \in M\).Set of derivations at \(p\), or equivalence classes of curves through \(p\). Dimension \(n\).Space where gradients and directional derivatives live on the manifold. Crucial for optimization.
Tangent Vector \(v \in T_p M\)An element of the tangent space; a direction or instantaneous velocity.\(v(fg) = f(p)v(g) + g(p)v(f)\) (as derivation). Or \(\gamma'(0)\) for a curve \(\gamma\).Represents a specific direction of change, e.g., a gradient step.
Coordinate Basis \(\{\frac{\partial}{\partial x^i}\vert_p\}\)A basis for \(T_p M\) induced by local coordinates \(x^i\).\(v = \sum v^i \frac{\partial}{\partial x^i}\Big\vert_p\).Practical way to represent tangent vectors and perform calculations.
Differential \((F_\ast )_p\) (Pushforward)Linear map induced by \(F: M \to N\) that maps tangent vectors from \(T_p M\) to \(T_{F(p)} N\).\(((F_\ast )_p v)(g) = v(g \circ F)\). Matrix: Jacobian \([\frac{\partial F^j}{\partial x^i}]\).Describes how infinitesimal changes are transformed by maps. Used in change of variables, chain rule.
Vector Field \(X\)A smooth assignment of a tangent vector \(X_p \in T_p M\) to each point \(p \in M\).\(X(p) = \sum X^i(p) \frac{\partial}{\partial x^i}\Big\vert_p\).Represents gradient fields, velocity fields for flows (e.g., gradient flow in optimization).

Part 2: Riemannian Metrics and Geodesics – Measuring and Moving

ConceptIntuition/Brief DescriptionKey Formula/RepresentationRelevance/Use
Riemannian Metric \(g\)Smoothly assigns an inner product \(g_p\) to each tangent space \(T_p M\).\(g_p(v,w)\); components \(g_{ij}(p) = g_p(\partial_i, \partial_j)\). Symmetric, positive-definite matrix \([g_{ij}]\).Defines lengths, angles, volumes. Equips manifold with geometric structure (e.g., Fisher Info Metric).
Arc Length Element \(ds^2\)Infinitesimal squared length along a curve.\(ds^2 = \sum_{i,j} g_{ij}(x) dx^i dx^j\).Used to calculate length of curves: \(L(\gamma) = \int \sqrt{ds^2}\).
Length of Curve \(L(\gamma)\)Total length of a curve \(\gamma: [a,b] \to M\).\(L(\gamma) = \int_a^b \sqrt{g_{\gamma(t)}(\gamma'(t), \gamma'(t))} \, dt\).Measures path costs; fundamental for defining distance.
Riemannian Distance \(d(p,q)\)Shortest path length between two points \(p,q \in M\).\(d(p,q) = \inf_{\gamma} L(\gamma)\).Intrinsic distance measure on the manifold.
Volume Form \(\text{vol}_g\)An \(n\)-form that allows integration over the \(n\)-dimensional manifold.\(\text{vol}_g = \sqrt{\det(g_{ij})} \, dx^1 \wedge \dots \wedge dx^n\) (in local oriented coords).Defines volume, enables integration of functions, probability densities.
Geodesic \(\gamma(t)\)“Straightest” possible curve on \(M\); locally length-minimizing.Geodesic Eq: \(\frac{d^2 x^k}{dt^2} + \sum_{i,j} \Gamma^k_{ij} \frac{dx^i}{dt} \frac{dx^j}{dt} = 0\).Idealized optimization paths, shortest paths. Model for “inertial motion” on the manifold.
Christoffel Symbols (1st kind for geodesics) \(\Gamma^k_{ij}\)Coefficients in geodesic equation, depend on \(g_{ij}\) and its derivatives. (See Part 3 for full definition via connection).\(\Gamma^k_{ij} = \frac{1}{2} g^{kl}(\partial_i g_{jl} + \partial_j g_{il} - \partial_l g_{ij})\).Quantify how coordinate basis vectors change; essential for defining geodesics and covariant derivatives.
Exponential Map \(\exp_p\)Maps \(v \in T_p M\) to the point \(\gamma_v(1)\) reached by following a geodesic from \(p\) with initial velocity \(v\) for unit time.\(\exp_p: T_p M \to M\).Provides “straight line” coordinates (normal coordinates); used in manifold optimization algorithms.

Part 3: Connections, Covariant Derivatives, and Curvature – Bending and Twisting

ConceptIntuition/Brief DescriptionKey Formula/RepresentationRelevance/Use
Affine Connection \(\nabla\)Operator defining differentiation of vector fields along other vector fields.\(\nabla_X Y\). Satisfies linearity in \(X\), \(\mathbb{R}\)-linearity in \(Y\), Leibniz rule for \(Y\).Generalizes directional derivative to manifolds.
Covariant Derivative \(\nabla_X Y\)Derivative of vector field \(Y\) along vector field \(X\), respecting manifold geometry.\((\nabla_X Y)^k = X(Y^k) + \sum_{i,j} X^i Y^j \Gamma^k_{ij}\).How vector fields (e.g., gradient fields) change across the manifold.
Christoffel Symbols (of connection) \(\Gamma^k_{ij}\)Coefficients of the connection \(\nabla\) in local coordinates. Represents \(\nabla_{\partial_i} \partial_j = \sum_k \Gamma^k_{ij} \partial_k\).Same formula as for geodesics if it’s the Levi-Civita connection.Define how basis vectors “turn”; crucial for explicit calculations of covariant derivatives and curvature.
Levi-Civita ConnectionUnique connection on a Riemannian manifold that is metric-compatible and torsion-free.\(\nabla g = 0\) (metric compatible), \(\Gamma^k_{ij} = \Gamma^k_{ji}\) (torsion-free).The “natural” connection for Riemannian geometry; standard for most geometric analyses in physics and ML (e.g., with FIM).
Parallel Transport \(P_\gamma\)Moving a vector along a curve \(\gamma\) such that its covariant derivative along \(\gamma\) is zero.\(\frac{DV}{dt} = \nabla_{\gamma'(t)} V = 0\). Preserves length & angles if Levi-Civita.Defines how to “keep a vector constant” along a path; essential for comparing vectors at different points. Path-dependent in curved spaces.
Riemann Curvature Tensor \(R(X,Y)Z\)Measures non-commutativity of covariant derivatives; quantifies intrinsic curvature.\(R(X,Y)Z = \nabla_X\nabla_Y Z - \nabla_Y\nabla_X Z - \nabla_{[X,Y]}Z\). Components \(R^l_{ijk}\).Zero iff manifold is flat. Describes how geodesics deviate/converge. Influences complexity of loss landscapes.
Sectional Curvature \(K(\sigma)\)Curvature of a 2D plane section \(\sigma \subset T_p M\).\(K(u,v) = g(R(u,v)v, u)\) for orthonormal \(u,v\) spanning \(\sigma\).Generalizes Gaussian curvature of surfaces. Positive/negative values affect local geometry significantly.
Ricci Curvature \(\text{Ric}_{jk}\)A contraction of the Riemann tensor; average sectional curvature.\(\text{Ric}_{jk} = \sum_i R^i_{jik}\).Measures how volume of geodesic balls deviates from Euclidean. Linked to Hessian in some ML contexts (e.g., information geometry).
Scalar Curvature \(S\)Total contraction of Riemann tensor; a single number at each point.\(S = \sum_j g^{jk} \text{Ric}_{jk}\).Overall measure of curvature at a point. For surfaces (\(n=2\)), \(S=2K\) (Gaussian curvature).

Reflection

This cheat sheet serves as a high-level map to the essential landmarks of differential geometry introduced in the crash course. Mastering these concepts involves more than memorization; it requires developing geometric intuition and understanding how these tools interrelate to describe the structure of complex spaces. These foundational ideas pave the way for understanding more specialized areas like Information Geometry, where the “space” is a family of probability distributions and the “metric” (often the Fisher Information Metric) quantifies distinguishability, profoundly impacting the design and analysis of learning algorithms.

This post is licensed under CC BY 4.0 by the author.