Tags Abstract Vector Spaces1 Accelerated Methods1 AdaGrad4 Adam4 Adam Optimizer1 Adaptive Algorithms1 Adaptive Learning Rates1 Adaptive Methods1 Adaptive Optimization1 Affine Sets1 Algebra1 Algebraic Methods1 Arc Length1 Atlases1 Automatic Differentiation1 Banach Spaces1 Bandit Algorithms1 Basis Vectors1 Batch Learning1 Beltrami Identity1 Benchmarks1 Best Practices1 Boolean Algebra1 Bra-Ket Notation1 Brachistochrone1 Bregman Divergence3 Calculus of Variations2 Calculus Rules1 Catenary1 Charts1 Cheat Sheet8 Cheatsheet1 Christoffel Symbols2 Combinatorics1 Completeness2 Complex Numbers1 Computational Cost1 Condition Number4 Conditional Expectation1 Conditional Probability1 Cones1 Conjugate Gradient2 Connections2 Continuous Optimization1 Contraction1 Contravariance3 Contravariant2 Convergence1 Convergence Analysis1 Convergence Theory1 Convex Functions1 Convex Optimization2 Convex Sets1 Convexity1 ConViT1 Coordinate Invariance1 Coordinate Transformations2 Covariance3 Covariant2 Covariant Derivative4 Covectors1 Crash Course29 Cross-Entropy1 Curvature3 Cutkosky-Sarlos1 Deep Learning3 Definitions1 Determinants1 Differential Geometry4 Diffusion Processes1 Discrete Optimization1 Discretization2 Distance1 DNF Formulas1 Dropout1 Dual Connections1 Dual Numbers1 Dual Spaces2 Duality4 Early Stopping1 Efficiency1 Eigenvalues1 Einstein Notation3 Empirical Risk Minimization1 Entropy1 Epigraph1 Euler Method2 Euler's Formula1 Euler-Lagrange Equation7 Euler-Poisson Equation1 Expectation2 Exponential Moving Average1 FAdam2 Fenchel Conjugate1 First-Order Convexity1 Fisher Information2 Fisher Information Matrix1 Fisher Information Metric3 Fixed Point Theorems1 Fokker–Planck Equation1 Formulas2 Foundations1 Fourier Analysis1 Francesco Orabona1 FTL1 FTRL6 Functional Analysis6 Functional Optimization1 Functionals4 Fundamental Lemma1 Generalization4 Generalization Theory1 Generating Functions1 Geodesics2 Gradient4 Gradient Descent6 Gradient Diversity1 Gradient Flow2 Gradient-Based Optimization1 Gradient-Free Optimization1 Gradients2 Graph Networks1 Hahn-Banach Theorem1 Hamiltonian Mechanics3 Hessian4 Hessian Spectrum1 Hilbert Spaces3 Hyperplanes1 Identities1 Imaginary Unit1 Implicit Bias1 Implicit Regularization2 Incremental Learning1 Indicator Functions1 Inductive Biases1 Information Geometry6 Information Theory2 Inner Product Spaces1 Integration by Parts1 Interior-Point Methods1 Intuition2 Invariance1 Isoperimetric Problems2 Iterative Methods4 Ito Calculus2 Ito Lemma2 Jacobian1 Jensen's Inequality1 KKT Conditions1 KL Divergence1 Kolmogorov Axioms1 L1 Regularization1 L2 Regularization1 L2 Spaces1 Lagrange Multipliers1 Lagrangian1 Lagrangian Mechanics3 Langevin Dynamics1 Legendre Transform3 Linear Algebra5 Linear Multi-step Methods1 Linear Programming1 Linear Regression1 Loss Function1 Loss Functions1 Loss Landscape Geometry1 Loss Landscapes1 Lp Spaces1 Machine Learning3 Machine Learning Optimization1 Manifolds2 Matrices1 Matrix Algebra1 Matrix Decompositions1 Matrix Norms2 Matrix Sign Function1 Matrix-Free Methods1 Measure Theory1 Metric Tensor3 Metrized Learning1 Minibatch Design1 Mirror Descent4 MLE1 Modular Duality1 Momentum2 Multivariable Calculus3 Muon Optimizer1 Natural Gradient3 Natural Gradient Descent1 Nesterov Accelerated Gradient1 Newton's Method4 Noise Engineering1 Non-convex Optimization3 Non-Euclidean Geometry1 Non-smooth Optimization1 Normed Spaces2 Norms1 Nuclear Norm1 Numerical Linear Algebra1 Objective Function1 OCO8 ODE Discretization1 ODE Solvers1 OGD1 OMD4 Online Gradient Descent1 Online Learning14 Online-to-Batch1 Optimal Control1 Optimality1 Optimality Conditions1 Optimization11 Optimization Algorithms4 Optimization Challenges1 Optimization Methods1 Optimization Problems1 Optimization Theory4 Optimizer Properties1 Optimizers1 Ordinary Differential Equations2 Orthogonality2 Outer Product1 PAC-Bayes1 Parallel Transport1 Parameters1 Partial Derivatives3 Partial Differential Equations1 Physics2 PolarGrad1 Polyak Heavy Ball1 Polyhedra1 Preconditioning5 Prerequisite1 Probability2 Probability Distributions2 Probability Theory1 Problem Formalization1 Problem Formulation1 Proximal Algorithms1 Quadratic Programming1 Quasi-Newton Methods1 Quick Reference1 Random Matrix Theory1 Random Variables2 Regret2 Regret Analysis1 Regret Minimization2 Regularization2 Reinforcement Learning1 Riemann Tensor1 Riemannian Metrics2 RMS Norm1 RMSProp2 Robustness1 Saddle Points1 Scale-Free Optimization1 Score-Based Generative Modeling1 SDP1 Second-Order Convexity1 Separation Theorems1 Sequential Decision Making1 Set Theory1 SGD2 SGD Noise1 Slater's Condition1 Smooth Manifolds1 SOCP1 Soft Inductive Biases1 Spectral Norm1 Spectral Theory1 Stability1 Statistical Manifolds2 Statistics2 Stochastic Differential Equations2 Stochastic Gradient Descent1 Stochastic Optimization2 Stratonovich Calculus2 Subdifferential1 Subgradient1 Subgradient Method1 Summary2 Supremum Norm1 SVD1 Tangent Spaces1 Taylor Series3 Tensor Algebra2 Tensor Calculus2 Tensors4 Theorems1 Time-Reversal1 Transformations1 Variational Calculus7 Variations1 Vector Algebra1 Vector Norms1 Vectors1