Machine Learning 56
- A Modern Introduction to Online Learning - Ch 1
- Preface
- Introduction to Basic Mathematical Optimization
- Momentum: Second-Order Gradient Flow ODE and a Multi-Step Method
- Speedrun of Common Gradient-Based ML Optimizers
- Adam Through the Lens of Information Geometry: A Diagonal Fisher Approximation
- Iterative Methods: Gradient-Free vs. Gradient-Based Optimization
- Desirable Properties of Optimizers
- Adam Optimizer: Online Learning of Updates and Efficacy with EMA
- Stochastic Gradient Descent: Noise as a Design Feature
- Challenges of High-Dimensional Non-Convex Optimization in Deep Learning
- Gradient Descent and Gradient Flow
- Adaptive Methods and Preconditioning: Reshaping the Landscape
- Problem Formalization - First Principles and Modern Perspectives in ML Optimization
- Soft Inductive Biases: Improving Generalization
- Parameter-Free Optimization: Letting Algorithms Tune Themselves
- Metrized Deep Learning: Muon
- Optimization Theory for Machine Learning
- Linear Algebra Part 2: Orthogonality, Decompositions, and Advanced Topics
- Cheat Sheet: Linear Algebra
- Cheat Sheet: Elementary Functional Analysis for Optimization
- Matrix Norms: Foundations for Metrized Deep Learning
- Motivating Banach Spaces: Norms Measure Size
- Convex Analysis: A Crash Course for Optimization
- Differential Geometry – A Crash Course for Machine Learning
- Elementary Functional Analysis: Why Types Matter in Optimization
- Motivating Hilbert Spaces: Encoding Geometry
- Information Geometry Part 3: Applications in ML and Further Horizons
- Information Geometry: Cheat Sheet
- Basics of Complex Numbers
- Properties of Matrix Norms
- Elementary Functional Analysis for Optimization
- Information Geometry Part 1: Statistical Manifolds and the Fisher Metric
- Information Geometry Part 2: Duality, Divergences, and Natural Gradient
- Information Geometry – A Crash Course
- Linear Algebra Part 1: Foundations & Geometric Transformations
- Online Learning Crash Course – Part 0: Setting & Motivation
- Online Learning Crash Course – Part 1: Regret & Benchmarks
- Online Learning Crash Course – Part 2: Online Gradient Descent
- Online Learning Crash Course – Part 3: FTL & FTRL
- Online Learning Crash Course – Part 4: Mirror Descent & Geometry
- Online Learning Crash Course – Part 5: Adaptivity (AdaGrad & Beyond)
- Online Learning Crash Course – Part 6: Online-to-Batch Conversions
- Online Learning Crash Course – Part 7: Beyond Convexity (Teaser)
- Online Learning Crash Course – Part 8: Summary & Guidance
- Online Learning Crash Course – Cheat Sheet
- Online Learning Crash Course
- Tensor Calculus: Quick Reference Cheat Sheet
- Statistics & Info Theory Part 1: Statistical Foundations for ML
- Statistics & Info Theory Part 2: Information Theory Essentials for ML
- Statistics & Info Theory Cheat Sheet: Key Formulas & Definitions
- Statistics and Information Theory for Machine Learning
- Tensor Calculus Part 1: From Vectors to Tensors – Multilinear Algebra
- Tensor Calculus Part 2: Coordinate Changes, Covariance, Contravariance, and the Metric Tensor
- Tensor Calculus Part 3: Differentiating Tensors and Applications in Machine Learning
- Tensor Calculus: A Primer for Machine Learning & Optimization