Mathematical Optimization 29
- Preface
- Metrized Deep Learning: Finding the Right "Measure" for Neural Network Optimization
- Introduction to Basic Mathematical Optimization
- Momentum: Second-Order Gradient Flow ODE and a Multi-Step Method
- Speedrun of Common Gradient-Based ML Optimizers
- Adam Through the Lens of Information Geometry: A Diagonal Fisher Approximation
- Iterative Methods: Gradient-Free vs. Gradient-Based Optimization
- Desirable Properties of Optimizers
- Adam Optimizer: Online Learning of Updates and Efficacy with EMA
- Stochastic Gradient Descent: Noise as a Design Feature
- Challenges of High-Dimensional Non-Convex Optimization in Deep Learning
- Gradient Descent and Gradient Flow
- Adaptive Methods and Preconditioning: Reshaping the Landscape
- Problem Formalization - First Principles and Modern Perspectives in ML Optimization
- Soft Inductive Biases: Improving Generalization
- Optimization Theory for Machine Learning
- Crash Course: Numerical Methods for ODEs in Optimization
- Crash Course: Numerical Linear Algebra for Optimization
- Crash Course Cheat Sheet: Numerical Analysis for Optimization
- Convex Analysis Part 1: Convex Sets – The Building Blocks
- Matrix Norms: Foundations for Metrized Deep Learning
- Convex Analysis: References and Further Reading
- Convex Analysis Part 2: Convex Functions – Shaping the Landscape
- Convex Analysis Part 3: Subdifferential Calculus – Handling Non-Smoothness
- Convex Analysis Part 4: Convex Optimization Problems – Formulation and Properties
- Convex Analysis Part 5: Duality, Conjugates, and Optimality Conditions
- Convex Analysis Part 6: Introduction to Convex Optimization Algorithms
- Convex Analysis: A Crash Course for Optimization
- Numerical Analysis for Optimization