Mathematical Optimization 11
- Transformers as Constrained Optimization
- Lion-K CCWD: Corrected Cautious Weight Decay
- Fast Polar Decomposition for Muon Optimizer with Rational and Polynomial Iterations
- When Equivalent Weights Train Differently
- Crash Course: Numerical Linear Algebra for Optimization
- Crash Course Cheat Sheet: Numerical Analysis for Optimization
- Matrix Norms: Foundations for Metrized Deep Learning
- Basics of Complex Numbers
- Properties of Matrix Norms
- Crash Course: Numerical Methods for ODEs in Optimization
- Numerical Analysis for Optimization