Note
Note

A Modern Introduction to Online Learning - Context

These are notes for the text A Modern Introduction to Online Learning by Francesco Orabona on ArXiV.

Contrast with offline learning:

Definition - Offline Learning (Batch Learning)

A learning paradigm where the model is trained using the entire available dataset at once (in a “batch”). The learning process is completed before the model is deployed to make predictions.

Updates typically require periodic retraining on the full (potentially augmented) dataset, making it slow to adapt to new data patterns.

Definition - Online Learning (Incremental / Sequential Learning)

A learning paradigm where the model learns sequentially, updating itself incrementally as new data points (or small mini-batches) arrive one by one or in small groups.

Learning is continuous and interleaved with the prediction process, allowing the model to adapt rapidly to new patterns or changes in data streams without needing the entire dataset upfront.

Analogy:

  • Offline Learning: Reading an entire textbook cover-to-cover, taking a final exam, and then using that knowledge. To learn updates, you need to get a whole new edition of the textbook and study it again.
  • Online Learning: Reading news articles one by one as they are published and constantly updating your understanding of current events based on each new piece of information.

Here’s a table summarizing the key differences:

FeatureOnline LearningOffline Learning (Batch Learning)
Data RequirementData arrives sequentially (streams)Entire dataset needed upfront
Model UpdateIncremental, per instance/mini-batchOn the entire dataset, periodically
Training PhaseContinuous / Interleaved with useDistinct, separate from deployment
AdaptabilityHigh, fast adaptation to changeLow, slow adaptation (requires retraining)
Memory UsageLow (per update)High (during batch training)
Computation (Update)Low per updateHigh during batch training
Handling Large DataExcellentChallenging if data exceeds memory
Concept DriftHandles wellHandles poorly without retraining
Data OrderCan be sensitiveLess sensitive (often shuffled)
“Forgetting”Potential issue (catastrophic)Less prone (sees all data repeatedly)

In essence, offline learning is suitable for static environments where batch processing is feasible, while online learning excels in dynamic environments with streaming data where continuous adaptation is crucial.

This post is licensed under CC BY 4.0 by the author.